$$
Conventions
- \(a_i\)’s and \(b_i\)’s will denote reals.
- \(V\) will denote an inner product space over \(\mathbb R\), and \(\|\cdot\|\) will denote the induced norm.
Notes
We start with an inductive proof of the following version of Cauchy’s inequality:
Proposition 1 One has \[\begin{equation*} a_1 b_1 + \cdots + a_n b_n\le \sqrt{a_1^2 + \cdots + a_n^2}\sqrt{b_1^2 + \cdots + b_n^2}. \end{equation*}\]
Proof. For \(n = 1\), we have equality. Let’s check the \(n = 2\) case, which will also be used in the inductive step. We have the following chain of equivalent statements: \[\begin{align*} (a_1 b_1 + a_2 b_2)^2 & \le (a_1^2 + a_2^2) (b_1^2 + b_2^2)\\ 2a_1 b_1 a_2 b_2 & \le a_1^2 b_2^2 + a_2^2 b_1^2\\ 0 & \le (a_1 b_2 - a_2 b_1)^2 \end{align*}\]
We now move on to the \((n + 1)\)-th case: \[\begin{align*} a_1 b_1 + \cdots + a_{n + 1} b_{n + 1} & = a_1 b_1 + \cdots + a_n b_n + a_{n + 1} b_{n + 1}\\ & \le \sqrt{a_1^2 + \cdots + a_n^2}\sqrt{b_1^2 + \cdots + b_n^2} + a_{n + 1} b_{n + 1}\\ & \le \sqrt{a_1^2 + \cdots + a_{n + 1}^2}\sqrt{b_1^2 + \cdots + b_{n + 1}^2} \end{align*}\] The second inequality uses the induction hypothesis, and the third uses the \(n = 2\) case.
We thus get:
Corollary 1 For \(a_1, a_2, \ldots, b_1, b_2, \ldots\), if \(\sum_i a_i^2\) and \(\sum_i b_i^2\) are finite, then so is \(\sum_i |a_i b_i|\).
However, we give an alternate proof not relying on Proposition 1:
Proof. We are motivated to investigate the smallness of \(|ab|\) when \(a^2\) and \(b^2\) are small. We’ll be done if we get hold of a constant \(C\) such that \[\begin{equation*} |ab|\le C(a^2 + b^2)\tag{1}\label{EQN: ab and a^2 b^2} \end{equation*}\] for then, that will imply a stronger result: \[\begin{equation*} \sum\nolimits_i |a_i b_i|\le C\left(\sum\nolimits_i a_i^2 + \sum\nolimits_i b_i^2\right) \end{equation*}\] Indeed, \(\ref{EQN: ab and a^2 b^2}\) holds for \(C = 1/2\) since \((|a| - |b|)^2\ge 0\).
The following stronger result also oozes out of the above proof:
Proposition 2 If \(\sum_i a_i^2\) and \(\sum_i b_i^2\) are finite, then \(\sum_i a_i b_i\) is absolutely convergent with \[\begin{equation*} \sum_i a_i b_i\le \frac{1}{2}\sum_i a_i^2 + \frac{1}{2}\sum_i b_i^2. \end{equation*}\]
Note that Proposition 2 can also be derived from Proposition 1 since \(\sqrt{x}\sqrt y\le (x^2 + y^2)/2\) for \(x, y\ge 0\). Conversely, we may derive Proposition 1 (even for the infinite case) from Proposition 2 as well by normalizing. Just make the following replacements: \(a_j\to a_j\big/\sqrt{\sum_i a_i^2}\) and \(b_j\to b_j\big/\sqrt{\sum_i b_i^2}\).
“Normalization gives us a systematic way to pass from an additive to a multiplicative inequality.”
The above discussion motivates a new way to prove the Cauchy-Schwarz inequality:
Theorem 1 (Cauchy-Schwarz) For \(u, v\in V\), one has \[\begin{equation*} \langle u, v\rangle\le \|u\|\|v\| \end{equation*}\] with equality holding \(\iff\) \(u\), \(v\) are linearly dependent.
Proof. It’s clear if any one of \(u\) or \(v\) is \(0\). Thus, we may without loss of generality assume that neither is \(0\). Note that the starting point for the proof of Corollary 1 (which was also the proof of Proposition 2) was that \((|a| - |b|)^2\ge 0\). Along the same lines, we observe that \[\begin{alignat*}{2} && 0 & \le \langle u - v, u - v\rangle &\\ &\stackrel{\text{\scriptsize w}}{\implies}& 0 & \le \|u\|^2 - 2\langle u, v\rangle + \|v\|^2 \\ &\stackrel{\text{\scriptsize w}}{\implies}& \langle u, v\rangle & \le (\|u\|^2 + \|v\|^2) / 2 \end{alignat*}\]
which is the generalization of Proposition 2. Replacing \(u\to u/\|u\|\) and \(v\to v/\|v\|\) (“normalizing”), we get \[\begin{equation*} \langle u, v\rangle\le\|u\|\|v\|. \end{equation*}\] Now, we show the fact about equality. “\(\Leftarrow\)” is evident. For the converse, let \(\langle u, v\rangle = \|u\|\|v\|\) which implies \(\langle\hat u, \hat v\rangle = 1\) where \(\hat u:= u/\|u\|\) and \(\hat v := v/\|v\|\) are the “normalized” variables. Then it follows that \(\|\hat u - \hat v\|^2 = 0\) so that \(v = \frac{\|v\|}{\|u\|} u\).
The popular quick proof involving the quadratic polynomial is due to Schwarz (see Exercise 11).
Exercises
Exercise 1 (The \(1\)-trick and the splitting trick) For any reals \(a_1, \ldots, a_n\), one has:
- \(\sum_i a_i\le \sqrt n \, \sqrt{\sum_i a_i^2}\)
- \(\sum_i a_i\le \sqrt{\sum_i |a_i|^{2/3}}\, \sqrt{\sum_i |a_i|^{4/3}}\)
Exercise 2 (Products of averages and averages of products) Let \(a_i, b_i, p_i\ge 0\) for \(i = 1, \ldots, n\) with \(\sum_i p_i = 1\) and each \(a_i b_i\ge 1\). Then we also have \[\begin{equation*} \left(\sum_{i = 1}^n p_i a_i\right) \left(\sum_{i = 1}^n p_i b_i\right)\ge 1 \end{equation*}\]
Exercise 3 (Why not three or more?) For reals \(a_i, b_i, c_i\)’s for \(i = 1, \ldots, n\), one has: \[\begin{equation*} \sum_i a_i b_i c_i\le \sqrt{\sum_i a_i^2}\, \sqrt{\sum_i b_i^2} \, \sqrt{\sum_i c_i^2}. \end{equation*}\]
Exercise 4 (Some help from symmetry) For \(x, y, z > 0\), one has: \[\begin{gather*} \left( \frac{x + y}{x + y + z} \right)^{1/2} + \left( \frac{y + z}{x + y + z} \right)^{1/2} + \left( \frac{z + x}{x + y + z} \right)^{1/2}\le \sqrt 6\\ x + y + z\le 2\left( \frac{x^2}{y + z} + \frac{y^2}{z + x} + \frac{z^2}{x + y} \right) \end{gather*}\]
Exercise 5 (A crystallographic inequality with a message) Let \(f\colon \mathbb{R\to R}\) be such that \(f(x)^2\le af(bx) + c\) where \(a, b, c\in\mathbb R\) (for instance, \(\cos(x)^2 = \cos(2x)/2 - 1/2\)). Let \(p_1 + \ldots + p_n = 1\) for \(p_i\ge 0\) and define \(g(x) := \sum_i p_i f(\alpha_i x_i)\) for \(\alpha_i\in\mathbb R\). Then \[\begin{equation*} g(x)^2\le a g(bx) + c. \end{equation*}\]
Exercise 6 (A sum of inversion preserving summands) For \(p_1, \ldots, p_n > 0\) with \(p_1 + \cdots + p_n = 1\), one has \[\begin{equation*} \sum_{i = 1}^n \left( p_i + \frac{1}{p_i} \right)^2\ge n^3 + 2n + \frac{1}{n} \end{equation*}\] with the equality holding \(\iff\) each \(p_i = 1/n\).
Lemma 1 Let \(0 < p_1, \ldots, p_n \le 1\) with \(\sum_i p_i = 1\). Then for any \(k\ge 0\), one has \[\begin{equation*} \sum_{i = 1}^n \frac{1}{p_i^k}\ge n^{k + 1} \end{equation*}\] with the equality holding \(\iff\) each \(p_i = 1/n\).
Proof. We induct strongly. Assume for all \(k'\) less than \(k\). We have two cases:
- \(k\) is even: Then \(n\sum_i 1/p_i^{k}\) \(\ge\) \(\left(\sum_i 1/p_i^{\frac{k}{2}}\right)^2\) \(\ge\) \(\bigl(n^{\frac{k}{2} + 1}\bigr)^2 = n^{k + 2}\) (without loss of generality, assume that \(k/2 < k\)) so that \(\sum_i 1/p_i^k \ge n^{k + 1}\).
- \(k\) is odd: Then \(\sum_i 1/p_i^k\) \(=\) \(\left(\sum_i p_i\right) \left( \sum_i 1/p_i^k \right)\) \(\ge\) \(\left( \sum_i p_i^{\frac{1}{2}}/p_i^{\frac{k}{2}} \right)^2\) \(=\) \(\left( \sum_i 1/p_i^{\frac{k-1}{2}} \right)^2\) \(\ge\) \(\bigl(n^{\frac{k+1}{2}}\bigr)^2\) \(=\) \(n^{k + 1}\).
Finally, note that if the equality holds, then all the inequalities above become equalities, so we may again deploy induction.
Exercise 7 (Flexibility of form) Let \(A\in\mathbb R^{n\times n}\). Then \[\begin{equation*} \langle x, y\rangle := x^t A\, y \end{equation*}\] defines an inner product on \(\mathbb R^n\) \(\iff\) \(A\) is symmetric and positive definite. Further, for \(n = 2\), the symmetric matrix \(\begin{bsmallmatrix} a & b\\ b & d \end{bsmallmatrix}\) is positive definite \(\iff\) \(a, d > 0\) and \(b^2 < ad\).
Exercise 8 (Doing the sums) For real \(a_i\)’s and \(-1 < x < 1\), one has: \[\begin{align*} \sum_{i = 0}^n x^i a_i & \le \frac{1}{\sqrt{1 - x^2}} \left(\sum_{i = 0}^n a_i^2\right)^{1/2}\\ \sum_{i = 1}^n \frac{a_i}{i} & \le \sqrt 2 \left(\sum_{i = 0}^n a_i^2\right)^{1/2}\\ \sum_{i = 1}^n \frac{a_i}{n + i} & \le \sqrt{\ln 2} \left(\sum_{i = 0}^n a_i^2\right)^{1/2}\\ \sum_{i = 0}^n \binom{n}{i} a_i & \le \sqrt{\binom{2n}{n}}\left(\sum_{i = 0}^n a_i^2\right)^{1/2} \end{align*}\]
Exercise 9 (Beating the obvious bounds) Let \(a_1, \ldots, a_n\) be reals. Then \[\begin{equation*} \left(\sum_{i = 1}^n a_i\right)^2 + \left( \sum_{i = 1}^n (-1)^i a_i \right)^2\le (n + 2)\sum_{i = 1}^n a_i^2. \end{equation*}\]
Exercise 10 (Schur’s lemma—the \(R\) and \(C\) bounds) Let \(A\in\mathbb R^{m\times n}\). Then for any \(x\in\mathbb R^m\) and \(y\in \mathbb R^n\), one has \[\begin{equation*} \left| x^t A\, y \right|\le \sqrt{RC} \left( \sum_{i = 1}^m x_i^2 \right)^{1/2} \left( \sum_{i = 1}^n y_i^2 \right)^{1/2} \end{equation*}\] where \(R := \max_i \sum_j |A_{i, j}|\) and \(C := \max_j \sum_i |A_{i, j}|\).
Exercise 11 (Schwarz’s argument in an inner product space) Prove Theorem 1 by considering the quadratic polynomial \(p(t) := \|u + tv\|^2\) on \(\mathbb R\).
Exercise 12 (Example of a self-generalization) If \(V_i\)’s are inner product spaces with inner products \(\langle\cdot, \cdot\rangle_i\) and \(w_i > 0\) for \(i = 1, \ldots, n\), then the following defines an inner product on \(\prod_i V_i\): \[\begin{equation*} \langle u, v\rangle := \sum_{i = 1}^n w_i \langle u_i, v_i\rangle_i. \end{equation*}\]
Exercise 13 (Application of Cauchy’s inequality to an array) If \(A\in\mathbb R^{m\times n}\), then one has \[\begin{equation*} m\sum_{i = 1}^m r_i^2 + n\sum_{j = 1}^n c_j^2 \le \Bigl(\sum_{i, j} a_{i, j}\Bigr)^2 + mn\sum_{i, j} a_{i, j}^2 \end{equation*}\] where \(r_i := \sum_{j = 1}^n a_{i, j}\) and \(c_j := \sum_{i = 1}^m a_{i, j}\). Aldo, the equality holds \(\iff\) there exist \(\alpha_1, \ldots, \alpha_m, \beta_1, \ldots, \beta_n\in\mathbb R\) such that \(a_{i, j} = \alpha_i + \beta_j\).
Exercise 14 (A Cauchy triple and Loomis-Whitney)
- Let \(A, B, C\in\mathbb R^{n\times n}\). Then one has \[\begin{equation*} \left(\sum_{i, j, k = 1}^n a_{i, j}\, b_{j, k}\, c_{k, i}\right)^2 \le \left(\sum_{i, j = 1}^n a_{i, j}^2 \right) \left(\sum_{j, k = 1}^n b_{j, k}^2 \right) \left(\sum_{k, i = 1}^n c_{k, i}^2 \right). \end{equation*}\]
- Let \(S\subseteq\mathbb Z^3\) be finite and let \(S_x\), \(S_y\), \(S_z\) be projections on the \(yz\)-, \(zx\)-, \(xy\)-planes respectively. Then these are also finite with \[\begin{equation*} |S|\le |S_x|^{1/2} |S_y|^{1/2} |S_z|^{1/2}. \end{equation*}\]
Exercise 15 (An application to statistical theory) Let \(D\) be finite a finite set and \(\Theta\subseteq\mathbb R\) be an interval. Let \(p\colon D\times\Theta\to[0, +\infty)\) be such that for each \(k\in D\), we have that \(\sum_{k\in D} p(k, \theta) = 1\) and that \(p(k, \cdot)\colon \Theta\to\mathbb R\) is differentiable. Let \(g\colon D\to \mathbb R\) such that \(\sum_{k\in D} g(k) p(k, \theta) = \theta\). Then \[\begin{equation*} \sum_{k\in D} (g(k) - \theta)^2 p(k, \theta) \ge \frac{1}{I(\theta)} \end{equation*}\] where \(I(\theta) := \sum_{k\in D}\left(\frac{p_\theta(k, \theta)}{p(k, \theta)}\right)^2 p(k, \theta)\).
Errata
- p. 11: “
binomialquadratic formula” in the third paragraph. - p. 17: \(\Theta\) must be given to be an appropriate set since differentiability is being talked of. For instance, \(\Theta\) may be taken to be an interval.
- p. 228: In the solution to Ex 7, \(\text{``}\langle x, y\rangle = 5x_1 y_1 + x_1 y_2 + x_2 y_1\) \({}+ \cancel{3y_2^2} {\color{YellowOrange}{3x_2 y_2}}\text{"}\) and \(\text{``}5z^2 + \cancel{3z}{\color{YellowOrange}{2z}} + 3 =0\text{"}\).