Angles and the Cauchy-Schwarz inequality

In Rn, where vectors are no longer represented by arrows, its not clear how we should define the angle between two vectors. We could try to define angles by using the formula

which we derived for geometric vectors by using the law of cosines for triangles, but there's a problem: if the right-hand side has to equal the cosine of some angle, how do we know it's between –1 and +1?

It turns out that you can prove it's between –1 and +1, as a consequence of the Cauchy-Schwarz inequality.

Cauchy-Schwarz Inequality

If u and v are vectors in Rn, then

(uv)2 ≤ (uu)(vv).

Equality holds if and only if u and v are parallel.

Proof: If either vector is a scalar multiple of the other, say u = av, then

(uu)(vv)

= {(av)•(av)}(vv)
= a2(vv)2
= {a(vv)}2
= (avv)2
= (uv)2 .

If neither vector is a scalar multiple of the other, define the vector w = au + bv for scalars a = –uv and b = uu. Calculate ww and simplify:

ww

= a2(uu) + 2ab(uv) + b2(vv)
= (uv)2(uu)– 2(uv)(uu)(uv) + (uu)2(vv)
= –(uv)2(uu) + (uu)2(vv).

Since neither of u and v is a scalar multiple of the other, w can't be 0. Then ww is positive, so we have

(uv)2(uu) < (uu)2(vv).

Divide both sides by uu (which is positive) to get

(uv)2 < (uu)(vv).

 

Now, take positive square roots of both sides of this inequality: |uv| ≤ (uu)1/2(vv)1/2, i.e.

     |uv| ≤ |u||v|.

which is equivalent to

|u||v|uv|u||v|.

Divide both sides by |u||v| to get

.

This last inequality says that its middle can indeed be the cosine of some angle. Our usual formula for the angle between two vectors does make sense in Rn, so we can use it as a definition.

For two non-zero vectors u = (u1, u2, ..., un) and v = (v1, v2, ..., vn) in Rn,

The angle θ between u and v is defined by the formula

.

u and v are said to be orthogonal whenever this angle is π/2, i.e. whenever uv = 0.

 

The Cauchy-Schwartz inequality also lets us prove one of the essential properties of norms in Rn.

Four essential rules for norms of vectors in Rn

For any vectors u and v in Rn and any scalar k,

If u0, then |u| > 0

|0| = 0

|ku| = |k||u|

|u + v||u| + |v|      (triangle inequality)

The first three of these rules are easy to prove. To explain the triangle inequality for geometric vectors, we interpreted them as sides of a triangle and use elementary geometry. In Rn, we don't have geometric triangles, but we can still prove this inequality as a consequence of the Cauchy-Schwartz inequality.

Proof of the triangle inequality: Calculate:

|u + v|2

= (u + v)•(u + v)   
= uu + 2uv + vv
= |u|2 + 2uv + |v|2
|u|2 + 2|uv| + |v|2     
(any number is less than or equal to its absolute value).

But (from the Cauchy-Schwarz inequality)

|uv| ≤ |u||v|.

Then

|u + v|2|u|2 + 2|u||v| + |v|2

i.e.

|u + v|2 ≤ {|u| + |v|}2.

Take positive square roots:

|u + v||u| + |v|.