4.5. Inner Product Spaces#

Inner products are a generalization of the notion of dot product. We restrict our attention to real vector spaces and complex vector spaces. Thus, the field F can be either R or C.

4.5.1. Inner Product#

Definition 4.71 (Inner product)

An inner product over an F-vector space V is any map ,:V×VF mapping (v1,v2)v1,v2 satisfying following properties:

  1. [Positive definiteness]

    v,v0 and v,v=0v=0.
  2. [Conjugate symmetry]

    v1,v2=v2,v1v1,v2V.
  • [Linearity in the first argument]

    αv,w=αv,wv,wV;αFv1+v2,w=v1,w+v2,wv1,v2,wV

Theorem 4.70 (Scaling in second argument)

Let ,:V×VF be an inner product. Then

v,αw=αv,w.

Proof. We proceed as follows:

v,αw=αw,v=αw,v=αw,v=αv,w.

Theorem 4.71 (Distribution in second argument)

Let ,:V×VF be an inner product. Then for any v,x,yV:

v,x+y=v,x+v,y.

Proof. We proceed as follows:

v,x+y=x+y,v=x,v+y,v=x,v+y,v=v,x+v,y.

Theorem 4.72 (Inner product with zero)

Let ,:V×VF be an inner product. Then,

0,v=v,0=0vV.

Proof. We proceed as follows:

u,v=u+0,v=u,v+0,v.

By cancelling terms, we get:

0,v=0.

Using the conjugate symmetry, we get:

v,0=0,v=0=0.
  • Linearity in first argument extends to any arbitrary linear combination:

αivi,w=αivi,w.
  • Similarly we have conjugate linearity in second argument for any arbitrary linear combination:

v,αiwi=αiv,wi.

Example 4.20

The standard inner product on Rn is defined as:

x,y=i=1nxiyi.

This is often called the dot product or scalar product.

Example 4.21

The standard inner product on Cn is defined as:

x,y=i=1nxiyi.

Example 4.22

Let x,yR2. Define:

x,y=x1y1x2y1x1y2+4x2y2.

Now:

  1. x,x=(x1x2)2+3x22. Thus, x=0x,x=0. Thus, it is positive definite.

  2. y,x=y1x1y2x1y1x2+4y2x2=x,y. It is symmetric.

  3. We can also verify that it is linear in the first argument.

Thus, it satisfies all the properties of an inner product.

Note that, in the matrix notation, we can write this inner product as:

x,y=[x1x2][1114][y1y2]

The matrix

A=[1114]

is positive definite. Its trace is 5 and its determinant is 3. Its eigen values are 4.303,0.697.

Example 4.23

Let Cn×n be the space of n×n matrices. For any A=(aij) and B=(bij) in Cn×n, we define the inner product as:

A,B=j,kajkbjk.

It can be easily seen that:

A,B=tr(ABH)=tr(BHA)

where BH is the conjugate transpose of B and tr computes the trace of a matrix (sum of its diagonal values).

Example 4.24

Let Cn×1 be the space of column vectors. Let Q be an arbitrary n×n invertible matrix over C.

For any x,yCn×1, define

x,y=yHQHQx.

We identify the 1×1 matrix in the R.H.S. with its single entry as a complex number C. This is a valid inner product.

When Q=I, the identity matrix, the inner product reduces to:

x,y=yHx.

This is the standard inner product on the space of column vectors.

Theorem 4.73

For complex inner products, the inner product is determined identified by its real part.

This statement may be confusing. Let us unpack what it means. Let

x,y=Rex,y+iImx,y.

Then, computing the inner product involves computing the real part as well as computing the complex part. What the statement means is that, if we know how to compute Rex,y for any x,yV, then, we can use the same method to compute Imx,y too; but using different inputs. See below.

Proof. Let

x,y=Rex,y+iImx,y.

For any complex number z=x+iyC, we have:

Re(iz)=Re(i(x+iy))=Re(yix)=y=Im(z).

Since, x,y is a complex number, hence:

Imx,y=Re(ix,y)=Rex,iy.

Thus,

x,y=Rex,y+iRex,iy.

4.5.2. Real Inner Product#

From the perspective of convex analysis, the general inner product is not very useful. We prefer a special class of inner products whose value is always real. This is applicable on vector spaces where the field of scalars is R.

Definition 4.72 (Real inner product)

A real inner product over an R-vector space V is any map ,:V×VR mapping (v1,v2)v1,v2 satisfying following properties:

  1. [Positive definiteness]

    v,v0 and v,v=0v=0.
  2. [Symmetry]

    v1,v2=v2,v1v1,v2V.
  • [Linearity in the first argument]

    αv,w=αv,wv,wV;αRv1+v2,w=v1,w+v2,wv1,v2,wV
  • Real inner product is always real valued no matter whether the vectors are real or complex.

  • Since the real inner product is symmetric, hence since it is linear in first argument, it is linear in second argument too.

v,αiwi=αiv,wi.

Example 4.25 (A real inner product for Cn over R)

In this example, we are dealing with n-tuples of complex numbers in Cn with the field of scalars being R. It can be easily checked that Cn over R is a vector space.

Let z1,z2 be complex numbers:

z1z2=(a1+ib1)(a2ib2)=a1a2+b1b2+i(b1a2a1b2).

Then

Re(z1z2)=a1a2+b1b2.
  1. Re(zz)=x2+y2 is positive definite; i.e., Re(zz)=0z=0+i0.

  2. Re(z1z2)=Re(z2z1) is symmetric.

  3. For any αR Re(αz1z2)=αRe(z1z2). Thus, it is linear in first argument.

Now, for any x,yCn, define:

x,y=Re(i=1nxiyi).

Following the argument above, it is a real inner product on Cn.

Interestingly, if uCn is identified with vR2n by stacking the real and imaginary parts, then the real inner product defined above for Cn is nothing but the standard inner product for R2n.

While the presentation in rest of the section will be based on the general conjugate symmetric inner product, it will be easy to extrapolate the results for the special case of real inner products.

4.5.3. Inner Product Space#

Definition 4.73 (Inner product space / Pre-Hilbert space)

An F-vector space V equipped with an inner product ,:V×VF is known as an inner product space or a pre-Hilbert space.

4.5.4. Orthogonality#

Orthogonality is the generalization of the notion of perpendicularity from elementary geometry.

Definition 4.74 (Orthogonal vectors)

Any two vectors u,vV are called orthogonal to each other if u,v=0.

We write uv if u and v are orthogonal to each other.

Definition 4.75 (Set of orthogonal vectors)

A set of non-zero vectors {v1,,vp} is called orthogonal or pairwise orthogonal if

vi,vj=0 if ij1i,jp.

Theorem 4.74 (Orthogonality implies independence)

A set of orthogonal vectors is linearly independent.

Proof. Let v1,vn be a set of orthogonal vectors. Suppose there is a linear combination:

α1v1++αnvn=0.

Taking inner product on both sides with vj, we get:

α1v1++αnvn,vj=0,vj0++αjvj,vj++0=0αjvj,vj=0αj=0.

Thus, the only zero linear combination is the trivial combination. Thus, the vectors are linearly independent.

4.5.5. Norm Induced by Inner Product#

Definition 4.76 (Norm induced by inner product)

Every inner product ,:V×VF on a vector space V induces a norm :VR given by:

v=v,vvV.

We shall justify that this function satisfies all the properties of a norm later. But before that, let us examine some implications of this definition which are useful in their own right.

Note that it is easy to see that is positive definite; i.e., 0=0 and v>0 if v0.

Also, it is positively homogeneous, since:

αv=αv,αv=ααv,v=|α|v,v=|α|v.

Theorem 4.75 (Pythagoras theorem)

If uv then

u+v2=u2+v2.

Proof. Expanding:

u+v2=u+v,u+v=u,u+u,v+v,u+v,v=u2+v2

where we used the fact that: u,v=v,u=0 since uv.

Theorem 4.76 (Cauchy Schwartz inequality)

For any u,vV:

|u,v|uv.

The equality holds if and only if u and v are linearly dependent.

Proof. If either u=0 or v=0 then the equality holds. So, suppose that neither of them are zero vectors. In particular v0 means v>0.

Define

w=u,vv2v.

Then,

w,uw=u,vv2v,uu,vv2v=u,vv2v,uu,vv2v=u,vv2(v,uv,v)=u,vv2(v,uu,vv2v,v)=u,vv2(v,uv,uv2v2)=0.

Thus, wuw. Therefore, by Pythagorean theorem,

u2=uw+w2=uw2+w2w2=u,vv2v2=|u,vv2|2v2=|u,v|2v2.

Multiplying on both sides by v2, we obtain:

u2v2|u,v|2.

Taking square roots on both sides,

|u,v|uv.

In the derivation above, the equality holds if and only if

0=uw=uu,vv2v

which means that u and v are linearly dependent.

Conversely, if u and v are linearly dependent, then u=αv for some αF, and

w=αv,vv2v=αv,vv2v=αv=u

giving us uw=0. Hence, the equality holds.

Theorem 4.77 (Inner product induced norm justification)

The function :VR induced by the inner product ,:V×VF as defined in Definition 4.76 is indeed a norm.

Proof. We need to verify that so defined is indeed a norm. We have already shown that it is positive definite and positive homogeneous. We now show the triangle inequality. We will take help of the Cauchy Schwartz inequality shown above:

u+v2=u+v,u+v=u,u+u,v+v,u+v,v=u2+u,v+u,v+v2=u2+2Reu,v+v2u2+2|u,v|+v2u2+2uv+v2=(u+v)2.

Taking square root on both sides, we obtain:

u+vu+v.

Thus, is indeed a norm.

We recap the sequence of results to emphasize the logical flow:

  1. We started with just the definition of in Definition 4.76.

  2. We proved positive definiteness from the definition itself.

  3. We proved positive homogeneity also from the definition itself.

  4. We proved Pythagoras theorem utilizing previously established results for inner products.

  5. We proved Cauchy Schwartz inequality using positive definiteness, positive homogeneity and Pythagoras theorem.

  6. We proved triangle inequality using Cauchy Schwartz inequality.

Theorem 4.78 (Inner product space to metric space)

Every inner product space is a normed space. Hence it is also a metric space.

Proof. An inner product induces a norm which makes the vector space a normed space. A norm induces a metric which makes the vector space a metric space.

4.5.6. Hilbert Spaces#

Definition 4.77 (Hilbert space)

An inner product space V that is complete with respect to the metric induced by the norm induced by its inner product is called a Hilbert space.

In other words, V is a Hilbert space if every Cauchy sequence of V converges in V.

4.5.7. Orthonormality#

Definition 4.78 (Set of orthonormal vectors)

A set of non-zero vectors {e1,,ep} is called orthonormal if

ei,ej=0 if ij1i,jpei,ei=11ip;

i.e., ei,ej=δ(i,j).

In other words, the vectors are unit norm (ei=1) and are pairwise orthogonal (eiej) whenever ij).

Since orthonormal vectors are orthogonal, hence they are linearly independent.

Definition 4.79 (Orthonormal basis)

A set of orthonormal vectors form an orthonormal basis for their span.

Theorem 4.79 (Expansion of a vector in an orthonormal basis)

Let {e1,,en} be an orthonormal basis for V. Then, any vV can be written as:

v=v,e1e1++v,enen.

Proof. Since {e1,,en} forms a basis for V, hence every every vV can be written as:

v=α1e1++αnen

where α1,,αnF.

Taking inner product with ej on both sides, we get:

v,ej=α1e1,ej++αnen,ej.

Since ei,ej=δ(i,j), hence the above reduces to:

v,ej=αj.

Theorem 4.80 (Norm of a vector in an orthonormal basis)

Let {e1,,en} be an orthonormal basis for V. For any vV, let its expansion in the orthonormal basis be:

v=α1e1++αnen.

Then,

v2=|α1|2++|αn|2=i=1n|αi|2.

Proof. Expanding the expression for norm squared:

v2=v,v=α1e1++αnen,α1e1++αnen=i=1nj=1nαiei,αjej=i=1n|αi|2.

Here are some interesting questions:

  • Can a basis in an inner product space be converted into an orthonormal basis?

  • Does a finite dimensional inner product space have an orthonormal basis?

  • Does every finite dimensional subspace of an inner product space have an orthonormal basis?

The answer to these questions is yes. We provide a constructive answer by the Gram-Schmidt algorithm described in the next section.

4.5.8. The Gram-Schmidt Algorithm#

The Gram-Schmidt algorithm (described below) can construct an orthonormal basis from an arbitrary basis for the span of the basis.

Algorithm 4.1 (The Gram-Schmidt algorithm)

Inputs v1,v2,,vn, a set of linearly independent vectors

Outputs e1,e2,,en, a set of orthonormal vectors

  1. w1=v1.

  2. e1=w1w1.

  3. For j=2,,n:

    1. wj=vji=1j1vj,eiei.

    2. ej=wjwj.

Theorem 4.81 (Justification for Gram-Schmidt algorithm)

Let v1,v2,,vn be linearly independent. The Gram-Schmidt algorithm described above generates a set of orthonormal vectors.

Moreover, for each j=1,,n, the set e1,,ej is an orthonormal basis for the subspace: span{v1,,vj}.

Proof. We prove this by mathematical induction. Consider the base case for j=1.

  1. w1=v1.

  2. e1=w1w1=v1v1.

  3. Thus, e1=1.

  4. span{e1}=span{v1} because e1 is a nonzero scalar multiple of v1.

Now, assume that the set e1,,ej1 is an orthonormal basis for span{v1,,vj1}.

  1. Thus, span{e1,,ej1}=span{v1,,vj1}.

  2. Since vj is linearly independent from v1,,vj1, hence vjspan{v1,,vj1}.

  3. Thus, vjspan{e1,,ej1}.

  4. Hence, wj=vji=1j1vj,eiei0. If it was 0, then vj would be linearly dependent on e1,,ej1.

  5. Thus, wj>0.

  6. Thus, ej=wjwj is well-defined.

  7. Also, ej=1 by construction, thus, ej is unit-norm.

  8. Note that wj is orthogonal to e1,,ej1. For any 1k<j, we have:

    wj,ek=vji=1j1vj,eiei,ek=vjeki=1j1vj,eiei,ek=vjekvj,ekek,ek=vjekvj,ek=0.

    since e1,,ej1 are orthonormal.

  9. Thus, for any 1k<j:

    ej,ek=wjwj,ek=wj,ekwj=0.
  10. Thus, ej is orthogonal to e1,,ej1.

  11. Since, all of them are unit norm, hence, e1,,ej1,ej are indeed orthonormal.

We also need to show that span{e1,,ej}=span{v1,,vj}.

  1. Note that wjspan{vj,e1,,ej1}=span{v1,,vj} since span{e1,,ej1}=span{v1,,vj1} by inductive hypothesis.

  2. Thus, ejspan{v1,,vj} since ej is just scaled wj.

  3. Thus, span{e1,,ej}span{v1,,vj}.

  4. For the converse, by definition vj=wj+i=1j1vj,eiei.

  5. Hence, vjspan{wj,e1,,ej1}=span{e1,,ej}.

  6. Thus, span{v1,,vj}span{e1,,ej}.

  7. Thus, span{e1,,ej}=span{v1,,vj} must be true.

Theorem 4.82 (Existence of orthonormal basis)

Every finite dimensional inner product space has an orthonormal basis.

Proof. This is a simple application of the Gram-Schmidt algorithm.

  1. Every finite dimensional vector space has a finite basis.

  2. Every finite basis can be turned into an orthonormal basis by the Gram-Schmidt algorithm.

  3. Thus, we have an orthonormal basis.

Corollary 4.13

Every finite dimensional subspace of an inner product space has an orthonormal basis.

4.5.9. Orthogonal Complements#

Definition 4.80 (Orthogonal complement)

Let S be a subset of an inner product space V. The orthogonal complement of S is the set of all vectors in V that are orthogonal to every element of S. It is denoted by S.

S{vV|vssS}.

Definition 4.81 (Orthogonal complement of a vector)

Let aV. The orthogonal complement of a is the set of all vectors in V that are orthogonal to a. It is denoted by a.

a{vV|va}.

Observation 4.2

a is just a notational convenience.

a={a}=(span{a}).

Theorem 4.83 (Orthogonal complement is a linear subspace)

If V is an inner product space and SV, then S is a subspace.

Proof. To verify that S is a subspace, we need to check the following.

  1. It contains the zero vector.

  2. It is closed under vector addition.

  3. It is closed under scalar multiplication.

We proceed as follows:

  1. 0,s=0 holds for any sS. Thus, 0S.

  2. Let u,vS. Then,

    1. u,s=0 and v,s=0 for every sS.

    2. Thus, u+v,s=u,s+v,s=0+0=0 for every sS.

    3. Thus, u+vS.

  3. Similarly, if vS, then αv,s=αv,s=0 for every sS.

Thus, S is a subspace of V.

Observation 4.3

The orthogonal complement of the inner product space V is its trivial subspace containing just the zero vector.

V={0}.

Theorem 4.84 (Orthogonal complement and basis)

If S is a subspace of V, then to show that some vector uS, it is sufficient to show that u is orthogonal to all the vectors in some basis of S.

Specifically, if S is a finite dimensional subspace of V and B={v1,,vm} is a basis for S, then

S={x|xvi,i=1,,m}.

Proof. Let B be a basis for S (finite or infinite).

Then, for any sS:

s=i=1pαpep

where αpF and epB.

Now, if u is orthogonal to every vector in B, then

s,u=i=1pαpep,u=0.

Thus, us. Since s was arbitrarily chosen from S, hence uS.

Now, assume S to be finite dimensional and B={v1,,vm} to be a basis of S. Let

T={x|xvi,i=1,,m}.

We first show that ST.

  1. Let vS.

  2. Then, vs for every sS.

  3. In particular, vvi for i=1,,m since BS.

  4. Thus, vT.

  5. Thus, ST.

We next show that TS.

  1. Let xT.

  2. Then, xvi for i=1,,m.

  3. But then, for any sS

    s,x=i=1mtivi,x=i=1mtivi,x0

    since s=i=1mtivi is a linear combination of B.

  4. Thus, xs for every sS.

  5. Thus, xS.

  6. Thus, TS.

Combining:

S=T={x|xvi,i=1,,m}.

Theorem 4.85 (Orthogonal decomposition)

Let V be an inner product space and S be a finite dimensional subspace of V. Then, every vV can be written uniquely in the form:

v=v+v

where vS and vS.

Proof. Let e1,,ep be an orthonormal basis for S.

Define:

(4.3)#vi=1pv,eiei.

And

v=vv.

By construction, vspan{e1,,ep}=S.

Now, for every 0ip:

v,ei=vv,ei=v,eiv,ei=v,eiv,ei=0.

Thus, vS.

We have shown that the existence of the decomposition of an vector v in components which belong to S and S. Next, we need to show that the decomposition is unique.

For contradiction, assume there was another decomposition:

v=u+u

such that uS and uS.

Then,

v+v=v=u+u

gives us:

w=vu=uv.

Thus, wS as well as wS. But then, ww giving us:

w,w=0=vu,vu=vu2.

This is possible only if vu=0, thus, v=u. Consequently, u=v too.

Thus,

v=v+v

is a unique decomposition.

Corollary 4.14 (Intersection between a subspace and its complement)

If S is a finite dimensional subspace of an inner product space V, then

SS={0}.

In other words, the only vector common between S and its orthogonal complement is the zero vector.

Theorem 4.86 (Vector space as direct sum)

If S is a finite dimensional subspace of an inner product space V, then

V=SS.

In other words, V is a direct sum of S and its orthogonal complement.

Proof. From Corollary 4.14, the intersection between S and S is the zero vector. Thus, by Definition 4.30, the direct sum between the two spaces SS is well defined.

By Theorem 4.85, every vector vV can be uniquely decomposed as

v=v+v

where vS and vS.

Thus, VSS.

However, since both S and S are subspaces of V, hence

V=SS.

Theorem 4.87 (Dimension of vector space as direct sum)

Let V be a finite dimensional inner product space. If S is a subspace of V, then

dimV=dimS+dimS.

Proof. Since V is finite dimensional, hence both S and S are finite dimensional subspaces of V.

By Theorem 4.86

V=SS.

Then, due to Theorem 4.20

dimV=dimS+dimS.

Theorem 4.88 (Orthogonal complement of orthogonal complement)

Let V be a finite dimensional inner product space. Let S be a subspace of V and let S be its orthogonal complement. Then

(S)=S.

In other words, in a finite dimensional space, the orthogonal complement of orthogonal complement is the original subspace itself.

Note that this result is valid only for finite dimensional spaces since in that case both S and S are finite dimensional.

Proof. Since V is finite dimensional, hence both S and S are finite dimensional.

(S)={vV|uvuS}.

We shall first show that S(S).

  1. Let sS.

  2. Then, by definition, suuS.

  3. Thus, s(S).

  4. Thus, S(S).

We now show that (S)S.

  1. Let u(S).

  2. By Theorem 4.86, V=SS since S is a finite dimensional subspace of V.

  3. Thus, u=v+w such that vS and wS.

  4. Since uv=w, hence uvS.

  5. We have already shown above that S(S). Hence v(S).

  6. Thus, uv=w(S) since both u and v belong to (S).

  7. Thus, uv(S)S as wS by orthogonal decomposition above.

  8. But, by Corollary 4.14 S(S)={0} since (S) is the orthogonal complement of S and S is finite dimensional.

  9. Thus, uv=0.

  10. Thus, u=v.

  11. Thus, uS.

  12. Since u was an arbitrary element of (S), hence (S)S.

Combining the two:

(S)=S.

Theorem 4.89 (n-1 dimensional subspaces)

Let V be a finite dimensional inner product space with dimV=n. Let S be an n1 dimensional subspace of V. Then, there exists a nonzero vector bV such that

S={xV|xb}.

In other words, the n1 dimensional subspaces are the sets of the form {x|xb} where b0.

Proof. Let S be n1 dimensional. Then, from Theorem 4.87

dimV=dimS+dimS.

This gives us dimS=n(n1)=1.

Since S is one dimensional, we can choose a non-zero vector bS as its basis. Since V is finite dimensional, hence

S=(S).

Thus, S consists of vectors which are orthogonal to a basis of S. Thus,

S={xV|xb}.

4.5.10. Orthogonal Projection#

Recall that a projection operator P:VV is an operator which satisfies P2=P.

The range of P is given by

R(P)={vV|v=Px for some xV}.

The null space of P is given by

N(P)={vV|Pv=0}.

Definition 4.82 (Orthogonal projection operator)

A projection operator P:VV over an inner product space V is called orthogonal projection operator if its range R(P) and the null space N(P) as defined above are orthogonal to each other; i.e.

rnrR(P),nN(P).

Theorem 4.90 (Orthogonal projection operator for a subspace)

Let S be a finite dimensional subspace of V. Let {e1,,ep} be an orthonormal basis of S. Let the operator PS:VV be defined as:

PSvv

where

v=v+v

is the unique orthogonal decomposition of v w.r.t. the subspace S as defined in Theorem 4.85. Then,

  1. PSv=i=1pv,eiei.

  2. For any vV, vPSvS.

  3. PS is a linear map.

  4. PS is the identity map when restricted to S; i.e., PSs=ssS.

  5. R(PS)=S.

  6. N(PS)=S.

  7. PS2=PS.

  8. For any vV, PSvv.

  9. For any vV and sS:

    vPSvvs

    with equality if and only if s=PSv.

PS is indeed an orthogonal projection onto S.

Proof. For the sake of brevity, we abbreviate P=PS.

Following (4.3), indeed:

P=i=1pv,eiei.

For any vV (due to Theorem 4.85):

vPv=vv=v.

Since vS hence vPvS.

[Linear map]

  1. Let u,vV.

  2. Let u=u+u and v=v+v.

  3. Consider u+v=(u+v)+(u+v).

  4. Then, u+vS and u+vS.

  5. Since, the orthogonal decomposition is unique, hence P(u+v)=u+v=Pu+Pv.

  6. Similarly, for αF, αu=αu+αu.

  7. With αuS and αuS, P(αu)=αu=αPu.

Thus, P is a linear map.

For any sS, we can write it as s=s+0. With sS and 0S, we have: Ps=s.

[Range]

  1. Since P maps v to a component in S, hence R(P)S.

  2. Since for every sS, there is vS such that Pv=s (specifically v=s), hence SR(P).

  3. Combining R(P)=S.

[Null space]

  1. Let vN(P). Write v=v+v.

  2. Then, Pv=v=0 as v is in the null space of P.

  3. Hence, v=vS.

  4. Thus, N(P)S.

  5. Now, let vS.

  6. We can write v as v=0+v where 0S and vS.

  7. Thus, Pv=0.

  8. Thus, SN(P).

  9. Combining, S=N(P).

[P2=P]

  1. For any vV, we have, Pv=v.

  2. Since vS, hence Pv=v.

  3. Thus, P2v=Pv=v=Pv.

  4. Since v was arbitrary, hence, P2=P.

[Pvv]

  1. We have v=v+v=Pv+v.

  2. By Pythagoras theorem: v2=Pv2+v2.

  3. Thus, v2Pv2.

  4. Taking square root on both sides: Pvv.

[vPvvs]

  1. Let vV and sS.

  2. Note that PvS hence PvsS.

  3. By definition vPvS.

  4. Thus, vPvPvs.

  5. We have: vs=(vPv)+(Pvs).

  6. Applying Pythagoras theorem:

    vs2=vPv2+Pvs2vPv2.
  7. Taking square root on both sides:

    vPvvs.
  8. Equality holds if and only if Pvs2=0 if and only if Pv=s.

In order to show that P is an orthogonal projection, we need to show that:

  1. P is a projection operator.

  2. rnrR(P),nN(P).

We have shown that:

  1. P2=P. Hence P is a projection operator.

  2. R(P)=S and N(P)=S.

  3. By definition, for any rS and nS, rn.

  4. Thus, P is an orthogonal projection operator.

Theorem 4.91 (Orthogonal projectors are adjoint)

A projection operator is orthogonal if and only if it is self adjoint.

Example 4.26 (Orthogonal projection on a line)

Consider a unit norm vector uRN.
Thus uTu=1.

Consider

Pu=uuT.

Now

Pu2=(uuT)(uuT)=u(uTu)uT=uuT=P.

Thus P is a projection operator.

Now,

PuT=(uuT)T=uuT=Pu.

Thus Pu is self-adjoint. Hence, Pu is an orthogonal projection operator.

Also,

Puu=(uuT)u=u(uTu)=u.

Thus Pu leaves u intact; i.e., Projection of u on to u is u itself.

Let vu i.e. u,v=0.

Then,

Puv=(uuT)v=u(uTv)=uu,v=0.

Thus Pu annihilates all vectors orthogonal to u.

Any vector xRN can be broken down into two components

x=x+x

such that u,x=0 and x is collinear with u.

Then,

Pux=uuTx+uuTx=x.

Thus Pu retains the projection of x on u given by x.

Example 4.27 (Projections over the column space of a matrix)

Let ARM×N with NM be a matrix given by

A=[a1a2aN]

where aiRM are its columns which are linearly independent.

The column space of A is given by

C(A)={AxxRN}RM.

It can be shown that ATA is invertible.

Consider the operator

PA=A(ATA)1AT.

Now,

PA2=A(ATA)1ATA(ATA)1AT=A(ATA)1AT=PA.

Thus PA is a projection operator.

PAT=(A(ATA)1AT)T=A((ATA)1)TAT=A(ATA)1AT=PA.

Thus PA is self-adjoint.

Hence PA is an orthogonal projection operator on the column space of A.

4.5.11. Parallelogram Identity#

Theorem 4.92 (Parallelogram identity)

2x22+2y22=x+y22+xy22.x,yV.

Proof. Expanding:

x+y22=x+y,x+y=x,x+y,y+x,y+y,x.

Also:

xy22=xy,xy=x,x+y,yx,yy,x.

Thus,

x+y22+xy22=2(x,x+y,y)=2x22+2y22.

When inner product is a real number following identity is quite useful.

Theorem 4.93 (Parallelogram identity for real inner product)

x,y=14(x+y22xy22).x,yV.

Proof. Expanding:

x+y22=x+y,x+y=x,x+y,y+x,y+y,x.

Also,

xy22=xy,xy=x,x+y,yx,yy,x.

Thus,

x+y22xy22=2(x,y+y,x)=4x,y

since for real inner products

x,y=y,x.

4.5.12. Polarization identity#

When inner product is a complex number, polarization identity is quite useful.

Theorem 4.94 (Polarization identity for complex inner product)

x,y=14(x+y22xy22+ix+iy22ixiy22)x,yV.

Proof. Expanding

x+y22=x+y,x+y=x,x+y,y+x,y+y,x.

Also,

xy22=xy,xy=x,x+y,yx,yy,x.

And,

x+iy22=x+iy,x+iy=x,x+iy,iy+x,iy+iy,x.

And,

xiy22=xiy,xiy=x,x+iy,iyx,iyiy,x.

Thus,

x+y22xy22+ix+iy22ixiy22=2x,y+2y,x+2ix,iy+2iix,y=2x,y+2y,x+2x,y2y,x=4x,y.