4.10. Singular Values#

In previous section we saw diagonalization of square matrices which resulted in an eigen value decomposition of the matrix. This matrix factorization is very useful yet it is not applicable in all situations. In particular, the eigen value decomposition is useless if the square matrix is not diagonalizable or if the matrix is not square at all. Moreover, the decomposition is particularly useful only for real symmetric or Hermitian matrices where the diagonalizing matrix is an F-unitary matrix (see Definition 4.111). Otherwise, one has to consider the inverse of the diagonalizing matrix also.

Fortunately there happens to be another decomposition which applies to all matrices and it involves just F-unitary matrices.

4.10.1. Singular value#

Definition 4.131 (Singular value)

A non-negative real number σ is a singular value for a matrix AFm×n if and only if there exist unit-length vectors uFm and vFn such that

Av=σu

and

AHu=σv

hold. The vectors u and v are called left-singular and right-singular vectors for σ respectively.

We first present the basic result of singular value decomposition. We will not prove this result completely although we will present proofs of some aspects.

Theorem 4.138 (Existence of singular values)

For every AFm×n with k=min(m,n), there exist two F-unitary matrices UFm×m
and VFn×n and a sequence of real numbers

σ1σ2σk0

such that

(4.14)#UHAV=Σ

where

Σ=diag(σ1,σ2,,σk)Fm×n.

The non-negative real numbers σi are the singular values of A as per Definition 4.131.

The sequence of real numbers σi doesn’t depend on the particular choice of U and V.

Σ is rectangular with the same size as A. The singular values of A lie on the principle diagonal of Σ. All other entries in Σ are zero.

It is certainly possible that some of the singular values are 0 themselves.

Remark 4.22

Since UHAV=Σ hence

(4.15)#A=UΣVH.

Definition 4.132 (Singular value decomposition)

The decomposition of a matrix AFm×n given by

(4.16)#A=UΣVH

is known as its singular value decomposition.

Remark 4.23

When F is R then the decomposition simplifies to

(4.17)#UTAV=Σ

and

A=UΣVT.

Remark 4.24

There can be at most k=min(m,n) distinct singular values of A.

Remark 4.25

We can also write

(4.18)#AV=UΣ.

Remark 4.26 (SVD as a sum of rank 1 matrices)

Let us expand

A=UΣVH=[u1u2um][σij][v1Hv2HvnH]=i=1mj=1nσijuivjH.

Remark 4.27

Alternatively, let us expand

Σ=UHAV=[u1Hu2HumH]A[v1v2vm]=[uiHAvj]

This gives us

σij=uiHAvj.

Following lemma verifies that Σ indeed consists of singular values of A as per Definition 4.131.

Lemma 4.70

Let A=UΣVH be a singular value decomposition of A. Then the main diagonal entries of Σ are singular values. The first k=min(m,n) column vectors in U and V are left and right singular vectors of A.

Proof. We have

AV=UΣ.
  1. Let us expand R.H.S.

    UΣ=[j=1muijσjk]=[uikσk]=[σ1u1σ2u2σkuk00]

    where 0 columns in the end appear nk times.

  2. Expanding the L.H.S. we get

    AV=[Av1Av2Avn].
  3. Thus by comparing both sides we get

    Avi=σiui for 1ik

    and

    Avi=0 for k<in.
  4. Now let us start with

    A=UΣVHAH=VΣHUHAHU=VΣH.
  5. Let us expand R.H.S.

    VΣH=[j=1nvijσjk]=[vikσk]=[σ1v1σ2v2σkvk00]

    where 0 columns appear mk times.

  6. Expanding the L.H.S. we get

    AHU=[AHu1AHu2AHum].
  7. Thus by comparing both sides we get

    AHui=σivi for 1ik

    and

    AHui=0 for k<im.

We now consider the three cases.

  1. For m=n, we have k=m=n. And we get

    Avi=σiui,AHui=σivi for 1im.

    Thus σi is a singular value of A and ui is a left singular vector while vi is a right singular vector.

  2. For m<n, we have k=m. We get for first m vectors in V

    Avi=σiui,AHui=σivi for 1im.

    Finally for remaining nm vectors in V, we can write

    Avi=0.

    They belong to the null space of A.

  3. For m>n, we have k=n. We get for first n vectors in U

    Avi=σiui,AHui=σivi for 1in.

    Finally for remaining mn vectors in U, we can write

    AHui=0.

Lemma 4.71

ΣΣH is an m×m matrix given by

ΣΣH=diag(σ12,σ22,,σk2,0,0,,0)

where the number of 0’s following σk2 is mk.

Lemma 4.72

ΣHΣ is an n×n matrix given by

ΣHΣ=diag(σ12,σ22,,σk2,0,0,,0)

where the number of 0’s following σk2 is nk.

Lemma 4.73 (Rank from singular value decomposition)

Let AFm×n have a singular value decomposition given by

A=UΣVH.

Then

r=rank(A)=rank(Σ).

In other words, the rank of A is number of nonzero singular values of A. Since the singular values are ordered in descending order in A hence, the first r singular values σ1,,σr are nonzero.

Proof. This is a straight forward application of Lemma 4.4 and Lemma 4.5. Further since only nonzero values in Σ appear on its main diagonal hence its rank is number of nonzero singular values σi.

Corollary 4.24

Let r=rank(A). Then Σ can be split as a block matrix

Σ=[ΣrOOO]

where Σr is an r×r diagonal matrix of the nonzero singular values diag(σ1,σ2,,σr). All other sub-matrices in Σ are O.

Lemma 4.74 (Eigen values of the Gram matrix)

The eigen values of Hermitian matrix AHAFn×n are σ12,σ22,,σk2,0,0,,0 with nk 0’s after σk2. Moreover the eigen vectors are the columns of V.

Proof. Expanding the Gram matrix:

AHA=(UΣVH)HUΣVH=VΣHUHUΣVH=VΣHΣVH.
  1. We note that AHA is Hermitian.

  2. Hence AHA is diagonalized by V.

  3. The diagonalization of AHA is ΣHΣ.

  4. Thus the eigen values of AHA are σ12,σ22,,σk2,0,0,,0 with nk 0’s after σk2.

  5. Clearly

    (AHA)V=V(ΣHΣ).
  6. Thus columns of V are the eigen vectors of AHA.

Lemma 4.75 (Eigen values of the frame operator)

The eigen values of Hermitian matrix AAHFm×m are σ12,σ22,,σk2,0,0,,0 with mk 0’s after σk2. Moreover the eigen vectors are the columns of V.

Proof. Expanding the frame operator:

AAH=UΣVH(UΣVH)H=UΣVHVΣHUH=UΣΣHUH.
  1. We note that AHA is Hermitian.

  2. Hence AHA is diagonalized by V.

  3. The diagonalization of AHA is ΣHΣ.

  4. Thus the eigen values of AHA are σ12,σ22,,σk2,0,0,,0 with mk 0’s after σk2.

  5. Clearly

    (AAH)U=U(ΣΣH).
  6. Thus columns of U are the eigen vectors of AAH.

Observation 4.8

The Gram matrices AAH and AHA share the same eigen values except for some extra 0s. Their eigen values are the squares of singular values of A and some extra 0s. In other words singular values of A are the square roots of nonzero eigen values of the Gram matrices AAH or AHA.

4.10.2. The Largest Singular Value#

Lemma 4.76

For every uFn the following holds

Σu2σ1u2.

Also for every uFm the following holds

ΣHu2σ1u2.

Proof. Let us expand the term Σu.

[σ1000σ2000σk0000][u1u2ukun]=[σ1u1σ2u2σkuk00]

Now since σ1 is the largest singular value, hence

|σiui||σ1ui|1ik.

Thus

i=1n|σ1ui|2i=1n|σiui|2

or

σ12u22Σu22.

The result follows.

A simpler representation of Σu can be given using the block representation of Σ in Corollary 4.24.

Let r=rank(A). Then

Σ=[ΣrOOO].

We split entries in u as

u=[(u1,,ur)(ur+1un)]T.

Then

Σu=[Σr[u1ur]TO[ur+1un]T]=[σ1u1σ2u2σrur00]T

Thus

Σu22=i=1r|σiui|2σ1i=1r|ui|2σ1u22.

2nd result can also be proven similarly.

Lemma 4.77 (Upper bounds on norms of matrix vector product)

Let σ1 be the largest singular value of an m×n matrix A. Then

Ax2σ1x2xFn.

Moreover

AHx2σ1x2xFm.

Proof. We have

Ax2=UΣVHx2=ΣVHx2

since U is unitary. Now from previous result we have

ΣVHx2σ1VHx2=σ1x2

since VH also unitary. Thus we get the result

Ax2σ1x2xFn.

Similarly

AHx2=VΣHUHx2=ΣHUHx2

since V is unitary. Now from previous result we have

ΣHUHx2σ1UHx2=σ1x2

since UH also unitary. Thus we get the result

AHx2σ1x2xFm.

There is a direct connection between the largest singular value and 2-norm of a matrix (see p-norm for Matrices).

Corollary 4.25

The largest singular value of A is nothing but its 2-norm; i.e.,

σ1=maxu2=1Au2.

4.10.3. SVD and Pseudo Inverse#

Lemma 4.78

Let A=UΣVH and let r=rank(A). Let σ1,,σr be the r nonzero singular values of A. Then the Moore-Penrose pseudo-inverse of Σ is an n×m matrix Σ given by

Σ=[Σr1OOO]

where Σr=diag(σ1,,σr).

Σ is obtained by transposing Σ and inverting all its nonzero (positive real) values.

Proof. Straight forward application of Lemma 4.24.

Corollary 4.26

The rank of Σ and its pseudoinverse Σ are same; i.e.,

rank(Σ)=rank(Σ).

Proof. The number of nonzero diagonal entries in Σ and Σ are same.

Lemma 4.79

Let A be an m×n matrix and let A=UΣVH be its singular value decomposition. Let Σ be the pseudoinverse of Σ as per Lemma 4.78. Then the Moore-Penrose pseudo-inverse of A is given by

A=VΣUH.

Proof. As usual we verify the requirements for a Moore-Penrose pseudo-inverse as per Definition 4.42. We note that since Σ is the pseudo-inverse of Σ it already satisfies necessary criteria.

First requirement:

AAA=UΣVHVΣUHUΣVH=UΣΣΣVH=UΣVH=A.

Second requirement:

AAA=VΣUHUΣVHVΣUH=VΣΣΣUH=VΣUH=A.

We now consider

AA=UΣVHVΣUH=UΣΣUH.

Thus

(AA)H=(UΣΣUH)H=U(ΣΣ)HUH=UΣΣUH=AA

since ΣΣ is Hermitian. Finally we consider

AA=VΣUHUΣVH=VΣΣVH.

Thus

(AA)H=(VΣΣVH)H=V(ΣΣ)HVH=VΣΣVH=AA

since ΣΣ is also Hermitian. This completes the proof.

Finally we can connect the singular values of A with the singular values of its pseudo-inverse.

Corollary 4.27 (Rank of pseudoinverse)

The rank of any m×n matrix A and its pseudoinverse A are same. i.e.

rank(A)=rank(A).

Proof. We have rank(A)=rank(Σ). Also it is easy to verify that rank(A)=rank(Σ). So using Corollary 4.26 completes the proof.

Lemma 4.80 (Singular values of the pseudoinverse)

Let A be an m×n matrix and let A be its n×m pseudoinverse as per Lemma 4.79. Let r=rank(A). Let k=min(m,n) denote the number of singular values while r denote the number of nonzero singular values of A. Let σ1,,σr be the nonzero singular values of A. Then the number of singular values of A is same as that of A and the nonzero singular values of A are

1σ1,,1σr

while all other kr singular values of A are zero.

Proof. k=min(m,n) denotes the number of singular values for both A and A.

  1. Since rank of A and A are same, hence the number of nonzero singular values is same.

  2. Now look at

    A=VΣUH

    where

    Σ=[Σr1OOO].
  3. Clearly Σr1=diag(1σ1,,1σr).

  4. Thus expanding the R.H.S. we can get

    A=i=1r1σiviuiH

    where vi and ui are first r columns of V and U respectively.

  5. If we reverse the order of first r columns of U and V and reverse the first r diagonal entries of Σ , the R.H.S. remains the same while we are able to express A in the standard singular value decomposition form.

  6. Thus 1σ1,,1σr are indeed the nonzero singular values of A.

4.10.4. Full Column Rank Matrices#

In this subsection we consider some specific results related to the singular value decomposition of a full column rank matrix.

  1. We will consider A to be an m×n matrix in Fm×n with mn and rank(A)=n.

  2. Let A=UΣVH be its singular value decomposition.

  3. From Lemma 4.73 we observe that there are n nonzero singular values of A.

  4. We will call these singular values as σ1,σ2,,σn.

  5. We will define

    Σn=diag(σ1,σ2,,σn).
  6. Clearly Σ is an 2×1 block matrix given by

    Σ=[ΣnO]

    where the lower O is an (mn)×n zero matrix.

  7. From here we obtain that ΣHΣ is an n×n matrix given by

    ΣHΣ=Σn2

    where

    Σn2=diag(σ12,σ22,,σn2).

Lemma 4.81

Let A be a full column rank matrix with singular value decomposition A=UΣVH. Then ΣHΣ=Σn2=diag(σ12,σ22,,σn2) and ΣHΣ is invertible.

Proof. Since all singular values are nonzero hence Σn2 is invertible. Thus

(ΣHΣ)1=(Σn2)1=diag(1σ12,1σ22,,1σn2).

Lemma 4.82

Let A be a full column rank matrix with singular value decomposition A=UΣVH. Let σ1 be its largest singular value and σn be its smallest singular value. Then

σn2x2ΣHΣx2σ12x2xFn.

Proof. Let xFn.

  1. We have

    ΣHΣx22=Σn2x22=i=1n|σi2xi|2.
  2. Now since

    σnσiσ1;
  3. hence

    σn4i=1n|xi|2i=1n|σi2xi|2σ14i=1n|xi|2.
  4. Thus

    σn4x22ΣHΣx22σ14x22.
  5. Applying square roots, we get

    σn2x2ΣHΣx2σ12x2xFn.

We recall from Corollary 4.7 that the Gram matrix of its column vectors G=AHA is full rank and invertible.

Lemma 4.83 (Norm bounds for Gram matrix vector product)

Let A be a full column rank matrix with singular value decomposition A=UΣVH. Let σ1 be its largest singular value and σn be its smallest singular value. Then

σn2x2AHAx2σ12x2xFn.

Proof. Expanding the Gram matrix

AHA=(UΣVH)H(UΣVH)=VΣHΣVH.

Let xFn. Let

u=VHxu2=x2.

Let

w=ΣHΣu.

Then from previous lemma we have

σn2u2ΣHΣu2=w2σ12u2.

Finally

AHAx=VΣHΣVHx=Vw.

Thus

AHAx2=w2.

Substituting we get

σn2x2AHAx2σ12x2xFn.

There are bounds for the inverse of Gram matrix also. First let us establish the inverse of Gram matrix.

Lemma 4.84 (SVD of inverse Gram matrix)

Let A be a full column rank matrix with singular value decomposition A=UΣVH. Let the singular values of A be σ1,,σn. Let the Gram matrix of columns of A be G=AHA. Then

G1=VΨVH

where

Ψ=diag(1σ12,1σ22,,1σn2).

Proof. We have

G=VΣHΣVH.

Thus

G1=(VΣHΣVH)1=(VH)1(ΣHΣ)1V1=V(ΣHΣ)1VH.

From Lemma 4.81 we have

Ψ=(ΣHΣ)1=diag(1σ12,1σ22,,1σn2).

This completes the proof.

We can now state the bounds:

Lemma 4.85 (Norm bounds for inverse Gram matrix vector product)

Let A be a full column rank matrix with singular value decomposition A=UΣVH. Let σ1 be its largest singular value and σn be its smallest singular value. Then

1σ12x2(AHA)1x21σn2x2xFn.

Proof. From Lemma 4.84 we have

G1=(AHA)1=VΨVH

where

Ψ=diag(1σ12,1σ22,,1σn2).

Let xFn. Let

u=VHxu2=x2.

Let

w=Ψu.

Then

w22=i=1n|1σi2ui|2.

Thus

1σ12u2Ψu2=w21σn2u2.

Finally

(AHA)1x=VΨVHx=Vw.

Thus

(AHA)1x2=w2.

Substituting we get the result.

4.10.5. Low Rank Approximation of a Matrix#

Definition 4.133 (Low rank matrix)

An m×n matrix A is called low rank if

rank(A)min(m,n).

A matrix is low rank if the number of nonzero singular values for the matrix is much smaller than its dimensions.

Following is a simple procedure for making a low rank approximation of a given matrix A.

  • Perform the singular value decomposition of A given by A=UΣVH.

  • Identify the singular values of A in Σ.

  • Keep the first r singular values (where rmin(m,n) is the rank of the approximation).

  • Set all other singular values to 0 to obtain Σ^.

  • Compute A^=UΣ^VH.