1.9. Matrices#

This section provides some basic definitions, notation and results on the theory of matrices.

1.9.1. Basic Definitions#

Definition 1.98 (Matrix)

An m×n matrix A is a rectangular array of numbers.

A=[a11a12a1na21a22a2nams1am2amn].

The numbers in a matrix are called its elements.

The matrix consists of m rows and n columns. The entry in i-th row and j-th column is referred with the notation aij.

If all the elements of a matrix are real, then we call it a real matrix.

If any of the elements of the matrix is complex, then we call it a complex matrix.

A matrix is often written in short as A=(aij).

Matrices are denoted by bold capital letters A, B etc.. They can be rectangular with m rows and n columns. Their elements or entries are referred to with small letters aij, bij etc. where i denotes the i-th row of matrix and j denotes the j-th column of matrix.

Definition 1.99 (The set of matrices)

The set of all real matrices of shape m×n is denoted by Rm×n.

The set of all complex matrices of shape m×n is denoted by Cm×n.

Definition 1.100 (Square matrix)

An m×n matrix is called square matrix if m=n.

Definition 1.101 (Tall matrix)

An m×n matrix is called tall matrix if m>n i.e. the number of rows is greater than columns.

Definition 1.102 (Wide matrix)

An m×n matrix is called wide matrix if m<n i.e. the number of columns is greater than rows.

Definition 1.103 (Vector)

A vector is an n-tuple of numbers written as:

v=(v1,v2,,vn).

If all the numbers are real, then it is called a real vector belonging to the set Rn. If any of the numbers is complex, then it is called a complex vector belonging to the set Cn. The numbers in a vector are called its components.

Sometimes, we may use a notation without commas.

v=(v1v2vn).

Definition 1.104 (Column vector)

A matrix with shape m×1 is called a column vector.

Definition 1.105 (Row vector)

A matrix with shape 1×n is called a row vector.

Note

It should be easy to see that Rm×1 and Rm are same sets. Similarly, R1×n and Rn are same sets.

A row or column vector can easily be written as an n-tuple.

Definition 1.106 (Main diagonal)

Let A=[aij] be an m×n matrix. The main diagonal consists of entries aij where i=j; i.e., the main diagonal is {a11,a22,,akk} where k=min(m,n).

Main diagonal is also known as leading diagonal, major diagonal primary diagonal or principal diagonal.

The entries of A which are not on the main diagonal are known as off diagonal entries.

Definition 1.107 (Diagonal matrix)

A diagonal matrix is a matrix (usually a square matrix) whose entries outside the main diagonal are zero.

Whenever we refer to a diagonal matrix which is not square, we will use the term rectangular diagonal matrix.

A square diagonal matrix A is also written as diag(a11,a22,,ann) which lists only the diagonal (non-zero) entries in A.

If not specified, the square matrices will be of size n×n and rectangular matrices will be of size m×n. If not specified the vectors (column vectors) will be of size n×1 and belong to either Rn or Cn. Corresponding row vectors will be of size 1×n.

1.9.2. Matrix Operations#

Definition 1.108 (Matrix addition)

Let A and B be two matrices with same shape m×n. Then, their addition is defined as:

A+B=(aij)+(bij)(aij+bij).

Definition 1.109 (Scalar multiplication)

Let A be a matrix of shape m×n and λ be a scalar. The product of the matrix A with the scalar λ is defined as:

λA=Aλ(λaij).

Theorem 1.35 (Properties of matrix addition and scalar multiplication)

Let A,B,C be matrices of shape m×n. Let λ,μ be scalars. Then:

  1. Matrix addition is commutative: A+B=B+A.

  2. Matrix addition is associative: A+(B+C)=(A+B)+C.

  3. Addition in scalars distributes over scalar multiplication: (λ+μ)A=λA+μA.

  4. Scalar multiplication distributes over addition of matrices: λ(A+B)=λA+λB.

  5. Multiplication in scalars commutes with scalar multiplication: (λμ)A=λ(μA).

  6. There exists a matrix with all elements being zero denoted by O such that A+O=O+A=A.

  7. Existence of additive inverse: A+(1)A=O.

Definition 1.110 (Matrix multiplication)

If A is an m×n matrix and B is an n×p matrix (thus, A has same number of columns as B has rows), then we define the product of A and B as:

AB(k=1naikbkj).

This binary operation is known as matrix multiplication. The product matrix has the shape m×p. Its i,j-th element is k=1naikbkj obtained by multiplying the i-th row of A with the j-th column of B element by element and then summing over them.

Theorem 1.36 (Properties of matrix multiplication)

Let A,B,C be matrices of appropriate shape.

  1. Matrix multiplication is associative: A(BC)=(AB)C.

  2. Matrix multiplication distributes over matrix addition: A(B+C)=AB+AC and (A+B)C=AC+BC.

1.9.3. Transpose#

The transpose of a matrix A is denoted by AT while the Hermitian transpose is denoted by AH. For real matrices AT=AH.

For statements which are valid both for real and complex matrices, sometimes we might say that matrices belong to Fm×n while the scalars belong to the field F and vectors belong to Fn where F refers to either the field of real numbers or the field of complex numbers. Most results from matrix analysis are written only for Cm×n while still being applicable for Rm×n.

Identity matrix for Fn×n is denoted as In or simply I whenever the size is clear from context.

Sometimes we will write a matrix in terms of its column vectors. We will use the notation

A=[a1a2an]

indicating n columns.

When we write a matrix in terms of its row vectors, we will use the notation

A=[a1Ta2TamT]

indicating m rows with ai being column vectors whose transposes form the rows of A.

1.9.4. Symmetric Matrices#

Definition 1.111 (Symmetric matrix)

A symmetric matrix is a matrix XFn×n which satisfies X=XT.

We define the set of symmetric n×n matrices as

Sn={XRn×n|X=XT}.

1.9.5. Dot Products#

The inner product or dot product of two column / row vectors u and v belonging to Rn is defined as

(1.1)#uv=u,v=i=1nuivi.

The inner product or dot product of two column / row vectors u and v belonging to Cn is defined as

(1.2)#uv=u,v=i=1nuivi.

1.9.6. Block Matrices#

Definition 1.112 (Block matrix)

A block matrix is a matrix whose entries themselves are matrices with following constraints

  • Entries in every row are matrices with same number of rows.

  • Entries in every column are matrices with same number of columns.

Let A be an m×n block matrix. Then

A=[A11A12A1nA21A22A2nAm1Am2Amn]

where Aij is a matrix with ri rows and cj columns.

A block matrix is also known as a partitioned matrix.

Example 1.23 (2x2 block matrices)

Quite frequently we will be using 2x2 block matrices.

P=[P11P12P21P22].

An example

P=[abcdefghi]

We have

P11=[abde]P12=[cf]P21=[gh]P22=[i]
  • P11 and P12 have 2 rows.

  • P21 and P22 have 1 row.

  • P11 and P21 have 2 columns.

  • P12 and P22 have 1 column.

Lemma 1.1 (Shape of a block matrix)

Let A=[Aij] be an m×n block matrix with Aij being an ri×cj matrix. Then A is an r×c matrix where

r=i=1mri

and

c=j=1ncj.

Sometimes it is convenient to think of a regular matrix as a block matrix whose entries are 1×1 matrices themselves.

Definition 1.113 (Multiplication of block matrices)

Let A=[Aij] be an m×n block matrix with Aij being a pi×qj matrices. Let B=[Bjk] be an n×p block matrix with Bjk being a qj×rk matrices.

Then the two block matrices are compatible for multiplication and their multiplication is defined by C=AB=[Cik] where

Cik=j=1nAijBjk

and Cik is a pi×rk matrix.

Definition 1.114 (Block diagonal matrix)

A block diagonal matrix is a block matrix whose off diagonal entries are zero matrices.