# 1.9. Matrices#

This section provides some basic definitions, notation and results on the theory of matrices.

## 1.9.1. Basic Definitions#

Definition 1.98 (Matrix)

An $$m \times n$$ matrix $$\bA$$ is a rectangular array of numbers.

$\begin{split} \bA = \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1 n}\\ a_{21} & a_{22} & \dots & a_{2 n}\\ \vdots & \vdots & \ddots & \vdots \\ a_{m s1} & a_{m 2} & \dots & a_{m n}\\ \end{bmatrix}. \end{split}$

The numbers in a matrix are called its elements.

The matrix consists of $$m$$ rows and $$n$$ columns. The entry in $$i$$-th row and $$j$$-th column is referred with the notation $$a_{i j}$$.

If all the elements of a matrix are real, then we call it a real matrix.

If any of the elements of the matrix is complex, then we call it a complex matrix.

A matrix is often written in short as $$\bA = (a_{ij})$$.

Matrices are denoted by bold capital letters $$\bA$$, $$\bB$$ etc.. They can be rectangular with $$m$$ rows and $$n$$ columns. Their elements or entries are referred to with small letters $$a_{i j}$$, $$b_{i j}$$ etc. where $$i$$ denotes the $$i$$-th row of matrix and $$j$$ denotes the $$j$$-th column of matrix.

Definition 1.99 (The set of matrices)

The set of all real matrices of shape $$m \times n$$ is denoted by $$\RR^{m \times n}$$.

The set of all complex matrices of shape $$m \times n$$ is denoted by $$\CC^{m \times n}$$.

Definition 1.100 (Square matrix)

An $$m \times n$$ matrix is called square matrix if $$m = n$$.

Definition 1.101 (Tall matrix)

An $$m \times n$$ matrix is called tall matrix if $$m > n$$ i.e. the number of rows is greater than columns.

Definition 1.102 (Wide matrix)

An $$m \times n$$ matrix is called wide matrix if $$m < n$$ i.e. the number of columns is greater than rows.

Definition 1.103 (Vector)

A vector is an $$n$$-tuple of numbers written as:

$\bv = (v_1, v_2, \dots, v_n).$

If all the numbers are real, then it is called a real vector belonging to the set $$\RR^n$$. If any of the numbers is complex, then it is called a complex vector belonging to the set $$\CC^n$$. The numbers in a vector are called its components.

Sometimes, we may use a notation without commas.

$\bv = \begin{pmatrix}v_1 & v_2 & \dots & v_n \end{pmatrix}.$

Definition 1.104 (Column vector)

A matrix with shape $$m \times 1$$ is called a column vector.

Definition 1.105 (Row vector)

A matrix with shape $$1 \times n$$ is called a row vector.

Note

It should be easy to see that $$\RR^{m \times 1}$$ and $$\RR^m$$ are same sets. Similarly, $$\RR^{1\times n}$$ and $$\RR^n$$ are same sets.

A row or column vector can easily be written as an $$n$$-tuple.

Definition 1.106 (Main diagonal)

Let $$\bA= [a_{i j}]$$ be an $$m \times n$$ matrix. The main diagonal consists of entries $$a_{i j}$$ where $$i = j$$; i.e., the main diagonal is $$\{a_{11}, a_{22}, \dots, a_{k k} \}$$ where $$k = \min(m, n)$$.

Main diagonal is also known as leading diagonal, major diagonal primary diagonal or principal diagonal.

The entries of $$\bA$$ which are not on the main diagonal are known as off diagonal entries.

Definition 1.107 (Diagonal matrix)

A diagonal matrix is a matrix (usually a square matrix) whose entries outside the main diagonal are zero.

Whenever we refer to a diagonal matrix which is not square, we will use the term rectangular diagonal matrix.

A square diagonal matrix $$A$$ is also written as $$\Diag(a_{11}, a_{22}, \dots, a_{n n})$$ which lists only the diagonal (non-zero) entries in $$\bA$$.

If not specified, the square matrices will be of size $$n \times n$$ and rectangular matrices will be of size $$m \times n$$. If not specified the vectors (column vectors) will be of size $$n \times 1$$ and belong to either $$\RR^n$$ or $$\CC^n$$. Corresponding row vectors will be of size $$1 \times n$$.

## 1.9.2. Matrix Operations#

Definition 1.108 (Matrix addition)

Let $$\bA$$ and $$\bB$$ be two matrices with same shape $$m \times n$$. Then, their addition is defined as:

$\bA + \bB = (a_{ij}) + (b_{ij}) \triangleq (a_{ij} + b_{ij}).$

Definition 1.109 (Scalar multiplication)

Let $$\bA$$ be a matrix of shape $$m \times n$$ and $$\lambda$$ be a scalar. The product of the matrix $$\bA$$ with the scalar $$\lambda$$ is defined as:

$\lambda \bA = \bA \lambda \triangleq (\lambda a_{ij}).$

Theorem 1.35 (Properties of matrix addition and scalar multiplication)

Let $$\bA, \bB, \bC$$ be matrices of shape $$m \times n$$. Let $$\lambda, \mu$$ be scalars. Then:

1. Matrix addition is commutative: $$\bA + \bB = \bB + \bA$$.

2. Matrix addition is associative: $$\bA + (\bB + \bC) = (\bA + \bB) + \bC$$.

3. Addition in scalars distributes over scalar multiplication: $$(\lambda + \mu)\bA = \lambda \bA + \mu \bA$$.

4. Scalar multiplication distributes over addition of matrices: $$\lambda (\bA + \bB) = \lambda \bA + \lambda \bB$$.

5. Multiplication in scalars commutes with scalar multiplication: $$(\lambda \mu) \bA = \lambda (\mu \bA)$$.

6. There exists a matrix with all elements being zero denoted by $$\ZERO$$ such that $$\bA + \ZERO = \ZERO + \bA = \bA$$.

7. Existence of additive inverse: $$\bA + (-1)\bA = \ZERO$$.

Definition 1.110 (Matrix multiplication)

If $$\bA$$ is an $$m \times n$$ matrix and $$\bB$$ is an $$n \times p$$ matrix (thus, $$\bA$$ has same number of columns as $$\bB$$ has rows), then we define the product of $$\bA$$ and $$\bB$$ as:

$\bA \bB \triangleq \left ( \sum_{k=1}^n a_{ik} b_{kj} \right ).$

This binary operation is known as matrix multiplication. The product matrix has the shape $$m \times p$$. Its $$i,j$$-th element is $$\sum_{k=1}^n a_{ik} b_{kj}$$ obtained by multiplying the $$i$$-th row of $$A$$ with the $$j$$-th column of $$B$$ element by element and then summing over them.

Theorem 1.36 (Properties of matrix multiplication)

Let $$\bA, \bB, \bC$$ be matrices of appropriate shape.

1. Matrix multiplication is associative: $$\bA (\bB \bC) = (\bA \bB)\bC$$.

2. Matrix multiplication distributes over matrix addition: $$\bA (\bB + \bC) = \bA \bB + \bA \bC$$ and $$(\bA + \bB) \bC = \bA \bC + \bB \bC$$.

## 1.9.3. Transpose#

The transpose of a matrix $$\bA$$ is denoted by $$\bA^T$$ while the Hermitian transpose is denoted by $$\bA^H$$. For real matrices $$\bA^T = \bA^H$$.

For statements which are valid both for real and complex matrices, sometimes we might say that matrices belong to $$\FF^{m \times n}$$ while the scalars belong to the field $$\FF$$ and vectors belong to $$\FF^n$$ where $$\FF$$ refers to either the field of real numbers or the field of complex numbers. Most results from matrix analysis are written only for $$\CC^{m \times n}$$ while still being applicable for $$\RR^{m \times n}$$.

Identity matrix for $$\FF^{n \times n}$$ is denoted as $$\bI_n$$ or simply $$\bI$$ whenever the size is clear from context.

Sometimes we will write a matrix in terms of its column vectors. We will use the notation

$\bA = \begin{bmatrix} \ba_1 & \ba_2 & \dots & \ba_n \end{bmatrix}$

indicating $$n$$ columns.

When we write a matrix in terms of its row vectors, we will use the notation

$\begin{split} \bA = \begin{bmatrix} \ba_1^T \\ \ba_2^T \\ \vdots \\ \ba_m^T \end{bmatrix} \end{split}$

indicating $$m$$ rows with $$\ba_i$$ being column vectors whose transposes form the rows of $$\bA$$.

## 1.9.4. Symmetric Matrices#

Definition 1.111 (Symmetric matrix)

A symmetric matrix is a matrix $$\bX \in \FF^{n \times n}$$ which satisfies $$\bX = \bX^T$$.

We define the set of symmetric $$n\times n$$ matrices as

$\SS^n = \{\bX \in \RR^{n \times n} | \bX = \bX^T\}.$

## 1.9.5. Dot Products#

The inner product or dot product of two column / row vectors $$\bu$$ and $$\bv$$ belonging to $$\RR^n$$ is defined as

(1.1)#$\bu \cdot \bv = \langle \bu, \bv \rangle = \sum_{i=1}^n u_i v_i.$

The inner product or dot product of two column / row vectors $$\bu$$ and $$\bv$$ belonging to $$\CC^n$$ is defined as

(1.2)#$\bu \cdot \bv = \langle \bu, \bv \rangle = \sum_{i=1}^n u_i \overline{v_i}.$

## 1.9.6. Block Matrices#

Definition 1.112 (Block matrix)

A block matrix is a matrix whose entries themselves are matrices with following constraints

• Entries in every row are matrices with same number of rows.

• Entries in every column are matrices with same number of columns.

Let $$\bA$$ be an $$m \times n$$ block matrix. Then

$\begin{split} \bA = \begin{bmatrix} \bA_{11} & \bA_{12} & \dots & \bA_{1 n}\\ \bA_{21} & \bA_{22} & \dots & \bA_{2 n}\\ \vdots & \vdots & \ddots & \vdots\\ \bA_{m 1} & \bA_{m 2} & \dots & \bA_{m n}\\ \end{bmatrix} \end{split}$

where $$\bA_{i j}$$ is a matrix with $$r_i$$ rows and $$c_j$$ columns.

A block matrix is also known as a partitioned matrix.

Example 1.23 ($$2x2$$ block matrices)

Quite frequently we will be using $$2x2$$ block matrices.

$\begin{split} \bP = \begin{bmatrix} \bP_{11} & \bP_{12} \\ \bP_{21} & \bP_{22} \end{bmatrix}. \end{split}$

An example

$\begin{split} P = \left[ \begin{array}{c c | c} a & b & c \\ d & e & f \\ \hline g & h & i \end{array} \right] \end{split}$

We have

$\begin{split} \bP_{11} = \begin{bmatrix} a & b \\ d & e \end{bmatrix} \; \bP_{12} = \begin{bmatrix} c \\ f \end{bmatrix} \; \bP_{21} = \begin{bmatrix} g & h \end{bmatrix} \; \bP_{22} = \begin{bmatrix} i \end{bmatrix} \end{split}$
• $$\bP_{11}$$ and $$\bP_{12}$$ have $$2$$ rows.

• $$\bP_{21}$$ and $$\bP_{22}$$ have $$1$$ row.

• $$\bP_{11}$$ and $$\bP_{21}$$ have $$2$$ columns.

• $$\bP_{12}$$ and $$\bP_{22}$$ have $$1$$ column.

Lemma 1.1 (Shape of a block matrix)

Let $$\bA = [\bA_{ij}]$$ be an $$m \times n$$ block matrix with $$\bA_{ij}$$ being an $$r_i \times c_j$$ matrix. Then $$\bA$$ is an $$r \times c$$ matrix where

$r = \sum_{i=1}^m r_i$

and

$c = \sum_{j=1}^n c_j.$

Sometimes it is convenient to think of a regular matrix as a block matrix whose entries are $$1 \times 1$$ matrices themselves.

Definition 1.113 (Multiplication of block matrices)

Let $$\bA = [\bA_{ij}]$$ be an $$m \times n$$ block matrix with $$\bA_{ij}$$ being a $$p_i \times q_j$$ matrices. Let $$\bB = [\bB_{jk}]$$ be an $$n \times p$$ block matrix with $$\bB_{jk}$$ being a $$q_j \times r_k$$ matrices.

Then the two block matrices are compatible for multiplication and their multiplication is defined by $$\bC = \bA \bB = [\bC_{i k}]$$ where

$\bC_{i k} = \sum_{j=1}^n \bA_{i j} \bB_{j k}$

and $$\bC_{i k}$$ is a $$p_i \times r_k$$ matrix.

Definition 1.114 (Block diagonal matrix)

A block diagonal matrix is a block matrix whose off diagonal entries are zero matrices.