2.2 Matrices

Matrices play a central role in linear algebra. They provide a compact way to represent systems of linear equations and also serve as representations of linear functions (or mappings).

Definition 2.9 A matrix is a rectangular array of numbers arranged in \(m\) rows and \(n\) columns.
Formally, a real-valued matrix \(\mathbf{A} \in \mathbb{R}^{m \times n}\) is:

\[ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{bmatrix} \]

The notation \((\mathbf{A})_{ij}\) refers to the \(ij^{th}\) element of \(\mathbf{A}\). So, \((\mathbf{A})_{ij} = a_{ij}\). For example, if \[ \mathbf{A} = \begin{bmatrix} 2 & 4 \\ 1 & 3 \end{bmatrix}, \] Then \(\mathbf{A}_{11} = 2, \mathbf{A}_{12} = 4, \mathbf{A}_{21} = 1\) and \(\mathbf{A}_{22} = 3\).

Matrices with one row are called row vectors, and those with one column are column vectors.


2.2.1 Matrix Addition

Definition 2.10 For two matrices \(\mathbf{A}, \mathbf{B} \in \mathbb{R}^{m \times n}\), their sum and difference are defined element-wise: \[ (\mathbf{A} + \mathbf{B})_{ij} = a_{ij} + b_{ij} \;\;\;\;\;\;\;\;\; (\mathbf{A} - \mathbf{B})_{ij} = a_{ij} - b_{ij} \]

The result of matrix addition is another \(m \times n\) matrix.

Example 2.21 Let: \[ \mathbf{A} = \begin{bmatrix} 2 & 4 \\ 1 & 3 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 5 & 0 \\ -2 & 1 \end{bmatrix}. \] Then \[ \mathbf{A} + \mathbf{B} = \begin{bmatrix} 2 + 5 & 4 + 0 \\ 1 + (-2) & 3 + 1 \end{bmatrix} = \begin{bmatrix} 7 & 4 \\ -1 & 4 \end{bmatrix} \] \[ \mathbf{A} - \mathbf{B} = \begin{bmatrix} 2 - 5 & 4 - 0 \\ 1 - (-2) & 3 - 1 \end{bmatrix} = \begin{bmatrix} -3 & 4 \\ 3 & 2 \end{bmatrix}. \]

Example 2.22 In the previous example, we can see that \[(\mathbf{A}+\mathbf{B})_{21} = -1 \quad \text{and} \quad (\mathbf{A}-\mathbf{B})_{22} = 2.\]


2.2.2 Matrix Multiplication

Definition 2.11 For matrices \(\mathbf{A} \in \mathbb{R}^{m \times n}\) and \(\mathbf{B} \in \mathbb{R}^{n \times k}\), their product \(\mathbf{C} = \mathbf{A}\mathbf{B} \in \mathbb{R}^{m \times k}\) is defined as: \[ c_{ij} = \sum_{l=1}^{n} a_{il} b_{lj} \]

That is, each element of \(\mathbf{C}\) is obtained by taking the dot product of the corresponding row of \(\mathbf{A}\) and column of \(\mathbf{B}\).

Matrix multiplication is only defined when the inner dimensions match (the number of columns of \(\mathbf{A}\) equals the number of rows of \(\mathbf{B}\)).

Matrix multiplication is not commutative, meaning \(\mathbf{A}\mathbf{B} \neq \mathbf{B}\mathbf{A}\) in general.

Example 2.23 Let \[ \mathbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 2 & 0 \\ 1 & 2 \end{bmatrix}. \]

Then \[ \mathbf{A} \mathbf{B} = \begin{bmatrix} 1(2) + 2(1) & 1(0) + 2(2) \\ 3(2) + 4(1) & 3(0) + 4(2) \end{bmatrix} = \begin{bmatrix} 4 & 4 \\ 10 & 8 \end{bmatrix}. \]

Example 2.24 Let \[ \mathbf{C} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}, \quad \mathbf{D} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix}. \]

Then \[ \mathbf{C} \mathbf{D} = \begin{bmatrix} 1(1) + 2(3) + 3(5) & 1(2) + 2(4) + 3(6) \\ 4(1) + 5(3) + 6(5) & 4(2) + 5(4) + 6(6) \end{bmatrix} = \begin{bmatrix} 22 & 28 \\ 49 & 64 \end{bmatrix}. \]


2.2.3 Identity Matrix

Definition 2.12 The identity matrix \(\mathbf{I}_n \in \mathbb{R}^{n \times n}\) is a square matrix with 1’s on the diagonal and 0’s elsewhere: \[ \mathbf{I}_n = \begin{bmatrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1 \end{bmatrix} \]

The identity matrix satisfies: \[ \mathbf{I}_m \mathbf{A} = \mathbf{A} \mathbf{I}_n = \mathbf{A} \] for any matrix \(\mathbf{A}\) (with the appropriate dimensions).

Example 2.25 Let \[ \mathbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} .\] Then \[\mathbf{A}\mathbf{I} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}= \begin{bmatrix} 1(1) + 2(0) & 1(0) + 2(1) \\ 3(1) + 4(0) & 3(0) + 4(1) \end{bmatrix}= \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}.\]


2.2.4 Matrix Properties

There are several important properties of matrices.

Lemma 2.2 For any matrices \(\mathbf{A}, \mathbf{B}\), and \(\mathbf{C}\) with appropriate dimensions for addition/ multiplication, the following properties hold.

  • Associativity:
    \[ (\mathbf{A}\mathbf{B})\mathbf{C} = \mathbf{A}(\mathbf{B}\mathbf{C}) \]
  • Distributivity:
    \[ (\mathbf{A} + \mathbf{B})\mathbf{C} = \mathbf{A}\mathbf{C} + \mathbf{B}\mathbf{C}, \quad \mathbf{A}(\mathbf{C} + \mathbf{D}) = \mathbf{A}\mathbf{C} + \mathbf{A}\mathbf{D} \]
  • Identity Property:
    \[ \mathbf{I}_m \mathbf{A} = \mathbf{A} \mathbf{I}_n = \mathbf{A} \]

Matrix multiplication is not element-wise. When multiplication is performed element by element, it is called the Hadamard product.


2.2.5 Matrix Inverse

Definition 2.13 A square matrix \(\mathbf{A} \in \mathbb{R}^{n \times n}\) is invertible (or nonsingular) if there exists a matrix \(\mathbf{B} \in \mathbb{R}^{n \times n}\) such that \[ \mathbf{A}\mathbf{B} = \mathbf{I}_n = \mathbf{B}\mathbf{A} .\]
In this case, \(\mathbf{B}\) is called the inverse of \(\mathbf{A}\) and is denoted \(\mathbf{A}^{-1}\).

If no such matrix exists, \(\mathbf{A}\) is singular or noninvertible.

Example 2.26 Let \[ \mathbf{A} = \begin{bmatrix} 2 & 1 \\ 7 & 4 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 4 & -1 \\ -7 & 2 \end{bmatrix}. \] We will show that \(\mathbf{A}\) and \(\mathbf{B}\) are inverses of each other by verifying that: \[ \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} = \mathbf{I} \] Notice that \[ \mathbf{A} \mathbf{B} = \begin{bmatrix} 2 & 1 \\ 7 & 4 \end{bmatrix} \begin{bmatrix} 4 & -1 \\ -7 & 2 \end{bmatrix} = \begin{bmatrix} (2)(4) + (1)(-7) & (2)(-1) + (1)(2) \\ (7)(4) + (4)(-7) & (7)(-1) + (4)(2) \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}. \] A similar calculation shows that \(\mathbf{B}\mathbf{A} = \mathbf{I}\).

Theorem 2.1 The inverse of a matrix \(\mathbf{A}\), when it exists, is unique.

For a 2×2 matrix
\[ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}, \] the inverse is
\[ \mathbf{A}^{-1} = \frac{1}{a_{11}a_{22} - a_{12}a_{21}} \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{bmatrix}, \] provided \(a_{11}a_{22} - a_{12}a_{21} \neq 0\). The term \(a_{11}a_{22} - a_{12}a_{21}\) is the determinant of \(\mathbf{A}\).

Example 2.27 Let
\[ \mathbf{A} = \begin{bmatrix} 4 & 3 \\ 2 & 1 \end{bmatrix}. \] The determinant is: \[ \det(\mathbf{A}) = (4)(1) - (3)(2) = 4 - 6 = -2. \] Applying the formula gives us the inverse of \(\mathbf{A}\): \[ \mathbf{A}^{-1} = \frac{1}{-2} \begin{bmatrix} 1 & -3 \\ -2 & 4 \end{bmatrix} = \begin{bmatrix} -\tfrac{1}{2} & \tfrac{3}{2} \\ 1 & -2 \end{bmatrix}. \]

Lemma 2.3 For a square non-singular matrix \(\mathbf{A}\), the following are true:

  1. \(\mathbf{A}\mathbf{A}^{-1} = \mathbf{I} = \mathbf{A}^{-1}\mathbf{A}\),
  2. \((\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1}\),
  3. \((\mathbf{A} + \mathbf{B})^{-1} \neq \mathbf{A}^{-1} + \mathbf{B}^{-1}\)

Example 2.28 Let \[ \mathbf{A} = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix} \] Then, using the definition of the 2x2 inverse, \[ \mathbf{A}^{-1} = \begin{bmatrix} 1 & 0 \\ 0 & \tfrac{1}{2} \end{bmatrix}, \quad \mathbf{B}^{-1} = \begin{bmatrix} \tfrac{1}{2} & 0 \\ 0 & 1 \end{bmatrix} \] So, their sum is \[ \mathbf{A}^{-1} + \mathbf{B}^{-1} = \begin{bmatrix} 1 + \tfrac{1}{2} & 0 \\ 0 & \tfrac{1}{2} + 1 \end{bmatrix} = \begin{bmatrix} \tfrac{3}{2} & 0 \\ 0 & \tfrac{3}{2} \end{bmatrix} \]

On the other hand, \[ \mathbf{A} + \mathbf{B} = \begin{bmatrix} 1 + 2 & 0 \\ 0 & 2 + 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 0 & 3 \end{bmatrix}. \] Using the definition of the inverse, we get: \[ (\mathbf{A} + \mathbf{B})^{-1} = \begin{bmatrix} \tfrac{1}{3} & 0 \\ 0 & \tfrac{1}{3} \end{bmatrix} \] Comparing, we see \[ (\mathbf{A} + \mathbf{B})^{-1} = \begin{bmatrix} \tfrac{1}{3} & 0 \\ 0 & \tfrac{1}{3} \end{bmatrix} \quad \text{vs.} \quad \mathbf{A}^{-1} + \mathbf{B}^{-1} = \begin{bmatrix} \tfrac{3}{2} & 0 \\ 0 & \tfrac{3}{2} \end{bmatrix}. \] Clearly, \[ (\mathbf{A} + \mathbf{B})^{-1} \neq \mathbf{A}^{-1} + \mathbf{B}^{-1}. \]


2.2.6 Matrix Transpose

Definition 2.14 The transpose of \(\mathbf{A} \in \mathbb{R}^{m \times n}\) is \(\mathbf{A}^\top \in \mathbb{R}^{n \times m}\), obtained by interchanging rows and columns.

Example 2.29 Let
\[ \mathbf{A} = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \end{bmatrix}. \] Matrix \(\mathbf{A}\) is a \(2 \times 3\) matrix (2 rows, 3 columns). The transpose of \(\mathbf{A}\), denoted \(\mathbf{A}^\top\), is formed by turning rows into columns: \[ \mathbf{A}^\top = \begin{bmatrix} 1 & 2 \\ 4 & 5 \\ 7 & 8 \end{bmatrix}. \] Now \(\mathbf{A}^\top\) is a \(3 \times 2\) matrix (3 rows, 2 columns).

Lemma 2.4 For \(\mathbf{A} \in \mathbb{R}^{m \times n}\) and \(\mathbf{B} \in \mathbb{R}^{n \times p}\), the following properties hold:

  1. \((\mathbf{A}^\top)^\top = \mathbf{A}\),
  2. \((\mathbf{A}\mathbf{B})^\top = \mathbf{B}^\top \mathbf{A}^\top\),
  3. \((\mathbf{A} + \mathbf{B})^\top = \mathbf{A}^\top + \mathbf{B}^\top\).

Example 2.30 Prove that for any two matrices \(\mathbf{A}\) and \(\mathbf{B}\) of the same size,
\[ (\mathbf{A} + \mathbf{B})^\top = \mathbf{A}^\top + \mathbf{B}^\top. \]

Proof: Let \(\mathbf{A} = [a_{ij}]\) and \(\mathbf{B} = [b_{ij}]\) be \(m \times n\) matrices.
Then the sum \(\mathbf{A} + \mathbf{B}\) is defined elementwise as: \[ (\mathbf{A} + \mathbf{B})_{ij} = a_{ij} + b_{ij}. \] The transpose of a matrix swaps its rows and columns. So, the \((i, j)\)-th entry of \((\mathbf{A} + \mathbf{B})^\top\) is: \[ (\mathbf{A} + \mathbf{B})^\top_{ij} = (\mathbf{A} + \mathbf{B})_{ji} = a_{ji} + b_{ji}. \]

Now, consider \(\mathbf{A}^\top + \mathbf{B}^\top\). The \((i, j)\)-th entry of this matrix is: \[ (\mathbf{A}^\top + \mathbf{B}^\top)_{ij} = \mathbf{A}^\top_{ij} + \mathbf{B}^\top_{ij} = a_{ji} + b_{ji}. \]

Since the corresponding entries are equal for all \(i, j\), \[ (\mathbf{A} + \mathbf{B})^\top = \mathbf{A}^\top + \mathbf{B}^\top. \]


2.2.7 Symmetric Matrices

Definition 2.15 matrix \(\mathbf{A} \in \mathbb{R}^{n \times n}\) is symmetric if \(\mathbf{A} = \mathbf{A}^\top\).

Only square matrices can be symmetric. If \(\mathbf{A}\) is invertible, then \(\mathbf{A}^\top\) is also invertible and
\[ (\mathbf{A}^{-1})^\top = (\mathbf{A}^\top)^{-1}. \] The sum of symmetric matrices is symmetric, but their product generally is not.

Example 2.31 Matrix \(\mathbf{A} \in \mathbb{R}^{4 \times 4}\) is symmetric \[ \mathbf{A} = \begin{pmatrix} 2 & 1 & 0 & -1 \\ 1 & 3 & 4 & 2 \\ 0 & 4 & 5 & 3 \\ -1 & 2 & 3 & 6 \end{pmatrix}. \] Notice that since \(\mathbf{A}\) is symmetric, we have \((\mathbf{A})_{ij} = (\mathbf{A})_{ji}\).


2.2.8 Scalar Multiplication

Matrices scale the same way that vectors do.

Definition 2.16 For \(\mathbf{A} \in \mathbb{R}^{m \times n}\), and \(\lambda \in \mathbb{R}\), scalar multiplication is defined componentwise as: \[ \lambda (\mathbf{A})_{ij} = \lambda a_{ij} \]

Example 2.32 Let \[ \mathbf{A} = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}, \quad \text{and let the scalar } k = 3. \] Then the scalar multiplication \(k \cdot \mathbf{A}\) is: \[ 3 \cdot \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} = \begin{pmatrix} 3 & 6 \\ 9 & 12 \end{pmatrix}. \]

Lemma 2.5 For \(\mathbf{A}, \mathbf{\mathbf{B}} \in \mathbb{R}^{m \times n}\) and scalars \(\lambda, \psi \in \mathbb{R}\): \[ (\lambda + \psi)\mathbf{A} = \lambda \mathbf{A} + \psi \mathbf{A}, \quad \lambda(\mathbf{A} + \mathbf{\mathbf{B}}) = \lambda \mathbf{A} + \lambda \mathbf{\mathbf{B}}. \]


2.2.9 Compact Form of Linear Systems

A system of linear equations such as \[ \begin{aligned} 2x_1 + 3x_2 + 5x_3 &= 1 \\ 4x_1 - 2x_2 - 7x_3 &= 8 \\ 9x_1 + 5x_2 - 3x_3 &= 2 \end{aligned} \] can be written in matrix form as
\[ \mathbf{A}\mathbf{x} = \mathbf{b}, \quad \text{where} \quad \mathbf{A} = \begin{bmatrix} 2 & 3 & 5 \\ 4 & -2 & -7 \\ 9 & 5 & -3 \end{bmatrix}, \quad \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 1 \\ 8 \\ 2 \end{bmatrix}. \] This expresses the system as a linear combination of the columns of \(\mathbf{A}\).

Example 2.33 Consider the system of equations: \[ \begin{cases} 2x + 3y = 5 \\ 4x - y = 1 \end{cases}. \] We can write this in matrix form as: \[ \begin{pmatrix} 2 & 3 \\ 4 & -1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 5 \\ 1 \end{pmatrix}. \]

Example 2.34 Consider the matrix equation: \[ \begin{pmatrix} 1 & 2 & -1 \\ 3 & 0 & 4 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 7 \\ 5 \end{pmatrix}. \] This corresponds to the system of equations: \[ \begin{cases} x + 2y - z = 7 \\ 3x + 0 \cdot y + 4z = 5 \end{cases}. \]

Exercises

Exercise 2.17 Compute \(\begin{bmatrix} 2 & -3\\ 1 & 0 \\ -1 & 3\end{bmatrix} + \begin{bmatrix} 9 & -5 \\ 0 & 13 \\ -1 & 3\end{bmatrix}\).

Exercise 2.18 Compute \(\begin{bmatrix} 2 & -3\\ 1 & 0 \\ -1 & 3\end{bmatrix} - \begin{bmatrix} 9 & -5 \\ 0 & 13 \\ -1 & 3\end{bmatrix}\).

Exercise 2.19 Compute \(\begin{bmatrix} 3 & 6\\2 & 4\end{bmatrix} \begin{bmatrix} 1 & 3\\0 & 2\end{bmatrix}\).

Exercise 2.20 Compute \(\begin{bmatrix} 1&2&3\\4&3&2\end{bmatrix} \begin{bmatrix} 2&3\\3&4\\1&2\end{bmatrix}\)

Exercise 2.21 The textbook talks about \(\mathbf{A}\mathbf{B} = \mathbf{C}\), where \(c_{ij} = \sum_{l=1}^n a_{il}b_{lj}\). Use this definition to find \(c_{12}\) using this definition if \(\mathbf{A} = \begin{bmatrix} 1&2&3\\4&3&2\end{bmatrix}\) and \(\mathbf{B} = \begin{bmatrix} 2&3\\3&4\\1&2\end{bmatrix}\)

Exercise 2.22 Is \(\begin{bmatrix} 1&-2\\2&5\end{bmatrix}\) the inverse of \(\begin{bmatrix} 5&-2\\2&-1\end{bmatrix}\)?

Exercise 2.23 Prove that the inverse of a matrix \(\mathbf{A}\) is unique (if it exists).

Exercise 2.24 Show that if \(\mathbf{A}^{-1}\) exists, then \(\det(\mathbf{A}) \neq 0\).

Exercise 2.25 Find \(\mathbf{A}^{-1}\) and \(\mathbf{A}^T\) for \(\begin{bmatrix} 2&-1\\-4&3\end{bmatrix}\).

Exercise 2.26 Find \(\mathbf{A}^{-1}\) and \(\mathbf{A}^T\) for \(\begin{bmatrix} 3&4/3\\-3&-1\end{bmatrix}\).

Exercise 2.27 Show that, for a 2 x 2 matrix, that \((\mathbf{A}^T)^{-1} = (\mathbf{A}^{-1})^T\).

Exercise 2.28 Show that the sum of symmetric matrices is symmetric.

Exercise 2.29 Find an example where the product of symmetric matrices is not symmetric.

Exercise 2.30 Prove that if \(\mathbf{A}\mathbf{B} = \mathbf{I}\), then \(\mathbf{B}\mathbf{A} = \mathbf{I}\).

Exercise 2.31 Prove Lemma 2.3

Exercise 2.32 Prove Lemma 2.4

Exercise 2.33 Prove Lemma 2.5