4.6 Matrix Approximation via SVD

The SVD of a matrix \(\mathbf{A} \in \mathbb{R}^{m \times n}\): \[ \mathbf{A} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^\top \] allows us to represent \(\mathbf{A}\) as a sum of rank-1 matrices: \[ \mathbf{A}_i := u_i v_i^\top, \quad \mathbf{A} = \sum_{i=1}^r \sigma_i \mathbf{A}_i, \] where \(r = \text{rank}(\mathbf{A})\) and \(\sigma_i\) are singular values.

Definition 4.10 A rank-k approximation of \(\mathbf{A}\) (with \(k < r\)) is: \[ \mathbf{A}^{(k)} = \sum_{i=1}^k \sigma_i u_i v_i^\top. \]

A rank-\(k\) approximation reduces storage and computation costs compared to the original. For example, a \(1432 \times 1910\) image approximated with rank-5 requires 16,715 numbers instead of 2,735,120 (~0.6%).

Example 4.22 Write \(\mathbf{A}\) as the sum of rank 1 matrices where \[ \mathbf{A} = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}. \]

Step 1: Singular Value Decomposition

Since \(\mathbf{A}\) is diagonal with positive entries, its SVD is especially simple: \[ \mathbf{A} = \mathbf{U}\boldsymbol{\Sigma}\mathbf{V}^\top, \] where \[ \mathbf{U} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad \boldsymbol{\Sigma} = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}, \quad \mathbf{V} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}. \]

The singular values are \[ \sigma_1 = 3, \quad \sigma_2 = 1. \]

Step 2: Rank-1 Decomposition

The SVD gives the expansion \[ \mathbf{A} = \sum_{i=1}^2 \sigma_i \mathbf{u}_i \mathbf{v}_i^\top, \] where \(\mathbf{u}_i\) and \(\mathbf{v}_i\) are the columns of \(\mathbf{U}\) and \(\mathbf{V}\).

Step 3: Individual Rank-1 Matrices

  • First term: \[ \sigma_1 \mathbf{u}_1 \mathbf{v}_1^\top = 3 \begin{bmatrix} 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 & 0 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 0 & 0 \end{bmatrix}. \]

  • Second term: \[ \sigma_2 \mathbf{u}_2 \mathbf{v}_2^\top = 1 \begin{bmatrix} 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 & 1 \end{bmatrix} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}. \]

Each matrix has rank 1.

Step 4: Sum of Rank-1 Matrices

Adding the two rank-1 matrices: \[ \mathbf{A} = \begin{bmatrix} 3 & 0 \\ 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}. \]

Each term \(\sigma_i \mathbf{u}_i \mathbf{v}_i^\top\) is a rank-1 matrix. The SVD expresses \(\mathbf{A}\) as a sum of rank-1 outer products. Truncating this sum gives the best low-rank approximation of \(\mathbf{A}\) (in the least-squares sense).

4.6.1 Error Measurement

The spectral norm of a matrix \(\mathbf{A}\) is: \[ \|\mathbf{A}\|_2 := \max_{x \neq 0} \frac{\|\mathbf{A}x\|_2}{\|x\|_2}. \] The spectral norm of \(\mathbf{A}\) is its largest singular value \(\sigma_1\).

Theorem 4.8 Eckart-Young Theorem: For any rank-\(k\) approximation \(\mathbf{A}^{(k)}\): \[ \mathbf{A}^{(k)} = \arg \min_{\text{rank}(\mathbf{B}) = k} \|\mathbf{A} - \mathbf{B}\|_2 \] \[ \|\mathbf{A} - \mathbf{A}^{(k)}\|_2 = \sigma_{k+1} \]

The SVD provides the best low-rank approximation in the spectral norm sense. This is used for lossy compression, dimensionality reduction, noise filtering, and regularization.

Example 4.23 Consider the matrix \[ \mathbf{A} = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}. \]

Since \(\mathbf{A}\) is diagonal, its singular values are simply the diagonal entries in descending order: \[ \sigma_1 = 3, \quad \sigma_2 = 1. \] We can write the SVD as \[ \mathbf{A} = U \Sigma V^\top \] with \[ U = V = I_2, \quad \Sigma = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}. \]

The rank-1 approximation of \(\mathbf{A}\) is obtained by keeping only the largest singular value \(\sigma_1 = 3\): \[ \mathbf{A}_1 = \sigma_1 u_1 v_1^\top = 3 \begin{bmatrix} 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 & 0 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 0 & 0 \end{bmatrix}. \]

The Eckart-Young Theorem states that this rank-1 approximation \(\mathbf{A}_1\) is optimal in the sense that it minimizes the spectral norm of the difference: \[ \|\mathbf{A} - \mathbf{A}_1\|_2 = \sigma_2 = 1. \]

Indeed: \[ \mathbf{A} - \mathbf{A}_1 = \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix} - \begin{bmatrix} 3 & 0 \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}. \] The largest singular value of this difference is 1, which matches \(\sigma_2\).

Therefore, \(\mathbf{A}_1\) captures the most significant component of \(\mathbf{A}\). The error in the approximation, measured by the 2-norm, is exactly the next singular value \(\sigma_2 = 1\).


Exercises

Exercise 4.38 Find a SVD of \[\mathbf{A} = \begin{bmatrix}3&0\\4&5 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).

Exercise 4.39 Find a SVD of \[\mathbf{A} = \begin{bmatrix}3&2&2\\2&3&-2 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).

Exercise 4.40 Find a SVD of \[\mathbf{A} = \begin{bmatrix}4&0\\3&-5 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).

Exercise 4.41 Find a SVD of \[\mathbf{A} = \begin{bmatrix}1&0&1\\-1&1&0 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).

Exercise 4.42 Find a SVD of \[\mathbf{A} = \begin{bmatrix}1&-1\\0&1\\1&0 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).

Exercise 4.43 Find a SVD of \[\mathbf{A} = \begin{bmatrix}1&1&1\\ -1&0&-2\\ 1&2&0 \end{bmatrix}.\] Find the rank 1 approximation for \(\mathbf{A}\).