3.8 Orthogonal Projections
Definition 3.11 A projection \(\pi: V \to U\) satisfies \(\pi^2 = \pi\). The corresponding projection matrix \(\mathbf{P}_\pi\) also satisfies \(\mathbf{P}_\pi^2 = \mathbf{P}_\pi\).
Projections are linear transformations that map vectors in a vector space \(V\) onto a subspace \(U \subseteq V\). Applying the projection twice does not change the result — once a vector has been projected, it already lies in the subspace.
Formally, for any \(\mathbf{x} \in V\): \[ \pi(\pi(\mathbf{x})) = \pi(\mathbf{x}) \quad \text{or equivalently} \quad \mathbf{P}_\pi^2 = \mathbf{P}_\pi. \]
Example 3.20 Example:
Let \(\mathbf{B} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\). Then the projection \(\mathbf{P}\), where
\[
\mathbf{P} = \mathbf{B}(\mathbf{B}^\top \mathbf{B})^{-1} \mathbf{B}^\top =
\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix},
\]
projects any vector in \(\mathbb{R}^2\) onto the x-axis. For example:
\[\mathbf{P}\begin{bmatrix} a \\ b \end{bmatrix} =\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} a \\ 0 \end{bmatrix}.\] Applying the projection twice leaves the result unchanged:
\[\mathbf{P} \mathbf{P}\begin{bmatrix} a \\ b \end{bmatrix} = \mathbf{P} \begin{bmatrix} a \\ 0 \end{bmatrix}= \begin{bmatrix} a \\ 0 \end{bmatrix}.\]
Example 3.21 Let \[ U = \text{span}\!\left(\begin{bmatrix} 1 \\ 1 \end{bmatrix}\right), \quad \mathbf{B} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \end{bmatrix}. \] Then \[ \mathbf{P} = \mathbf{B} \mathbf{B}^\top = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}. \] For \(\mathbf{x} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}\), we have \[ \mathbf{P} \mathbf{x} = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 1 \end{bmatrix} =\frac{1}{2}\begin{bmatrix} 3 \\ 3 \end{bmatrix} = \begin{bmatrix} 1.5 \\ 1.5 \end{bmatrix}. \] Hence, \(\mathbf{P} \mathbf{x}\) is the orthogonal projection of \(\mathbf{x}\) onto the line \(\mathbf{y} = \mathbf{x}\). Again notice that \(\mathbf{P}\mathbf{P}\mathbf{x} = \mathbf{P}x\).
3.8.1 Projection onto a Line
Definition 3.12 For a line \(U = \text{span}(\mathbf{b})\), the projection of \(\mathbf{x} \in \mathbb{R}^n\) is: \[ \pi_U(\mathbf{x}) = \frac{\mathbf{b}^T \mathbf{x}}{\mathbf{b}^T \mathbf{b}} \mathbf{b} = \frac{\mathbf{b}^T \mathbf{x}}{\|\mathbf{b}\|^2} \mathbf{b}= \frac{\langle \mathbf{b}, \mathbf{x} \rangle}{\|\mathbf{b}\|^2} \mathbf{b} \] \[ P_\pi = \frac{\mathbf{b} \mathbf{b}^T}{\|\mathbf{b}\|^2} \]
There are many different notations people use for projections onto a line. So, you may see any of the notations used in the definition.
Example 3.22 Let
\[
\mathbf{u} = \begin{bmatrix} 3 \\ 4 \end{bmatrix},
\qquad
\mathbf{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}.
\]
The projection of \(\mathbf{u}\) onto \(\mathbf{v}\) is: \[ \pi_{\text{Span}(v)}(\mathbf{u}) = \text{proj}_{\mathbf{v}}(\mathbf{u}) = \frac{\langle \mathbf{u}, \mathbf{v} \rangle}{\lVert \mathbf{v} \rVert^2} \mathbf{v}. \] We have \[ \langle \mathbf{u},\mathbf{v}\rangle = 3(1) + 4(2) = 11, \] and \[ \lVert \mathbf{v} \rVert^2 = 1^2 + 2^2 = 5. \] Therefore, \[ \text{proj}_{\mathbf{v}}(\mathbf{u}) = \frac{11}{5} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 11/5 \\ 22/5 \end{bmatrix}. \]
If \(\|\mathbf{b}\| = 1\), then \(\pi_U(\mathbf{x}) = (\mathbf{b}^T \mathbf{x}) \mathbf{b}\).
Example 3.23 Let
\[
\mathbf{u} = \begin{bmatrix} 6 \\ 2 \\ -1 \end{bmatrix},
\qquad
\mathbf{v} = \frac{1}{\sqrt{14}}
\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}.
\]
Note that \(\mathbf{v}\) is a unit vector.
Since \(\mathbf{v}\) is a unit vector,
\[ \text{proj}_{\mathbf{v}}(\mathbf{u}) = \langle \mathbf{u}, \mathbf{v} \rangle \, \mathbf{v}. \]
We know that \[ \langle \mathbf{u}, \mathbf{v} \rangle = \left\langle \begin{bmatrix} 6 \\ 2 \\ -1 \end{bmatrix}, \frac{1}{\sqrt{14}} \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \right\rangle = \frac{1}{\sqrt{14}} (6(1) + 2(2) - 1(3)) \]
\[ = \frac{1}{\sqrt{14}} (6 + 4 - 3) = \frac{7}{\sqrt{14}}, \]
and \[ \text{proj}_{\mathbf{v}}(\mathbf{u}) = \frac{7}{\sqrt{14}} \cdot \frac{1}{\sqrt{14}} \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} = \frac{7}{14} \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}. \]
3.8.2 Projection onto General Subspaces
For subspace \(U = \text{span}(\mathbf{b}_1, ..., \mathbf{b}_m)\) with \(\mathbf{B} = [\mathbf{b}_1, ..., \mathbf{b}_m]\): \[ \pi_U(\mathbf{x}) = \mathbf{B}(\mathbf{B}^T \mathbf{B})^{-1} \mathbf{B}^T \mathbf{x} \] \[ P_\pi = \mathbf{B}(\mathbf{B}^T \mathbf{B})^{-1} \mathbf{B}^T. \] The normal equation \(\mathbf{B}^T \mathbf{B} \mathbf{\lambda} = \mathbf{B}^T \mathbf{x}\) gives the coordinates \(\mathbf{\lambda}\). If \(\{\mathbf{b}_i\}\) form an ONB, the formula simplifies to: \[ \pi_U(\mathbf{x}) = \mathbf{B} \mathbf{B}^T \mathbf{x}. \] This is computationally efficient since \(\mathbf{B}^T \mathbf{B} = \mathbf{I}\).
Example 3.24 Project the vector \(\mathbf{x} = \begin{bmatrix}6\\0\\0\end{bmatrix}\in\mathbb{R}^3\) onto the subspace \[ U=\operatorname{span}\{\mathbf{b}_1,\mathbf{b}_2\},\qquad \mathbf{b}_1=\begin{bmatrix}1\\1\\1\end{bmatrix},\quad \mathbf{b}_2=\begin{bmatrix}0\\1\\2\end{bmatrix}. \]
Step 1 — form the basis matrix
\[ \mathbf{B} = [\,\mathbf{b}_1\ \mathbf{b}_2\,] = \begin{bmatrix} 1 & 0\\[4pt] 1 & 1\\[4pt] 1 & 2 \end{bmatrix}\in\mathbb{R}^{3\times 2}. \]
Step 2 — normal equations \(\mathbf{B}^\top \mathbf{B} \mathbf{\lambda} = \mathbf{B}^\top \mathbf{x}\)
Compute \[ \mathbf{B}^\top \mathbf{B} = \begin{bmatrix} 1 & 1 & 1\\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} 1 & 0\\ 1 & 1\\ 1 & 2 \end{bmatrix} = \begin{bmatrix} 3 & 3\\[4pt] 3 & 5 \end{bmatrix}, \qquad B^\top \mathbf{x} = \begin{bmatrix} 1 & 1 & 1\\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix}6\\0\\0\end{bmatrix} = \begin{bmatrix}6\\0\end{bmatrix}. \]
Solve \((\mathbf{B}^\top \mathbf{B}) \mathbf{\lambda} = \mathbf{B}^\top \mathbf{x}\): \[ \begin{bmatrix}3 & 3\\[4pt]3 & 5\end{bmatrix} \begin{bmatrix}\lambda_1\\[4pt]\lambda_2\end{bmatrix} = \begin{bmatrix}6\\[4pt]0\end{bmatrix}. \] Compute inverse (or solve directly). The inverse is \[ (\mathbf{B}^\top \mathbf{B})^{-1} = \frac{1}{(3)(5)-3\cdot3} \begin{bmatrix}5 & -3\\[4pt]-3 & 3\end{bmatrix} = \frac{1}{6}\begin{bmatrix}5 & -3\\[4pt]-3 & 3\end{bmatrix}. \] Thus \[ \mathbf{\lambda} = (\mathbf{B}^\top \mathbf{B})^{-1}\mathbf{B}^\top \mathbf{x} = \frac{1}{6}\begin{bmatrix}5 & -3\\[4pt]-3 & 3\end{bmatrix}\begin{bmatrix}6\\[4pt]0\end{bmatrix} = \frac{1}{6}\begin{bmatrix}30\\[4pt]-18\end{bmatrix} = \begin{bmatrix}5\\[4pt]-3\end{bmatrix}. \]
Step 3 — projection point \(\pi_U(\mathbf{x}) = \mathbf{B}\mathbf{\lambda}\)
\[ \pi_U(\mathbf{x}) = \mathbf{B} \mathbf{\lambda} = \begin{bmatrix} 1 & 0\\[4pt] 1 & 1\\[4pt] 1 & 2 \end{bmatrix} \begin{bmatrix}5\\[4pt]-3\end{bmatrix} = \begin{bmatrix} 5\\[4pt] 5+(-3)\\[4pt] 5+2(-3) \end{bmatrix} = \begin{bmatrix} 5\\[4pt]2\\[4pt]-1 \end{bmatrix}. \]
Step 4 — projection matrix (optional)
The projection matrix onto \(U\) is \[ \mathbf{P} = \mathbf{B}(\mathbf{B}^\top \mathbf{B})^{-1}\mathbf{B}^\top = \begin{bmatrix} 1 & 0\\[4pt] 1 & 1\\[4pt] 1 & 2 \end{bmatrix} \frac{1}{6}\begin{bmatrix}5 & -3\\[4pt]-3 & 3\end{bmatrix} \begin{bmatrix} 1 & 1 & 1\\[4pt] 0 & 1 & 2 \end{bmatrix}\;=\; \frac{1}{6} \begin{bmatrix} 5 & 2 & -1\\[6pt] 2 & 2 & 2\\[6pt] -1 & 2 & 5 \end{bmatrix}, \] and one can verify \(\mathbf{P}\mathbf{x}=\pi_U(\mathbf{x})\).
3.8.3 Gram-Schmidt Orthogonalization
The Gram-Schmidt algorithm is Used to construct an orthogonal (or orthonormal) basis from any basis \(\{\mathbf{b}_1, ..., \mathbf{b}_n\}\): \[ \mathbf{u}_1 = \mathbf{b}_1, \quad \mathbf{u}_k = \mathbf{b}_k - \pi_{\text{span}(\mathbf{u}_1, ..., \mathbf{u}_{k-1})}(\mathbf{b}_k). \] After orthogonalization, normalize each \(\mathbf{u}_k\) to form an ONB.
Example 3.25 In \(\mathbb{R}^2\), find an ONB given \(\mathbf{b}_1 = \begin{bmatrix}2 \\ 0\end{bmatrix}\) and, \(\mathbf{b}_2 = \begin{bmatrix}1 \\ 1\end{bmatrix}\).
Set \(\mathbf{u}_1 = \mathbf{b}_1\). Then \[ \mathbf{u}_2 = \mathbf{b}_2 - \text{proj}_{\mathbf{u_1}}(\mathbf{b}_2) = \dfrac{\langle \mathbf{u}_1, \mathbf{b}_2 \rangle}{\|\mathbf{u}_1 \|^2} \mathbf{u}_1 = \begin{bmatrix}1 \\ 1\end{bmatrix} - \dfrac{1}{2} \begin{bmatrix}2 \\ 0\end{bmatrix} = \begin{bmatrix}0 \\ 1\end{bmatrix}. \] We normalize to find the ONB: \[\mathbf{v}_1 = \dfrac{1}{2}\begin{bmatrix}2 \\ 0\end{bmatrix} = \begin{bmatrix}1 \\ 0\end{bmatrix} \qquad \mathbf{v}_2 = \begin{bmatrix}0 \\ 1 \end{bmatrix}.\]
Example 3.26 Put example here
3.8.4 Projection onto Affine Subspaces
For an affine subspace \(L = \mathbf{x}_0 + U\): \[ \pi_L(\mathbf{x}) = \mathbf{x}_0 + \pi_U(\mathbf{x} - \mathbf{x}_0). \] The distance from \(\mathbf{x}\) to \(L\) equals the distance from \(\mathbf{x} - \mathbf{x}_0\) to \(U\): \[ d(\mathbf{x}, L) = \|\mathbf{x} - \pi_L(\mathbf{x})\| = \|\mathbf{x} - \mathbf{x}_0 - \pi_U(\mathbf{x} - \mathbf{x}_0)\|. \] This concept is fundamental to support vector machines and hyperplane separation in later chapters.
Exercises
Exercise 3.50
Find the scalar projection (the component), vector projection and orthogonal projection of \(\mathbf{u} = \begin{bmatrix}2\\-3\\1 \end{bmatrix}\) onto \(\mathbf{d} =\begin{bmatrix}1 \\ -2 \\ 3\end{bmatrix}\).Exercise 3.51
Find the scalar projection (the component), vector projection and orthogonal projection of \(\mathbf{u} = \begin{bmatrix}3\\-4\\5 \end{bmatrix}\) onto \(\mathbf{d} =\begin{bmatrix}1 \\ 0 \\ 3\end{bmatrix}\).Exercise 3.52
Find the scalar projection (the component), vector projection and orthogonal projection of \(\mathbf{u} = \begin{bmatrix}5\\7\\1 \end{bmatrix}\) onto \(\mathbf{d} = \begin{bmatrix}2\\-1\\3\end{bmatrix}\).Exercise 3.53
Find the scalar projection (the component), vector projection and orthogonal projection of \(\mathbf{u} = \begin{bmatrix}3\\-2\\1 \end{bmatrix}\) onto \(\mathbf{d} = \begin{bmatrix}4\\1\\1\end{bmatrix}\).Exercise 3.54
Find the shortest Euclidean distance from the point \(P(1,3,-2)\) to the line through \(P_0(2,0,-1)\) with direction \(\mathbf{d} = \begin{bmatrix}1\\-1\\0\end{bmatrix}\)Exercise 3.55
Given \(B=\{\mathbf{u}_1,\mathbf{u}_2\}\), where \(\mathbf{u}_1=\begin{bmatrix} 1\\1\end{bmatrix}\), and \(\mathbf{u}_2=\begin{bmatrix}2\\1\end{bmatrix}\), use the Gram-Schmidt procedure to find a corresponding orthonormal basis.Exercise 3.56
Given \(B=\{\mathbf{u}_1,\mathbf{u}_2\}\), where \(\mathbf{u}_1=\begin{bmatrix} 1\\0\end{bmatrix}\), and \(\mathbf{u}_2=\begin{bmatrix}1\\1\end{bmatrix}\), use the Gram-Schmidt procedure to find a corresponding orthonormal basis.Exercise 3.57
Given \(B=\{\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3\}\), where \(\mathbf{u}_1=\begin{bmatrix} 1\\2\\0\end{bmatrix}, \mathbf{u}_2=\begin{bmatrix}8\\1\\-6\end{bmatrix}\) and \(\mathbf{u}_3 =\begin{bmatrix}0\\0\\1\end{bmatrix}\), use the Gram-Schmidt procedure to find a corresponding orthonormal basis.Exercise 3.58
Given \(B=\{\mathbf{u}_1,\mathbf{u}_2, \mathbf{u}_3\}\), where \(\mathbf{u}_1=\begin{bmatrix} 1\\1\\1\\1\end{bmatrix}, \mathbf{u}_2=\begin{bmatrix}1\\1\\-1\\-1\end{bmatrix}\) and \(\mathbf{u}_3 =\begin{bmatrix}0\\-1\\2\\1\end{bmatrix}\), use the Gram-Schmidt procedure to find a corresponding orthonormal basis.Exercise 3.59
Verify that \(\{\begin{bmatrix} -2\\1\\3\\0\end{bmatrix},\begin{bmatrix}0\\-3\\1\\-6\end{bmatrix}, \begin{bmatrix}-2\\-4\\0\\2\end{bmatrix}\}\) is an orthogonal set of vectors in \(\mathbb{R}^4\), and use it to construct an orthonormal basis in \(\mathbb{R}^4\).Exercise 3.60 Obtain an orthonormal basis for the subspace of \(\mathbb{R}^4\) spanned by \(\{\begin{bmatrix}1\\0\\1\\0\end{bmatrix},\begin{bmatrix}1\\1\\1\\1\end{bmatrix}, \begin{bmatrix}-1\\2\\0\\1\end{bmatrix}\}\).
Exercise 3.61
Exercise 3.62