2.3 Solving Systems of Equations
We can represent a system of linear equations as
\[
\begin{aligned}
a_{11}x_1 + \dots + a_{1n}x_n &= b_1 \\
\vdots \quad\quad \quad\quad \quad\quad \vdots \quad & \quad \vdots \\
a_{m1}x_1 + \dots + a_{mn}x_n &= b_m
\end{aligned}
\]
where \(a_{ij}, b_i \in \mathbb{R}\) are known constants and \(x_j\) are unknowns. This can be written compactly in matrix form as
\[
\mathbf{A}\mathbf{x} = \mathbf{b}.
\]
Matrices allow for a compact representation and straightforward manipulation of linear systems. Next, we focus on finding solutions to these systems and introduce the idea of a matrix inverse.
2.3.1 Particular and General Solutions
Consider a system with fewer equations than unknowns. Such systems often have infinitely many solutions. To solve such a system of equations, follow these steps:
- Find a particular solution to \(\mathbf{A}\mathbf{x} = \mathbf{b}\).
- Find all homogeneous solutions to \(\mathbf{A}\mathbf{x} = \mathbf{0}\).
- Combine the results:
\[ \mathbf{x} = \mathbf{x}_p + \mathbf{x}_h. \] Neither the particular solution nor the general solution is unique.
Example 2.35 \[ \begin{bmatrix} 1 & 0 & 8 & -4 \\ 0 & 1 & 2 & 12 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} 42 \\ 8 \end{bmatrix}. \]
From inspection, we can find one particular solution: \[ \mathbf{x}_p = \begin{bmatrix} 42 \\ 8 \\ 0 \\ 0 \end{bmatrix}. \]
To find all possible solutions, we add all non-trivial combinations that satisfy \(\mathbf{A}\mathbf{x} = \mathbf{0}\). Let \(c_i\) represent column \(i\). Then \[c_3 = 8\begin{bmatrix}1\\0 \end{bmatrix} + 2\begin{bmatrix}0 \\ 1 \end{bmatrix} = 8c_1 + 2 c_2,\] or rather \[ \begin{aligned} 8c_1 + 2c_2 - c_3 &= 0, \\ -4c_1 + 12c_2 - c_4 &= 0, \end{aligned} \] which yield two homogeneous solutions: \[ \mathbf{x}_1 = \begin{bmatrix} 8 \\ 2 \\ -1 \\ 0 \end{bmatrix}, \quad \mathbf{x}_2 = \begin{bmatrix} -4 \\ 12 \\ 0 \\ -1 \end{bmatrix}. \]
Thus, the general solution is \[ \mathbf{x} = \begin{bmatrix} 42 \\ 8 \\ 0 \\ 0 \end{bmatrix} + \lambda_1 \begin{bmatrix} 8 \\ 2 \\ -1 \\ 0 \end{bmatrix} + \lambda_2 \begin{bmatrix} -4 \\ 12 \\ 0 \\ -1 \end{bmatrix}, \quad \lambda_1, \lambda_2 \in \mathbb{R}. \]
In general, systems of equations are not as simple as the example above. To solve any linear system, we can use Gaussian elimination, an algorithmic procedure that:
- Applies elementary row transformations to simplify the system, and
- Transforms it into a form where the three steps above can be applied directly.
This provides a systematic way to find all solutions to \(\mathbf{A}\mathbf{x} = \mathbf{b}\).
2.3.2 Elementary Transformations
Definition 2.17 Elementary transformations are operations that simplify a system of linear equations without changing its solution set.
These are:
- Row exchange: Swap two equations (rows).
- Row scaling: Multiply an equation (row) by a nonzero scalar \(\lambda \in \mathbb{R} \setminus \{0\}\).
- Row addition: Add a multiple of one equation (row) to another.
These operations are key to transforming a system into simpler forms, such as row-echelon form (REF) or reduced row-echelon form (RREF).
Example 2.36 Given a system of equations, \[\begin{align*} -2x_1 + 4x_2 - 2x_3 - x_4 + 4x_5 &= 3\\ 4x_1 - 8x_2 + 3x_3 - 3x_4 + x_5 &= 2\\ x_1 - 2x_2 + x_3 - x_4 + x_5 &= 0\\ x_1 - 2x_2 - 3x_4 + 4x_5 &= a\\ \end{align*}\] we write it in augmented matrix form: \[ [\mathbf{A} | \mathbf{b}] = \begin{bmatrix} -2 & 4 & -2 & -1 & 4 & -3 \\ 4 & -8 & 3 & -3 & 1 & 2 \\ 1 & -2 & 1 & -1 & 1 & 0 \\ 1 & -2 & 0 & -3 & 4 & a \end{bmatrix} \]
By applying elementary row operations, we obtain a simpler (row-echelon) form: \[ \begin{bmatrix} 1 & -2 & 1 & -1 & 1 & 0 \\ 0 & 0 & 1 & -1 & 3 & -2 \\ 0 & 0 & 0 & 1 & -2 & 1 \\ 0 & 0 & 0 & 0 & 0 & a + 1 \end{bmatrix} \]
The resulting equations are: \[ \begin{aligned} x_1 - 2x_2 + x_3 - x_4 + x_5 &= 0 \\ x_3 - x_4 + 3x_5 &= -2 \\ x_4 - 2x_5 &= 1 \\ 0 &= a + 1 \end{aligned} \]
A particular solution exists only when \(a = -1\): \[ \mathbf{x}_p = \begin{bmatrix} 2 \\ 0 \\ -1 \\ 1 \\ 0 \end{bmatrix} \]
The general solution is: \[ \mathbf{x} = \begin{bmatrix} 2 \\ 0 \\ -1 \\ 1 \\ 0 \end{bmatrix} + \lambda_1 \begin{bmatrix} 2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} + \lambda_2 \begin{bmatrix} 2 \\ 0 \\ -1 \\ 2 \\ 1 \end{bmatrix}, \quad \lambda_1, \lambda_2 \in \mathbb{R}. \]
Definition 2.18 A matrix is in row-echelon form if:
- All-zero rows are at the bottom.
- Each pivot (first nonzero entry from the left) is strictly to the right of the pivot in the row above.
This produces a “staircase” structure.
Definition 2.19 Basic variables correspond to pivot columns. Free variables correspond to non-pivot columns.
In the previous example: \(x_1, x_3, x_4\) are basic; \(x_2, x_5\) are free.
Definition 2.20 A system is in reduced row-echelon form if:
- It is in REF.
- Each pivot is 1.
- Each pivot is the only nonzero entry in its column.
RREF makes it easy to:
- Identify basic and free variables.
- Read off particular and general solutions.
Gaussian elimination is a process that systematically applies elementary transformations to bring a matrix to RREF.
Gaussian elimination allows direct solution of \(\mathbf{A}\mathbf{x} = \mathbf{b}\) or \(\mathbf{A}\mathbf{x} = \mathbf{0}\).
Example 2.37 \[ \mathbf{A} = \begin{bmatrix} 1 & 3 & 0 & 0 & 3 \\ 0 & 0 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & -4 \end{bmatrix} \]
Pivot columns: 1, 3, and 4 (basic variables).
Non-pivot columns: 2 and 5 (free variables).
The general solution to \(\mathbf{A}\mathbf{x} = \mathbf{0}\) is: \[ \mathbf{x} = \lambda_1 \begin{bmatrix} 3 \\ -1 \\ 0 \\ 0 \\ 0 \end{bmatrix} + \lambda_2 \begin{bmatrix} 3 \\ 0 \\ 9 \\ -4 \\ -1 \end{bmatrix}, \quad \lambda_1, \lambda_2 \in \mathbb{R}. \]
2.3.3 The Minus-1 Trick
For a homogeneous system \(\mathbf{A}\mathbf{x} = \mathbf{0}\) in RREF:
- Extend \(\mathbf{A}\) to an \(n \times n\) matrix \(\tilde{\mathbf{A}}\) by adding rows with a single −1 where pivots are missing.
- The columns of \(\tilde{\mathbf{A}}\) containing −1 entries form a basis for the solution space (kernel/null space).
Example 2.38 For \[ \mathbf{A} = \begin{bmatrix} 1 & 3 & 0 & 0 & 3 \\ 0 & 0 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & -4 \end{bmatrix}, \] we use the -1 trick to get \[ \tilde{\mathbf{A}} = \begin{bmatrix} 1 & 3 & 0 & 0 & 3 \\ 0 & -1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & -4 \\ 0 & 0 & 0 & 0 & -1 \end{bmatrix}. \] Columns 2 and 5 are exactly the solutions for the homogeneous equation.
The columns with −1 yield the same solution vectors as before.
Example 2.39 Consider the homogeneous system: \[ \begin{cases} x + 2y - z = 0 \\ 2x + 4y - 2z = 0. \end{cases} \] We can write this as an augmented matrix: \[ \left[\begin{array}{ccc|c} 1 & 2 & -1 & 0 \\ 2 & 4 & -2 & 0 \end{array}\right]. \] Row reduction gives us \[ \left[\begin{array}{ccc|c} 1 & 2 & -1 & 0 \\ 0 & 0 & 0 & 0 \end{array}\right] \] We rewrite with the -1 trick: \[ \left[\begin{array}{ccc|c} 1 & 2 & -1 & 0 \\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ \end{array}\right] \] Hence the general (homogeneous) solution is:
\[ \mathbf{x} = s\begin{bmatrix} 2 \\ -1 \\ 0 \end{bmatrix} + t\begin{bmatrix} -1 \\ 0 \\ -1 \end{bmatrix}, \quad s,t \in \mathbb{R}. \]
2.3.4 Calculating an Inverse Matrix via Gaussian Elimination
Lemma 2.6 To find \(\mathbf{A}^{-1}\) for \(\mathbf{A} \in \mathbb{R}^{n \times n}\), use Gaussian elimination to transform: \[ [\mathbf{A} | \mathbf{I}_n] \;\Rightarrow\; [\mathbf{I}_n | \mathbf{A}^{-1}]. \]
Example 2.40 \[ \mathbf{A} = \begin{bmatrix} 1 & 0 & 2 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 2 & 0 & 1 \\ 1 & 1 & 1 & 1 \end{bmatrix}. \] Append the identity matrix. \[ [\mathbf{A}|\mathbf{I}_4] = \begin{bmatrix} 1 & 0 & 2 & 0 &1&0&0&0\\ 1 & 1 & 0 & 0 &0&1&0&0\\ 1 & 2 & 0 & 1 &0&0&1&0\\ 1 & 1 & 1 & 1 &0&0&0&1 \end{bmatrix}. \]
After elimination: \[ [\mathbf{I}_4|\mathbf{A}^{-1}] = \begin{bmatrix} 1&0&0&0& -1 & 2 & -2 & 2 \\ 0&1&0&0& 1 & -1 & 2 & -2 \\ 0&0&1&0& 1 & -1 & 1 & -1 \\ 0&0&0&1& -1 & 0 & -1 & 2 \end{bmatrix}. \] \[ \mathbf{A}^{-1} = \begin{bmatrix} -1 & 2 & -2 & 2 \\ 1 & -1 & 2 & -2 \\ 1 & -1 & 1 & -1 \\ -1 & 0 & -1 & 2 \end{bmatrix}. \]
Verification: You should check that \[ \mathbf{A} \mathbf{A}^{-1} = \mathbf{I}_4. \]
2.3.5 Algorithms for Solving a System of Linear Equations
When solving a system of linear equations \(\mathbf{A}\mathbf{x} = \mathbf{b}\), we typically assume that a solution exists. If not, approximate methods such as linear regression (see Chapter 9) are used.
In special cases where \(\mathbf{A}\) is square and invertible, the exact solution can be written as
\[
\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}.
\]
However, this is often not feasible, so we use the Moore–Penrose pseudo-inverse:
\[
\mathbf{x} = (\mathbf{A}^{\top}\mathbf{A})^{-1}\mathbf{A}^{\top}\mathbf{b},
\]
which provides the minimum-norm least-squares solution. Despite being conceptually simple, this approach is computationally expensive and numerically unstable, so it is rarely used in practice.
For large-scale systems, we use iterative methods, which gradually improve an approximate solution. These methods include:
- Stationary iterative methods: Richardson, Jacobi, Gauss–Seidel, and successive over-relaxation.
- Krylov subspace methods: Conjugate gradients, generalized minimal residual (GMRES), and biconjugate gradients.
These methods iteratively update the estimate of the solution according to
\[
\mathbf{x}^{(k+1)} = \mathbf{C}\mathbf{x}^{(k)} + \mathbf{d},
\]
reducing the residual error \(\|\mathbf{x}^{(k+1)} - \mathbf{x}^*\|\) at each step until convergence to the true solution \(\mathbf{x}^*\).
Exercises
Exercise 2.34 Particular/ General solution example.
Exercise 2.35 Gaussian Elimination example.
Exercise 2.36 -1 trick example.
Exercise 2.37 Find the inverse of \(\begin{bmatrix}1&2&3\\2&3&0\\3&0&1\end{bmatrix}\)
Exercise 2.38 Moore-Penrose pseudo inverse example.
Exercise 2.39 Solve the system of equations \[\begin{align*}2x + 5y + 2z & = - 38\\ 3x - 2y + 4z & = 17\\ - 6x + y - 7z & = - 12\end{align*}\]
Exercise 2.40 Solve the system of equations \[\begin{align*}2x - 4y + 5z & = - 33\\ 4x - y & = - 5\\ - 2x + 2y - 3z & = 19\end{align*}\]
Exercise 2.41 Solve the system of equations \[\begin{align*}x - y & = 6\\ - 2x + 2y & = 1\end{align*}\]
Exercise 2.42 Solve the system of equations \[\begin{align*}2x + 5y & = - 1\\ - 10x - 25y & = 5\end{align*}\]
Exercise 2.43 Find the inverse of \(\begin{bmatrix}1&0&2&3\\2&3&0&1\\0&3&0&1\\1&0&0&1\end{bmatrix}\).
Exercise 2.44 Consider a \(4 \times 4\) matrix. How many operations does Gaussian Elimination require? What about for an \(n \times n\) matrix?