3.2 Constrained Optimization and Lagrange Multipliers

In many optimization problems, we seek to minimize a function \(f( \mathbf{x})\) subject to constraints on the variables \(\mathbf{x}\).

Definition 3.3 A constrained optimization problem has the form: \[ \min_{\mathbf{x}} f( \mathbf{x}) \quad \text{subject to } g_i( \mathbf{x}) \le 0, \; i = 1, \ldots, m \] where each \(g_i( \mathbf{x})\) defines a constraint.


3.2.1 From Constraints to the Lagrangian

Rather than directly enforcing the constraints, another approach is to penalize violations of the constraints using an indicator function: \[ J( \mathbf{x}) = f( \mathbf{x}) + \sum_{i=1}^{m} \mathbf{1}(g_i(\mathbf{x})) \quad \text{where} \quad \mathbf{1}(z) = \begin{cases} 0 & \text{if } z \le 0,\\ \infty & \text{otherwise.} \end{cases} \] However, this is difficult to optimize because this indicator function is non-differentiable. To overcome this, we introduce Lagrange multipliers \(\mathbf{\lambda}_i \ge 0\) and form the Lagrangian: \[ L(\mathbf{x}, \mathbf{\mathbf{\lambda}}) = f(\mathbf{x}) + \sum_{i=1}^{m} \mathbf{\mathbf{\lambda}}_i g_i(\mathbf{x}) = f(\mathbf{x}) + \mathbf{\mathbf{\lambda}}^\top \mathbf{g}(\mathbf{x}) \]

To solve a Lagrange multiplier problem, we solve \[L(\mathbf{x}, \mathbf{\lambda})=0.\] That’s the same as solving \(\nabla f = \lambda \nabla g\) with \(g(\mathbf{x}) = 0\).

Example 3.7 Find the maximum and minimum of the function \[ f(x,y) = x^2 + y^2 \] subject to the constraint \[ g(x,y) = x + y - 4 = 0. \]

First, we find the gradients \[ \nabla f(x,y) = \begin{bmatrix} 2x \\ 2y \end{bmatrix}, \qquad \nabla g(x,y) = \begin{bmatrix} 1 \\ 1 \end{bmatrix}. \]

The Lagrange multiplier condition is \[ \nabla f = \lambda \nabla g. \] This gives: \[ \begin{cases} 2x = \lambda \\ 2y = \lambda \\ x + y = 4 \end{cases} \]

From the first two equations: \[ 2x = 2y \quad \Rightarrow \quad x = y. \] Substitute into the constraint: \[ x + x = 4 \quad \Rightarrow \quad x = 2. \] Thus: \[ (x,y) = (2,2). \]

The minimum value is \[ f(2,2) = 2^2 + 2^2 = 8. \] No maximum value exists since we can make \(x^2 + y^2\) as large as we like when \(x+y-4 = 0\). We would need a restriction on the size of \(x\) and \(y\) to find both the max and min values.


3.2.1.1 Lagrangian Duality

Definition 3.4 The primal problem is: \[ \min_x f(\mathbf{x}) \quad \text{subject to } g_i(\mathbf{x}) \le 0 \]

Definition 3.5 The corresponding Lagrangian dual problem is: \[ \max_{\mathbf{\lambda} \ge 0} D(\mathbf{\mathbf{\lambda}}) \quad \text{where} \quad D(\mathbf{\mathbf{\lambda}}) = \min_x L(\mathbf{x}, \mathbf{\mathbf{\lambda}}) \]

This dual formulation often simplifies computation since \(D(\mathbf{\mathbf{\lambda}})\) is concave, even if \(f\) and \(g_i\) are not.

Example 3.8 Consider the constrained optimization problem: \[ \begin{aligned} \min_{x \in \mathbb{R}} \quad & f(x) = x^2 \\ \text{subject to} \quad & g(x) = x - 1 \le 0. \end{aligned} \] This means we want to minimize \(x^2\) subject to \(x \le 1\).

We introduce a Lagrange multiplier \(\lambda \ge 0\) for the inequality constraint: \[ \mathcal{L}(x, \lambda) = x^2 + \lambda(x - 1). \]

The dual function is obtained by minimizing the Lagrangian over \(x\): \[ g(\lambda) = \inf_{x \in \mathbb{R}} \mathcal{L}(x, \lambda). \]

Take the derivative with respect to \(x\) and set it to zero: \[ \frac{d}{dx}(x^2 + \lambda x - \lambda) = 2x + \lambda = 0 \quad \Rightarrow \quad x^* = -\frac{\lambda}{2}. \] Substitute back: \[ \begin{aligned} g(\lambda) &= \left(-\frac{\lambda}{2}\right)^2 + \lambda\left(-\frac{\lambda}{2} - 1\right) \\ &= \frac{\lambda^2}{4} - \frac{\lambda^2}{2} - \lambda \\ &= -\frac{\lambda^2}{4} - \lambda. \end{aligned} \]

The dual problem is: \[ \begin{aligned} \max_{\lambda \ge 0} \quad & g(\lambda) = -\frac{\lambda^2}{4} - \lambda. \end{aligned} \]

We differentiate: \[ \frac{d}{d\lambda}\left(-\frac{\lambda^2}{4} - \lambda\right) = -\frac{\lambda}{2} - 1, \] and set equal to zero: \[ -\frac{\lambda}{2} - 1 = 0 \quad \Rightarrow \quad \lambda^* = -2. \]

Since the dual constraint requires \(\lambda \ge 0\), the maximum occurs at the boundary: \[ \lambda^* = 0. \] Thus: \[ g(0) = 0. \]

  • Primal optimum: \[ \min_{x \le 1} x^2 \quad \Rightarrow \quad x^* = 0, \quad f(x^*) = 0. \]

  • Dual optimum: \[ \max_{\lambda \ge 0} g(\lambda) = 0. \]

Therefore, \[ f^* = g^* = 0 \]

Example 3.9 Consider the linear program \[ \begin{aligned} \text{maximize } \quad & P = 3x + 5y \\ \text{subject to } \quad & 2x + y \le 20, \\ & x + 2y \le 12, \\ & x + 3y \le 15, \\ & x \ge 0,\; y \ge 0. \end{aligned} \] Solve the problem, reformulate the problem using matrices, find the dual problem.

Since this is a linear program in two variables, the optimum occurs at a corner point of the feasible region.

Corner points:

  1. \((0,0) \Longrightarrow P = 0\)

  2. \((10,0) \Longrightarrow P = 3(10) = 30\)

  3. \((0,4) \Longrightarrow P = 5(4) = 20\)

  4. \((6,3) \Longrightarrow P = 3(6) + 5(3) = 33\)

\[ (x^*, y^*) = (6,3), \quad P^* = 33 \]


Matrix Form of the Primal Problem

Let \[ \mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}, \quad \mathbf{c} = \begin{bmatrix} 3 \\ 5 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 20 \\ 12 \\ 15 \end{bmatrix}, \] \[ \mathbf{A} = \begin{bmatrix} 2 & 1 \\ 1 & 2 \\ 1 & 3 \end{bmatrix}. \]

Then the primal problem is \[ \begin{aligned} \max_{\mathbf{x}} \quad & \mathbf{c}^\top \mathbf{x} \\ \text{subject to } \quad & \mathbf{A}\mathbf{x} \le \mathbf{b}, \\ & \mathbf{x} \ge 0. \end{aligned} \]

For the dual problem, we introduce Lagrange multipliers
\[ \boldsymbol{\lambda} = \begin{bmatrix} \lambda_1 \\ \lambda_2 \\ \lambda_3 \end{bmatrix} \ge 0 \] for the inequality constraints. For a primal maximization problem with \(\le\) constraints and \(\mathbf{x} \ge 0\), the dual is a minimization problem:

\[ \begin{aligned} \min_{\boldsymbol{\lambda}} \quad & \mathbf{b}^\top \boldsymbol{\lambda} = 20\lambda_1 + 12\lambda_2 + 15\lambda_3 \\ \text{subject to } \quad & \mathbf{A}^\top \boldsymbol{\lambda} \ge \mathbf{c}, \\ & \boldsymbol{\lambda} \ge 0. \end{aligned} \]

That is, \[ \begin{aligned} \min \quad & 20\lambda_1 + 12\lambda_2 + 15\lambda_3 \\ \text{subject to } \quad & 2\lambda_1 + \lambda_2 + \lambda_3 \ge 3, \\ & \lambda_1 + 2\lambda_2 + 3\lambda_3 \ge 5, \\ & \lambda_1, \lambda_2, \lambda_3 \ge 0. \end{aligned} \]

  • Primal optimal value: \(P^* = 33\)
  • Dual optimal value: \(D^* = 33\)

This confirms strong duality for this linear program.


3.2.1.2 Weak Duality and the Minimax Inequality

The minimax inequality is given in the following theorem. F

Theorem 3.1 For any function with two arguments \(\phi(\mathbf{x}, \mathbf{y})\), the maximin is less than the minimax. That is, \[ \max_{\mathbf{y}} \min_{\mathbf{x}} \phi(\mathbf{x}, \mathbf{y}) \le \min_{x}\mathbf{x} \max_{\mathbf{y}} \phi(\mathbf{x}, \mathbf{y}) \]

Using this, we derive weak duality.

Theorem 3.2 The optimal value of the dual problem is always less than or equal to the optimal value of the primal problem: \[ \min_{\mathbf{x}} \max_{\mathbf{\mathbf{\lambda}} \ge 0} L(\mathbf{x}, \mathbf{\mathbf{\lambda}}) \ge \max_{\mathbf{\mathbf{\lambda}} \ge 0} \min_{\mathbf{x}} L(\mathbf{x}, \mathbf{\mathbf{\lambda}}) \]

Thus, duality provides a lower bound on the primal problem’s optimal value.


3.2.1.3 Equality Constraints

If equality constraints are present, such as \(h_j(\mathbf{x}) = 0\), they can be rewritten as two inequalities: \[ h_j(\mathbf{x}) \le 0 \quad \text{and} \quad h_j(\mathbf{x}) \ge 0 \] The associated Lagrange multipliers for equality constraints are unconstrained, while those for inequality constraints remain non-negative.


Exercises

Exercise 3.7 Find the maximum and minimum values of \(f\left( {x,y} \right) = 81{x^2} + {y^2}\) subject to the constraint \(4{x^2} + {y^2} = 9\).

Exercise 3.8 Find the maximum and minimum values of \(f\left( {x,y} \right) = 8{x^2} - 2y\) subject to the constraint \({x^2} + {y^2} = 1\).

Exercise 3.9 Find the maximum and minimum values of \(f\left( {x,y,z} \right) = {y^2} - 10z\) subject to the constraint \({x^2} + {y^2} + {z^2} = 36\).

Exercise 3.10 Find the maximum and minimum values of \(f\left( {x,y,z} \right) = xyz\) subject to the constraint \(x + 9{y^2} + {z^2} = 4\). Assume \(x \geq 0\) for this problem. Why is this needed?