7.3 Expectation-Maximization (EM) Algorithm

The Expectation-Maximization (EM) algorithm, introduced by Dempster et al. (1977), is an iterative method for estimating parameters in models with latent variables, such as Gaussian mixture models. In GMMs, parameters \(\mu_k, \boldsymbol{\Sigma}_k, \pi_k\) cannot be solved in closed form because the responsibilities \(r_{nk}\) depend on them in a complex, nonlinear way. Therefore, the EM algorithm alternates between estimating responsibilities and updating parameters until convergence.


7.3.1 Algorithm Steps

  1. Initialize parameters
    Select initial values for each parameter we want to find: \[ \{\pi_k, \mu_k, \boldsymbol{\Sigma}_k\}_{k=1}^{K}. \]

  2. E-step (Expectation)
    Compute the responsibility of component \(k\) for data point \(\mathbf{x}_n\): \[ r_{nk} = \frac{\pi_k \, \mathcal{N}(\mathbf{x}_n | \mu_k, \boldsymbol{\Sigma}_k)} {\sum_{j=1}^{K} \pi_j \, \mathcal{N}(\mathbf{x}_n | \mu_j, \boldsymbol{\Sigma}_j)}. \]

  3. M-step (Maximization)
    Update the parameters using the new responsibilities: \[ N_k = \sum_{n=1}^{N} r_{nk}. \]

    Update means: \[ \mu_k = \frac{1}{N_k} \sum_{n=1}^{N} r_{nk} \mathbf{x}_n. \]

    Update covariances: \[ \boldsymbol{\Sigma}_k = \frac{1}{N_k} \sum_{n=1}^{N} r_{nk}(\mathbf{x}_n - \mu_k)(\mathbf{x}_n - \mu_k)^{\top}. \]

    Update mixture weights: \[ \pi_k = \frac{N_k}{N}. \]

Each EM iteration increases the log-likelihood, and convergence can be monitored by checking either the change in log-likelihood or parameter stability.


7.3.2 Example: GMM Fit

After several iterations (typically few), the EM algorithm converges to a stable mixture model. For instance, a fitted GMM might be: \[ p(\mathbf{x}) = 0.29\,\mathcal{N}(\mathbf{x}|-2.75, 0.06) + 0.28\,\mathcal{N}(\mathbf{x}|-0.5, 0.25) + 0.43\,\mathcal{N}(\mathbf{x}|3.64, 1.63). \] Plots of negative log-likelihood across iterations demonstrate steady improvement until convergence.


Exercises

Exercise 7.6 Use Excel to analyze the GMM for 2 Gaussians and data \((-3, -1,0, 1, 3, 4)\)

Exercise 7.7 Use Excel to analyze the GMM for 3 Gaussians and data \((-3, -1,0, 1, 3, 4)\)

Exercise 7.8 Use Excel to analyze the GMM for 2 Gaussians and data \((-4,-3, -1,0, 1, 3, 4)\)

Exercise 7.9 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -1,0, 1, 3, 4)\)

Exercise 7.10 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -2.5, 0, 1, 3, 4)\)

Exercise 7.11 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -3, 0, 1, 3, 4)\)