7.3 Expectation-Maximization (EM) Algorithm
The Expectation-Maximization (EM) algorithm, introduced by Dempster et al. (1977), is an iterative method for estimating parameters in models with latent variables, such as Gaussian mixture models. In GMMs, parameters \(\mu_k, \boldsymbol{\Sigma}_k, \pi_k\) cannot be solved in closed form because the responsibilities \(r_{nk}\) depend on them in a complex, nonlinear way. Therefore, the EM algorithm alternates between estimating responsibilities and updating parameters until convergence.
7.3.1 Algorithm Steps
Initialize parameters
Select initial values for each parameter we want to find: \[ \{\pi_k, \mu_k, \boldsymbol{\Sigma}_k\}_{k=1}^{K}. \]E-step (Expectation)
Compute the responsibility of component \(k\) for data point \(\mathbf{x}_n\): \[ r_{nk} = \frac{\pi_k \, \mathcal{N}(\mathbf{x}_n | \mu_k, \boldsymbol{\Sigma}_k)} {\sum_{j=1}^{K} \pi_j \, \mathcal{N}(\mathbf{x}_n | \mu_j, \boldsymbol{\Sigma}_j)}. \]M-step (Maximization)
Update the parameters using the new responsibilities: \[ N_k = \sum_{n=1}^{N} r_{nk}. \]Update means: \[ \mu_k = \frac{1}{N_k} \sum_{n=1}^{N} r_{nk} \mathbf{x}_n. \]
Update covariances: \[ \boldsymbol{\Sigma}_k = \frac{1}{N_k} \sum_{n=1}^{N} r_{nk}(\mathbf{x}_n - \mu_k)(\mathbf{x}_n - \mu_k)^{\top}. \]
Update mixture weights: \[ \pi_k = \frac{N_k}{N}. \]
Each EM iteration increases the log-likelihood, and convergence can be monitored by checking either the change in log-likelihood or parameter stability.
7.3.2 Example: GMM Fit
After several iterations (typically few), the EM algorithm converges to a stable mixture model. For instance, a fitted GMM might be: \[ p(\mathbf{x}) = 0.29\,\mathcal{N}(\mathbf{x}|-2.75, 0.06) + 0.28\,\mathcal{N}(\mathbf{x}|-0.5, 0.25) + 0.43\,\mathcal{N}(\mathbf{x}|3.64, 1.63). \] Plots of negative log-likelihood across iterations demonstrate steady improvement until convergence.
Exercises
Exercise 7.6 Use Excel to analyze the GMM for 2 Gaussians and data \((-3, -1,0, 1, 3, 4)\)
Exercise 7.7 Use Excel to analyze the GMM for 3 Gaussians and data \((-3, -1,0, 1, 3, 4)\)
Exercise 7.8 Use Excel to analyze the GMM for 2 Gaussians and data \((-4,-3, -1,0, 1, 3, 4)\)
Exercise 7.9 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -1,0, 1, 3, 4)\)
Exercise 7.10 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -2.5, 0, 1, 3, 4)\)
Exercise 7.11 Use Excel to analyze the GMM for 3 Gaussians and data \((-4,-3, -3, 0, 1, 3, 4)\)