Chapter 5 Linear Regression

In linear regression, the goal is to find a function \(f\) that maps inputs \(\mathbf{x} \in \mathbb{R}^D\) to outputs \(f(\mathbf{x}) \in \mathbb{R}\). We are given training data \(\{(\mathbf{\mathbf{x}}_n, y_n)\}_{n=1}^N\), where each observation is modeled as: \[ y_n = f(\mathbf{x}_n) + \epsilon_n. \] Here, \(\epsilon_n\) represents independent and identically distributed (i.i.d.) Gaussian noise with zero mean.

With linear regression, we have two basic goals:

  • Modeling the data: Estimate the function \(f\) that explains the observed data.
  • Generalization: Ensure \(f\) predicts well for unseen inputs, not just the training data.

When using linear regression, we have to make some decisions apriori.

  1. Model Choice:
    • Decide on the type and parametrization of the regression function (e.g., polynomial degree).
    • Model selection (see Section 8.6) helps identify the simplest model that explains the data effectively.
  2. Parameter Estimation:
    • Determine the optimal model parameters using appropriate loss functions and optimization algorithms.
  3. Overfitting and Model Selection:
    • Overfitting occurs when the model fits training data too closely, failing to generalize.
    • This typically arises from overly flexible or complex models.
  4. Connection Between Loss Functions and Priors:
    • Many optimization objectives can be derived from probabilistic assumptions about the data.
    • Understanding this relationship clarifies how prior beliefs influence model behavior.
  5. Uncertainty Modeling:
    • Since training data are finite, predictions carry uncertainty.
    • Modeling uncertainty provides confidence bounds for predictions, which are especially important with limited data.