Chapter 5 Linear Regression

In linear regression, the goal is to find a function \(f\) that maps inputs \(\mathbf{x} \in \mathbb{R}^D\) to outputs \(f(\mathbf{x}) \in \mathbb{R}\). We are given training data \(\{(\mathbf{\mathbf{x}}_n, y_n)\}_{n=1}^N\), where each observation is modeled as: \[ y_n = f(\mathbf{x}_n) + \epsilon_n. \] Here, \(\epsilon_n\) represents independent and identically distributed (i.i.d.) Gaussian noise with zero mean.

With linear regression, we have two basic goals:

Modeling the data: Estimate the function \(f\) that explains the observed data.
Generalization: Ensure \(f\) predicts well for unseen inputs, not just the training data.

When using linear regression, we have to make some decisions apriori.

Model Choice:
- Decide on the type and parametrization of the regression function (e.g., polynomial degree).
- Model selection (see Section 8.6) helps identify the simplest model that explains the data effectively.
Parameter Estimation:
- Determine the optimal model parameters using appropriate loss functions and optimization algorithms.
Overfitting and Model Selection:
- Overfitting occurs when the model fits training data too closely, failing to generalize.
- This typically arises from overly flexible or complex models.
Connection Between Loss Functions and Priors:
- Many optimization objectives can be derived from probabilistic assumptions about the data.
- Understanding this relationship clarifies how prior beliefs influence model behavior.
Uncertainty Modeling:
- Since training data are finite, predictions carry uncertainty.
- Modeling uncertainty provides confidence bounds for predictions, which are especially important with limited data.