6.4 Summary Statistics and Independence
This section explores how to summarize and compare random variables using summary statistics, covariance, and independence. These concepts are essential in understanding uncertainty, relationships between variables, and geometric interpretations in probability spaces.
6.4.1 Means and Covariances
Definition 6.11 The expected value \(E_X[g(x)]\) of a function \(g(x)\) with respect to a random variable \(X\) represents the long-run average outcome weighted by its probability.
- Continuous case:
\[ E_X[g(x)] = \int_X g(x)p(x)\,dx \]
- Discrete case:
\[ E_X[g(x)] = \sum_{x \in X} g(x)p(x) \]
The mean is a special case where \(g(x) = x\), giving \(E_X[x]\). Two other related measures of central tendency are:
- Median: Middle value (robust to outliers)
- Mode: Most frequent or likely value
Example 6.17 Discrete Expected Value
Let \(X\) be a discrete random variable representing the number shown when rolling a fair six-sided die. The probability Mass Function \[ P(X = x) = \frac{1}{6}, \quad x = 1,2,3,4,5,6 \] Thus, the expected Value is \[ \mathbb{E}[X] = \sum_{x=1}^6 x \, P(X=x) = \frac{1}{6}(1+2+3+4+5+6) = \frac{21}{6} = 3.5 \]
Example 6.18 Continuous Expected Value with \(g(x) = x\) (The Mean)
Let \(X \sim \text{Uniform}(0,2)\). The probability density function is \[ f(x) = \begin{cases} \frac{1}{2}, & 0 \le x \le 2, \\ 0, & \text{otherwise}. \end{cases} \] Therefore, the expected Value is \[ \mathbb{E}[X] = \int_{-\infty}^{\infty} x f(x)\, dx = \int_0^2 x \cdot \frac{1}{2} \, dx \] \[ = \frac{1}{2} \left[ \frac{x^2}{2} \right]_0^2 = \frac{1}{2} \cdot 2 = 1 \] This is the mean of the distribution.
Example 6.19 Continuous Expected Value with \(g(x) \neq x\)
Let \(X \sim \text{Uniform}(0,1)\), and define: \[ g(X) = X^2 \] The expected Value of \(g(X)\) \[ \mathbb{E}[g(X)] = \mathbb{E}[X^2] = \int_0^1 x^2 \cdot 1 \, dx \] \[ = \left[ \frac{x^3}{3} \right]_0^1 = \frac{1}{3}. \]
Definition 6.12 Covariance measures how two random variables vary together: \[ \text{Cov}[x, y] = E[xy] - E[x]E[y] \]
Example 6.20 Covariance of Two Random Variables
Let \(X\) and \(Y\) be discrete random variables with the following joint probability distribution:
| \(x\) | \(y\) | \(p(x,y)\) |
|---|---|---|
| 0 | 0 | \(0.25\) |
| 0 | 1 | \(0.25\) |
| 1 | 0 | \(0.25\) |
| 1 | 1 | \(0.25\) |
The expected value of \(X\) is \[ E[X] = \sum_{x,y} x \, p(x,y) = (0)(0.25) + (0)(0.25) + (1)(0.25) + (1)(0.25) = 0.5. \]
The expected value of \(Y\) is \[ E[Y] = \sum_{x,y} y \, p(x,y) = (0)(0.25) + (1)(0.25) + (0)(0.25) + (1)(0.25) = 0.5. \]
The expected value of \(XY\) is \[ E[XY] = \sum_{x,y} xy \, p(x,y) = (0)(0)(0.25) + (0)(1)(0.25) + (1)(0)(0.25) + (1)(1)(0.25) = 0.25. \]
Therefore, the covariance is \[ \text{Cov}[X,Y] = E[XY] - E[X]E[Y] = 0.25 - (0.5)(0.5) = 0, \] meaning there is no linear relationship between \(X\) and \(Y\). In this case, \(X\) and \(Y\) are independent random variables.
Definition 6.13 Variance is the covariance of a variable with itself: \[ V[x] = \text{Cov}[x, x] \]
The variance describes how much the data spread around the mean.
Example 6.21 Let \(X\) be a discrete random variable with probability mass function:
| \(x\) | \(p(x)\) |
|---|---|
| 0 | \(0.2\) |
| 1 | \(0.5\) |
| 2 | \(0.3\) |
What is the variance of \(X\)?
The expected value of \(X\) is \[ E[X] = \sum_x x p(x) = 0(0.2) + 1(0.5) + 2(0.3) = 1.1 \]
The expected value of \(X^2\) is \[ E[X^2] = \sum_x x^2 p(x) = 0^2(0.2) + 1^2(0.5) + 2^2(0.3) = 1.7 \]
Therefore, by definition, \[ \mathrm{Var}(X) = \mathrm{Cov}(X, X) = E[X^2] - E[X]^2 = 1.7 - (1.1)^2 = 1.7 - 1.21 = 0.49 \]
The variance measures how much the values of \(X\) spread out around the mean. Here, the variance of \(X\) is \(0.49\), which quantifies the variability of the random variable.
The covariance matrix summarizes all pairwise covariances for multivariate data and is symmetric and positive semi-definite. The covariance matrix has the following general form:
\[ \Sigma = \begin{bmatrix} \text{Cov}[x_1, x_1] & \text{Cov}[x_1, x_2] & \cdots & \text{Cov}[x_1, x_D] \\ \text{Cov}[x_2, x_1] & \text{Cov}[x_2, x_2] & \cdots & \text{Cov}[x_2, x_D] \\ \vdots & \vdots & \ddots & \vdots \\ \text{Cov}[x_D, x_1] & \text{Cov}[x_D, x_2] & \cdots & \text{Cov}[x_D, x_D] \end{bmatrix} \]
Correlation normalizes covariance to the range \([-1, 1]\): \[ \text{corr}[x, y] = \frac{\text{Cov}[x, y]}{\sqrt{V[x]V[y]}} \]
- \(\text{corr} = 1\): perfect positive relationship
- \(\text{corr} = -1\): perfect negative relationship
- \(\text{corr} = 0\): uncorrelated variables
6.4.2 Empirical Means and Covariances
In practice, we estimate population statistics using sample data:
Definition 6.14 Empirical Mean: \[ \mathbf{a}r{x} = \frac{1}{N}\sum_{n=1}^{N} x_n \] Empirical Covariance: \[ \Sigma = \frac{1}{N}\sum_{n=1}^{N}(x_n - \mathbf{a}r{x})(x_n - \mathbf{a}r{x})^\top \]
Both the empirical mean and covariance are the sample-based estimates of the true (population) parameters.
Example 6.22 Suppose we observe paired data from two random variables \(X\) and \(Y\). The data consist of \(n = 4\) observations:
\[ (x_i, y_i) = (1, 2),\; (2, 3),\; (3, 5),\; (4, 4). \]
The empirical mean (sample mean) of \(X\) and \(Y\) is defined as \[ \mathbf{a}r{x} = \frac{1}{n}\sum_{i=1}^n x_i, \qquad \mathbf{a}r{y} = \frac{1}{n}\sum_{i=1}^n y_i. \]
Compute each mean: \[ \mathbf{a}r{x} = \frac{1+2+3+4}{4} = 2.5, \qquad \mathbf{a}r{y} = \frac{2+3+5+4}{4} = 3.5. \]
The empirical covariance between \(X\) and \(Y\) is \[ \widehat{\mathrm{Cov}}(X,Y) = \frac{1}{n}\sum_{i=1}^n (x_i - \mathbf{a}r{x})(y_i - \mathbf{a}r{y}). \]
Compute each term: \[ \begin{aligned} (1-2.5)(2-3.5) &= ( -1.5)(-1.5) = 2.25, \\ (2-2.5)(3-3.5) &= (-0.5)(-0.5) = 0.25, \\ (3-2.5)(5-3.5) &= (0.5)(1.5) = 0.75, \\ (4-2.5)(4-3.5) &= (1.5)(0.5) = 0.75. \end{aligned} \]
Sum and divide by \(n\): \[ \widehat{\mathrm{Cov}}(X,Y) = \frac{2.25 + 0.25 + 0.75 + 0.75}{4} = 1.0. \]
6.4.3 Alternatice Expressions for the Variance
There are a number of alternative forms of the variance:
- Definition:
\[ V[x] = E[(x - \mu)^2] \]
- Raw-score formula:
\[ V[x] = E[x^2] - (E[x])^2 \]
- Pairwise difference form:
\[ \frac{1}{N^2}\sum_{i,j=1}^{N}(x_i - x_j)^2 = 2\left(E[x^2] - (E[x])^2\right) \]
6.4.4 Sums and Transformations of Random Variables
Lemma 6.1 For two random variables \(X\) and \(Y\): \[ \begin{aligned} E[x + y] &= E[x] + E[y] \\ V[x + y] &= V[x] + V[y] + 2\,\text{Cov}[x, y] \end{aligned} \]
If we apply an affine transformation \(y = A x + b\): \[ \begin{aligned} E[y] &= A E[x] + b, \\ V[y] &= A V[x] A^\top \end{aligned} \]
Example 6.23 \(V[x] = E[(x - \mu)^2]\) is an equivalent version of \(V[x] = E[x^2] - (E[x])^2\) (and this is exactly the definition of variance we gave earlier). If we recall that \(\mu = E[x]\), we can use the affine transformation property of the expected value to get: \[\begin{align*} E[(x - \mu)^2] &= E[(x - \mu)(x-\mu)] \\ &= E[x^2 - 2\mu x + \mu^2] \\ &= E[x^2] - 2\mu E[x] + \mu^2 \\ &= E[x^2] - 2E[x]^2 + E[x]^2\\ &=E[x^2] - E[x]^2. \end{align*}\]
6.4.5 Statistical Independence
Definition 6.15 Independent variables:
Two random variables \(X\) and \(Y\) are independent if
\[
p(x, y) = p(x)p(y)
\]
Conditional independence:
\(X\) and \(Y\) are conditionally independent given \(Z\) if
\[
p(x, y \mid z) = p(x \mid z)p(y \mid z)
\]
Conditional independence is often written as \(X \perp\!\!\!\perp Y \mid Z\). These relationships are fundamental in probabilistic modeling and graphical models.
6.4.6 Inner Products and Geometry of Random Variables
Random variables can be viewed as vectors in a space with inner products defined by covariance: \[ \langle X, Y \rangle = \text{Cov}[x, y] \] The length of a random variable is its standard deviation: \[ \|X\| = \sqrt{V[x]} = \sigma[x] \] The angle between random variables is related to correlation: \[ \cos(\theta) = \frac{\text{Cov}[x, y]}{\sqrt{V[x]V[y]}} = \text{corr}[x, y] \] \(\theta = 90^\circ \Leftrightarrow\) uncorrelated variables (orthogonal in this space)
This geometric interpretation helps visualize dependence and uncertainty.
Exercises
Exercise 6.17
The distribution of the amount of gravel (in tons) sold by a particular construction supply company in a given week is a continuous RV \(X\) with pdf \[f(x) = \begin{cases} \dfrac{3}{2}(1-x^2) & 0 \leq x \leq 1 \\ 0 & otherwise \end{cases}.\]Exercise 6.18
Let \(X\) be a random variable with PDF given by \[f_X(x)= \begin{cases} cx^2 & |x| \leq 1\\0 & otherwise \end{cases}.\]Exercise 6.19
Let \(X\) be a positive continuous random variable. Prove that \(E[X] = \int_{0}^{\infty} P(X \geq x) dx\).