6.4 Summary Statistics and Independence

This section explores how to summarize and compare random variables using summary statistics, covariance, and independence. These concepts are essential in understanding uncertainty, relationships between variables, and geometric interpretations in probability spaces.

6.4.1 Means and Covariances

Definition 6.11 The expected value \(E_X[g(x)]\) of a function \(g(x)\) with respect to a random variable \(X\) represents the long-run average outcome weighted by its probability.

Continuous case:
\[ E_X[g(x)] = \int_X g(x)p(x)\,dx \]
Discrete case:
\[ E_X[g(x)] = \sum_{x \in X} g(x)p(x) \]

The mean is a special case where \(g(x) = x\), giving \(E_X[x]\). Two other related measures of central tendency are:

Median: Middle value (robust to outliers)
Mode: Most frequent or likely value

Example 6.17 Discrete Expected Value

Let \(X\) be a discrete random variable representing the number shown when rolling a fair six-sided die. The probability Mass Function \[ P(X = x) = \frac{1}{6}, \quad x = 1,2,3,4,5,6 \] Thus, the expected Value is \[ \mathbb{E}[X] = \sum_{x=1}^6 x \, P(X=x) = \frac{1}{6}(1+2+3+4+5+6) = \frac{21}{6} = 3.5 \]

Example 6.18 Continuous Expected Value with \(g(x) = x\) (The Mean)

Let \(X \sim \text{Uniform}(0,2)\). The probability density function is \[ f(x) = \begin{cases} \frac{1}{2}, & 0 \le x \le 2, \\ 0, & \text{otherwise}. \end{cases} \] Therefore, the expected Value is \[ \mathbb{E}[X] = \int_{-\infty}^{\infty} x f(x)\, dx = \int_0^2 x \cdot \frac{1}{2} \, dx \] \[ = \frac{1}{2} \left[ \frac{x^2}{2} \right]_0^2 = \frac{1}{2} \cdot 2 = 1 \] This is the mean of the distribution.

Example 6.19 Continuous Expected Value with \(g(x) \neq x\)

Let \(X \sim \text{Uniform}(0,1)\), and define: \[ g(X) = X^2 \] The expected Value of \(g(X)\) \[ \mathbb{E}[g(X)] = \mathbb{E}[X^2] = \int_0^1 x^2 \cdot 1 \, dx \] \[ = \left[ \frac{x^3}{3} \right]_0^1 = \frac{1}{3}. \]

Definition 6.12 Covariance measures how two random variables vary together: \[ \text{Cov}[x, y] = E[xy] - E[x]E[y] \]

Example 6.20 Covariance of Two Random Variables

Let \(X\) and \(Y\) be discrete random variables with the following joint probability distribution:

\(x\)	\(y\)	\(p(x,y)\)
0	0	\(0.25\)
0	1	\(0.25\)
1	0	\(0.25\)
1	1	\(0.25\)

The expected value of \(X\) is \[ E[X] = \sum_{x,y} x \, p(x,y) = (0)(0.25) + (0)(0.25) + (1)(0.25) + (1)(0.25) = 0.5. \]

The expected value of \(Y\) is \[ E[Y] = \sum_{x,y} y \, p(x,y) = (0)(0.25) + (1)(0.25) + (0)(0.25) + (1)(0.25) = 0.5. \]

The expected value of \(XY\) is \[ E[XY] = \sum_{x,y} xy \, p(x,y) = (0)(0)(0.25) + (0)(1)(0.25) + (1)(0)(0.25) + (1)(1)(0.25) = 0.25. \]

Therefore, the covariance is \[ \text{Cov}[X,Y] = E[XY] - E[X]E[Y] = 0.25 - (0.5)(0.5) = 0, \] meaning there is no linear relationship between \(X\) and \(Y\). In this case, \(X\) and \(Y\) are independent random variables.

Definition 6.13 Variance is the covariance of a variable with itself: \[ V[x] = \text{Cov}[x, x] \]

The variance describes how much the data spread around the mean.

Example 6.21 Let \(X\) be a discrete random variable with probability mass function:

\(x\)	\(p(x)\)
0	\(0.2\)
1	\(0.5\)
2	\(0.3\)

What is the variance of \(X\)?

The expected value of \(X\) is \[ E[X] = \sum_x x p(x) = 0(0.2) + 1(0.5) + 2(0.3) = 1.1 \]

The expected value of \(X^2\) is \[ E[X^2] = \sum_x x^2 p(x) = 0^2(0.2) + 1^2(0.5) + 2^2(0.3) = 1.7 \]

Therefore, by definition, \[ \mathrm{Var}(X) = \mathrm{Cov}(X, X) = E[X^2] - E[X]^2 = 1.7 - (1.1)^2 = 1.7 - 1.21 = 0.49 \]

The variance measures how much the values of \(X\) spread out around the mean. Here, the variance of \(X\) is \(0.49\), which quantifies the variability of the random variable.

The covariance matrix summarizes all pairwise covariances for multivariate data and is symmetric and positive semi-definite. The covariance matrix has the following general form:

\[ \Sigma = \begin{bmatrix} \text{Cov}[x_1, x_1] & \text{Cov}[x_1, x_2] & \cdots & \text{Cov}[x_1, x_D] \\ \text{Cov}[x_2, x_1] & \text{Cov}[x_2, x_2] & \cdots & \text{Cov}[x_2, x_D] \\ \vdots & \vdots & \ddots & \vdots \\ \text{Cov}[x_D, x_1] & \text{Cov}[x_D, x_2] & \cdots & \text{Cov}[x_D, x_D] \end{bmatrix} \]

Correlation normalizes covariance to the range \([-1, 1]\): \[ \text{corr}[x, y] = \frac{\text{Cov}[x, y]}{\sqrt{V[x]V[y]}} \]

\(\text{corr} = 1\): perfect positive relationship
\(\text{corr} = -1\): perfect negative relationship
\(\text{corr} = 0\): uncorrelated variables

6.4.2 Empirical Means and Covariances

In practice, we estimate population statistics using sample data:

Definition 6.14 Empirical Mean: \[ \mathbf{a}r{x} = \frac{1}{N}\sum_{n=1}^{N} x_n \] Empirical Covariance: \[ \Sigma = \frac{1}{N}\sum_{n=1}^{N}(x_n - \mathbf{a}r{x})(x_n - \mathbf{a}r{x})^\top \]

Both the empirical mean and covariance are the sample-based estimates of the true (population) parameters.

Example 6.22 Suppose we observe paired data from two random variables \(X\) and \(Y\). The data consist of \(n = 4\) observations:

\[ (x_i, y_i) = (1, 2),\; (2, 3),\; (3, 5),\; (4, 4). \]

The empirical mean (sample mean) of \(X\) and \(Y\) is defined as \[ \mathbf{a}r{x} = \frac{1}{n}\sum_{i=1}^n x_i, \qquad \mathbf{a}r{y} = \frac{1}{n}\sum_{i=1}^n y_i. \]

Compute each mean: \[ \mathbf{a}r{x} = \frac{1+2+3+4}{4} = 2.5, \qquad \mathbf{a}r{y} = \frac{2+3+5+4}{4} = 3.5. \]

The empirical covariance between \(X\) and \(Y\) is \[ \widehat{\mathrm{Cov}}(X,Y) = \frac{1}{n}\sum_{i=1}^n (x_i - \mathbf{a}r{x})(y_i - \mathbf{a}r{y}). \]

Compute each term: \[ \begin{aligned} (1-2.5)(2-3.5) &= ( -1.5)(-1.5) = 2.25, \\ (2-2.5)(3-3.5) &= (-0.5)(-0.5) = 0.25, \\ (3-2.5)(5-3.5) &= (0.5)(1.5) = 0.75, \\ (4-2.5)(4-3.5) &= (1.5)(0.5) = 0.75. \end{aligned} \]

Sum and divide by \(n\): \[ \widehat{\mathrm{Cov}}(X,Y) = \frac{2.25 + 0.25 + 0.75 + 0.75}{4} = 1.0. \]

6.4.3 Alternatice Expressions for the Variance

There are a number of alternative forms of the variance:

Definition:
\[ V[x] = E[(x - \mu)^2] \]
Raw-score formula:
\[ V[x] = E[x^2] - (E[x])^2 \]
Pairwise difference form:
\[ \frac{1}{N^2}\sum_{i,j=1}^{N}(x_i - x_j)^2 = 2\left(E[x^2] - (E[x])^2\right) \]

6.4.4 Sums and Transformations of Random Variables

Lemma 6.1 For two random variables \(X\) and \(Y\): \[ \begin{aligned} E[x + y] &= E[x] + E[y] \\ V[x + y] &= V[x] + V[y] + 2\,\text{Cov}[x, y] \end{aligned} \]

If we apply an affine transformation \(y = A x + b\): \[ \begin{aligned} E[y] &= A E[x] + b, \\ V[y] &= A V[x] A^\top \end{aligned} \]

Example 6.23 \(V[x] = E[(x - \mu)^2]\) is an equivalent version of \(V[x] = E[x^2] - (E[x])^2\) (and this is exactly the definition of variance we gave earlier). If we recall that \(\mu = E[x]\), we can use the affine transformation property of the expected value to get: \[\begin{align*} E[(x - \mu)^2] &= E[(x - \mu)(x-\mu)] \\ &= E[x^2 - 2\mu x + \mu^2] \\ &= E[x^2] - 2\mu E[x] + \mu^2 \\ &= E[x^2] - 2E[x]^2 + E[x]^2\\ &=E[x^2] - E[x]^2. \end{align*}\]

6.4.5 Statistical Independence

Definition 6.15 Independent variables:
Two random variables \(X\) and \(Y\) are independent if
\[ p(x, y) = p(x)p(y) \] Conditional independence:
\(X\) and \(Y\) are conditionally independent given \(Z\) if
\[ p(x, y \mid z) = p(x \mid z)p(y \mid z) \]

Conditional independence is often written as \(X \perp\!\!\!\perp Y \mid Z\). These relationships are fundamental in probabilistic modeling and graphical models.

6.4.6 Inner Products and Geometry of Random Variables

Random variables can be viewed as vectors in a space with inner products defined by covariance: \[ \langle X, Y \rangle = \text{Cov}[x, y] \] The length of a random variable is its standard deviation: \[ \|X\| = \sqrt{V[x]} = \sigma[x] \] The angle between random variables is related to correlation: \[ \cos(\theta) = \frac{\text{Cov}[x, y]}{\sqrt{V[x]V[y]}} = \text{corr}[x, y] \] \(\theta = 90^\circ \Leftrightarrow\) uncorrelated variables (orthogonal in this space)

This geometric interpretation helps visualize dependence and uncertainty.

Exercises

Exercise 6.17

The distribution of the amount of gravel (in tons) sold by a particular construction supply company in a given week is a continuous RV \(X\) with pdf \[f(x) = \begin{cases} \dfrac{3}{2}(1-x^2) & 0 \leq x \leq 1 \\ 0 & otherwise \end{cases}.\]

Solution

Exercise 6.18

Let \(X\) be a random variable with PDF given by \[f_X(x)= \begin{cases} cx^2 & |x| \leq 1\\0 & otherwise \end{cases}.\]

Solution

Exercise 6.19

Let \(X\) be a positive continuous random variable. Prove that \(E[X] = \int_{0}^{\infty} P(X \geq x) dx\).

Solution

Exercise 6.20

Show that \(cov[x,y] = E[xy] -E[x]E[y]\).

Solution