Chapter Summary - Data 89 Course Notes

Interactive Tools:¶

Law of Large Numbers Interactive - Use this interactive to watch sample averages over large data sets converge to expectations.
Convolution Visualizer - Use this tool to visualize convolutions.
Distribution Plotter. - Use this tool to visualize densities and to experiment with limiting distributions.

Variance of Sums and Averages¶

All definitions and results are available in Section 13.1.

The variance in a sum of $n$ random variables, $S_n = \sum_{j=1}^n X_j$ , is the sum of all the pairwise covariances:
$\text{Var}[S_n] = \sum_{i=1}^n \sum_{j=1}^n \text{Cov}[X_i,X_j] = \sum_{j=1}^n \text{Var}[X_j] + 2 \sum_{i=1}^n \sum_{j = i+1}^n \text{Cov}[X_i,X_j].$
(1)
- Positive associations increase the variance of the sum, negative associations decrease the variance of the sum.
- If the variables are uncorrelated, as when they are independent, then the variance in the sum is the sum of the variances:
  $\text{Var}[S_n] = \sum_{j=1}^n \text{Var}[X_j].$
  (2)
- If all of the variables have the same variance and are uncorrelated, then:
  $\text{Var}[S_n] = n \text{Var}[X_1].$
  (3)
The variance in the sample average of $n$ random variables, $\bar{X}_n = \frac{1}{n} \sum_{j=1}^n X_j = \frac{1}{n} S_n$ is the average of all the pairwise covariances:
$\text{Var}[\bar{X}_n] = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \text{Cov}[X_i,X_j].$
(4)
- If the variables are uncorrelated, as when they are independent, then the variance in the sample average:
  $\text{Var}[S_n] = \frac{1}{n^2}\sum_{j=1}^n \text{Var}[X_j].$
  (5)
- If all of the variables have the same variance and are uncorrelated, then:
  $\text{Var}[S_n] = \frac{1}{n} \text{Var}[X_1].$
  (6)
- So, the standard deviation of a sample average of $n$ independent, identical, random variables is $\text{SD}[\bar{X}_n] = \frac{1}{\sqrt{n}} \text{SD}[X_1]$ .
As a result, the sample averages of independent, identical random variables are consistent estimators for their expectation, in the sense that, the expected square error in the estimate converges to zero as the number of samples diverges.
- The same result holds provided the random variables have a convergent mean, and are sufficiently uncorrelated. It does not require independent or identical random variables.

Tail Bounds¶

All definitions and results are available in Section 13.2.

Markov’s Inequality: If $Y$ is a nonnegative random variable:
$\text{Pr}(Y > y_*) \leq \frac{\mathbb{E}[Y]}{y_*}$
(7)
for any $y_* > 0$ .
Chebyshev’s Inequality: If $Y$ is a random variable:
$\text{Pr}(|Y - \mathbb{E}[Y]| > \epsilon) \leq \frac{\text{Var}[Y]}{\epsilon^2}.$
(8)

The Law of Large Numbers¶

All definitions and results are available in Section 13.3.

The (Weak) Law of Large Numbers: If $\{X_j\}_{j=1}^n$ is a sequence of independent, identically distributed random variables with mean $\mu$ , then:
$\text{Pr}(|\bar{X}_n - \mu| > \epsilon) \xrightarrow{n \rightarrow \infty} 0$
(9)
at rate $\mathcal{O}(n^{-1})$ or faster for all $\epsilon > 0$ .
- In other words, the distribution of the sample average concentrates about the underlying expectation.
- We proved the weak law using Chebyshev’s inequality. The same statement holds anytime the variance in the sample average converges to zero in the limit as $n$ diverges, so may also hold for sufficiently uncorrelated random variables.
Observed frequencies are sample averages of indicators. The weak law implies that, the observed frequency of an event, in $n$ identical, independent repetitions of the process, are guaranteed to converge to the underlying chance of the event, if such a chance exists. Thus, the law of large numbers shows that, if chances exist, then they must be measurable using long run frequencies.

The Central Limit Theorem¶

All definitions and results are available in Section 13.4.

Normal Random Variables: A random variable $X \sim \text{Normal}(\mu,\sigma^2)$ is normally distributed with mean $\mu$ and variance $\sigma^2$ if it can take on any real value and has density function:
$f_{X}(x) = \frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2}\left(\frac{x - \mu}{\sigma} \right)^2}.$
(10)
- A random variable $Z$ is drawn from a standard normal distribution if $\mu = 0$ and $\sigma = 1$ . Then:
  $f_{Z}(z) = \frac{1}{\sqrt{2 \pi} } e^{-\frac{1}{2}z^2}.$
  (11)
The Central Limit Theorem (CLT): If $\{X_j\}_{j=1}^n$ are independent, identically distributed random variables with finite mean $\mu$ and finite variance $\sigma^2$ , then the standardized sample average:
$Z_n = \frac{\bar{X}_n - \mathbb{E}[\bar{X}_n]}{\text{SD}[\bar{X}_n]} = \frac{S_n - \mathbb{E}[S_n]}{\text{SD}[S_n]}$
(12)
converges, in distribution, to a standard normal random variable:
$\lim_{n \rightarrow ]infty} Z_n = Z \sim \text{Normal}(0,1)$
(13)
regardless the distribution used to sample the $X$ ’s.
- As a result, sums and sample averages of many independent, identical random variables are approximately normally distributed.
- In particular, $S_n$ is approximately drawn from a $\text{Normal}(n \mu, n \sigma^2)$ distribution and $\bar{X}_n$ is approximately drawn from a $\text{Normal}(\mu,\sigma^2/n)$ distribution.

13.5 Chapter Summary

Interactive Tools:¶

Variance of Sums and Averages¶

Tail Bounds¶

The Law of Large Numbers¶

The Central Limit Theorem¶