Interactive Tools:¶
Law of Large Numbers Interactive - Use this interactive to watch sample averages over large data sets converge to expectations.
Convolution Visualizer - Use this tool to visualize convolutions.
Distribution Plotter. - Use this tool to visualize densities and to experiment with limiting distributions.
Variance of Sums and Averages¶
All definitions and results are available in Section 13.1.
The variance in a sum of random variables, , is the sum of all the pairwise covariances:
Positive associations increase the variance of the sum, negative associations decrease the variance of the sum.
If the variables are uncorrelated, as when they are independent, then the variance in the sum is the sum of the variances:
If all of the variables have the same variance and are uncorrelated, then:
The variance in the sample average of random variables, is the average of all the pairwise covariances:
If the variables are uncorrelated, as when they are independent, then the variance in the sample average:
If all of the variables have the same variance and are uncorrelated, then:
So, the standard deviation of a sample average of independent, identical, random variables is .
As a result, the sample averages of independent, identical random variables are consistent estimators for their expectation, in the sense that, the expected square error in the estimate converges to zero as the number of samples diverges.
The same result holds provided the random variables have a convergent mean, and are sufficiently uncorrelated. It does not require independent or identical random variables.
Tail Bounds¶
All definitions and results are available in Section 13.2.
Markov’s Inequality: If is a nonnegative random variable:
for any .
Chebyshev’s Inequality: If is a random variable:
The Law of Large Numbers¶
All definitions and results are available in Section 13.3.
The (Weak) Law of Large Numbers: If is a sequence of independent, identically distributed random variables with mean , then:
at rate or faster for all .
In other words, the distribution of the sample average concentrates about the underlying expectation.
We proved the weak law using Chebyshev’s inequality. The same statement holds anytime the variance in the sample average converges to zero in the limit as diverges, so may also hold for sufficiently uncorrelated random variables.
Observed frequencies are sample averages of indicators. The weak law implies that, the observed frequency of an event, in identical, independent repetitions of the process, are guaranteed to converge to the underlying chance of the event, if such a chance exists. Thus, the law of large numbers shows that, if chances exist, then they must be measurable using long run frequencies.
The Central Limit Theorem¶
All definitions and results are available in Section 13.4.
Normal Random Variables: A random variable is normally distributed with mean and variance if it can take on any real value and has density function:
A random variable is drawn from a standard normal distribution if and . Then:
The Central Limit Theorem (CLT): If are independent, identically distributed random variables with finite mean and finite variance , then the standardized sample average:
converges, in distribution, to a standard normal random variable:
regardless the distribution used to sample the ’s.
As a result, sums and sample averages of many independent, identical random variables are approximately normally distributed.
In particular, is approximately drawn from a distribution and is approximately drawn from a distribution.