In Section 5.1 we saw that two sequences converging to zero can appear quite similar, but behave differently. Both the geometric sequence, for , and the harmonic sequence, , decay monotonically and converge to zero as diverges. However, the geometric series, converges while the harmonic series, diverges. These two sequences are distinguished by the rate at which they approach zero.
This section introduces the visual and analytic tools needed to compare the rates at which sequences approach zero in a limit. Since these rates apply in a limit, they are often called asymptotic rates, where asymptotic stands for, “regarding approach to an asymptote.” We will use asymptotic rates to categorize the tail behavior of distributions. Distributions whose tails decay quickly are “light” tailed, while distributions whose tails decay slowly are “heavy” tailed. The heavier the tail, the more of the mass of the distribution is spread out over unusually extreme outcomes.
First, some definitions to standardize our language.
For example, is the sequence of all integers, while is the sequence of all squared numbers.
Sequences are commonly expressed:
where is a function that returns the entry in the sequence. For example, the sequence consisting of all squared integers could be expressed .
Discrete distribution functions for count variables are naturally expressed using an ordered list. If then we can express its PMF as a sequence:
where . For example, if is geometric, then .
The tail decay rate of a discrete distribution is the rate at which the sequence of values decays as diverges.
Visualizing Tail Decay¶
Consider an example geometric and harmonic series again:
The geometric series converges to zero faster, but it is hard to see if we plot each series as a function of . The geometric is slower at first, and, by the time it catches the harmonic, both are already very small.

The same problem shows up anytime we want to compare the tails of a distribution. For example, if has power law tails, then we could have for any . Does converge faster or slower than ? How does the tail decay rate depend on the power of the power law, ?
Once again, a plot is not too helpful, since the action is all squeezed to close to zero for the eye to discern.
For specificity, let’s focus on three power law type sequences:
It shouldn’t be too susrprising that the larger the negative power, the faster the sequence decays. Here’s a table comparing values for inputs spaced as powers of ten.
| 1 | 0.1 | 0.01 | 0.001 | |
| 1 | 0.01 | 0.0001 | 0.000001 | |
| 1 | 0.001 | 0.000001 | 0.000000001 |
Notice that the last sequence converges much faster than the first. The first sequence adds one zero before the “1” each time we increase by a factor of ten. The second adds two zeros per factor of ten. The third adds three zeros per factor of ten. Notice, the number of zeros added matches the power of the power law.
Let’s write the table again, this time using scientific notation:
| 100 | 101 | 102 | 103 | |
|---|---|---|---|---|
| 100 | 10-1 | 10-2 | 10-3 | |
| 100 | 10-2 | 10-4 | 10-6 | |
| 100 | 10-3 | 10-6 | 10-9 |
Finally, only report the exponents by changing the header from , to and from the sequence values to the of the sequence values:
| 0 | 1 | 2 | 3 | |
|---|---|---|---|---|
| 0 | -1 | -2 | -3 | |
| 0 | -2 | -4 | -6 | |
| 0 | -3 | -6 | -9 |
Converting to a log scale on the input and output makes it much easier to compare rates of convergence. Here’s a corresponding plot using a log scale on both the input and output axes.

Notice, on the log-log plot, the power laws have become lines, whose slopes match their powers. The larger the negative power, the steeper the line on the log-log plot.
Log-log plots are very helpful tools for distinguishing rates of convergence. They are a good choice for visualizing tail decay since they convert power laws to lines, whose slope equals the power of the power line.
Comparing Asymptotic Rates¶
Formally, we can compare the rate at which two sequences do to zero as follows:
For example, converges faster than since and converges to zero as diverges.
Less trivially, consider the example sequences:
Taking their ratio:
Then, we want to find the value of the limit:
There are two ways to evaluate this limit. First, try simplifying. Divide the top and the bottom by :
Alternately, recognize that diverges faster than , so for large the terms will dominate. For example, when :
So, since the terms dominate, we can simplify by dropping all smaller terms:
So, in this case and converge to zero at the same rate.
For example, the sequence goes to zero as fast as , so .
Anytime you are given a rational function for , you can identify its order of convergence by stripping out everything except the dominating term in the numerator and denominator. These are the terms where is raised to the largest power.
For example, the sequence:
If you can’t simplify the limit directly, or by stripping out everything but the dominating terms, then try using L’Hopital’s rule.
Consider the sequences:
for some integer . Which converges faster?
Let and . Then we can apply L’Hopital’s rule to work out the limit.
If :
So, converges faster than . That’s not surprising. Exponential sequences adecay quickly, while harmonic sequences decay slowly.
If :
since we’ve already shown that .
So, also converges faster than .
If :
since we’ve already shown that .
So, also converges fatser than .
Repeating this argument inductively yields the following conclusion:
This rule generalizes our observation about geometric and harmonic sequences. The geometric sequence deacys exponentially, so decays faster than any power law. A distribution with exponential tails will have “lighter” tails than any distribution with power law tails.
The plot below compares the exponential function to the power laws we plotted before.

The exponential clearly converges faster than , , or . It converges faster than for any since the power law, would be a line with slope on the log-log plot, while the exponential is a concave curve whose slope grows steeper as increases on the log-log plot. No matter what you choose, I could pick an large enough so that the exponential curve is decreasing faster than .
Ordering Tail Decay Rates¶
During this week you will practice ordering sequences by their decay rates. We’ve set up some essential references in this chapter.
Our slowest example is the harmonic sequence, , which converges so slowly that its associated series, diverges.
Our fastest examples are exponential/geometric, for .
Power laws of the form for fall in between. They are faster than harmonic, but slower than exponential. They are ordered by their power.
Whenever you are given a distribution function, it is a good idea to check the tails, and to try and determine the order of convergence of the tail to zero. In essence, try to match its decay rate to the decay rate of some simple reference function (e.g. a power law or an exponential).