Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

10.1 Iterated Integrals and Sums

Iterated Integration

Consider a pair of variables [x,y][x,y]. Suppose that f(x,y)f(x,y) is a surface. Then, the volume beneath the surface over a region R\mathcal{R} is defined by the double integral:

In Section 8.3 we saw that, if fX,Yf_{X,Y} is a density function for a random vector with two entries, then the chance a random vector falls in some region is defined as a double integral. Similarly, the expected value of a function of a random vector is defined as a double integral. In particular:

Pr([X,Y]R)=[x,y]RfX,Y(x,y)dxdy,E[g(x,y)]=all [x,y]g(x,y)fX,Y(x,y)dxdy.\begin{aligned} &\text{Pr}([X,Y] \in \mathcal{R}) = \iint_{[x,y] \in \mathcal{R}} f_{X,Y}(x,y) dx dy, \\ &\mathbb{E}[g(x,y)] = \iint_{\text{all } [x,y]} g(x,y) f_{X,Y}(x,y) dx dy. \end{aligned}

These observations generalize to more variables. For example, the chance a random vector with three entries lands in some region is expressed as a triple integral. In this chapter we will focus on the two-dimensional case. All results generalize directly to higher dimensions.

The uniform example worked above provides our first solution to a double integral.

You can compare this result to the same statement in one dimension. In one dimension, the integral of a constant over the interval is the constant times the length of the interval.

The same rule applies for double sums. The double sum of a constant equals the constant times the total number of possible index pairs used in the sum.

To evaluate general double integrals we use iterated integration.

Geometrically, iterated integration finds the volume beneath a surface by cutting the surface into cross-sections, evaluating the area under each cross-section then integrating the areas.

Note, we get to choose which variable to integrate over first. We can either integrate over yy, then over xx, or over xx, then over yy. This is an imprecise statement of Fubini’s Theorem. We won’t study the theorem formally in this class, but will use the result.

Usually we evaluate a double integral in three steps:

  1. Pick an order. Choose the order so that the inner integral is easy to evaluate.

  2. Suppose we chose to integrate over yy first. Then, let R(x)\mathcal{R}(x) denote the set of all yy such that [x,y]R[x,y] \in \mathcal{R}. You can picture R(x)\mathcal{R}(x_*) as the interval formed by drawing the intersection of R\mathcal{R} with the vertical line at x=xx = x_*. Evaluate:

    A(x)=yR(x)f(x,y)dyA(x) = \int_{y \in \mathcal{R}(x)} f(x,y) dy

    to find the area of the yy-cross section of ff, over R\mathcal{R}, at xx.

  3. Integrate over xx:

    [x,y]Rf(x,y)dxdy=xRA(x)dx.\iint_{[x,y] \in \mathcal{R}} f(x,y) dx dy = \int_{x \in \mathcal{R}} A(x) dx.

Notice that, by iterating, we convert a single integration problem over two variables into a sequence of two integration problems, each with respect to a single variable. To evaluate each single integral, we can use any of the techniques we’ve learned before for integration of a univariate function.

Rectangular Regions

Suppose that R\mathcal{R} is a rectangle. Then R={x[a,b],y[c,d]}\mathcal{R} = \{x \in [a,b], y \in [c,d]\} for some aba \leq b and some cdc \leq d. Often, rectangles are denoted using the shorthand, R=[a,b]×[c,d]\mathcal{R} = [a,b] \times [c,d].

If R\mathcal{R} is a rectangle, then the range of available yy is [c,d][c,d], no matter xx. Similarly, the range of available xx is [a,b][a,b], no matter yy. So, the double integral can be expanded:

[x,y][a,b]×[c,d]f(x,y)dxdy={x=ab(y=cdf(x,y)dy)dxy=cd(x=abf(x,y)dx)dy\iint_{[x,y] \in [a,b] \times [c,d]} f(x,y) dx dy = \begin{cases} & \int_{x =a}^b \left(\int_{y = c}^d f(x,y) dy \right) dx \\ & \int_{y =c}^d \left(\int_{x = a}^b f(x,y) dx \right) dy \end{cases}

Run the code cell below to visualize a joint density surface as a three-D histogram. It will open the same demo we used in Section 8.3.

To help visualize this volume, try running the code cell below. Pick n=10n = 10. Then select “3D Perspective” and “Normalize by area”. Then, select the rectangular region R={x[0.20,0.50],y[0,1]}\mathcal{R} = \{x \in [0.20,0.50], y \in [0,1]\}. You will see your rectangle highlighted with red boundaries.

from utils_joint_distribution import run_joint_distribution_demo

run_joint_distribution_demo()

Read the volume reported in the blue box at the bottom. Record the value.

Next, repeat the same experiment, computing the volumes for x[0.20,0.30]x \in [0.20,0.30], then for x[0.30,0.40]x \in [0.30, 0.40], then for x[0.40,0.50]x \in [0.40, 0.50]. Record each reported volume. Notice that each volume is the volume under a yy-cross-section of the histogram.

Add up your three volumes. They will return the volume your originally computed for x[0.20,0.50]x \in [0.20,0.50]. This should be no surprise. The original volume was equal to the sum of the volume under every highlighted histogram bar. By adding over yy with xx fixed, then over xx, you add over all the histogram bars.

This same logic applies no matter how small we make the boxes. It explains the iterated integral formula. Iterated integration is the same process we repeated above, in the limit as nn goes to infinity.

Try making nn large now. The reported volume is still the sum of the volume under each bar, and the collection of bars can still be added by first summing over yy fixing xx, then over xx.

Examples

  1. Consider the function f(x,y)=y(1(x+y))f(x,y) = y (1 - (x + y)). Let’s integrate it over [0,0.25]×[0,0.5][0,0.25] \times [0,0.5]. First, apply iterated integration:

    [x,y][0,0.25]×[0,0.5]y(1(x+y))dxdy=x=00.25(y=00.5y(1(x+y))dy)dx\begin{aligned} \iint_{[x,y] \in [0,0.25] \times [0,0.5]} y (1 - (x + y)) dx dy & = \int_{x = 0}^{0.25} \left( \int_{y = 0}^{0.5} y (1 - (x + y)) dy \right) dx \end{aligned}

    Next, evaluate the inner integral

    y=00.5y(1(x+y))dy=(1x)y=00.5ydyy=00.5y2dy=(1x)12y2y=00.513y3y=00.5=(1x)12(0.520)13(0.530)=18(1x)112=(18112)18x=12418x=18(13x)\begin{aligned} \int_{y = 0}^{0.5} y (1 - (x + y)) dy & = (1 - x) \int_{y = 0}^{0.5} y dy - \int_{y = 0}^{0.5} y^2 dy \\ & = (1 - x) \frac{1}{2} y^2 \Big|_{y = 0}^{0.5} - \frac{1}{3} y^3 \Big|_{y = 0}^{0.5} \\ & = (1 - x) \frac{1}{2} (0.5^2 - 0) - \frac{1}{3}(0.5^3 - 0) \\ & = \frac{1}{8}(1 - x) - \frac{1}{12} \\ & = \left(\frac{1}{8} - \frac{1}{12} \right) - \frac{1}{8} x \\ & = \frac{1}{24} - \frac{1}{8} x \\ & = \frac{1}{8} \left(\frac{1}{3} - x \right)\end{aligned}

    So:

    A(x)=18(13x).A(x) = \frac{1}{8} \left(\frac{1}{3} - x \right).

    Notice, the area under yy cross-section depends on xx since changing xx changes the cross-section.

    Finally, evaluating the outer integral:

    x=00.25A(x)dx=18x=00.25(13x)dx=18(13x12x2)x=00.25=18(112132)=18×596=57680.007\begin{aligned} \int_{x = 0}^{0.25} A(x) dx & = \frac{1}{8} \int_{x = 0}^{0.25} \left(\frac{1}{3} - x \right) dx \\ & = \frac{1}{8} \left( \frac{1}{3} x - \frac{1}{2} x^2 \right)_{x =0}^{0.25} \\ & = \frac{1}{8} \left( \frac{1}{12} - \frac{1}{32} \right) \\ & = \frac{1}{8} \times \frac{5}{96} = \frac{5}{768} \approx 0.007 \end{aligned}
  2. Suppose that [X,Y][X,Y]is a continuous random vector supported on the entire x,yx,y plane. Let fX,Yf_{X,Y} denote the joint density function of XX and YY. What is the chance that X[1,b]X \in [1,b]?

    We can approach this problem two ways. First, since our question only involved XX, we can ignore YY, and work with the marginal density of XX, fX(x)f_X(x). Then, as in Section 2.3 and Section 2.4, the chance XX is contained in an interval is:

    Pr(X[a,b])=x=abfX(x)dx.\text{Pr}(X \in [a,b]) = \int_{x = a}^b f_X(x) dx.

    Alternately, the chance X[a,b]X \in [a,b] is the chance [X,Y]R[X,Y] \in \mathcal{R} where R\mathcal{R} is the rectangle with sides [a,b][a,b] and [,][-\infty,\infty] since this rectangle places no constraints on YY. Then:

    Pr(X[a,b])=Pr(X[a,b],Y(,))=[x,y]RfX,Y(x,y)dxdy.\text{Pr}(X \in [a,b]) = \text{Pr}(X \in [a,b],Y \in (-\infty,\infty)) = \iint_{[x,y] \in \mathcal{R}}f_{X,Y}(x,y) dx dy.

    Let’s check that these two answers are the same.

    Expand the double integral as an iterated integral:

    [x,y]RfX,Y(x,y)dxdy=x=ab(y=fX,Y(x,y)dy)dx.\iint_{[x,y] \in \mathcal{R}}f_{X,Y}(x,y) dx dy = \int_{x = a}^b \left( \int_{y = -\infty}^{\infty} f_{X,Y}(x,y) dy \right) dx.

    The inner integral returns the marginal density of XX (see Section 8.3):

    y=fX,Y(x,y)dy=fX(x).\int_{y = -\infty}^{\infty} f_{X,Y}(x,y) dy = f_X(x).

    So, the iterated integral is equivalent to:

    x=ab(y=fX,Y(x,y)dy)dx=x=abfX(x)dx.\int_{x = a}^b \left( \int_{y = -\infty}^{\infty} f_{X,Y}(x,y) dy \right) dx = \int_{x = a}^b f_X(x) dx.

    The last integral is the integral we used originally, so our answers match!

Sums and Products

If ff can be expressed as a sum or product of univariate functions, then the double integral of ff over a rectangular region simplifies.

For example:

[x,y][1,5]×[0,3]xy2dxdy=3x=15xdx4y=03y2dy.\iint_{[x,y] \in [1,5] \times [0,3]} x - y^2 dx dy = 3 \int_{x = 1}^5 x dx - 4 \int_{y = 0}^3 y^2 dy.

Here are two examples:

  1. Suppose that XX and YY are uniformly distributed over the rectangle [0,1]×[0,2][0,1] \times [0,2]. The rectangle has area 1×2=21 \times 2 = 2, so the joint density equals 1/21/2 everywhere inside the support. Then:

    E[XY]=[x,y][0,1]×[0,2]xyfX,Y(x,y)dxdy=12[x,y][0,1]×[0,2]xydxdy.\mathbb{E}[XY] = \iint_{[x,y] \in [0,1] \times [0,2]} x y f_{X,Y}(x,y) dx dy = \frac{1}{2} \iint_{[x,y] \in [0,1] \times [0,2]} x y dx dy.

    Then, applying our rule for products, the double integral of the product is the product of the single integrals:

    E[XY]=12(x=01xdx)×(y=02ydy).\mathbb{E}[XY] = \frac{1}{2} \left(\int_{x = 0}^1 x dx \right) \times \left(\int_{y = 0}^2 y dy \right).
  2. Suppose that XX and YY are independent variables. Then, as shown in Section 8.3, their joint density is the product of their marginals: fX,Y(x,y)=fX(x)×fY(y)f_{X,Y}(x,y) = f_X(x) \times f_Y(y). So:

    Pr(X[a,b],Y[c,d])=[x,y][a,b]×[c,d]fX,Y(x,y)dxdy=(x=abfX(x)dx)×(y=cdfY(y)dy)=Pr(X[a,b])×Pr(Y[c,d]).\begin{aligned} \text{Pr}(X \in [a,b],Y \in [c,d]) & = \iint_{[x,y] \in [a,b] \times [c,d]} f_{X,Y}(x,y) dx dy \\ & = \left( \int_{x = a}^b f_X(x) dx \right) \times \left( \int_{y = c}^d f_Y(y) dy \right) \\ & = \text{Pr}(X \in [a,b]) \times \text{Pr}(Y \in [c,d]). \end{aligned}

    So, the product rule for double integrals recovers the product rule for independent random variables. Joint probabilities equal the product of the corresponding marginal probabilities:

    Pr(X[a,b],Y[c,d])=Pr(X[a,b])×Pr(Y[c,d]).\text{Pr}(X \in [a,b],Y \in [c,d]) = \text{Pr}(X \in [a,b]) \times \text{Pr}(Y \in [c,d]).

Non-Rectangular Regions

If the region R\mathcal{R} is not a rectangle, then the bounds of the inner integral will depend on the outer variable. This is easiest to see by example.

Suppose that [X,Y][X,Y] is a random vector drawn from the region X0X \geq 0, Y0Y \geq 0, X+Y1X + Y \leq 1, with joint density:

PDF(x,y)=60x2y\text{PDF}(x,y) = 60 x^2 y

The figure below shows the support, S\mathcal{S}:

Support of [X,Y].

What is the probability that X<0.5?X < 0.5?

Let R={x0,y0,x+y1,x0.5}\mathcal{R} = \{x \geq 0, y \geq 0, x + y \leq 1, x \leq 0.5 \}. The figure below shows the region R\mathcal{R}.

The region R.

This region is a quadrilateral, with corners {[0,0],[0.5,0],[0,1],[0.5,0.5]}\{[0,0], [0.5,0],[0,1], [0.5,0.5]\}. It includes all xx between 0 and 0.5. For any x[0,0.5]x \in [0,0.5] the range of available yy is [0,1x][0,1 - x] since, x+y1x + y \leq 1 implies y1xy \leq 1 - x.

So:

Pr([X,Y]R)=[x,y]RPDF(x,y)dxdy=x=00.5(y=01xPDF(x,y)dy)dx.\text{Pr}([X,Y] \in \mathcal{R}) = \iint_{[x,y] \in \mathcal{R}} \text{PDF}(x,y) dx dy = \int_{x = 0}^{0.5} \left( \int_{y = 0}^{1 - x} \text{PDF}(x,y) dy \right) dx.

Pay careful attention to the bounds of the inner integral. When R\mathcal{R} is not a rectangle the bounds for the inner integral will depend on the outer variable. In this case, the upper bound for yy depends on the choice of xx.

To evaluate the double integral, first evaluate the inner integral:

A(x)=y=01x60x2ydy=60x2y=01xydy=30x2y2y=01x=30x2(1x)2.\begin{aligned} A(x) & = \int_{y = 0}^{1 - x} 60 x^2 y dy = 60 x^2 \int_{y = 0}^{1 - x} y dy \\ & = 30 x^2 y^2 \Big|_{y = 0}^{1 - x} = 30 x^2 (1 - x)^2.\end{aligned}

So:

Pr([X,Y]R)=x=00.5A(x)dx=30x=00.5x2(1x)2dx.\text{Pr}([X,Y] \in \mathcal{R}) = \int_{x = 0}^{0.5} A(x) dx = 30 \int_{x = 0}^{0.5} x^2 (1 - x)^2 dx.

We can evaluate this integral by factoring (1x)2(1 - x)^2 or via integration by parts (see Section 7.1). Factoring, (1x)2=x22x+1 (1 - x)^2 = x^2 - 2 x + 1, so:

Pr([X,Y]R)=30x=00.5x42x3+x2dx=30(15x524x3+13x2)x=00.5=30(150.55120.53+130.52)=30×160=12.\begin{aligned} \text{Pr}([X,Y] \in \mathcal{R}) & = 30 \int_{x = 0}^{0.5} x^4 - 2 x^3 + x^2 dx \\ & = 30 \left( \frac{1}{5} x^5 - \frac{2}{4} x^3 + \frac{1}{3} x^2 \right)_{x = 0}^{0.5} \\ & = 30 \left( \frac{1}{5} 0.5^5 - \frac{1}{2} 0.5^3 + \frac{1}{3} 0.5^2 \right) \\ & = 30 \times \frac{1}{60} \\ & = \frac{1}{2}. \end{aligned}

That’s a surprisingly simple answer!

Here’s a simpler way to work it out. Recall that, finding the area under a cross-section of the joint density, over the full range of the free variable returns a marginal density (see Section 8.3):

y such that [x,y]SfX,Y(x,y)dy=fX(x)\int_{y \text{ such that } [x,y] \in \mathcal{S}} f_{X,Y}(x,y) dy = f_X(x)

So, in our example, A(x)=fX(x)A(x) = f_X(x), the marginal density of XX. Then, borrowing our previous work, the random variable XX is supported on [0,1][0,1] with marginal density:

fX(x)g(x)=x2(1x)2.f_X(x) \propto g(x) = x^2 (1 - x)^2.

The function g(x)g(x) is symmetric about x=1/2x = 1/2 since it treats xx and 1x1 - x equivalently. Therefore, XX is a random variable whose distribution is even about x=1/2x = 1/2. It follows that the chance X<0.5X < 0.5 must match the chance X>0.5X > 0.5. Therefore, Pr(X0.5)=0.5\text{Pr}(X \leq 0.5) = 0.5.

Iterated Summation

Most integration rules also apply, by analogy, to summations. For example integrals of linear functions are linear functions of integrals. Sums of linear functions are also linear functions of sums. Integrals of products can be expanded with integration by parts. Sums of products may be expanded using summation by parts. Just as double integrals may be expanded as an iterated pair of integrals, double sums may be expanded as an iterated pair of sums.

Iterated sums expand a double sum as a sum over an outer index of a sum over an inner index, treating the outer index as if it were constant. For example, the double sum over ii and jj is the same as a sum over ii of a sum over jj given ii.

Note that, as we saw for integrals, we can run the sum in either order. We can sum over ii on the outside and jj on the inside or ii on the inside and jj on the outside. We used this approach in Section 7.1 to derive the tail sum formula for expectations.

Here are two concrete ways to think about an iterated sum:

  1. By Analogy to For Loops: If you implemented an iterated sum in code, you would use two for loops. The outer loop could run over all possible ii. The inner loop would run over all possible jj given ii.

  2. By Analogy to Table Operations: Suppose you were given a table whose rows are indexed by ii and whose columns are indexed by jj. Let f(i,j)f(i,j) denote the value of the i,ji,j entry of the table. Then, a double sum over ff is the same as the sum of a subset of the entries of the table. Summing over ii, then jj given ii, is the same as using a sum over the columns, then a sum over the rows. Summing over jj, then ii given jj, is the same as running a sum over the rows, then a sum over the columns.

For example, suppose that X{1,2,3,4}X \in \{1,2,3,4\} and Y{0,1,2}Y \in \{0,1,2\} are discrete random variables. Then we can represent their joint distribution with a joint distribution table (see Section 1.4):

Jointx=1x = 1x=2x = 2x=3x = 3x=4x = 4
y=0y = 00.30.10.050.05
y=1y = 10.20.10.050.05
y=2y = 20.050.0500

Then, the probability that X<3X < 3 and Y>0Y > 0 is the probability that X[1,2]X \in [1,2] and Y[1,2]Y \in [1,2]. Therefore:

Pr(X[1,2],Y[1,2])=[x,y]RPr(X=x,Y=y)\text{Pr}(X \in [1,2], Y \in [1,2]) = \sum_{[x,y] \in \mathcal{R}} \text{Pr}(X = x, Y = y)

where R\mathcal{R} is the rectangle {x[1,2],y[1,2]}\{x \in [1,2], y \in [1,2]\}. Excerpting the matching elements of the table:

Jointx=1x = 1x=2x = 2x=3x = 3x=4x = 4
y=0y = 0....
y=1y = 10.20.1..
y=2y = 20.050.05..

then adding give:

Pr(X[1,2],Y[1,2])=0.2+0.1+0.05+0.05=0.4.\text{Pr}(X \in [1,2], Y \in [1,2]) = 0.2 + 0.1 + 0.05 + 0.05 = 0.4.

Grouping terms in the sum:

0.2+0.1+0.05+0.05=(0.2+0.1)+(0.05+0.05)0.2 + 0.1 + 0.05 + 0.05 = (0.2 + 0.1) + (0.05 + 0.05)

is equivalent to summing across each row, then adding the row sums:

(0.2+0.1)+(0.05+0.05)=(Pr(X=1,Y=1)+Pr(X=2,Y=1))+(Pr(X=1,Y=2)+Pr(X=2,Y=2))=y=12(x=12Pr(X=x,Y=y))\begin{aligned} (0.2 + 0.1) + (0.05 + 0.05) & = (\text{Pr}(X = 1,Y = 1) + \text{Pr}(X = 2,Y = 1)) + (\text{Pr}(X = 1,Y = 2) + \text{Pr}(X = 2,Y = 2)) \\& = \sum_{y = 1}^2 \left( \sum_{x = 1}^2 \text{Pr}(X = x,Y = y) \right) \end{aligned}

The last expression is an iterated sum.