Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

2.5 Chapter Summary

This chapter introduced our main modeling tools.

Interactive Tools:

  1. Distribution Plotter. This is the tool used in Sections 2.2 and 2.4 to visualized PMFs and PDFs. You will need it for your first HW. It is a good reference to come back to throughout the course. I suggest you bookmark it.

  2. Dartboard Sampler - This is the example used in Section 2.3 to derive the idea of a density function.

Random Variables and Distributions

These definitions are all available in Section 2.1.

  1. A random variable is a randomly selected number.

    • The support of a random variable is the range of possible values it can attain. The support is to random variables as the outcome space is to randomly chosen outcomes.

  2. Random variables are modelled using distribution functions

    • A probability mass function (PMF) is the function:

      PDF(x)=Pr(X=x)\text{PDF}(x) = \text{Pr}(X = x)
    • We often visualize a PMF with a bar chart (probability histogram) with one bar per possible value of the random variable, and heights equal to the chance that value occurs

    • A valid PMF must return nonnegative values, and must be normalized (its values must sum to one). Visually, the area of all the bars in a probability histogram must equal 1.

    • A cumulative distribution function (CDF) is the function:

      CDF(x)=Pr(Xx)\text{CDF}(x) = \text{Pr}(X \leq x)
    • The PDF and CDF are related by the additivity property:

    CDF(x)=all yxPDF(y)\text{CDF}(x) = \sum_{\text{all } y \leq x} \text{PDF}(y)
    • The CDF can be used to compute chances on intervals:

    Pr(X(a,b])=CDF(b)CDF(a)\text{Pr}(X \in (a,b]) = \text{CDF}(b) - \text{CDF}(a)

Discrete Models

These definitions are all available in Section 2.2.

  1. A discrete random variable is a random variable that is not continuous. It is usually a random variable that can take on finitely many values, or is restricted to the integers, so represents random counts.

    • Random variables may be defined implicitly, by the process that generates outcomes, or explicitly by fixing a support and a distribution function

  2. A Bernoulli random variable is:

    • Implicit: an indicator for a random event that returns 0 if the event doesn’t happen and 1 if the event does happen.

    • Explicit: a binary random variable with support {0,1}\{0,1\} and where Pr(X=1)=p\text{Pr}(X = 1) = p.

    • The parameter of the Bernoulli is the success probability of the associated event.

  3. A Geometric random variable is:

    • Implicit: the number of repetitions of independent, identical Bernoulli (binary) trials up to and including the first success.

    • Explicit: a random variable with support equal to the positive integers, {1,2,3,...}\{1,2,3,...\}, and PMF:

    PDF(X)=Pr(X=x)=(1p)x1p\text{PDF}(X) = \text{Pr}(X = x) = (1 - p)^{x - 1} p
    • The parameter of the Geometric is the success probability of each trial.

  4. A Binomial random variable is:

    • Implicit: the number of successes in a string of repeated identical, independent Bernoulli (binary) trials.

    • Explicit: a random variable supported on {0,1,2,...n}\{0,1,2,...n\} for some positive integer nn, with PMF:

    PDF(X)=Pr(X=x)=(nx)px(1p)nx.\text{PDF}(X) = \text{Pr}(X = x) = \left(\begin{array}{c} n \\ x \end{array}\right) p^x (1 - p)^{n - x}.
    • The parameter nn is the number of trials, and pp is the chance of success in each trial.

Continuous Models

Section 2.3 is largely philosophical. It proves, and works to justify, the following statement:

If XX is a continuous random variable, then Pr(X=x)=0\text{Pr}(X = x) = 0 for all xx.

That is, all exact events have chance equal to zero. Section 2.3 shows that this property is needed in any model where chances vary continuously with changes to events. Open up the [Dartboard Sampler](Dartboard Sampler) and use it to try to compute the chance a dart is exactly a distance rr from the center of the board. You’ll see that, no matter how many samples you use, you’ll never find any exactly a distance rr from the center, no matter what rr you pick.

  • As a consequence, we never need to distinguish the events xbx \leq b from x<bx < b or, xbx \geq b from x>bx > b

  • We showed, by symmetry, that if XX is a uniform random variable, then probability is equal to proportion, where the size of sets is measured using length (1 dimension), area (2 dimensions), or volume (3 dimensions).

Probability Densities

These results are all explained in Section 2.3.

  1. If XX is a continuous random variable then its probability density function is defined:

PMF(x)=fX(x)=limΔx01ΔxPr(Xx±12Δx).\text{PMF}(x) = f_X(x) = \lim_{\Delta x \rightarrow 0} \frac{1}{\Delta x} \text{Pr} \left(X \in x \pm \frac{1}{2} \Delta x \right).
  1. Any function f(x)f(x) that is both nonnegative and normalized (integrates to 1) could be a density. No function that is ever negative, or integrates to a number other than one, is a density.

  2. We specify a continuous random variable by PDF, CDF, or measure, and move between all three:

    • PDF to measure: Pr(X[a,b])=x=abfX(x)dx.\text{Pr}(X \in [a,b]) = \int_{x = a}^b f_X(x) dx.

    • PDF to CDF: FX(x)=Pr(Xx)=s=bfX(s)ds.F_X(x) = \text{Pr}(X \leq x) = \int_{s = -\infty}^b f_X(s) ds.

    • CDF to measure: Pr(X[a,b])=FX(b)FX(a)\text{Pr}(X \in [a,b]) = F_X(b) - F_X(a)

    • CDF to PDF: fX(x)=ddxFX(x)f_X(x) = \frac{d}{dx} F_X(x)

  3. XX is a Uniform random variable on [a,b][a,b] if X[a,b]X \in [a,b] and fX(x)=1/(ba)f_X(x) = 1/(b - a) is constant for all x[a,b]x \in [a,b].

  4. XX is an Exponential random variable with parameter λ\lambda if X0X \geq 0 and fX(x)=λeλxf_X(x) = \lambda e^{-\lambda x} if x0x \geq 0.

    • The parameter λ\lambda must be greater than 0

  5. XX is a Pareto random variable with parameters xm,αx_m,\alpha if XxmX \geq x_m and fX(x)=αxmαx(α+1)f_X(x) = \alpha x_m^{\alpha} x^{-(\alpha + 1)}.

    • Both parameters xmx_m and α\alpha must be greater than 0

  6. Density functions are often written f(x)g(x)f(x) \propto g(x) where g(x)g(x) is a simpler functional form that determines the shape of the distribution. Then f(x)=cg(x)f(x) = c g(x) where cc is the normalizing constant c=1/g(x)dxc = 1/\int_{-\infty}^{\infty}g(x) dx.

    • In general, g(x)g(x) is a function with some free parameters that depends on the parameters and xx. For example eλxe^{-\lambda x}. Then, the normalizing constant is a function of the free parameter but is not a function of xx.

    • For example, the normalizing constant for the exponential is λ\lambda.

    • We should read densities by recognizing their support and functional forms first, then their normalizing constants.