Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1.7 Chapter Summary

Phew. That was a lot to think through.

Here’s a summary of the important definitions and results from Chapter 1. Reading the summary is not a substitute for reading the chapter. It will provide most of the information to complete your studysheets, but copying results from the summary to the studysheet is not a substitute for completing the sheet yourself. Make sure you can find where each result listed here was explained in the main chapter text.

If there is one table to summarize the chapter, it is this:

Outcomes, Events, and Sets

These results are all explained in Section 1.1.

  1. A random process is some process that produces unpredictable outcomes

    • An outcome is a specific, distinct, result of the process

    • The outcome space, Ω\Omega, is the set of all possible outcomes

    • An event is any collection of outcomes. Events, EE are subsets of Ω\Omega.

  2. Sets may be defined:

    • explicitly by listing their entries. For example, A={a,b,c}A = \{a,b,c\}.

    • implicitly by defining rules that the entries all must satisfy, and that, if satisfied, ensure membership in the set. For example, A={all letters before d in the alphabet}.A = \{\text{all letters before } d \text{ in the alphabet} \}.

    • The size of a set, AA, is denoted A|A|. If a set is finite, then its size is the number of entries in the set.

  3. Logic and set operations

    • Sets can be defined by combining a collection of rules into logical sentences. For instance, S={all letters before d or after w}={a,b,c,y,z}S = \{\text{all letters before } d \text{ or after } w\} = \{a,b,c,y,z\}

    • Appending not before a set’s implicit definition produces the set complement. For example: not A=Ac={all outcomes not in A}\text{not } A = A^c = \{\text{all outcomes not in } A\}

    • Concatenating sets with an or produces their union. For example, S=AVS = A \cup V if V={y,z}V = \{y,z\}.

    • Concatenating sets with an and produces their intersection. For example, B={b,c,d,e}B = \{b,c,d,e\} then AB={b,c}A \cap B = \{b,c\}.

    • Modifying a probability statement with an if adds conditions that restrict the space of possible outcomes Ω\Omega. We denote if with a vertical bar |.

    • In summary:

LogicalSet OperationNotation
notcomplementc^c
orunion\cup
ifrestrict Ω\Omega\mid
andintersect\cap

Probability as Proportion

These results are all explained in Section 1.2.

  1. A probability measure is a function that accepts events and returns their chance. We denote the measure, Pr()\text{Pr}(\cdot) so Pr(A)\text{Pr}(A) is the chance the event AA occurs.

  2. Probability as Frequency: The chance of an event equals the long run frequency with which it would occur in an arbitrarily long sequence of trials

    • It follows that:

      • All chances are between 0 and 1

      • The chance that something happens, Pr(Ω)\text{Pr}(\Omega), equals 1

      • Chances for disjoint events add: Pr(AB)=Pr(A)+Pr(B)\text{Pr}(A \cup B) = \text{Pr}(A) + \text{Pr}(B) if AA and BB are disjoint.

      • Expanding an event to include more outcomes never makes it less likely. Contracting an event so it includes fewer outcomes never makes it more likely.

  3. We say that all outcomes are equally likely if:

    • They would occur with the same long run frequency

    • We have no better model and want to start simple

    • The features that distinguish outcomes cannot possibly influence their frequency, or the process that selects outcomes

  4. If all outcomes are equally likely then probability is equivalent to proportion:

    • The probability of every outcome is 1/Ω1/|\Omega| where Ω|\Omega| is the number of possible outcomes

    • The probability of every event is:

    Pr(E)=EΩ=the number of ways E can happenthe number of distinct things that can happen\text{Pr}(E) = \frac{|E|}{|\Omega|} = \frac{\text{the number of ways } E \text{ can happen}}{\text{the number of distinct things that can happen}}
    • So, if all outcomes are equally likely, we can compute probabilities by (a) enumerating the outcome space, (b) counting the number of possible outcomes, (c) enumerating the event, (d) counting the number of ways the event can happen, and (d) evaluating their ratio.

The Rules of Chance

All of these results are explained in Section 1.3.

  1. A probability model is a choice of outcome space, all relevant events, and probability measure, such that:

    1. Nonnegativity: Pr(E)0\text{Pr}(E) \geq 0 for all events EE.

    2. Normalization: Pr(Ω)=1\text{Pr}(\Omega) = 1.

    3. Additivity: Pr(AB)=Pr(A)+Pr(B)\text{Pr}(A \cup B) = \text{Pr}(A) + \text{Pr}(B) if AA and BB are disjoint.

  2. Ensuing probability rules:

    1. Complements: Pr(Ec)=1Pr(E)\text{Pr}(E^c) = 1 - \text{Pr}(E)

    2. Sub-additivity: Pr(AB)=Pr(A)+Pr(B)Pr(A and B)Pr(A)+Pr(B).\text{Pr}(A \cup B) = \text{Pr}(A) + \text{Pr}(B) - \text{Pr}(A \text{ and } B) \leq \text{Pr}(A) + \text{Pr}(B).

Joint and Marginal Probability

All of these results are explained in Section 1.4.

  1. A joint probability is the probability that two events both happen: Pr(A,B)=Pr(A and B)=Pr(AB)\text{Pr}(A,B) = \text{Pr}(A \text{ and }B) = \text{Pr}(A \cap B)

    • Since ABA \cap B is contained in both AA and BB, Pr(A,B)min{Pr(A),Pr(B)}\text{Pr}(A,B) \leq \text{min}\{\text{Pr}(A),\text{Pr}(B)\}.

    • Given a collection of joint probabilities, Pr(A,B),Pr(A,Bc),Pr(Ac,B),Pr(Ac,Bc)\text{Pr}(A,B), \text{Pr}(A,B^c), \text{Pr}(A^c,B), \text{Pr}(A^c,B^c) the marginal probabilities are the chances of the individual events, Pr(A),Pr(Ac),Pr(B),Pr(Bc)\text{Pr}(A), \text{Pr}(A^c), \text{Pr}(B), \text{Pr}(B^c).

    • The act of breaking an event into all the ways it can occur is called partitioning (breaking into disjoint parts)

    • The act of summing the chances of disjoint parts is called marginalization

  2. Joint and marginal probabilities may be arranged into a joint probability table where

    • The sum of the joint probabilities in any row or column must add to the corresponding marginal

    • The sum of all joint probabilities must equal 1

    • The sum of any pair of marginals must equal 1

For example:

EventAAnot AABB Marginals
BBPr(A,B)\text{Pr}(A,B)Pr(Ac,B)\text{Pr}(A^c,B)Pr(B)\text{Pr}(B)
not BBPr(A,Bc)\text{Pr}(A,B^c)Pr(Ac,Bc)\text{Pr}(A^c,B^c) Pr(Bc)\text{Pr}(B^c)
AA MarginalsPr(A)\text{Pr}(A)Pr(Ac)\text{Pr}(A^c)1

Conditional Probability

All of these results are explained in Section 1.5.

  1. A conditional probability is the probability of one event given that another occurs: Pr(BA)=Pr(B if A)\text{Pr}(B|A) = \text{Pr}(B \text{ if } A)

    • Conditioning on an event, AA, restricts the set of possible outcomes to AA

    • Conditioning on AA does not change the relative likelihood (e.g. the odds) of any outcomes in AA

  2. Normalization is the action of scaling a list of nonnegative numbers by their sum

  3. To find conditional probabilities from a joint probability table:

    1. Excerpt the appropriate rows or columns of the joint table

    2. Scale all entries by their sum, which equals the marginal assigned to the row/column (e.g. normalize)

  4. The conditional probability of BB given AA is always the ratio of a joint to a marginal:

Pr(BA)=Pr(B,A)Pr(A)\text{Pr}(B|A) = \frac{\text{Pr}(B,A)}{\text{Pr}(A)}
  1. The multiplication rule expresses any joint as a product of a marginal and a conditional:

Pr(A,B)=Pr(A)×Pr(BA)\text{Pr}(A,B) = \text{Pr}(A) \times \text{Pr}(B|A)
  1. An outcome tree is a diagram with one node for every possible event in a sequence of events, arrows for possible transitions between nodes, labelled by the marginal, or conditional, probabilities of the transition.

    • We can use the multiplication rule to compute chances by evaluating products along paths in outcome trees

  2. Bayes Rule recovers Pr(AB)\text{Pr}(A|B) from marginals for AA and conditionals for BB given AA:

Pr(AB)=Pr(A,B)Pr(B)=Pr(A)Pr(BA)Pr(A,B)+Pr(Ac,B)=Pr(A)Pr(BA)Pr(A)Pr(BA)+Pr(Ac)Pr(BAc)\text{Pr}(A|B) = \frac{\text{Pr}(A,B)}{\text{Pr}(B)} = \frac{\text{Pr}(A) \text{Pr}(B|A)}{\text{Pr}(A,B) + \text{Pr}(A^c,B)} = \frac{\text{Pr}(A) \text{Pr}(B|A)}{\text{Pr}(A) \text{Pr}(B|A) + \text{Pr}(A^c) \text{Pr}(B|A^c)}

Independent Events

All of these results are explained in Section 1.6.

  1. Events AA and BB are independent if and only if any of the following are true:

    • Knowing the outcome of one tells us nothing about the other.

    • Pr(AB)=Pr(A),Pr(BA)=Pr(B)\text{Pr}(A|B) = \text{Pr}(A), \quad \text{Pr}(B|A) = \text{Pr}(B)

      • that is, the conditionals equal the marginals because we learn nothing by conditioning

    • Pr(AB)=Pr(ABc),Pr(BA)=Pr(BAc)\text{Pr}(A|B) = \text{Pr}(A|B^c), \quad \text{Pr}(B|A) = \text{Pr}(B|A^c)

      • that is, the conditionals don’t depend on the conditioning statement, since the events tell us nothing about each other

    • Pr(A,B)=Pr(A)×Pr(B)\text{Pr}(A,B) = \text{Pr}(A) \times \text{Pr}(B)

      • that is, the joint is the product of the marginals

      • This is a special case of the general multiplication rule. Only use it for independent events.

      • This is useful for computing joint probabilities and checking independence.

      • Do not take this as the definition of independence. It’s really a consequence

  2. If two events are not independent, then they are dependent.