Outcomes, Events and Sets - Data 89 Course Notes

Probability is used (and misused) in amazingly diverse ways. We might ask for the probability that:

it rains tomorrow, or an earthquake strikes,
a lottery ticket hits big, or a poker hand wins,
an atom decays, or an electron tunnels and flips a computer bit,
the stock market crashes, or a stock doubles in value,
a political candidate wins an election, or a poll returns a spurious result,
a medical test returns a true detection, or a recommended vaccine is safe,
an image contains a pedestrian and the pedestrian is waiting to cross the road, or that the last three letters in this list are etc.

To study probability we will need an equally flexible language that can sensibly discuss electrons, policitians, stock prices, and word prediction.

Outcomes and Outcome Spaces¶

Any random process or experiment produces outcomes that cannot be predicted from information that is available before the outcome occurs. We will use the following language to describe the possible outcomes of a random process:

If we want to define a set by listing its entries, we use curly braces, $\{...\}$ , and use commas to separate distinct elements of the set. For instance, if $|\Omega| = 5$ , then $\Omega = \{\omega_1,\omega_2,\omega_3,\omega_4,\omega_5\}$ where $\omega_j$ is the $j^{th}$ possible outcome. The order of the elements in the list does not matter.

Examples

For example:

If I flip a coin, then:
- the possible outcomes are $H = \{\text{the coin lands on heads}\}$ , $T = \{\text{the coin lands on tails}\}$ .
- The outcome space, $\Omega$ , is the list of possible sides, $\{H,T\}$ .
- The size of the outcome space is $|\Omega| = 2$ since the die can land on six different sides.
If I flip two coins, then:
- the possible outcomes are $HH = \{\text{the first coin lands on heads and the second lands on heads}\}$ , $HT = \{\text{the first coin lands on heads and the second lands on tails}\}$ , and so on.
- The outcome space, $\Omega$ , is the list of possible ordered pairs of tosses, $\{HH, HT, TH, TT\}$ . Notice that this list includes $HT$ and $TH$ as distinct outcomes. This means that we pay attention to the order of the tosses.
- The size of the outcome space is $|\Omega| = 4$ since there are two possible options for each coin. If ignore the order of the tosses, as when we cannot distinguish the coins, then $\Omega$ has only three possible outcomes, 2 heads, 1 head, or no heads.
If I roll a six sided die, then:
- the possible outcomes are $1 = \{\text{the die lands on a 1}\}$ , $2 = \{\text{the die lands on a 2}\}$ , and so on.
- The outcome space, $\Omega$ , is the list of possible rolls of the die, $\{1,2,3,4,5,6\}$ .
- The size of the outcome space is $|\Omega| = 6$ since the die can land on six different sides.

Sets and Subsets¶

Sets can contain smaller sets. For instance, the set $\{HH,HT,TH,HH\}$ contains the smaller set $\{HH,HT,TH\}$ . It is common to denote sets contained in the outcome space with capital letters, e.g. $A = \{HH,HT,TH\}$ . We use the $\in$ symbol to show that an element is contained in a set. For instance, $HT \in A$ .

Now that we have multiple sets, we can start comparing their entries. It is standard practice to start from a Venn diagram:

Set Relations

Equality: Two sets are equal, $A = B$ , if they contain all of the same elements.
Subset (proper): A set, $A$ , is a subset of another set, $B$ , if every element of $A$ is also an element of $B$ . That is, $A$ is contained in $B$ . We denote containment with the subset symbol $\subset$ . The equation: $A \subset B$ means every element of $A$ is an element of $B$ but there are some elements of $B$ not in $A$ . In terms of outcomes, $A \subset B$ means that, if an outcome $\omega$ is in $A$ , then it is also in $B$ .
Subset (equal): The symbol $\subseteq$ stands for subset equal. The statement $A \subseteq B$ means that every element of $A$ is an element of $B$ . If $A \subseteq B$ then $A$ and $B$ might be the same sets.

Disjoint: Two sets, $A$ and $B$ , are disjoint if they don’t share any elements. In terms of outcomes, if $A$ and $B$ are disjoint, then the outcomes in each are mutually exclusive. If $\omega \in A$ then $\omega \notin B$ . The symbol $\notin$ means “not in”.

Every set of outcomes is a subset of the outcome space $\Omega$ . Every set of outcomes, $A$ , divides $\Omega$ into two disjoint parts since every element of $\Omega$ is either in $A$ , or not in $A$ . We let $A^c$ denote the complement of $A$ . This is collection of all outcomes in $\Omega$ that are not in $A$ . Any collection of disjoint sets that break a larger set into disjoint pieces is a partition of the larger set. For instance, the pair $\{A,A^c\}$ is a partition of $\Omega$ .

If $A = \Omega$ , then $A^c$ is empty, since every outcome is in $A$ . The empty set is the opposite of $\Omega$ . It contains no outcomes. The empty set is given the special symbol $\emptyset$ .

Events¶

Sets can be defined explicitly by listing their entries. Sets can also be defined implicitly by introducing a list of conditions that, if satisfied, gaurantee that an outcome is contained in the set.

The relationship between a condition (or list of conditions) on outcomes, and a collection of outcomes that satisfy the condition, is the first key idea in probability. It relates natural language descriptions of events to precise collections of outcomes.

Any time we go to ask a question of the form, “What is the probability that ...” or “What is the chance that ...” we will need to replace the ellipsis (...) with an event. For instance, “What is the probability that the sum of two die add to an even value?” or “What is the probability that a randomly sampled student is a Data Science major and is pursuing a double major in Economics or Engineering?”

If we randomly poll a student and the student:

is a Data Science major, and
is pursuing a double major, and
their double major is either in Economics or Engineering

then the event whose chance we were interested in has occured.

Every event is a list of conditions on outcomes. So, every time we ask “What is the probability that some event occurs?” We are really asking, “What is the probability that the outcome of this random process satisfies the conditions that define the event?” Since every list of conditions defines a set, $E$ , “What is the probability that the outcome of this random process is contained in the set $E$ ?”

In notation, we could express this question “What is $\text{Pr}(\omega \in E)$ ?” or, in standard shorthand, “What is $\text{Pr}(E)$ ?”

These observations suggest the following definition:

Many probability questions can be expressed simply, “Find $\text{Pr}(E)$ .” We will focus about a third of our effort in this class to this problem, i.e. to computing chances.

Example: Permutations¶

The following example is adapted from the Data 140 textbook.

Suppose you are shuffling three cards labeled $a$ , $b$ , and $c$ . Then the space of all possible outcomes is

\Omega ~=~ \{ abc, ~acb, ~bac, ~bca, ~cab, ~cba \}

(1)

The event $\{abc, ~ acb, ~ bca, ~cba \}$ can be described as “ $a$ appears first or last”. 🛠️ Try filling in the events in the table below:

Event	Verbal Description	Subset
$A$	$a$ appears first	$\{abc, acb\}$
$B$	$a$ and $b$ are next to each other
$C$	the letters are in reverse alphabetical order
$D$	$a$ does not appear
$E$	$b$ is either first, second, or third
$F$	the letters form a word that means ‘taxi’

Answers

Event	Verbal Description	Subset
$A$	$a$ appears first	$\{abc, acb\}$
$B$	$a$ and $b$ are next to each other	$\{abc, bac, cab, cba\}$
$C$	the letters are in reverse alphabetical order	$\{cba\}$
$D$	$a$ does not appear	$\emptyset$
$E$	$b$ is either first, second, or third	$\Omega$
$F$	the letters form a word that means ‘taxi’	$\{cab\}$

Notice that we can describe events that contain every possible outcome. Then the event equals $\Omega$ .

We can also describe events that are impossible. If an event is impossible then no outcome satisfies the event, so the corresponding set is empty. Hence, $D = \emptyset$ in the table above.

Combining Events¶

We will often define events as combinations of other events. For instance, we might be interested in the event that a randomly sampled student is either a sophomore or junior. This event concatenates the events, the student is a sophomore, and the event, the student is a junior. Formally, we combine events by listing the constraints that define each component of the event, then rules for combining those constraints. These rules can be boiled down to four logical operations:

not: e.g. event $A$ does not happen. Applying “not” replaces $A$ with its complement, $A^c$ .
- So, every time you see (or think) the word not you should consider the event’s complement.
or: e.g. either event $A$ happens or event $B$ happens. Applying “or” combines all the outcomes contained in the collections $E$ and $F$ .
- For instance, the set of die rolls that are either even or less than 4 is the set $\{2,4,6,1,3\}$ . Notice that we never list an identical outcome twice. This set is produced by listing all the distinct entries that appear in $\{2,4,6\}$ or $\{1,2,3\}$ .
- The set operation that corresponds to or is a union, denoted $\cup$ . For instance, if $A = \{2,4,6\}$ and $B = \{1,2,3\}$ then $A \cup B = \{1,2,3,4,6\}$ . You should read $A \cup B$ as “ $A$ union $B$ ”. Like in a good marriage, the union of two sets is the collection of all entries owned by both sets.
- Notice: combining sets with the contraction “or” never makes the sets smaller. If they are distinct, it makes the sets larger. So, adding an “or” statement makes an event more general. The more ways an event can occur, the more likely it is. So, expanding an event with an “or” statements never decreases its chance.
and: e.g. event $A$ happens and event $B$ happens. Applying “and” restricts our attention to only the outcomes that are contained in both sets. These are the outcomes that satisfy all of the conditions in $A$ and $B$ . For instance, the set of all die rolls that are even and are less than 4 is the set $\{2\}$ .
- The set operation that corresponds to and is a intersection, denoted $\cap$ . For instance, if $A = \{2,4,6\}$ and $B = \{1,2,3\}$ then $A \cap B = \{2\}$ . You should read $A \cap B$ as “ $A$ intersect $B$ ”.
- Notice: combining sets with the contraction “and” never makes the sets larger. It usually makes the sets smaller. So, adding an “and” statement makes an event more specific. Contracting an event with an “and” statements never increases its chance.
if: e.g. what is the probability that event $A$ happens if event $B$ happens? Applying if restricts the outcome space. If $B$ happens, then any outcome $\omega \notin F$ cannot happen. So, since we don’t need to consider impossible events when computing chances, we can replace $\Omega$ with $B$ .
- Restricting the outcome space to all outcomes that satisfy an if statement is called conditioning.
- For instance, the probability that a die roll is less than 4 if the roll is even should be computed using the outcome space $\{2,4,6\}$ .
- It is common to use “given that” instead of “if”. For instance, the statements, “Find the probability that a die roll is less than 4 if it is even” is the same as the statement “Find the probability that a die roll is less than 4 given that the roll is even.”
- The standard notation for “if” is a vertical bar. E.g. $\text{Pr}(A|B)$ means, “the probability that $E$ happens if $B$ happens”.

Solution

Here’s a reasonable diagram:

We’ve grayed out all of $\Omega$ and $A$ outside of $B$ since, when we condition on $B$ , we are restricting our attention to $B$ . Then, the set $A \cap B$ represents the only outcomes where $A$ happens, given that we start in $B$ . Notice that the definition of the event is the same as for and. What’s changed is the outcome space. Instead of looking at the intersection relative to all of $\Omega$ , we are only comparing it to $B$ . We’ll see in Section 1.5 that the formula for conditional chances looks like the formula for joint chances (and statements), but rescaled so that proportions are evaluated out of the event we condition on ( $B$ ) instead of $\Omega$ .

In summary:

P.S. A Comment on Notation¶

We introduced a lot of notation in this section. Notation is crucial in mathematics since abstract notation can express a general idea that is realized in many different specific ways. For instance, $\text{Pr}(E)$ means the probability of the event $E$ , which can serve as shorthand for any of the probabilities listed at the start of the chapter. We could let $E$ stand for “it rains tomorrow” or “this stock doubles in value in the next quarter” without changing the notation used for the probability of an event. Compressing many different statements with an abstract shorthand is powerful since it will allow us to prove common facts about apparently unrelated systems. It will also allow us to say complicated things quickly.

Notation is also intimidating and exclusionary. Like any language, it is foreign on first exposure, and can seem more like a special code, or pedant’s jargon. We will do our best to avoid unneeded notation and to explain every notation introduced. In exchange, we promise that the notation we do use is worth learning, and expect you to learn it.

If you struggle with new notation, or found the notation above opaque, be proactive, and treat it like you would a foreign language. Make a table, with symbols on one side, name each symbol, then add a plain language description of each symbol or notation. It can help to write out a long form explanation, then a concise, pithy, “key” that you use to express the essence of the idea. Try to make the “key” memorable.

Many ideas in set theory are best represented visually with a Venn diagram. So, try drawing a Venn diagram for the cases $A = B$ , $A \subset B$ , $B \subset A$ , and $A$ and $B$ disjoint. Make space in your table for these visuals. You’ll rehearse these skills in your first discussion section.