How to convince the following probability problem to highschool student?

Question

I will present two problems alongside solutions, student is doing problems of type I like a cakewalk but has several issues with the problems of type 2;

Type I

Consider an experiment of rolling two dice:

Sample space $$ S = \{(1,1),(1,2),(1,3), \cdots, (6,6) \} $$

Let $A$ be the event of getting 6 as sum on two dice:

Event $$A = \{(1,5),(2,4),(3,3),(4,2),(5,1)\}$$

Let $B$ be the event of getting 4 on first die:

Event $$B = \{(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)\}$$

Now, the probability of getting sum of 6 on two dice given that first die appears as 4 is given by

$$p(A \mid B) = \dfrac{p(A\cap B)}{p(B)} = \dfrac{p\{(4,2)\}}{p(B)} = \dfrac{1}{6}$$

Type II

Let $C$ be the event of having cancer and $p(C) = 1/2$

The probability of having both tumor and cancer is $p(C\cap T) = 1/6$

Then the probability of having tumor given cancer is given by

$$p(T\mid C) = \dfrac{p(T\cap C)}{p(C)} = \dfrac{1}{3}$$

Students are understanding type I problems and type 2 problems well, but some students are asking to present $C, T, C\cap T$ in terms of sets. I tried to convince them using Venn diagrams. But they are asking for either roster form (for discrete sets) or set builder form(for any set).

I am trying to do like follows

\begin{align} \text{Sample space} = S &= \{x \mid \text{$x$ is a living being} \} \\ C &= \{x \mid \text{$x$ has cancer}\} \\ C\cap T &= \{x \mid \text{$x$ has both tumor and cancer}\} \\ \end{align}

Is it right way to do? Students are facing difficulty and they are asking every event inform of set, since definition of event is that it is a subset of sample space (which is a set).

I might say $S = \{ x \mid \text{$x$ is a person} \}$ (since I am guessing that your sample space is people, not all living things (can a paramecium even get cancer?), and I would likely define the set $T := \{ x \mid \text{$x$ has a tumor} \} \subseteq S$, but it otherwise looks fine. That being said "$T$ is the set of people with tumors" is a reasonable definition of set in this context; I wonder why you are getting so much guff about it... — Xander Henderson
– Xander Henderson ♦, Commented Jan 15, 2018 at 15:33
I would be inclined to write $$ S = \{ x \mid x \text{ is a sample} \} \qquad \qquad C = \{ x \mid x \text{ is in event $C$} \}$$ and so forth if someone insisted on writing the events as sets in a context where they don't matter. — user797
– user797, Commented Jan 15, 2018 at 20:17
There's a step missing in both solutions, and this serves to hide the fact that in the Type I solution one is likely answering by simply substituting frequencies (not probabilities). That is, it submarines the possibility that one thinks $p((4, 2)) = 1$ and $p(B) = 6$; this in turn can then feed errors in Type II. — Daniel R. Collins
– Daniel R. Collins, Commented Jan 16, 2018 at 2:12
Isn't the secret problem here that in type 1 there is a finite sample space which can be explicitly written down, but not in type 2? — Torsten Schoeneberg
– Torsten Schoeneberg, Commented Oct 2 at 2:21

Nate Bade · Accepted Answer · 2018-01-18 20:15:14Z

To use set builder notation you need to carefully state to which set every element you want to discuss belongs. In general it goes

$P = \{x\in\text{some set} : x\text{ satisfies some truth evaluable proposition}\}$.

Here's how I would do it in your case:

Let $P$ be the set of all humans, let $C=\{x\in P: x\text{ has cancer}\}$ and let $T=\{x\in P: x\text{ has a tumor}\}$. Then $C\cap P$ is the set of humans that both have cancer and a tumor, or $C\cap P = \{x\in P: (x\text{ has a tumor})\wedge (x\text{ has cancer})\}$. In the previous, the wedge notation is the logical 'and'.

Stef · Accepted Answer · 2025-10-02 09:41:25Z

Rolling dice is simple, and easy to imagine or visualise. Enumerating die results is easy. Drawing and visualising a table of 36 possible two-roll outcomes is easy.

But sampling from a population of humans who may or may not have cancer is hard to imagine and visualise. In fact, the question never mentions sampling from a population. It never explains why probabilities are relevant to the situation. In fact, it never even presents a situation!

The question opens abruptly with "The event of having cancer" and "The probability of having both tumor and cancer". But "having cancer" is a verb, and it has no subject here! Who has this cancer? Where do the probabilities come from?

One way to interpret the problem is to imagine that we are presented with a fixed person, and this person will randomly get cancer or not. In that case, it's not obvious at all what the sample space would be, and it's pretty hard to link the intuitive idea of the event "this person gets cancer" with the corresponding mathematical event, which must be a mathematical set.

Another way to interpret the problem is to imagine that we randomly sample a person from a population. So, every person already has cancer or not, and there is no randomness involved as to whether each given person has cancer; the randomness comes from us randomly selecting one person from this population. A statistician would immediately interpret the problem this way. But will a student interpret it this way, and be able to visualise the population and the events?

I would bring an actual Guess Who? set to the classroom, with tumor and cancer written on the little characters. Now you have a table of characters to sample from, the two sets A and B are easy to visualise, and your two types of problems are now pretty similar.

That being said, I would be wary of the proposed solution to the type I problem. Specifically this line: $$p(A \mid B) = \dfrac{p(A\cap B)}{p(B)} = \dfrac{p\{(4,2)\}}{p(B)} = \dfrac{1}{6}$$ It looks like someone just plugged in $p(\{(4,2)\})=1$ and $p(B)=6$, which would obviously be incorrect. There are two ways to calculate this fraction: either the simple but hard way, plug in $p(\{(4,2)\})=1/36$ and $p(B)=1/6$, then simplify the resulting fraction to finally arrive at $1/6$; or the subtle way, don't plug in probabilities at all, because the ratio of the probabilities of two events is equal to the ratio of cardinals of those two events, although this is only true because all outcomes of the sample space are equiprobable. Either way, there is a step missing in the solution, and although this step might seem obvious to the educator, it is somewhat subtle and might hide (or cause!) a lack of understanding.

Andreas Blass · Accepted Answer · 2018-01-23 00:03:38Z

Since they can do Type-1 problems, I'd begin by trying to make your Type-2 problem look as similar to Type 1 as possible, despite the fact that, in the Type-2 problem, we don't actually know the size of the sample space. So I'd temporarily assume some specific size, say $6$ (to make the arithmetic easy). So we have $6$ people of whom $3$ have cancer, and one of those $3$ also has a tumor. Let them work out the conditional probability in this case, which I hope they can do because this version of the question is very similar to Type 1. Then, re-do the problem with a sample space of size, say, 24. And again with 60 or 600, etc., until it becomes clear to them that the information in the problem is enough to determine the answer, regardless of the size of the sample space. The formula for conditional probability in terms of absolute probabilities summarizes the results of these calculations.

guest · Accepted Answer · 2018-01-21 19:40:15Z

The question is very confusingly stated. (Hopefully it is more clearly done in native language.) I can't help with the logic barrier for the students, but if they are having a hard time with the logic and then you add on confusing language, this makes things worse. Or may be more the problem than the logic barrier itself.

"Let T be the event of having cancer and p(C)=1/2".

Is T tumor or cancer??? And why have you combined such separate ideas (definition of T tumor and probability of C cancer) into one joined compound sentence? [You are lacking a comma, too. Wouldn't normally nit about this--make mistakes myself--but it adds to the bafflement.] For that matter what does "cancer" and "tumor" mean? (Note how I add the term "detectable" below to help with the huh factor of people equating C and T.)

Let me try:

C is the fraction of the population that has cancer.
T is the fraction with a detectable tumor.*

Given:

A. Probability of cancer C in the population is 0.5 B. Probability of having both cancer C and a detectable tumor T is 1/6.

Question: What is the probability of having a detectable tumor T, given having cancer C?

Answer: (your equation)

P.s. This is just a student math problem but mistakes in the medical literature are easy to make and common. In some cases even affecting patient treatment. So try to be super precise. I'm not perfect either probably have some mistakes myself. But I do know the importance of clear communication in medical work!

*Not clear to me if you are considering benign detectable tumors (i.e. possible to have T sans C). This has implications to how to think about the problem in real world. For instance, you could have a person who has C (cancer) and has T (detectable tumor) but the detected tumor is benign. Thus examination, biopsy, surgery etc. might not detect or treat the person's cancer. I don't think this affects your math problem, but just be careful.

I agree with the points made in this answer, but I would avoid the words "fraction of the population" to describe an event. A fraction is a fraction. An event is a set. So I would say "C is the set of people who have cancer" or "C is the subset of the population of people who have cancer" and avoid "C is the fraction...". — Stef
– Stef, Commented Oct 1 at 15:37

Stack Exchange Network

How to convince the following probability problem to highschool student?

4 Answers 4

You must log in to answer this question.

Hot Network Questions

How to convince the following probability problem to highschool student?

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions