Submitted by eth_trader_12 t3_ycxaz9 in askscience
Say Juliet wins two state lotteries back to back in 1999.
A person who thinks it was rigged might say "Well, the probability of her winning two state lotteries back to back is infinitesimally low. Therefore she probably rigged it."
A person who looks at the broader scope of lotteries might say "Well, the probability of someone winning two state lotteries at some time is much higher. This is explainable by chance and the lottery being rigged isn't necessary."
The second is a logically weaker description of the first. It "seems" more correct, but is it? What is the justification for it? Many would say that the second description seems more relevant, but why? Why is it exactly more relevant?
Curiously, one can construct logically stronger and logically weaker versions of the same exact sequence of events, and yet...each would have wildly different probabilities. For example, these descriptions go from logically strong to logically weak:
- What is the probability that Juliet decided to play two New Jersey state lotteries, won two state lotteries, and won it at 9 PM? (assuming it was announced then)
- What is the probability that Juliet won two New Jersey state lotteries?
- What is the probability that Juliet won two state lotteries?
- What is the probability that someone won two state lotteries?
- What is the probability that someone won two lotteries?
- What is the probability that some rare meaningful event in the world occurs?
- What is the probability that an event occurs?
The last statement almost seems like a tautology, and yet, can one come up with a satisfactory answer as to why it's not relevant here? Which description is most relevant?
albasri t1_itq7f8y wrote
Part of this is simply about having more information to allow us to make better judgments. If I ask "what's the probability that it is going to rain on Sunday" my answer will be different if I have no other information than if I know we are talking about a particular place and time of year. But then we can represent this as different probabilities: P(rain on Sunday) vs. P(rain on Sunday | we are in Seattle in October and are having a rainy season and it's been overcast and rainy for the last 3 days and is overcast today and the weather app said it was going to rain). And note here that the conditional probability is different from the joint probability! (See conjunction fallacy below.) In the conditional case we are saying that those other events have already happened; in the joint case, we are asking about the probability of it raining AND those other events happening.
This is also related to how we define a probability space. Once we agree on the possible events and outcomes, we can go about assigning or calculating probabilities to them. This is something that we don't always do exactly the same way, especially in ambiguous circumstances. For example, you can interpret my rain question as a question about it raining or not raining on Sunday (those are the possible events that make up our probability space) or as a question about the probability that it will rain on that day of the week vs. other days of the week. We can be primed to think about the problem in either way and this can bias our answers (Criag and Rottenstreich 2003 <- pdf!).
Which bringd up the issue of language / assumptions. When someone asks "what is the probability of rolling a 3" they are leaving out "on one roll of a fair, 6-sided die that has a different number on each side". We generally understand what poaaible outcomes we are considering (the probability space) when we are asking these questions, but sometimes need to be more specific.
Part of what you describe might also be related to the conjunction fallacy which is an example of a kind of probability error people tend to make in judging two events (A and B) to be more likely than just one of those events (A). The classic example is the "Linda problem" from Kahneman and Tversky (1981) that I copy in full from wiki:
> Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
> Which is more probable?
> Linda is a bank teller.
> Linda is a bank teller and is active in the feminist movement.
> The majority of those asked chose option 2. However, the probability of two events occurring together (that is, in conjunction) is always less than or equal to the probability of either one occurring alone—formally, for two events A and B this inequality could be written as Pr(A and B) <= Pr(A) and Pr(A and B) <= Pr(B).
> For example, even choosing a very low probability of Linda's being a bank teller, say Pr(Linda is a bank teller) = 0.05 and a high probability that she would be a feminist, say Pr(Linda is a feminist) = 0.95, then, assuming these two facts are independent of each other, Pr(Linda is a bank teller and Linda is a feminist) = 0.05 × 0.95 or 0.0475, lower than Pr(Linda is a bank teller).
So I suppose the answer to your question is that it depends on the context and what you are actually interested in estimating. For that we need to be clear and precise and sometimes say more than just one sentence that can be interpreted in many ways.