Submitted by eth_trader_12 t3_ycxaz9 in askscience

Say Juliet wins two state lotteries back to back in 1999.

A person who thinks it was rigged might say "Well, the probability of her winning two state lotteries back to back is infinitesimally low. Therefore she probably rigged it."

A person who looks at the broader scope of lotteries might say "Well, the probability of someone winning two state lotteries at some time is much higher. This is explainable by chance and the lottery being rigged isn't necessary."

The second is a logically weaker description of the first. It "seems" more correct, but is it? What is the justification for it? Many would say that the second description seems more relevant, but why? Why is it exactly more relevant?

Curiously, one can construct logically stronger and logically weaker versions of the same exact sequence of events, and yet...each would have wildly different probabilities. For example, these descriptions go from logically strong to logically weak:

  1. What is the probability that Juliet decided to play two New Jersey state lotteries, won two state lotteries, and won it at 9 PM? (assuming it was announced then)
  2. What is the probability that Juliet won two New Jersey state lotteries?
  3. What is the probability that Juliet won two state lotteries?
  4. What is the probability that someone won two state lotteries?
  5. What is the probability that someone won two lotteries?
  6. What is the probability that some rare meaningful event in the world occurs?
  7. What is the probability that an event occurs?

The last statement almost seems like a tautology, and yet, can one come up with a satisfactory answer as to why it's not relevant here? Which description is most relevant?

1

Comments

You must log in or register to comment.

albasri t1_itq7f8y wrote

Part of this is simply about having more information to allow us to make better judgments. If I ask "what's the probability that it is going to rain on Sunday" my answer will be different if I have no other information than if I know we are talking about a particular place and time of year. But then we can represent this as different probabilities: P(rain on Sunday) vs. P(rain on Sunday | we are in Seattle in October and are having a rainy season and it's been overcast and rainy for the last 3 days and is overcast today and the weather app said it was going to rain). And note here that the conditional probability is different from the joint probability! (See conjunction fallacy below.) In the conditional case we are saying that those other events have already happened; in the joint case, we are asking about the probability of it raining AND those other events happening.

This is also related to how we define a probability space. Once we agree on the possible events and outcomes, we can go about assigning or calculating probabilities to them. This is something that we don't always do exactly the same way, especially in ambiguous circumstances. For example, you can interpret my rain question as a question about it raining or not raining on Sunday (those are the possible events that make up our probability space) or as a question about the probability that it will rain on that day of the week vs. other days of the week. We can be primed to think about the problem in either way and this can bias our answers (Criag and Rottenstreich 2003 <- pdf!).

Which bringd up the issue of language / assumptions. When someone asks "what is the probability of rolling a 3" they are leaving out "on one roll of a fair, 6-sided die that has a different number on each side". We generally understand what poaaible outcomes we are considering (the probability space) when we are asking these questions, but sometimes need to be more specific.

Part of what you describe might also be related to the conjunction fallacy which is an example of a kind of probability error people tend to make in judging two events (A and B) to be more likely than just one of those events (A). The classic example is the "Linda problem" from Kahneman and Tversky (1981) that I copy in full from wiki:

> Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

> Which is more probable?

> Linda is a bank teller.

> Linda is a bank teller and is active in the feminist movement.

> The majority of those asked chose option 2. However, the probability of two events occurring together (that is, in conjunction) is always less than or equal to the probability of either one occurring alone—formally, for two events A and B this inequality could be written as Pr(A and B) <= Pr(A) and Pr(A and B) <= Pr(B).

> For example, even choosing a very low probability of Linda's being a bank teller, say Pr(Linda is a bank teller) = 0.05 and a high probability that she would be a feminist, say Pr(Linda is a feminist) = 0.95, then, assuming these two facts are independent of each other, Pr(Linda is a bank teller and Linda is a feminist) = 0.05 × 0.95 or 0.0475, lower than Pr(Linda is a bank teller).

So I suppose the answer to your question is that it depends on the context and what you are actually interested in estimating. For that we need to be clear and precise and sometimes say more than just one sentence that can be interpreted in many ways.

14

quants_pants t1_itqkh23 wrote

A similar discussion emerged after Mexico has been hit by an earthquake three times on the same date. It is nicely discussed in this podcast: https://www.bbc.co.uk/programmes/m001cq31 In essence, if you only wonder how likely an event was after it happened, it is incorrect to look at the probability of this happening to a specific person at a specific time and place. It could have happened to another person, at a different time and at a different place and you would still have read about it and wondered about how unlikely this event seems.

3

Brainy_Gal t1_itqqzh7 wrote

Good question! To answer it, I'll pose another one: why don't we take it as evidence the lottery was rigged every single time someone wins (i.e. not twice in a row but just once)? In other words why is: Given the lottery isn't rigged, the probability that SOMEONE will win is (close to) 1 the relevant statement, instead of: Given the lottery isn't rigged, the probability that JULIET will win is very small. Clearly if we use the latter approach, we will (almost) always find "evidence" the lottery is rigged even when it isn't, so this can't be correct. And the reason why is that we've formulated the probability statement AFTER we know Juliet won.

So, with someone winning twice in a row; the relevant statement is: given the lottery isn't rigged, what is the probability that SOMEONE will win twice in a row? Assuming that's reasonably high, we can't use this as evidence the lottery is rigged. But let's assume it's low; you may rightly ask: why isn't the relevant question: given the lottery isn't rigged, what is the probability that someone will win twice (at any two time points)? Why are we assigning special weight to the fact that it's twice in a row? And here is where there is really no good answer. Yes it's true we formulated the former probability statement before knowing someone actually did win twice in a row. Or did we? What if Juliet had not won back-to-back, but twice within the space of a week? Would we still insist this was evidence of the lottery being rigged? These are the questions that keep statisticians awake at night.

2

eth_trader_12 OP t1_itqty1l wrote

I think the relevant question even in your example of just one lottery is to consider the probability that Juliet winning is small. And there's no problem in that. The probability of Juliet winning given that it was rigged is 1. The probability of Juliet winning normally is very small.

But that's fine. That's the probability of Juliet winning given chance, not the probability of chance given Juliet winning. In order to arrive at the second, we need to look at prior probabilities. Given that the rate at which lotteries is rigged is probably less than the chance of Juliet winning, one would still conclude Juliet won fairly.

An example of something that makes this clear is imagining that every other lottery is rigged. If every other lottery is rigged, would it make sense to look at the probability of merely "someone" winning the lottery. Clearly not. Since the probability of "someone" winning the lottery is very high if not 1 depending on the lottery. But does that mean the probability of it occurring by chance is close to 1? No. Because every other lottery is still rigged.

−1

ohnoyoudin t1_itrcspg wrote

The results of two completely separate lottery events do not affect one another. It’s the same idea as flipping a coin twice- just because one toss came up heads this has no effect on the outcome of the second toss.

So really the relevant question here is ‘what is the probability of someone playing two state lotteries at once’. The probability of this is high- people who gamble on terrible odds are likely to do so in multiplicity.

2

uh-okay-I-guess t1_itrh646 wrote

Let's say you let J be the event that Juliet wins two state lotteries back to back, and let A be the probability that anyone wins two state lotteries back to back. Let R be the event that the lottery is rigged. What you really want to know is the posterior probability that the lottery is rigged, and this could be P(R|J) or P(R|A).

I'm interpreting your question as asking: how should I choose whether to care about P(R|J) or P(R|A)?

Fortunately, if we know nothing about Juliet, it doesn't really matter. See, you can calculate P(R|J) = P(J|R)P(R)/P(J), and you can also calculate P(R|A) = P(A|R)P(R)/P(A). I claim these calculations will produce very similar values when we know nothing about Juliet. P(R) is of course common to both. Canceling it from both expressions leaves P(J|R)/P(J) in the first expression and P(A|R)/P(A) in the second.

In the second expression, both the numerator and denominator are larger. There are at least a thousand people who play the New Jersey lottery^([citation needed]), so we expect P(A) > 1000P(J). But similarly, if the lottery is rigged, it could be rigged in favor of any of those one thousand people, so we also expect P(A|R) > 1000P(J|R). If we genuinely do not think any of those people is more likely to be favored by the rigging than any other, then the ratios will be exactly the same and our calculated probabilities will be the same too. In this case it doesn't matter whether we care about P(R|J) or P(R|A).

Remember, this only works if Juliet is an arbitrary person. If we do know that Juliet is the daughter of a crooked New Jersey politician, then maybe we think that if the lottery is rigged, it's quite likely to be rigged in her favor. In that case, we might say P(A|R) = 10P(J|R), while still believing that the lottery is probably not rigged and P(A) = 1000P(J). Then P(R|J) is going to be 100 times bigger than P(R|A). In this case it would be wrong to use P(R|A), because it's throwing away important information (Juliet's crooked connections).

In summary, in both cases we should really condition on the more specific event (i.e. we care about P(R|J)), because that takes into account all the available information. But luckily for our sanity, when we have no special information about Juliet, P(R|J) = P(R|A). So even though you want to condition on all the available information, it's fine to ignore information that means nothing to you.

1

eth_trader_12 OP t1_its2ozt wrote

I completely agree with you except the last statement. P(R|A) given the same principle of more information that you just said assumes that "all the information" we have is that someone at some time won two lotteries twice. As in, if you knew that someone won two lotteries at some point or another, then yes, P (R|A) would suffice.

But in this case, we know that Juliet won. Hence, we calculate P (R|J), even if we don't know anything else about Juliet

1

Coomb t1_its9nxp wrote

>I completely agree with you except the last statement. P(R|A) given the same principle of more information that you just said assumes that "all the information" we have is that someone at some time won two lotteries twice. As in, if you knew that someone won two lotteries at some point or another, then yes, P (R|A) would suffice. > >But in this case, we know that Juliet won. Hence, we calculate P (R|J), even if we don't know anything else about Juliet

Why? Why do we do that when as far as we know Juliet is no different from anyone else? There is no more reason to assume the lottery is rigged because a particular individual whose name you know won twice in a row if you know nothing at all about that individual. If you don't know them from Adam, then Juliet could just as easily have been Adam or Bobby or Charles or Doug. Knowing literally nothing other than her name is the same as knowing nothing at all about her, unless the lottery is rigged for anyone named Juliet and you know that.

1

uh-okay-I-guess t1_itsc8xy wrote

Your original post has a list of 7 events (A1 through A7) in increasing order of probability. The difference between P(A1) and P(A7) is many orders of magnitude, with huge jumps from each statement to the next, and there is seemingly no grounds for preferring to study one of these statements to any other. Even if you try to use the principle of including all information, it's not clear where to stop. You'll just worry that you needed to choose A0: Juliet won twice, at 9PM, and the weather was cloudy both times. Compared to P(A1), P(A0) is another order of magnitude lower.

But if you realize that you actually care about, not P(A1), ..., P(A7), but P(R|A1), ..., P(R|A7), then it's easy to see that adding cloudiness does not actually change the result. In a Bayesian formulation of the problem, only relevant information matters. P(R|A0) = P(R|A1). You can ignore the weather.

The point of the last statement in the previous post is that, if you know no particular distinguishing information about Juliet, you can calculate P(R|A) or P(R|J) and it doesn't matter, because you still get the same answer. So as long as you use Bayesian reasoning, you are basically free to pay attention to the specific identity of the person, or ignore it, without affecting your conclusion.

1

eth_trader_12 OP t1_itsfjq6 wrote

That’s what you’re confusing. Looking at the specific probability of Juliet winning the lottery twice does increase the probability that it is rigged, just perhaps not enough. The only reason many consider it still not rigged is because of prior probabiltiies: the vast majority of lotteries in history have been fair; very few have been rigged.

But now imagine as if half of all lotteries were fair and half were all rigged. Let’s assume 10,000 tickets and 10,000 people. Let’s now look at the first case the other commenter mentioned: Juliet won the lottery once. The prior probabiltiies are the same for rigged and fair so they can be ignored. We now look at the likelihoods. The probability of Juliet winning the lottery given a fair lottery is 1 in 10k. The probability of Juliet winning the lottery given a rigged lottery is ALSO 1 in 10k (given others could have rigged it). Fair lottery is equally as likely as a rigged one.

Now, let’s assume Juliet won the lottery twice. The priors are again the same so let’s look at the likelihoods. The probability of Juliet winning two lotteries given chance is (1/10k*1/10k). The probability of Juliet winning two lotteries given that it’s rigged is 1/10k (1 out of 10k people could have rigged it twice). Now, the rigged lottery is MORE likely. Note that if we looked at the more generic description of SOMEONE winning the lottery twice, the likelihood of SOMEONE winning the lottery back to back would be 1…given enough time. But the likelihood of SOMEONE winning the lottery back to back given its rigged..is also 1. Now we must conclude they’re equally likely, but that’s not accurate.

As you can see; looking at specifics seems to work better.

1

eth_trader_12 OP t1_itsfkc8 wrote

That’s what you’re confusing. Looking at the specific probability of Juliet winning the lottery twice does increase the probability that it is rigged, just perhaps not enough. The only reason many consider it still not rigged is because of prior probabiltiies: the vast majority of lotteries in history have been fair; very few have been rigged.

But now imagine as if half of all lotteries were fair and half were all rigged. Let’s assume 10,000 tickets and 10,000 people. Let’s now look at the first case the other commenter mentioned: Juliet won the lottery once. The prior probabiltiies are the same for rigged and fair so they can be ignored. We now look at the likelihoods. The probability of Juliet winning the lottery given a fair lottery is 1 in 10k. The probability of Juliet winning the lottery given a rigged lottery is ALSO 1 in 10k (given others could have rigged it). Fair lottery is equally as likely as a rigged one.

Now, let’s assume Juliet won the lottery twice. The priors are again the same so let’s look at the likelihoods. The probability of Juliet winning two lotteries given chance is (1/10k*1/10k). The probability of Juliet winning two lotteries given that it’s rigged is 1/10k (1 out of 10k people could have rigged it twice). Now, the rigged lottery is MORE likely. Note that if we looked at the more generic description of SOMEONE winning the lottery twice, the likelihood of SOMEONE winning the lottery back to back would be 1…given enough time. But the likelihood of SOMEONE winning the lottery back to back given its rigged..is also 1. Now we must conclude they’re equally likely, but that’s not accurate.

As you can see; looking at specifics seems to work better.

0

eth_trader_12 OP t1_itsfut7 wrote

Sorry I replied to the wrong person but what I said applies to you as well.

In regards to what you said though, “what you care about” is subjective so we’re back to square one.

Ultimately though, I think the more specific information should be taken into account. The example in my other comment highlights that

1

eth_trader_12 OP t1_itvas30 wrote

It doesn’t. You missed the entire point of the example. Even if I didn’t know her name, I would know that a specific person won it, and the math would be the same.

The math would be different only if I knew that someone at some time won two lotteries, not at a specific time. The time is what’s relevant here

−1