watabadidea t1_j61lwjw wrote on January 27, 2023 at 1:42 AM

Reply to comment by Coquenico in The bivalent mRNA boosters from Pfizer-BioNTech and Moderna were 48% effective against symptomatic infection from the predominant omicron subvariant (XBB/XBB.1.5) in persons aged 18-49 years according to early data published by the CDC by shiruken

>They don't have to explain; anyone who's trained in statistics knows

Whenever someone starts off by trying to make it known how much smarter and better educated/trained they are, I know I'm in for some excellent analyses and good faith engagement.

>It's always the same reason: you have many more people to work with; because comparatively very few people get hospitalized.

So the reason one question is easier to answer than another is "always" because one has "many" more examples to work with? It doesn't matter how more complex one system might be, how easy or hard it is to actually observe the two systems, how easy or hard it is to make accurate measurements, etc.?

It "always" comes down to the one that has "many" more examples to work with?

>It's like if you're trying to check if two dice are loaded, but there's one die you can roll every few seconds and another you can roll only once every hour

So the first die is rolled 3,600 times more frequently than dice two. Your statistical training tells you that this means that it will "always" be easier to tell if the first dice is loaded than the second?

So what if the second dice is 10 billion times less fair/more loaded than the first dice? I would think that, even with the slower rolling speed, I'll be able to determine that the second die is loaded long before I can tell the first die is loaded.

Your stance is that I'm wrong and that the error in my thinking is because of the superior statistical training that you have?

>Of course if you could collect literally all data on all patients nationwide/worldwide you'd have enough cases

Of course, but how is that relevant? OP never suggested that you'd need a data set that big. I never suggested that you would need a data set that big. The CDC never said you would need a data set that big.

Are you saying that you'd need a data set that big?

If so, maybe we should discuss your thought process here. Otherwise, if literally nobody that matters to the conversation is suggesting we need a data set that large, then why are you choosing to focus on it?

>I don't think you realize just how much the cost/benefit ratio is skewed towards getting vaccinated.

Obviously, at least for the vast vast majority of people. We aren't talking about getting vaccinated though. We are talking about getting boosted.

So, again, how is this relevant?

>You're basically complaining that the weight of your car is given in pounds rather than in ounces

If I have the accurate weight of the car in pounds, I can literally directly calculate how many ounces it is. Are you telling me that knowing:

>The bivalent mRNA boosters from Pfizer-BioNTech and Moderna were 48% effective against symptomatic infection from the predominant omicron subvariant (XBB/XBB.1.5) in persons aged 18-49 years according to early data published by the CDC

...allows you to directly calculate the answers to all the questions I asked? NGL, that would be really impressive.

Coquenico t1_j61ypu7 wrote on January 27, 2023 at 3:19 AM

> Whenever someone starts off by trying to make it known how much smarter and better educated/trained they are, I know I'm in for some excellent analyses and good faith engagement.

I'm telling you that the answer you're looking is statistical in essence, and that you cannot understand the answer if you do not understand the underlying statistical approach

> So what if the second dice is 10 billion times less fair/more loaded than the first dice?

even if the die always rolled the same number you'd still need at least 5 hours to go anywhere. In those 5 hours you would be able to have detected/excluded very small deviations in the other die (note that you can never exclude extremely small deviations)

So of course there are other factors involved, but statistical power is always hugely dependent on the raw numbers

the current problem isn't like this anyway. Proving that the booster is at most 99% efficient against hospitalization is relatively easy, but it's a result that's useless, as instead the question that's relevant for policy is to get an estimate of its efficiency within a maybe 10% ballpark; is it around 20%, 50%, 80%? So it's the same order of magnitude as for the efficiency on symptoms

> Your stance is that I'm wrong and that the error in my thinking is because of the superior statistical training that you have?

It's not a stance. Whenever statistics are involved, it's intrinsically harder to work with events that are rare, and it's something that's very intuitive to all statisticians. I've tried to explain why that's the case but it's useless if you don't listen

watabadidea t1_j62ylyf wrote on January 27, 2023 at 9:51 AM

>I'm telling you that the answer you're looking is statistical in essence, and that you cannot understand the answer if you do not understand the underlying statistical approach

That's fundamentally very different from what you actually said though.

You can't understand the answer without understanding the approach != If you were trained in statistics, you wouldn't need someone to explain the answer to you.

If you honestly can't see the different implications in those two statements, I'm not sure what to tell you.

>(note that you can never exclude extremely small deviations)

Again, this is fundamentally at odds with what you said previously. Based on your earlier claims, it is "always" easier to determine if die 1 is loaded than die 2. That means that by the time we have enough info that die 2 is loaded, we should "always" have enough info to determine die 1 is loaded, regardless of how extremely small the deviations are.

Is that really the conclusion that your statistical training leads you to? Or were you just making dishonest overgeneralizations to try to shut down questions you didn't like?

>So of course there are other factors involved,

Of course there are!

That's literally not what you said earlier though.

>...but statistical power is always hugely dependent on the raw numbers

Nobody is arguing against this. Your claim is that relative ease is "always" determined by this. Again, those are two massively different claims. I have a hard time believing that you don't see this.

>So it's the same order of magnitude as for the efficiency on symptoms

If the efficiency on symptoms is 48%, that is .48 or 4.8*10^-1 for an order of magnitude of -1. Using an order of magnitude of -1, we can have anything from 1.0*10^-1 to 9.9*10^-1. Stated another way, everything from 10% effective to 99% effective has the same order of magnitude as 48%.

It seems like you agree that this is too wide of a range to be useful in a practical sense.

>Whenever statistics are involved, it's intrinsically harder to work with events that are rare,

Well this isn't true though. Go back to the dice example. What if die 1 is rolled inside a completely black box that you have no ability to interact with. You have literally no way of observing (directly or indirectly) what the die lands on when it is rolled. On the other hand, die 2 is perfectly observable but is rolled 1,000 times less frequently than die 1.

Or what if you have a sensor to 0automatically measure and record what number you get for each dice roll. However, the sensor for die 1 is broken and always spits out a completely random result regardless of what the true result is on die 1. In contrast, the sensor for die 2 is known to be 100% accurate, but die 2 is rolled 1,000 times less frequently.

The idea that it is intrinsically harder to work with die 2 in these scenarios because the events are more rare is just flat out wrong.

> ...and it's something that's very intuitive to all statisticians.

If the professors in your statistics program are telling you die 2 is harder to work with in those scenarios, you should probably ask for a refund.

>I've tried to explain why that's the case but it's useless if you don't listen

Have you considered the possibility that you don't actually know as much as you think you do?

Coquenico t1_j6f4w9z wrote on January 29, 2023 at 10:18 PM

there's nothing I can do for you here. You need to read through a basic statistics book. Seems like you have some training in physics so hopefully the mathematical aspects won't be a problem for you

And stop believing people more competent than you on a subject are out to get you. I'm trying to explain in a few lines things you need several years of formal learning to fully understand, of course there are going to be many caveats. That doesn't mean I'm not doing my best to portray things honestly. Now if you don't want to trust me, well, as I said, learn statistics yourself

watabadidea t1_j6i1up3 wrote on January 30, 2023 at 2:10 PM

>You need to read through a basic statistics book.

I think that this might highlight your problem here. This idea that it "always" comes down to which scenario has more frequent occurrences is exactly the type of dumbed-down, overgeneralized claim you'd find in a basic statistics book.

While it might be useful for discussing the issue with someone who has literally zero knowledge of the field, it has no place in a higher level discussion of real-world studies with other professionals.

Seriously, have you ever been involved in a real-world research study where you were going to have to collect a ton of data and then analyze it? I have. "How do we measure what's happening?" and "How do we know our measurements are accurate?" and "How do we reduce system complexity to isolate the thing that we are actually interested in?" are some of the very first conversations that we have.

If someone on the team responded by essentially saying:

>None of that matters. All we have to do is pick something that happens more frequently because that is always the answer.

...we are going to start wondering how this guy ever got on the team in the first place.

>And stop believing people more competent than you on a subject are out to get you.

It is more about people that:

Don't know anything about me.
Insist on repeatedly stressing how incompetent/untrained/unskilled I am, despite knowing nothing about me.
Use these repeated, baseless claims about my competency to dismiss my clearly accurate and legitimate criticisms/critiques out of hand.

If you are doing that, which you clearly are, you might not be out to get me, but you are certainly trying to shut down rational discussion.

>Now if you don't want to trust me, well, as I said, learn statistics yourself

Have you considered the possibility that I don't trust you and have learned statistics myself? You think that not properly considering this is a root cause of our disagreement?

>That doesn't mean I'm not doing my best to portray things honestly.

I mean, I don't know any professional in my field that would dismiss the importance of observability, measurement accuracy, system complexity etc. when attempting to perform statistical analysis of some event of interest.

You've dismissed the importance of these things over and over in this conversation by repeatedly stressing that it "always" comes down to how frequently something occurs. When I call you out, you resort to personal attacks on my understanding, training, and competency.

I'm not a mind reader so I can't say with 100% certainty what your motivation is, but it certainly doesn't seem like you are making a serious effort to engage honestly.

Coquenico t1_j6jaziu wrote on January 30, 2023 at 7:05 PM

> I think that this might highlight your problem here. This idea that it "always" comes down to which scenario has more frequent occurrences is exactly the type of dumbed-down, overgeneralized claim you'd find in a basic statistics book

not at all; I'm not recommending a basic book because it will give you the answer you're looking for, but because it's where you need to start

> Seriously, have you ever been involved in a real-world research study where you were going to have to collect a ton of data and then analyze it?

its my job

> Don't know anything about me.

it seems you have formal training in physics but not in statistics

> Insist on repeatedly stressing how incompetent/untrained/unskilled I am, despite knowing nothing about me.

you keep denying elementary statistical principles, so I assume that you don't have that knowledge

you keep failing to see the problem from a broad statistical perspective. That alone is proof of your incompetence, and why I recommended reading a basic book. You don't have the foundation to transfer your knowledge of physical data analysis to medical data analysis

watabadidea t1_j6jlqhh wrote on January 30, 2023 at 8:12 PM

>not at all; I'm not recommending a basic book because it will give you the answer you're looking for, but because it's where you need to start

That implies that I have no "start" in understanding statistics. This is a baseless (and inaccurate) implication.

>its my job

Ok, so apply that. If you get a real-world scenario that you are trying to analyze, you don't consider how observable it is? You don't consider how easily you can measure it? You don't consider how accurate your measurements are? You don't consider how complex the system is?

Instead as long as it occurs "many" times more frequently than a problem that can be successfully analyzed with a high degree of accuracy, then you "know" that this problem will be "easier?"

Seriously, there are instances where you can get statistically meaningful results with a frequency of a few dozen. 1,000 is certainly "many" more than that. Your stance is literally that you can model the most complex systems in the universe as long as they have happened at least 1,000 times.

Not 1,000 times that you've seen. Not 1,000 times that you can accurately measure. They just have to have happened 1,000, period.

Again, this assertion is just ridiculous on its face, yet that's what is suggested by your position. When I've pointed out that it is ridiculous, your go to move is to resort to personal attacks.

>it seems you have formal training in physics but not in statistics

That's the assumptions you've made. That's not the same as that actually being the case, nor is it the same as there being a logical basis to from that conclusion.

>you keep denying elementary statistical principles, so I assume that you don't have that knowledge

The idea that many more occurrences always makes one thing easier to analyze than another, regardless of relative observability, measurability, accuracy of measurements, system complexity, etc. is not an elementary statistical principle. Saying it repeatedly doesn't change the reality.

>you keep failing to see the problem from a broad statistical perspective.

Well your claims aren't limited to a broad statistical prospective though. When you claim that this is "always" the case and you make personal attacks on the knowledge base of anyone that disagrees, then you are pretty clearly taking the stance that it applies in any and all scenarios, including very specific circumstances.

Coquenico t1_j6jmfqu wrote on January 30, 2023 at 8:16 PM

> That's the assumptions you've made. That's not the same as that actually being the case, nor is it the same as there being a logical basis to from that conclusion

there's definitely a logical basis :) now of course, you're clearly not honest with me, so I'm only permitted suspicions

[deleted] t1_j6jop3v wrote on January 30, 2023 at 8:30 PM

[removed]

watabadidea t1_j6jq3s6 wrote on January 30, 2023 at 8:39 PM

Look in the mirror.

You're either pretending to have a job collecting and analyzing data or you are pretending to believe that you can easily reach statistically relevant results for any question of interest, as long as something has happened ~1,000 times, even if it is impossible to observe or measure these ~1,000 events.

Not only that, you claim that this is an "elementary statistical principle." Maybe you should pump the breaks on accusing others of not being honest here.

Coquenico t1_j6jt22j wrote on January 30, 2023 at 8:57 PM

I've already given answers to these arguments. You're over-interpreting what I've said and have built a straw man that I won't bother taking down

if you want to believe you know, do just that

watabadidea t1_j6jtw3t wrote on January 30, 2023 at 9:02 PM

>You're over-interpreting

Nope. You said "always." I called you out on that as being an over generalization that didn't hold water when applied to all specific instances. You response was to make personal attacks about how I don't understand statistics.

>...and have built a straw man that I won't bother taking down

You didn't say "always"? You didn't push back and resort to personal attacks when I called you out on this being an over generalization?

Or are you saying that you agree that it was a over generalization, but you still personally attacked me for pointing it out?

Coquenico t1_j6jxzfu wrote on January 30, 2023 at 9:27 PM

> of course there are other factors involved, but statistical power is always hugely dependent on the raw numbers

always is correct

my very first answer could have specified "always in epidemiology studies", but it was evident from context; unless you've forgotten what this discussion is about (which very much seems to be the case, at this point you just want to convince yourself that you are right to doubt the faithfulness of the original article and whoever defends it)

watabadidea t1_j6jzpaz wrote on January 30, 2023 at 9:38 PM

>always is correct

Now you are just being disingenuous. You and I both know that this wasn't your first use of the word "always," nor was it the one I was referring to.

>my very first answer could have specified "always in epidemiology studies", but it was evident from context;

Really? Your very first answer include the following example:

>It's like if you're trying to check if two dice are loaded, but there's one die you can roll every few seconds and another you can roll only once every hour

The reality is that, unless you are suggesting that rolling dice is an epidemiology study, then the context clearly wasn't limiting your claim to epidemiology studies. At the very least, the context was applying your statement to dice rolls as well.

EDIT: Funny that you don't even attempt to address my claim (e.g., at the very least, the context of your example was meant to apply to epidemiology studies and dice rolls). Instead, you just make a reply that doesn't attempt to address this point and then block me.

Coquenico t1_j6k57l9 wrote on January 30, 2023 at 10:12 PM

the metaphor is valid for epidemiology studies. at the core you're just tallying the chances of an objectively observable binary outcome in a series of predetermined groups

I'm not sure where your experiment of rolling infinitesimally loaded dice in a sealed black box is coming from but it's so completely absurd and disconnected from the practical and theoretical considerations associated with epidemiology that I needn't comment on it