Is a statistician smarter than a fifth…

Mar 28, 2024

Take this quiz to find out.

14 Comments

Mar 28, 2024

Statistics gives us a framework of working with uncertainty in reality. The origin of Gaussian distribution is that Gauss wants to measure the error from celestial objects. In that context, we already have a way to observe the reality---celestial movement can be verified. Unfortunately, many other phenomenon where probability distribution is assumed can not be repeated/verified. The truth might be that statisticians can not solve that problem. Introducing differential, asymptotic analysis, measure theory can not solve this fundamental problem. It can even make problem worse by introducing more assumptions. We can not pretend that we know what we can't know by using sigma-algebra...

Expand full comment

Rob Nowak

Mar 28, 2024

My point is just that we know that the standard deviation will decrease like 1/sqrt{n} so we can use this to help guide how we allocate samples going forward. If the first and third decks empirical red proportions are separated by multiple stds at some point, then we could stop sample cards from deck 1 and focus just on decks 2 and 3. So, this is some justification for studying the standard deviation or confidence intervals of estimates.

Expand full comment

Reply (1)

Ben Recht

Mar 28, 2024

Yes, the notion that we *can* randomly sample does a lot of work there, but this is a good argument for intervals.

Expand full comment

Reply (1)

Non Linear Panacea

Mar 29, 2024

This kind of “thought experiment” is used to support many statistical arguments, which oversimplifies the problem. This kind of decks argument are applied to estimate the population of animal, which is fine. But don’t forget that most animals are not decks of cards —they are moving, born and dying. They are also not celestial objects whose trajectory can be predicted through some Newtonian law. And outbreak of disease could destroy the power of interval. Variance and mean give us little information of a dynamical system even if we assume we know the pdf formula.

I suggest another kind of thought experiments as an educational supplement. One day, God gives his power to a statistician. The power is not as awesome as god who knows the real law- explicit form of all functions(including random variable) whose measurable spaces contain all of multi-universe.

With such power, the statistician knows pdfs of all quantities in the Universe. One day a student comes to the statistician with three fairies. Those fairies can change into two color: red and non-red. In 10 minutes, the statistician notices :

fairy 1 has been red for 1 minutes.

fairy 2 has been red for 2 minutes.

fairy 3 has been red for 5 minutes.

The statistician uses his super pdf power to explain, which fairy likes to display red most and the frequentist interpretation of the confidence interval, to the student.

“Sir, I am confused. Why don’t you just ask them ?” Student replied.

Expand full comment

Willy, son of Willy

Apr 25, 2024

For your question in the appendix see Laplace's rule of succession:

https://en.m.wikipedia.org/wiki/Rule_of_succession

Expand full comment

Rob Nowak

Mar 28, 2024

I agree with the post. Things get more interesting when you have multiple hypotheses. Suppose you have three very large decks of cards. You want to determine which deck has the highest proportion of reds. You draw 100 cards from each deck and see 10, 20, 50 reds, respectively. Now you are allowed to draw more cards from each deck, but you have a budget of 300 total. How should you allocate your budget?

Expand full comment

Reply (2)

Ben Recht

Mar 28, 2024

I still need more information before I can answer your question, no?

Expand full comment

J Lee MD PhD

Mar 28, 2024Edited

My first guesstimate, about what could be a Trick Question regarding allocating the budget of 300 new card draws, would be that the ULTIMATE sizes of the three new samples from the three "very large card universes" should be inversely proportional to the 10/20/50 red counts that we have already have in hand. How about 120 extra for the 10 pile to give us 130 total cards, 110 for the 20 pile to give 130 total, and 70 for the third pile to give 120 total?

Expand full comment

J Lee MD PhD

Mar 28, 2024Edited

I'm no card-carrying Bayesian, but I must say that the *brilliant* 2021 text of Aubrey Clayton, "Bernoulli's Fallacy -- Statistical Illogic and the Crisis of Modern Science", should be on the book shelves of all persons reading this interesting post. I would argue that it's one-half of maybe THE most important reading assignment every serious thinker should tackle (the other half being Deb Mayo's 1996 text, "Error and the Growth of Experimental Knowledge").

Expand full comment

Reply (1)

Ben Recht

Mar 28, 2024

Maybe I should reread, but I found Clayton's fixation on the Frequentist vs Bayesian arguments caused him to not see that everyone is wrong about probability. That, to me, is a harder but more necessary case to articulate. Mathematical probability is rigorous, but there isn't a "right way" to apply it to reality in all cases. I'll keep working on fleshing out this view here.

But I haven't read that particular book by Mayo. Added it to my list.

Expand full comment

Reply (1)

J Lee MD PhD

Mar 28, 2024

Deborah Mayo is a definite genius and there are two books from her subsequent to the one mentioned. They are each exquisitely written in my opinion. She has a busy website that is a good "meeting place" for folks pumped up about the niche (?) field of Philosophy of Science. I agree that Professor Clayton seemed a wee bit too exercised in places about F vs. B wrangling that is now, what, at least 100 years old? However, I am not actually credentialed to make a truly authoritative evaluation (PhD organic chemistry and then MD and Board Certified General Surgery). As my sister told me, "Oh, you are *just* a general surgeon".

Expand full comment

Reply (1)

Ben Recht

Mar 29, 2024

Your sister is mean! I'd also argue that these probabilistic arguments are too important to defer to statisticians. Most certainly don't know more metaphysics than a fifth grader.

And I agree with your assessment of Mayo. Her blog is indeed full of great discussions.

Expand full comment

Fırat Kıyak

Mar 31, 2024

Why confidence intervals make no sense in the case of 21 cards? You can give a confidence interval for the total number of red cards, for example this basic one works: Let k be the number of red cards in the first 20. If k < 10, then the "interval" is {k}. If k=10, then {10, 11}. If k > 10, then {k + 1}. The probability that the number of red cards is inside this interval is minimized when there are 10 or 11 red balls in total, and you have 10/21 probability of failure. For other cases it is much better, for example when there are 5 red balls in total, then probability of predicting the correct number of balls is 16/21, and when there are 6 red balls in total, it is 15/21.

One interesting remark: If you want your confidence interval to consist of only a single number, i.e. you want to do a prediction, and if you also want to have a coverage probability of at least %50 for all possible choice of parameters, then no matter what comes up you need to flip a fair coin and say one of the two consistent outcomes. There is no better one than this.

Expand full comment

Diplo

Mar 29, 2024

Another fun post! "How many red cards do you think were in total in the deck?"

And then, "Beyond these facts, _I don’t know_ how the deck was assembled." I think we can question this "IDK". You say "we can all more or less agree this is a reasonable model". So, it's a thought experiment where we try to check what our reasonable intuitions are. My intuition is that there are two possibilities. Either the deck is approximately 50%-50% (e.g., we found a big pile of cards at the cottage and my cousin "expertly shuffled it") either it's not (e.g., my statistician friend is having fun with me; last year my cousin removed half of all cards of one color to make children's art and craft). I guess my reasoning is Bayesian. To me, personally, I find the first possibility quite plausible. I'm tempted to at least consider the possibility that at first the pile was approximately 500-500. If that's true, I know that 5-15 is unlikely. Without calculations I wouldn't know how unlikely but I'd know it's somewhat unlikely (apparently ~ 4% for <6 of either color). In real life, I think it's very reasonable to put a big prior on 500-500. So, either around 500 or around 250. I think of this "thought experiment" linguistically and socially before thinking about the maths. In other words, are the cards drawn from a a hypergeometric distribution, or from a normal pile of 30 old decks at the cottage. Now that I think of it, having 1000 cards at the cottage is unlikely; maybe at the community center where they play bridge.

Expand full comment

arg min

Is a statistician smarter than a fifth…