9 Comments
User's avatar
David J. Hunter's avatar

I know Hoekstra says it’s wrong, but #5 is fairly standard shorthand for “The interval [0.1, 0.4] was produced using a procedure such that, if we were to repeat the procedure over and over, then 95% of the time the confidence intervals produced would contain the true value of q.” It’s in most introductory stats books, presumably as a convenient way to point toward the correct interpretation.

Expand full comment
Ben Recht's avatar

The full statement you wrote is correct, but Hoekstra et al.'s point is that the shorthand is confusing: the statement refers to the returned boundaries and not to the procedure. So in spirit, it's not totally wrong as shorthand, but logically speaking, it's fallacious.

Now, I agree that this sort of statistical pedantry is often unhelpful. But I find it useful to point out how we've invented a cookbook of statistics where at least 95% of the chefs are confused about their recipes.

Expand full comment
T Coddington's avatar

but Tampa would have been kicking the extra point in an indoor stadium, surely the probability of a successful kick is higher indoors? 😜🤣

Expand full comment
Ben Recht's avatar

Love it. The point estimate is higher and but the error bars are wider.

Expand full comment
empiko's avatar

Why is #4 not correct? Is there an intuitive explanation? How is it different from the quote above: "95% of the true value of q would lie inside the interval C(X)"

Expand full comment
Ben Recht's avatar

I agree it's confusing! It's because "q" itself isn't a probabilistic entity. At least in the weird frequentist framework which confidence intervals come from (the Monte Carlo Algorithm framework), q is a deterministic quantity so assertions about p(q) don't make sense.

Expand full comment
Philipp Renz's avatar

Isn't the question whether a coin is biased ill-posed in the first place. The coins that I've seen are certainly not perfectly symmetrical P(heads) might be close to 0.5 but it is most certainly a tiny bit off.

So we already know that the coin is biased and there really is no need to do any kind of statistical analysis.

Expand full comment
Ben Recht's avatar

Right, I sort of talk about this in today's blog. The probabilistic null hypotheses are often very suspect.

Expand full comment
Thomas Dent's avatar

Your argument for 5-nines confidence sounds awfully like the one CERN et al use for '5 sigma' - ie at this point one can stop arguing about pure statistical issues (and move on to arguing about systematics ...)

Expand full comment