Difficult Statistics

A Tribute to David Freedman

Jul 07, 2023

A decade ago, I was at a campus reception where a wine-soaked physicist confidently declared there are two types of statisticians: Bayesians and Frequentists. This dichotomy is undoubtedly conventional wisdom, and many popularizations of statistical thinking cast these two viewpoints as armies at war, each side religiously believing in the right way to model reality with probability distributions. The physicist turned to Philip Stark, then chair of the Statistics department, and tried to gauge his allegiance, asking “Philip, what kind of statistician are you?” Philip grinned wryly and quipped “Difficult.”

It took me several years to understand what Philip meant. And I didn’t figure it out until David Blei recommended I read David Freedman. I started with Freedman’s article “Statistical Models and Shoe Leather” and, hooked by his hard-boiled prose, proceeded to read the entirety of “Statistical Models: Theory and Practice“ and “Statistical Models and Causal Inference: A Dialogue with the Social Sciences.” Freedman was the quintessential Difficult statistician, highlighting countless misinterpretations of observational data and misapplications of statistical techniques in brusque, tactless narrative.

Now, you’d think Freedman must have some affinity for statistical modeling since he used the term in the title of every book. But no! Freedman hated statistical models. He argued that statistics was a tool to help guide experiments and that sophisticated statistical methods could not fix poor data. He found example after example where statistical methods and models were not justified when applied. In such cases, he would declare “the evidence is weak.” The inferences we could draw from such modeling are on the shakiest of ground.

Freedman’s writing helped me not only crystalize my own discomfort with statistical modeling but also with how probability was abused more broadly. First I started seeing cracks in the logic of machine learning. Then I started seeing cracks everywhere. But once you cross the rubicon into Philippe Lemoine’s “Science Isn’t Real” cult, what do you do? I guess, in classic internet fashion, you Post It To Your Blog.

At least for starters, that’s what I’m doing here. This substack is a tribute to David Freedman. I’ll directly discuss some of his arguments in future posts. I’ll also highlight some of my other favorite critics (Cartwright, Collins, Gigerenzer, Daston, Meehl, Leamer, Pinch, …). But I also want to understand why these criticisms of statistical practice (which are 50 years old now) fail to get capture. Everyone knows the Ritualized Null-Hypothesis Significance Test is nonsense. Why can’t we communally kick the habit?

Finally, I want (and hope) to take this substack a step beyond critique. My current research asks “what do you do after abandoning statistical models? So a secondary thread will be about probability as a measurement device rather than a descriptive tool of phenomenology. Statistical methodology is useful even if one never builds a statistical model of the natural world.

Damek Davis

Jul 7, 2023

"probability as a measurement device" as in "# of samples we need to estimate 'x'?"

i'm not happy about asking here

Expand full comment

1 reply by Ben Recht

Lior Fox

Jul 8, 2023

In the previous post with the regression example, you were criticizing (if I understand correctly) the interpretation of statistical models as "mechanistic" (or "causal") models. Here it seems you argue against their interpretation (or even usage?) as phenomenological models as well?

Surely a phenomenological model alone without _any_ accompanied mechanism is something very limited, but there could still be value in it sometimes.

Anyways, looking forward for the following posts, sounds like this is going to be most interesting and relevant.

1 more comment...

arg min

Discussion about this post