Debate me bro
Statistical inference is rhetoric.
A recurring theme on the blog is how statistics can be used in many different ways. Prediction, rule-making, simulation, exploratory data analysis, algorithms. Any time we want to summarize counts, statistics lurks around the corner. Today I want to loop back to something that I’ve said a few times on the blog, but need to reiterate based on last week’s posting. The most controversial, but perhaps most common, use of statistics is in rhetoric.
That statistics are frequently deployed as rhetoric is not controversial. Mark Twain famously threw statistics in the rhetoric bucket over a century ago (with friends, lies, and damned lies). In his 1988 essay, “Social Science as Moral Theology,” Neil Postman lamented how we ceded increasing authority to numbers since Twain’s time, yielding a veneer of science atop moral storytelling. As Postman quips, counting things doesn’t make you a scientist. It is only when you speak of counts in the right language that you are granted authority. It’s not hard to examine rule books for scientific validity and realize these are style guides, not unimpeachable tools of divination.
Fisher’s Design of Experiments is best read as a style guide for endless argument. Fisher loved to argue with people! Paraphrasing some of his greatest hits: “You tasted 8 correctly, but the p-value of random guessing is 1/70, I remain unconvinced!” “Here are another dozen reasons why your data associating cancer with smoking is unconvincing.” “Let me convince you that some races are less intelligent than others.” Most don’t hold up, but Fisher was the master of constructing statistical rhetoric to protect his beliefs and values.
Statistics becomes rhetorical when we deploy it to convince people to believe claims based on counts. The question raised by the last few posts is whether we can transmute counts into knowledge. “How many times do I have to see something to believe it?” This is the problem of induction. As we’ve known since David Hume cursed western philosophy, there is no logical means of answering this question and justifying beliefs from past observations. Sorry! All we can do is argue about it.
In “A Bureaucratic Theory of Statistics,” I call the confusing collection of probabilistic approaches to the problem of induction “post-hoc inference.” I use the term post hoc because it occurs after data collection. I use the term inference because the techniques work by relating a sample to a data-generating distribution, which is what most statisticians think of as statistical inference: error bars, confidence intervals, p-values.1 In class, we are taught that the tools of statistical inference are the best way to make sure we are not fooled by our common sense. That they are an algorithm for growing our personal knowledge. But when people deploy the tools of statistical inference, they are not trying to convince themselves. They are trying to convince everyone else.
Post-hoc inference sets formal rules for debate about historical counts. One simply needs to learn the proper techniques of parsing csv files with R or Stata. Or at least how to prompt Claude Code to do said parsing for you. From this ruleset, you can computationally generate pedantic arguments about why your statistics should convince others. Why your counts should trump their beliefs. Just remember that I, like Ronald Fisher, may never be convinced by your argument.
Some diehards might say, “Despite their flaws, there exists no better set of techniques for learn from numerical evidence than Bayesian and frequentist statistics.” People committed to this view need to get out more. There are plenty of other ways that people make sense of data without statistics, even in human-facing science. The last couple of posts discussed several examples. Moreover, as I and an overabundance of critics have chronicled, statistics more often than not confuses rather than enlightens when carted out as evidence for claims.
There is no algorithm for scientific inference. You don’t have to go full-on Feyerabend to believe that science is a messy, sociological system for cultivating expertise, with many different methods for presenting, discussing, and interpreting empirical facts. There is no universal scientific method, and that’s fine. Post-hoc inference has proven useful as a language for scientific debate in many contexts. It’s never dispositive, but no rhetoric ever is.
Moreover, rhetoric also helps explain why statistics functions so well in bureaucratic contexts. Formal rules facilitate the creation of evidence standards. These just need to be good enough to move forward, not good enough to convince everyone. Then we can build a regulatory system necessitating clearing that statistical bar. The language of statistical inference functions like legal writing.
I want statisticians and data scientists to be more honest and explicit about the rhetorical role of statistical inference. No one in introductory statistics tells you that they are teaching a language of persuasion with numbers. They should! Let’s admit that’s what statistical inference is. Sure, statistics can be shoveled into a tier of bullshit below lies and damned lies. But statistical inference can also be a powerful formal language for storytelling and persuasion.
Statistics has many uses. Debate strategy is just one of them.
In All of Statistics, Larry Wasserman has a much broader definition of inference, but I think it’s better for statistics if we separate out this hypothesis testing stuff from actually useful things like machine learning, exploratory data analysis, and randomized algorithms. One way to secure a big tent for statistics is to embrace the diversity of meaning in its multitudinous applications.


The idea of statistics (and other aspects of scientific process and communication) as rhetoric in the sense of formally structured argument with intention to persuade, is one that I like as one perspective regarding the goals of inference, though not the only one. For one, some statistical procedures play a role in automated systems with no human audience at all (this is often "machine learning"), and maybe just sometimes people learn and adjust their own views in response to quantitative results (sometimes this is "Exploratory Data Analysis", though not always via the specific methods that get called that).
When I taught graduate Econometrics, I liked to devote just a tiny sliver of the first lecture to alternative interpretations, before plunging students into an otherwise standard course on probability theory and matrix algebra: see roughly minutes 12-18 of a recording https://www.youtube.com/watch?v=h727zDsAy1Q&t=1s
In it, for the idea of modeling as rhetoric, I cite particularly Deirdre McCloskey 1998 "The Rhetoric of Economics", which includes also a classic bit on p-values which I think fits quite well with the perspective you take here. (For propriety I fail to mention that I find her to be, stylistically, one of the most grating and painful to read authors in economics, though the points still stand). Looking more at the rhetoric of quantitative theoretical models, I also point to Ariel Rubinstein's excellent "Economic Fables". Rubinstein's student Ran Spiegler has a more recent book in the same vein, written in a literary fashion but drawing on what is now an active literature on models of learning from and persuading people with models. This area (by authors like Spiegler, Philipp Strack, etc) builds on earlier literature that did the same with formal decision-theoretic models of learning, but lately incorporates various kinds of behavioral features, since it is hard to explain much of scientific practice as reflecting rational learning. Of course, how convincing you find these models of being persuaded by models will depend on how persuasive you find models of that kind, so I believe there are opportunities for infinitely more layers of turtles.
The parallel to legal writing is spot on. Both systems create artificial consensus thresholds that feel objective but are fundamentally arbitrary. I spent a few years working with clinical trial data and saw how p=0.05 became this ritual barrier, when really it was just convenient bureaucracy. What's intresting though is once we call it rhetoric, it risks making stats seem optional in contexts where it genuinely adds rigor to debates that would otherwise devolve into pure narrative warfare.