The idea of statistics (and other aspects of scientific process and communication) as rhetoric in the sense of formally structured argument with intention to persuade, is one that I like as one perspective regarding the goals of inference, though not the only one. For one, some statistical procedures play a role in automated systems with no human audience at all (this is often "machine learning"), and maybe just sometimes people learn and adjust their own views in response to quantitative results (sometimes this is "Exploratory Data Analysis", though not always via the specific methods that get called that).
When I taught graduate Econometrics, I liked to devote just a tiny sliver of the first lecture to alternative interpretations, before plunging students into an otherwise standard course on probability theory and matrix algebra: see roughly minutes 12-18 of a recording https://www.youtube.com/watch?v=h727zDsAy1Q&t=1s
In it, for the idea of modeling as rhetoric, I cite particularly Deirdre McCloskey 1998 "The Rhetoric of Economics", which includes also a classic bit on p-values which I think fits quite well with the perspective you take here. (For propriety I fail to mention that I find her to be, stylistically, one of the most grating and painful to read authors in economics, though the points still stand). Looking more at the rhetoric of quantitative theoretical models, I also point to Ariel Rubinstein's excellent "Economic Fables". Rubinstein's student Ran Spiegler has a more recent book in the same vein, written in a literary fashion but drawing on what is now an active literature on models of learning from and persuading people with models. This area (by authors like Spiegler, Philipp Strack, etc) builds on earlier literature that did the same with formal decision-theoretic models of learning, but lately incorporates various kinds of behavioral features, since it is hard to explain much of scientific practice as reflecting rational learning. Of course, how convincing you find these models of being persuaded by models will depend on how persuasive you find models of that kind, so I believe there are opportunities for infinitely more layers of turtles.
Always more turtles! But we need some language of persuasion to move forward collectively, be it in research, governance, or other contexts that necessitate participatory decision making.
My one minor point of terminological disagreement is that I would not call EDA and ML statistical inference. This is me breaking with Wasserman. For, um, rhetorical purposes, I like the dappled view, keeping the many local applications of statistics separate.
The parallel to legal writing is spot on. Both systems create artificial consensus thresholds that feel objective but are fundamentally arbitrary. I spent a few years working with clinical trial data and saw how p=0.05 became this ritual barrier, when really it was just convenient bureaucracy. What's intresting though is once we call it rhetoric, it risks making stats seem optional in contexts where it genuinely adds rigor to debates that would otherwise devolve into pure narrative warfare.
I'm teaching Detection and Estimation this semester (which I've taught off and on for the last 10 years). It's an EE class at heart but with all of the very dodgy statistics carted out for ML applications I feel like I've had to emphasize more over the years is that all the models we use are gross oversimplifications and ultimately an ansatz. This might be safe for, say, communications (yeah, noise isn't additive or Gaussian but the stuff we build on those assumptions works in practice), but not for most other applications.
Worth pointing out that Feyerabend was playing the same game as Fisher. His anti-method wasn't the result of deep reasoning but of the fact that "alternative" medical theories and treatments, to which he was committed, always bombed out in statistical testing.
For example, while there's still debate among philosophers of science about his defence of astrology, neither astronomers nor astrologers pay any attention to it, AFAICT.
The rules of statistical inference as outlined (discovered?) by Bernoulli are remarkable, if you read his work in some depth, he talks about the idea of "moral certainty" (which we would translate today as a confidence level with many nines)
Of course, the 95% confidence interval is most common today - and while it is deeply misunderstood, it is most prevalently cited by nonexperts, I believe, in the field of polling.
Probability itself is a VERY new field, and far less developed than other "math" despite being simpler on its face. I describe probability as the logic of statistics: "what does this data say?" the statistic, seems easy. But probability asks "what does this data mean?" - and the logic is often poorly applied: say and mean are often conflated.
In response to "lies, damned lies, and statistics" I say:
Statistics do not lie, but people who do not understand statistics are easily misled.
I'd be glad to debate you bro...but here I speak to your readers.
When a method does genuine inferential work and yields warranted inferences, it not only is persuasive—it deserves to be. Recht’s reasoning, however, flattens this distinction, making it all too easy to dismiss such methods as merely “rhetorical.” Although his position is sufficiently equivocal (as in his bureaucratic view of statistics), the overall impression is that in his view statistical inference is little more than officially sanctioned snake oil. The fallacy here is a failure to ask why some methods deserve to be persuasive and others are do not. In the case of statistical inference, when correctly applied, the answer is straightforward: it earns its persuasiveness by exposing itself to stringent criticism and severe testing. Recht’s argument would undercut the central task of statistical science—namely, distinguishing warranted inferences and sound methods from bad but persuasive ones: those built on biased selection, missing error estimates, charming anecdotes, hasty generalization, or a convenient blindness to alternative explanations of the data.
Worse still, by portraying statistical inference as post-hoc—entering only after data collection—Recht overlooks the single most important achievement and ongoing task of error statistical inference: planning and experimental design. It is this forward-looking design perspective–advanced by Fisher and Neyman and Pearson– that makes error estimation possible and provides the basis for the critical scrutiny that moves us beyond rhetoric. Fisher was explicit about this point: his insistence on randomization, he explains, is motivated by a desire to stop 'important' critics from discounting experimental results as the result of poor controls or rhetorical tricks dressed up as science. Recht seems to be suggesting either that that’s all statistical inference is, or perhaps that’s all science is. There may be other reasons he is so keen to derogate statistical inference...
nearly everything in the world created by humans involves rhetoric because most things created by humans for other humans to use requires communication, justification, and argument.
The idea of statistics (and other aspects of scientific process and communication) as rhetoric in the sense of formally structured argument with intention to persuade, is one that I like as one perspective regarding the goals of inference, though not the only one. For one, some statistical procedures play a role in automated systems with no human audience at all (this is often "machine learning"), and maybe just sometimes people learn and adjust their own views in response to quantitative results (sometimes this is "Exploratory Data Analysis", though not always via the specific methods that get called that).
When I taught graduate Econometrics, I liked to devote just a tiny sliver of the first lecture to alternative interpretations, before plunging students into an otherwise standard course on probability theory and matrix algebra: see roughly minutes 12-18 of a recording https://www.youtube.com/watch?v=h727zDsAy1Q&t=1s
In it, for the idea of modeling as rhetoric, I cite particularly Deirdre McCloskey 1998 "The Rhetoric of Economics", which includes also a classic bit on p-values which I think fits quite well with the perspective you take here. (For propriety I fail to mention that I find her to be, stylistically, one of the most grating and painful to read authors in economics, though the points still stand). Looking more at the rhetoric of quantitative theoretical models, I also point to Ariel Rubinstein's excellent "Economic Fables". Rubinstein's student Ran Spiegler has a more recent book in the same vein, written in a literary fashion but drawing on what is now an active literature on models of learning from and persuading people with models. This area (by authors like Spiegler, Philipp Strack, etc) builds on earlier literature that did the same with formal decision-theoretic models of learning, but lately incorporates various kinds of behavioral features, since it is hard to explain much of scientific practice as reflecting rational learning. Of course, how convincing you find these models of being persuaded by models will depend on how persuasive you find models of that kind, so I believe there are opportunities for infinitely more layers of turtles.
Always more turtles! But we need some language of persuasion to move forward collectively, be it in research, governance, or other contexts that necessitate participatory decision making.
My one minor point of terminological disagreement is that I would not call EDA and ML statistical inference. This is me breaking with Wasserman. For, um, rhetorical purposes, I like the dappled view, keeping the many local applications of statistics separate.
The parallel to legal writing is spot on. Both systems create artificial consensus thresholds that feel objective but are fundamentally arbitrary. I spent a few years working with clinical trial data and saw how p=0.05 became this ritual barrier, when really it was just convenient bureaucracy. What's intresting though is once we call it rhetoric, it risks making stats seem optional in contexts where it genuinely adds rigor to debates that would otherwise devolve into pure narrative warfare.
Yes. It's funny how if you call it "rhetoric" it feels optional, but "logic" and it feels mandatory.
I'm teaching Detection and Estimation this semester (which I've taught off and on for the last 10 years). It's an EE class at heart but with all of the very dodgy statistics carted out for ML applications I feel like I've had to emphasize more over the years is that all the models we use are gross oversimplifications and ultimately an ansatz. This might be safe for, say, communications (yeah, noise isn't additive or Gaussian but the stuff we build on those assumptions works in practice), but not for most other applications.
Not sure if it gets through...
It's hard! I struggle with exactly the same problems when teaching machine learning. Shannon and Wiener cursed us with heavy stochastic baggage.
Shannon at least had the bandwagon paper which I trot out now as some OG curmudgeonness which was not wrong.
Worth pointing out that Feyerabend was playing the same game as Fisher. His anti-method wasn't the result of deep reasoning but of the fact that "alternative" medical theories and treatments, to which he was committed, always bombed out in statistical testing.
Come now, that is a rather uncharitable reading of Feyerabend. His critiques in Against Method were leveled at far "harder" science than medicine.
He presented the attack in terms of scientific methodology in general, but medicine was the real target and (AFAICT) the main area where he has had continuing influence, for example https://www.sciencedirect.com/science/article/abs/pii/S1369848613000733
For example, while there's still debate among philosophers of science about his defence of astrology, neither astronomers nor astrologers pay any attention to it, AFAICT.
The rules of statistical inference as outlined (discovered?) by Bernoulli are remarkable, if you read his work in some depth, he talks about the idea of "moral certainty" (which we would translate today as a confidence level with many nines)
Of course, the 95% confidence interval is most common today - and while it is deeply misunderstood, it is most prevalently cited by nonexperts, I believe, in the field of polling.
Probability itself is a VERY new field, and far less developed than other "math" despite being simpler on its face. I describe probability as the logic of statistics: "what does this data say?" the statistic, seems easy. But probability asks "what does this data mean?" - and the logic is often poorly applied: say and mean are often conflated.
In response to "lies, damned lies, and statistics" I say:
Statistics do not lie, but people who do not understand statistics are easily misled.
I'd be glad to debate you bro...but here I speak to your readers.
When a method does genuine inferential work and yields warranted inferences, it not only is persuasive—it deserves to be. Recht’s reasoning, however, flattens this distinction, making it all too easy to dismiss such methods as merely “rhetorical.” Although his position is sufficiently equivocal (as in his bureaucratic view of statistics), the overall impression is that in his view statistical inference is little more than officially sanctioned snake oil. The fallacy here is a failure to ask why some methods deserve to be persuasive and others are do not. In the case of statistical inference, when correctly applied, the answer is straightforward: it earns its persuasiveness by exposing itself to stringent criticism and severe testing. Recht’s argument would undercut the central task of statistical science—namely, distinguishing warranted inferences and sound methods from bad but persuasive ones: those built on biased selection, missing error estimates, charming anecdotes, hasty generalization, or a convenient blindness to alternative explanations of the data.
Worse still, by portraying statistical inference as post-hoc—entering only after data collection—Recht overlooks the single most important achievement and ongoing task of error statistical inference: planning and experimental design. It is this forward-looking design perspective–advanced by Fisher and Neyman and Pearson– that makes error estimation possible and provides the basis for the critical scrutiny that moves us beyond rhetoric. Fisher was explicit about this point: his insistence on randomization, he explains, is motivated by a desire to stop 'important' critics from discounting experimental results as the result of poor controls or rhetorical tricks dressed up as science. Recht seems to be suggesting either that that’s all statistical inference is, or perhaps that’s all science is. There may be other reasons he is so keen to derogate statistical inference...
Maybe you don't necessarily need to go full Feyerabend, but if you've already been radicalized by years of reading arg min, why not go all the way?
nearly everything in the world created by humans involves rhetoric because most things created by humans for other humans to use requires communication, justification, and argument.