For All The Cows
Steampunk Data Science: An afterword and some thoughts on Cyberpunk Data Science.
This is the final part of a blogged essay “Steampunk Data Science.” A table of contents is here.
To be a practicing scientist, you need to maintain a state of vigilant cognitive dissonance. If you get too deep into the history and philosophy of your discipline, it becomes very hard to keep writing papers. A little perspective reveals the absurdity and irrationality of the system. And once you see it, you can’t unsee it. Participating requires a lot more effort.
Of course, I speak from experience. Here’s my super brief history. My research from 2015-2018 convinced me that machine learning was lying to itself about its foundations. I went to other disciplines for answers. I first looked to statistics and disappointingly found its epistemology even more confused than machine learning. David Blei sent me David Freedman’s Statistical Models and Shoe Leather, and I became a Freedman completionist. The last chapter of Freedman’s posthumously collected works, Statistical Models and Causal Inference, is a sequel to this famous Shoe Leather essay. It details several examples of “scientific shoe leather,” and devotes a page to the history of vitamins. I was a bit surprised that he wrote a lot about Eijkman, but didn’t mention Wisconsin. I wondered what else was missing in this summary. And here we are, six years later.
Though I wrote the first draft of this essay four years ago, I have never been able to figure out what to do with it. Initially, I thought it would be part of The Irrational Decision, but that project took me in a very different direction. The Irrational Decision is about science, engineering, and decision-making after the computer. This story was about how we did things before the standardizing forces of the computer and statistics. It didn’t cleanly fit, and I ended up jettisoning this chapter.
As a standalone piece, Henry Farrell warned me that it lived in an uncanny valley. The writing wasn’t academic enough to get published in a journal or pop enough to appear in a magazine. But you know where that sort of stuff can go? Directly to you, my argmin readers. Maybe this blog is the uncanny valley between NeurIPS and The New Yorker. I should make that the masthead.
Whatever the case, this piece not finding an external home is fine. Writing through a project is an exercise in itself. I finished this essay before I started substacking, and the recurring themes on here are built upon my research and writing about vitamins. This unpublished project laid the foundation of a research program. We’ll see where it ends up.
Now, for all of you scientists reading this, I’m not suggesting you follow me down this path. They’re uncommon, but scientific breakthroughs still happen, and looking for them can be thrilling. Keep looking.
The most dramatic plot in this essay is F. Gowland Hopkins’ crossover curves, in which he showed that some factor in milk was needed to sustain the growth of rats.
You might look at that plot and say, “Gee, it would have been nice to be a scientist back in 1900, as all of those low-hanging fruit are gone.” You might think that we never find such clear success stories in our modern, complex world. This couldn’t be further from the truth. I mean, this plot is from 2021:
This graphs the effect of starting and stopping semaglutide (Ozempic). The evidence of a revolutionary breakthrough in the management of weight loss tracks the average growth with a visualization scheme identical to F. Gowland Hopkins’ plots from a century earlier.
Yes, interventions like GLP-1 agonists come along rarely. But they do come along!
And for all of our results that are not robust and large? We should accept that, though they are likely illusory, the many small results help us build a scientific record. They form a scattered pile of puzzle pieces that we can all try to assemble. It’s only from the incremental pieces that aren’t precisely reproducible or clean that we can see the big picture. Don’t worry about the distribution of p-values. Do think hard about how to produce reproducible data and coding artifacts.
I’ll repeat what I said yesterday. The most important thing we can learn from the discovery of vitamins is that discovery starts with a mess. It’s in learning from the mess that we find the undeniable effects that completely transform our understanding.
Oh, one more thing. You might be wondering what happened to those cows in the Single-Grain Experiment. The cows were fed their monotonous diets for years, and the research team closely monitored the cows’ growth and health. They weighed each animal monthly and took a photograph once every six months. The cows all grew at similar rates, but there were noticeable differences in appearance, offspring, and milk production.
The corn-fed animals looked smooth of coat, fuller through the barrel; and as expressed by experienced feeders and judges of domestic animals, they were in a better state of nutrition. On the other extreme stood the wheat-fed group with rough coats, gaunt and thin in appearance, small of girth and barrel, and to the practiced eye, in rather a lower state of nutrition.
While those fed on corn gave birth to healthy calves, the wheat-fed cows’ offspring all died within a day. The milk of the cows had different fat contents depending on the diet. The corn-fed had somewhat less fat than the oat-fed, but the wheat-fed had almost no fat content whatsoever. Wheat alone was a perplexingly poor diet for cows.
Elmer McCollum was no dummy. He and Marguerite Davis began publishing papers about their rat colony well before the Single-Grain experiment had finished. Hart eventually terminated the Single-Grain experiment in 1911, and the team published their results in the Proceedings of the National Academy of Sciences, several years after McCollum and Davis announced their discovery of Vitamin A and confirmed the existence of Vitamin B.1
The Wisconsin researchers never figured out what was wrong with the wheat. Following McCollum and Davis’ discovery of Vitamin A, they tried adding butter to the rations, but this didn’t seem to improve the cows’ health. Their most likely hypothesis was that there was something toxic in their wheat ration. In his autobiography, McCollum later reflected that the harvested wheat itself was just of poor quality. Due to the way they grew, processed, and stored the wheat on the Madison campus, the cows ended up being fed only wheat grain and straw.
Had the cows eaten their full quota of leaf, as the corn- and oat-fed animals did, they would not have been in such poor nutritive condition. Through four years we had been inexcusably uncritical of some important details.
The experiment, though wildly influential, yielded nothing but inconclusive results.
I’d like to thank Mihaela Curmei, Jessica Dai, Shamik Dasgupta, Henry Farrell, Sara Fridovich-Keil, Paula Gradu, Chris Harshaw, Lauren Kroiz, Kevin Munger, Deb Raji, Philippe Rigollet, Scott Shenker, and Chris Wiggins for many helpful comments and suggestions. Special thanks to the students in the 2024 Spring seminar “The Philosophy and History of Automated Decision Making,” who participated in a lively discussion about an earlier draft of this essay.
Even back then, PNAS was the journal for articles “Previously rejected from Nature And Science.”





This series has been such a fascinating read!