Individual Experience vs. The Cochrane Review
On my decade-long exploration seeking a scientific language for singular evidence.
I had a bit of a throwaway line in the last post about how maximizing the welfare of populations requires the erasure of individuals. You might ask why? Individuals are units in a broader population. A population is a group of individuals. Improving the welfare at scale must improve the welfare of the units.
Except we all know this isn’t true. Maximizing averages doesn’t say anything about the outcomes of any particular individual. In fact, decisions that maximize averages often harm some of the individuals in the total sum. Individuals in a community always have some shared interests, but they have plenty of disparate interests too. Which interests are maximized is a political decision that necessarily leaves other interests neglected. Our metrics and measures can have broad societal value while still making many unhappy.
And how should those individuals make decisions about their own lives? You may be able to convince yourself that Mathematical Rationality makes sense for a bureaucratic state or company. However, it’s much harder to make the case for being a mathematically rational individual.1 Quantification, abstraction, hierarchy, and statistics can help organize and steer decision making at scale. But if one of the core goals of quantification is legibility for intersubjectivity, why do you need numbers to make sense of your personal experience? Why is it useful to see yourself like a state?
Nothing motivates my research more than this tension between the population and the individual. It’s been my main focus since 2020.2 But getting traction on these topics has been an uphill battle. Try telling someone in the human-facing sciences that you want to study the epistemology of case studies. It’s so easy to fall into the cracks of crankhood.
Now, of course, scientists are incapable of seeing the pure irrationality of science. Blindly applying population results to individuals requires a lot of faith. We have formal scientific language to understand population averages. This language is incoherent when directed back towards individuals. Take our gold standard of causal inference, the randomized controlled experiment. These trials can estimate the fraction of people in an experimental cohort who would benefit from taking a drug. But let’s say you do a trial with 600 people and find that 20% of the control group has a bad outcome, and 10% of the treatment group has a bad outcome.3 What does this say about my outcome? Unfortunately, that result alone says nothing. We’d like to argue that the intervention reduces my risk by a factor of 2. But what even is individual risk?
Despite this bizarre inability to really say anything about individual benefit, when you try to come up with a non-statistical, non-quantized language to say precise things about people, you are relegated to the bucket of pseudoscience or, if you can bench enough, bro-science. Anecdotes are not data. Your miracle cure is always a fluke. Your personal experience is trumped by this impenetrable 500-page systematic review. Anyone who disagrees with the consensus of experts is being an irrational contrarian.
However, a lot of practices people find beneficial are immune to the postmodern lens of the randomized trial. It’s really hard to do RCTs on organ transplantation. You can’t do RCTs for physical therapy. You certainly can’t do them for massage or chiropractic. It’s hilarious that we have convinced ourselves that we can do this for psychology, despite decades of embarrassing “scientific” failures. And when you start looking at “sports science,” you realize how silly it is to try to put a scientific corset on all of human experience. No randomized trial explains Victor Wembanyama.
I could pick on more than just the RCT. Individuals don’t exist in the calculus of rationality. None of the pillars of mathematical rationality I talk about in The Irrational Decision make much sense for individual people. I could give similar spiels about game theory, statistical prediction, or optimization.
Optimization, in particular, is tough to grapple with. I got a lot of questions about personal optimization in the conversations about my book. Many people find it useful to think about their lives as a collection of optimization problems. If you want to strive for the best, that means some number must go up, right?
Don’t get me wrong, I love optimizing too! Do I obsess about my home coffee setup, my exercise program, my writing schedule? You bet I do. Is that bad? Does that mean I’m just a cog in the capitalist machine? These questions form the basis of a conversation worth having.
Now, here’s an annoying paradox. If we want a language to talk about individual experience, it has to have some element of intersubjectivity. This is where the quantification trap comes in. The mimetic power of the quantification trap means that a shared language for discussing individual experience is always at risk of being contaminated by scientific quantification. But it doesn’t have to. Most people share their experiences without numbers and charts. We obviously share experience through art, music, and literature. These are all shared languages, too.
For the next bit on this blog, I want to find language to talk to each other about individual experiences. I want to write about this weird tension between the quantification trap and the individual. People figure out how to do amazing things without consulting the scientific literature all the time. How can we talk about commonalities without reducing them to numbers or statistics? I initially thought this would be the topic of the final chapter of The Irrational Decision, but I realized it was far too sprawling and unwieldy to fit. It will have to be its own book. Some day.
In the meantime, I’m going to try to type it out on here.
Unless you drink a lot of slate-star-less-wrong Kool-Aid.
What happened in 2020? I don’t remember.
(p<0.001)


Love your blog, Ben. I semi-grasp your point but then, as a cardiologist, I view the double-blind RCT as our best current defence against the cognitive biases that trap us into doing harmful things to patients. The list of treatments that we thought helpful, but were debunked by an RCT, is long and strewn with fatalities. Simple examples include oxygen to reduce the size of a heart attack, and hormone replacement therapy to prevent heart attacks. Yes, applying RCT findings to an individual patient is fraught. But maybe not as fraught as using an anecdotal experience in one patient as the template for treating the next patient - "I gave the patient pill x, a week later their runny nose stopped, therefore x treats runny noses". Each element of the modern RCT protects against a number of cognitive biases - randomisation prevents self-selection bias, an appropriate control group mitigates the Hawthorne effect and blinding both the patient and researcher will combat expectation and confirmation biases. Now I'm left thinking - did all these RCTs debunk quackery or am I on shakier ground - eek
As I think more about this, I wonder whether (a) a clear separation can be made between the population versus individual health approaches that you raise, both here, and in your discussion of Meehl where you show that nomothetic approaches to evaluation will always return the result that actuarial approaches beat clinical judgment and (b) whether your concern may be more about an uneasiness with the epistemic authority granted to a certain form of knowledge rather than intrinsic differences in theory. This is new material to me, but quite important for work that I am trying to do in health. The example is the following:
1. Consider the RCT approach to population health, and view it as a missing data problem; I need a way to `fill in' the missing counterfactuals for treated units. I can do this in a number of ways depending on precisely what parameter I am interested in, but one standard formulation is to use the average of the control group to fill in the missing data for the treatment group, had they not been assigned to treatment. Essentially, I am `borrowing' data from the control group and applying it to the treatment group.
2. Now consider a case where a 45 y.o. man walks into the doctors office. Perhaps on entering, the doctor evaluates the patient and sees somethings from the way the person walks and/or what the person is wearing. Then, the doctor asks the patient why they have come and the patient says "I have a headache." But that phrase has no meaning without a shared understanding of what a headache means within the local shared context. As with the RCT, the doctor must `borrow' data from a broader context, and (dare I say), use a shared language to help her manage this very individual case. A specific example is that in the slums where I work, people may say "I have low blood pressure." This does not have a clear biomedical interpretation, but is used to convey general feelings of malaise and perhaps depression.
In both cases, then, we borrow data from other contexts to understand how patients are to be treated. I am not sure that `intersubjectivity' is the right word here, but if it is, both approaches require intersubjectivity, perhaps of different forms (not sure about that). So, the difference between the population and individual approaches is not necessarily in the particularity of the latter, since the use of language will always require some degree of sharing. Instead the difference is in ______________.
I am not sure what that _________ is , but will keep working on it. A book that I have been told to look at is "Towards a Contextual Realism." It is far afield, so will take me some time, but perhaps it will help.