Clinical versus Statistical Prediction (I)
Meehl's Philosophical Psychology, Lecture 10, part 1.
This post digs into Lecture 10 of Paul Meehl’s course “Philosophical Psychology.” Technically speaking, this lecture starts at minute 74 of Lecture 9. The video for Lecture 10 is here. Here’s the full table of contents of my blogging through the class.
Throughout his undergraduate and graduate studies in Minnesota, Meehl found himself at the center of a personal and professional conflict for the soul of psychology. As a high schooler, Meehl had been drawn to psychology by the psychodynamic school (pioneered by Freud), which considered the myriad connections between a patient’s past experiences–even their dreams–and their current mental state. At Minnesota, he was educated by a rigid behaviorist crew, strongly anti-Freudian, focused on understanding the impact of external factors on mental states, and adamantly scientific and statistical.
This struggle in psychology was part of a broader struggle in social science between the idiographic and the nomothetic. The idiographic focuses on the particulars, on the individual, trying to make sense of the singular and unverifiable. The nomothetic focuses on the general, trying to determine laws and principles that explain categories with clear measurements. The idiographic treats every case as unique. The nomothetic treats every case as a statistic. The last two lectures of his course are about Meehl’s career-long project of demarcating the purviews of the idiographic and the nomothetic.
One of Paul Meehl’s most famous works, Clinical versus Statistical Prediction, grew out of a lecture series from 1947 probing this boundary. Though he wouldn’t call it by name, Meehl makes the first argument for machine learning in the clinic. After struggling to find a publisher, his book finally appeared in 1954, two years before the famous Dartmouth AI conference. It was four years before Rosenblatt’s Perceptron. Even as computers were just coming online, there was already ample evidence that statistical pattern recognition could, and perhaps should, play a role in critical decision-making.
Meehl’s book focuses on how to predict behavior. He gives some examples of what he had in mind in Lecture 10:
Given an application with LSAT score, undergraduate grades, and letters of recommendation, who should be admitted into law school?
Given a record of behavior, should a jailed person be released on parole?
Should you hospitalize a patient who is clinically depressed to prevent suicide?
Should a person who doesn’t respond to antidepressant prescriptions be given shock therapy?
These questions demand consequential decisions about people’s lives. They are all concerned with how a human being reacts under particular circumstances. And in all cases, the outcomes are uncertain. We just don’t know what will happen as a result of many very consequential decisions.
To answer these questions, we have to predict what will happen as a result of our actions. If you’ve been following along with the blog, you should be comfortable saying that all such predictions are probabilistic. Even though these questions are about individual people, their answers have epistemic uncertainty. Answers to these questions are logical statements that can be measured with Probability 1.
“I believe this candidate will do will in our law program.”
“I believe this person will not commit crimes if released.”
“I believe this person will harm themselves unless they are committed.”
“I believe this person will find some relief from shock therapy.”
These are all beliefs, and Meehl wanted to know how best to quantify them. What would be the best way to decide in the face of the inherent uncertainty?
In Clinical versus Statistical Prediction, Meehl aims to compare the idiographic to nomothetic approach to such decision-making. I will clarify the technical distinction momentarily, but let me first set the high-level distinction between these approaches. The nomothetic approach is statistical, transmuting an assessment of past rates into future uncertainty. We could look at all similar cases in the past and count the number of times a treatment had worked. We could use the success percentage as a proxy for our belief the treatment will work on this patient before us. Then, we could use optimal statistical decision rules to weigh the costs and benefits and select an action. This is statistical prediction. We convert past performance into future confidence.
Clinical prediction, on the other hand, starts from the idea that all patients are unique. In the 1940s, the validity of inference from class membership was not at all conventional wisdom. In his book, Meehl quotes Gordon Allport making the case for the idiographic:
“A fatal non-sequitur occurs in the reasoning that if 80% of the delinquents who come from broken homes are recidivists, then this delinquent from a broken home has an 80% chance of becoming a recidivist. The truth of the matter is that this delinquent has either 100% certainty of becoming a repeater or 100% certainty of going straight. If all the causes in his case were known, we could predict for him perfectly (barring environmental accidents). His chances are determined by the pattern of his life and not by the frequencies found in the population at large. Indeed, psychological causation is always personal and never actuarial.”
I still hear this argument today. It is made by doctors and patients. It is made by advocates against algorithmic decision systems. In fact, I’ve made this argument multiple times myself. I’m personally very sympathetic to Allport. Meehl himself concurs with the general sentiment. Cases are indeed unique.
Is it always impossible to make inferences from class membership? That seems too strong. Moreover, you’ll note that Allport uses the word “chances.” Chance is, by its very nature, a probabilistic concept. The question remains whether that chance can be usefully estimated through actuarial methods. When is generalizing about the future just a question of careful counting?
This was a radical question in the 1940s, but it seems quaint today. We are living in the glory days of statistical pattern recognition. The tech industry and half of academia have decided that general intelligence is nothing but making decisions by counting things in appropriate reference classes. Everything we do is a sum of our past experience. All decisions are actuarial. It’s just a matter of finding the formula.
As we’ll see, Meehl wouldn’t go this far, but he’d come closer to agreeing than disagreeing. In 1947–before computers, before machine learning, before AI–he tried to understand the effectiveness and limits of actuarial tables in human decision-making. This week, I will walk through Meehl’s argument. I’ll make precise the sets of questions, decisions, and evaluation methods he considers. I’ll provide his evidence. And I’ll close with some reflections on the valuable lessons we can still learn from reading Meehl’s 1954 book.
Funny thing you should mention Rosenblatt. His 1956 PhD thesis at Cornell was on psychometrics, and starts with the following passage:
"All research psychologists are familiar with problems in which the simultaneous working of a large number of variables seems to determine a piece of behavior, or a personality trait, or the outcome of an experiment. Such complex relationships are not peculiar to psychology; they are equally true, for example, of the gas laws in physics. However, psychology more than the physical sciences must deal with these relationships statistically, rather than as perfect mathematical functions."
It's not at all obvious to me that the uncertainty in these examples is more epistemic than aleatoric. It could be, if you buy the claim that "If all the causes in his case were known, we could predict for him perfectly". But that claim seems at least as true of the outcome of a literal dice roll, the paradigmatic example (and etymological origin) of aleatoric uncertainty—if anything, the dice roll is probably _more_ predictable, since the physics governing it are less chaotic than those governing psychological causation.
Maybe a pedantic distinction, but possibly important if we're talking about the philosophical foundations of probability.