Instrumentalized Actuarial Predictions

The randomized controlled trial as a natural extension of machine learning

Nov 04, 2025

This is a live blog of Lecture 17 of the 2025 edition of my graduate machine learning class “Patterns, Predictions, and Actions.” A Table of Contents is here.

Way back in Lecture 3, we discussed how a core application of statistical prediction was evaluating the impact of actions. I presented Paul Meehl’s actuarial framework for decisions: When given a decision problem with a small set of possible outcomes and an appropriate, fixed data format, actuarial rules provide more accurate predictions than clinical judgment on average.

Part of the problem with the actuarial method is that it assumes this data already exists. If we have some new novel action that we’ve never observed before, how can we collect the data needed to predict average outcomes?

One answer to this question is the randomized controlled trial. We can view this as a machine learning problem. Let’s suppose we have a new drug that we are considering offering to patients. We imagine there’s a population of such individuals out there who could benefit from taking it. We say we’ll evaluate the intervention based on some average measure of benefit for this population, like the average number of quality-adjusted life years gained, or some other dehumanizing aggregated statistic.

Now we have a cost function to optimize. It’s an average over a population. We’ve spent the semester so far developing data-driven tools to approximately solve this optimization problem. We randomly sample a collection of individuals from the population. Let’s say the outcome we aim to predict is whether they got better or not. Each subject has some associated features that we think are predictive of their outcome, like age or family history. Our goal is to devise a treatment plan that analyzes an individual’s features and predicts whether they will benefit from the drug.

To do this, we add one more feature to the list. We randomly assign each subject to receive the drug or not. That is, the feature vector is the concatenation of the data we think predictive of the outcome and their treatment assignment, a boolean variable equal to one if they received the treatment and zero otherwise.

We now wait to observe the labels. These are how each subject does in our pre-defined evaluation metric. Once we observe the outcomes on our study sample, we can build a statistical predictor, guess, from the observed data set.

outcome = guess(features, assignment)

Once we have computed this function, using a transformer or something, we are solidly back in the world of decision theory. We can then make decisions about everyone else in the population by appropriately thresholding the value

prediction = guess(features, treated)-guess(features, not treated)

This will be effectively the same as performing a likelihood ratio test, assuming our prediction function provides a good sense of the likelihood of benefit under treatment and no treatment.

From this perspective, randomized controlled trials are a data-driven approach to optimal decision making. We assume that you’d like to make a good decision in the standard average error decision theory setup. We look back at Lecture 4 and see that we need to be able to compute the expected value of a good outcome under treatment. We then estimate these expected values using machine learning techniques and treat the estimates as accurate when we make decisions. That is, we assume that estimates of rates in the past are good surrogates for rates in the future, as we always do.

Predicting the impact of actions is still just prediction. The major difference in the randomized controlled trial is our ability to manipulate one of the features to obtain a reasonable estimate of the conditional probability of an outcome. That the assignment be random is sufficient to ensure that our counts give accurate, unconfounded estimates.

Now, sadly, almost no one presents randomized trials this way. Philip Dawid has a phenomenal paper, “Causal Inference without Counterfactuals”, that sets things up as I do. As we’ll see, this setup provides a bridge between the statistical prediction we’ve developed so far in this course to the active manipulation in control design, policymaking, and reinforcement learning.

If you are sympathetic to “causal inference,” Dawid pens a must-read critique of the more common mathematical frameworks that claim to have the mystical power to divine causation. I have written before about why I think these claims are epistemically fraught. The randomized trial as a logical extension of the actuarial method makes sense to me. The randomized trial as a gold standard of causal inference does not.

Now, you might rightly complain that treating the output of a machine learning model as an accurate estimate of the likelihood of a medical outcome is problematic, too. I agree, we need to be careful. I could gripe about the many shortcomings of actuarial decision making, as I did in an earlier post. It’s not a panacea. And even from a statistical perspective, a great deal of care is needed. As is always the case with the actuarial method, you need a lot of samples to make highly accurate predictions. How many samples you need is hard to calculate in advance. Moreover, the idea that you can randomly sample subjects from a population is almost never true. And yet, such statistical assumptions are seldom true in machine learning, either. Machine learning and actuarial prediction push on despite these concerns. It’s possible that obsessing over the fine details of quantitative central limit theorems on causal trees might be missing the forest.

arg min

Discussion about this post