One of the more common objections to case studies is that the lack of randomization makes them too susceptible to confounding and cherry-picking. Is it possible to run an experiment where an individual is both the treatment and control group? A partial positive answer is the N-of-1 trial design.
Typical N-of-1 trials work as follows. At the beginning of each week, the patient and doctor pick a bottle of pills from an urn. Half of these bottles will have Drug A, and the other half will have Drug B. Drug B might be a placebo. The patient and doctor are both blinded from knowing which is selected. The patient will take their medication from whichever bottle they chose that week, logging how they feel. They will select a new random bottle each week for a set number of weeks. At the end of some period, the doctor and patient will discuss the journals and decide which weeks had better outcomes. They will then unblind the identities of the bottles and check whether there is a correlation between either drug and the patient’s well-being.
The fancy way of describing this trial is a “multi-cycle within-patient, randomized, double-blind, cross-over” experiment. But we need not dwell on the fancy statistics here. The N-of-1 trial is just a randomized controlled trial. But instead of multiple patients, we examine multiple time windows of a single patient and try to draw conclusions about differences between interventions.
What conditions would we need to draw reasonable inferences? The patient's baseline condition should be relatively chronic so that treatments at different times are comparable. The treatments should act faster than the individual periods of a trial. I.e., the effects should ramp-up and ramp-down faster than the trial period lengths.
Chronic pain is an ideal condition for such experiments. Patients respond differently to different drugs, and finding the one that works best for an individual can be a long and arduous process. Pain medication has a fast half-life.
A less obvious application of the N-of-1 trial is testing side effects. Statins are a wildly popular class of cholesterol-lowering drugs that are effective in reducing the risk of bad cardiovascular events. But every drug has risks, and many people report negative side effects from statins. But what if these side effects are a “nocebo” effect, where the expectations of an effect from taking the drug manifest as physical symptoms?
A team of London cardiologists decided to try N-of-1 trials for statins to help patients navigate such side effects. They enrolled patients who had previously had harmful reactions to statins. Each month, each patient would either be taking statins or placebos. The trial period was one year. The team chose a monthly period because statin side effects tended to arise within two weeks. In such a trial, the patient would also get some of the cholesterol-lowering benefits of the statin.
That N-of-1 trials are just RCTs where the units are time windows also points to their limitations. Requiring blocks of the trial to be independent significantly limits the settings where this method can be applied. And since we can’t scale up the number of units–typical N-of-1 trials have only 6 or 12 periods--the effect sizes must be rather large.
But perhaps these limitations are of this specific design which is too primitive. Can this N-of-1 mindset guide us to more sophisticated experiments? Can we use the temporal dynamics to our advantage? As a person who has done too much control theory, I can’t help but think this looks an awful lot like an algorithm for system identification. One of the most tried and true methods for determining the frequency response of a linear system is to inject random noise and measure the output. Is there a way to bring more statistics and dynamics into the picture to create more widely applicable designs for individuals?