Designed Interactions

Statistical Experiment Design and its computational pipeline

Oct 15, 2024

This is the live blog of Lecture 14 of my graduate class “Convex Optimization.” A Table of Contents is here.

In last week’s panel, I challenged Philip Stark and Bin Yu about statistics “ownership” of experiment design. No discipline owns the scientific method, nor should one department be solely responsible for teaching experimentation to the rest of campus. Philip quipped back that Statistics does indeed own agricultural experiments randomized over four by four plots. To this point, I concur.

For lack of better jargon, I’m going to call statistical experiment design this sort of mathematical modeling of experimentation pioneered by the likes of Fisher and Neyman in the 1920s. This sort of experimentation is a narrow but powerful application of inverse problems. In statistical experiment design, you get to design the inverse problem. Optimization is part of this design process, particularly in minimizing potential sources of error.

The measurement model in experiment design is the same as in inverse problems. Your goal is to build a machine such that you will observe

measurement = forward model ( state ) + noise

With the measurements in hand, you then must find a way to solve the inverse problem to get a sense of that unobserved state.

Just as in the case of inverse problems, the linear measurement model is a powerful starting point. Most of the inverse problems I described earlier in this class can be recast as experiment design problems. Experiment design is deciding which measurements we want to include in an inverse problem. If we are going to take a CT scan, we’d like to minimize the amount of X-rays we shoot into a person. If we are going to take an MRI, we’d like to minimize the amount of time someone spends in a claustrophobic tunnel. Imaging design requires cleverly balancing resources to collect the best set of measurements most efficiently.

A less obvious example that fits into this class of linear experiments is the randomized controlled trial. The standard potential outcomes model of randomized trial assumes that a subject has two possible outcomes: outcome0 if the intervention is not applied and outcome1 if the intervention is applied. Our goal is to estimate the effect of the treatment, equal to the difference between outcome1 and outcome0. If we let “intervention” denote a variable equal to 1 if the treatment is applied and 0 if the treatment is not applied, then we have the linear relationship:

experimental outcome = intervention*(outcome1 - outcome0) + outcome0

For each subject, the experimental outcome is a linear measurement of the individual’s treatment effect. Randomizing the intervention lets us get a sense of the average effect of the treatment on a population.

A middle ground between the imaging applications and the randomized experiments that was trendy in the 2010s was compressive sensing. Compressive sensing selected measurements at random from a long list of potential options. Through some rather hairy math, researchers proved that such random subsets could provide estimates as good as if we had used all the measurements.

With this view of experiments, measurement is a complex pipeline that connects physical sensors to computation and error analysis on a computer. Statistics and optimization play a key role in the codesign of this computational experimental pipeline.

The standard way this happens is as follows. First, a design must declare which algorithm it will use to solve the inverse problem. Boyd and Vandenberghe focus on least-squares methods. But you can choose whatever algorithm suits you. With this fixed, you use statistics to characterize the estimation error given what you know about the measurement processes. You might do this by careful calibration to characterize the shape of the noise. Or, you might design the measurement itself to be random to remove systematic bias in the noise. Statistics comes in here to analyze the process of measurement and the randomness in your choice of measurement. This deliberate randomness in intervention, which casts measurement as a randomized algorithm, remains Statistics' most important contribution.

Finally, we bring in optimization again. Since the estimation error will be a function of your measurement design, you can search for the design that minimizes the error under appropriate metrics. This is the main focus of today’s lecture. Minimum error designs will be our first encounter with problems that don’t naturally lend themselves to convex optimization. There will be many choices as to what it means to minimize error, and deciding which is best ends up being a matter of convenience, taste, and convention. This is fine. There is never going to be an optimal design. Design is a problem of good enough, not optimal. You have to accept open questions and more experiments if you don’t see what you want.

Austin

Very cool that experimental design can be viewed through the lens of inverse problems and optimization! In practice, how often can we use optimization to inform our experimental design choices? For example, I think optimization could lend itself very naturally to physics experimental design, but would it be equally effective for sociology experiments? I feel it would be harder to quantify and optimize our design if our measurement tool was a Google form asking "how did you feel on a scale from 1 to 10 after this?" compared to using an oscilloscope. Do we as the researchers have to make a judgement call on this, independent of the optimization side of things?

Expand full comment

arg min

Discussion about this post