Purpose Driven Uncertainty Quantification

Summarizing a much longer than intended blog series.

Mar 13, 2024

I’ve been having a hard time figuring out how to close this series on action and uncertainty quantification. I guess that if I had good answers, I wouldn’t be blogging. Arg min is where I express my confusion out loud on the internet. This series has been helpful for me in articulating said confusion, and maybe it’s just worth a recap and some directed questions to investigate.

Here’s a rough, cleaned up version of the decision algorithm scatter plot.

I’ve left on three extremes here. You can pin your own decision problem accordingly. In each regime, uncertainty quantification plays a different role and looks substantially different. There is no universal algorithm for uncertainty quantification.

Let me elucidate by recapping the three corners. Most people focus on uncertainty quantification for stochastic optimization problems with a single stage of recourse. In these problems, actions don’t impact predictions. For example, an optimizer might try to predict weather or climate and plan accordingly for their utility given nature’s whims. Quantifying probabilistic ranges of various outcomes and forecasts could improve an optimizer’s yield. This might lead them to the controversial space of chance constraints and value-at-risk, which have some limited utility in practice.

We then turned to bandit algorithms. Actions have no impact on the world in the bandit framework, but an optimizer can act frequently. There, we saw that the faster you acted, the less of a role uncertainty quantification played. In the standard problem, the multi-armed bandit, an optimizer would only use a loose bound to check when one option yielded a higher return than another. Oftentimes, as side information became available to the optimizer, simple greedy methods based on point predictions of return would perform as well as anything else. Faster action meant less dependence on uncertainty quantification.

Turning to when actions had impact naturally led to control systems. I described how certainty equivalence was surprisingly effective in control. This didn’t make sense to me, but it’s guided design since the 1940s. The key is that if you perfectly observe the system state, you don’t need to quantify uncertainty. The state sums up everything you need to know to act accordingly.

When you didn’t fully observe the state, the problem became much more complicated. I recalled Doyle’s paradox, where overconfidence in state estimation leads to disastrous sensitivity to model uncertainty. Carl Boettiger shared how this sort of behavior, where a good predictive model led to worse performance, arose in simple ODE models of conservation.

This led me to a discussion of uncertainty quantification in classic control. The beauty of feedback is that big ensembles of completely unreliable systems behave the same way in negative feedback loops. Uncertainty quantification buys you little when systems are regulated by feedback control. Feedback is mitigating the uncertainty for you! Indeed, almost no modeling was needed. However, certain uncertainties about feedback systems could be extremely dangerous. Delays were particularly troublesome. If you are trying to subtract two large signals, being slightly off in alignment will leak large errors. Delay sensitivity led me to an apparent contradiction of the earlier discussion. Now I could construct two systems that looked to have the same model yet gave very different performances in a feedback loop. The key was knowing which uncertainty mattered.

This subtlety in uncertainty quantification led to some troubling issues in mission critical design. Gunter Stein’s Respect The Unstable lecture highlighted why I’ve drawn the red barrier in my plot. There are likely some regimes where extremely fast action with extreme control authority is impossible. These cases tend to be dangerous systems like airplanes or nuclear power plants. In such systems, every component of the design, including the human operator, becomes mission critical. Every component must be designed with extreme care to avoid disaster.

The examples I worked through revealed uncertainty quantification was a moving target. Marginal prediction intervals played the smallest role, and I couldn’t find any applications where they were the crucial missing piece. Instead, holistic uncertainty quantification that accounted for the various ways unknowns could arise was necessary. Knowing which uncertainties are worth tending to is a subtle art that requires deep modeling knowledge. There is no such thing as “model-free uncertainty quantification.”

In most cases, uncertainty quantification was needed in counterintuitive ways. For example, delay times were important to quantify in feedback systems, but amplification levels were not. This is why I want a better understanding of the sorts of decision problems machine learning and statistics researchers are hoping to help solve. We need to pose the problems clearly and carefully before posing modern solutions to uncertainty quantification. And it’s why I want to reiterate that most of the applications I hear about algorithmic decision aids have multiple points of action and assessment. The ability to act more than once changes how we formulate problems and think about uncertainty.

And maybe we should ask about what happens below that red curve too. What do we do when there’s high uncertainty and high impact decisions? Obviously, we can’t just not act. But we haven’t developed any reasonable frameworks to mathematically automate decision making in that dark space. Better uncertainty quantification is not going to improve emergency surgeries. It’s not going to improve macroeconomic policy either. When we get into these regimes where we get to act only once with huge impact, it’s beyond presumptuous to think that math or stats will save the day. At that point, the uncertainty that needs to be quantified is the value of the math itself.

Jessica Hullman

"Better uncertainty quantification is not going to improve emergency surgeries. It’s not going to improve macroeconomic policy either." These statements seem too broad to me. To say that better uncertainty quantification can't improve settings like macroeconomic policy making sounds like you're saying there is no value in considering how predicted distributions across different models compare. But my experience with policy makers (such as with central bankers, who have invited me to their conferences several times based on their interests in expressing uncertainty) is that decision makers often perceive value in attempts to quantify uncertainty, even if they know the assumptions behind any particular quantification can't be verified. I.e., the problems we tend to see with uncertainty quantification in some of these contexts are not that uncertainty quantification isn't useful or we shouldn't be trying to improve on our methods, it's that some people expect the "small world" view of uncertainty that we can quantify within some model to capture all of our uncertainty. So I wouldn't say better uncertainty quantification is necessarily useless in these settings.

Expand full comment

6 replies by Ben Recht and others

Kevin M

“ At that point, the uncertainty that needs to be quantified is the value of the math itself.”

Love that quote

6 more comments...

arg min

Discussion about this post