It’s AI Winter week on the arg min blog. While the rest of my field is out eating po boys and beignets and getting free T-shirts, I am going to get bogged down in philosophy.

Having ended the semester confused about decision making under uncertainty, I started compiling a reading list to better understand how engineers approached these problems. Chris Strong offhandedly mentioned that Mykel Kochenderfer had written a book *titled* *Decision Making Under Uncertainty*. I immediately downloaded it. I’m a big fan of Kochenderfer and his work, and this seemed like a perfect place to start.

The book begins

“Rational decision making requires reasoning about one’s uncertainty and objectives.”

Oh wow. There is so much to unpack in that single sentence. *Rational*. What on earth does rational mean? I’ll be coming back to this question a lot, as a certain sphere of online nerds think they own the concept of rationality. But the ideal of rationality has existed for millennia and has seldom resembled anything taught in econ 101.

But let’s not get bogged down by rationality just yet. Today, I’m more interested in the second half of the quoted sentence. It suggests that this book is closely related to what I taught last semester: decision making cast as optimizing a policy, constrained in some fashion by uncertainty.

Now, what does Kochenderfer mean by uncertainty?

“In problems of uncertainty, it is essential to be able to compare the plausibility of different statements.”

He continues to argue for two critical components needed to model uncertainty.

Given your experience, you should be able to compare the plausibility of any two statements. Kochenderfer gives an example: is it more plausible that “there is an electrical anomaly on our satellite” or “there is a thruster system anomaly on our satellite?” You have to be able to decide if one scenario is more likely or that they are equally likely.

For any three statements, you need transitivity. If A is more likely than B and B is more likely than C, then A is more likely than C.

Once you have these, you might as well just assign *a number* to your beliefs about statements. That is, we can find some function, Kochenderfer calls it *P,* that takes as input any statement whatsoever and returns a number representing its plausibility. The next sentence invokes magic:

“If we make a set of additional assumptions about the form of

P, then we can show thatPmust satisfy the basic axioms of probability… This book does not provide a comprehensive review of probability theory…”

The book continues on from there, and it’s great, but I am stuck on page 2. That incantation was a huge jump! We went from being able to compare our personal attitudes towards outcomes to “clearly all uncertainty is modeled as probability.” Is it that easy? I mean, I’m still stuck on the idea that we should be able to compare the plausibility of any two statements. No modeler can do that. Sure, maybe I can compare the plausibility of two malfunctioning events on my satellite. But what about the plausibility that the anomaly was caused by a blue alien versus an alien with four eyes? Is my belief about color stronger than my belief about physiology? The protomolecule in the Expanse was a blue crystal, so…

Look, I feel for Kochenderfer and his co-authors here. Because when you try to motivate probability as a rational foundation of uncertainty, you end up writing hundreds of pages of philosophy. Where did the particular model of subjective probability in this book come from? Kochenderfer cites E. T. Jaynes, a physicist who revolutionized our thinking about statistical mechanics and also convinced half of my grad school peers to become Bayesians. And if you read the first two chapters and the Appendix in Jaynes’ *Probability Theory*, you’ll see how thorny all of these issues are.

Jaynes does not assuage my concerns about comparability. Here’s his perplexing passage arguing for why all statements are comparable:

“For example, is it more likely that (A) Tokyo will have a severe earthquake on June 1, 2230; or that (B) Norway will have an unusually good fish catch on that day? To most people, the contexts of propositions A and B seem so different that we do not see how to answer this. But with a little education in geophysics and astrophysics, one realizes that the moon could well affect both phenomena, by causing phase-locked periodic variations in the amplitudes of both the tides and stresses in the Earth’s crust. Recognition of a possible common physical cause at work makes the propositions seem comparable after all.”

No it doesn’t! The only reasonable answer to Jaynes’ question is: “I don’t know.”

The upside of this (very) initial dive into the literature is that everyone is as confused as I am. It’s clear that in many contexts, probabilistic modeling is very *useful*. But in all of the rational choice literature, there’s an inherent tension between pragmatism and universality. The arguments tend to proceed aggressively.

The rationalist will say “I have found the best and only way to represent uncertainty.”

The critic will respond with an obvious hole in the formalism.

The rationalist will respond, “Ah, but

*look at all we can do.*” Let’s be pragmaticThe critic begrudgingly agrees.

The next day, the rationalist returns to regaling everyone with how they have the best and only way to represent uncertainty.

I’m fine being with being a pragmatist, but what if we just tried—*from the start*—to formulate a pragmatic theory and not get bogged down in normative battles? What would a pragmatic theory of decision making look like? Watch this space for some partial answers to these questions. The field is far richer than the rationalists would like us to believe.

## The Rational Landscapes

It has been a while since I read Jaynes, but Isn't one of his underlying principles that nothing is actually random? I think his defense of comparability makes more sense if you accept his view that everything is deterministic so probability only represents epistemic uncertainty not some notion of true randomness.

Lots to argue with there, of course....

I've been eagerly reading your lecture blogging series and these follow-ups on uncertainty, thanks for writing them! They've put into words a lot of the struggles I have had with probability and uncertainty over the past few years.

I started thinking about modelling uncertainty seriously about 6 years ago while working on a research program with military scientists. One of their areas of interest was subjective logic - where you reason with "subjective opinions" that quantify your subjective uncertainty in a point probability estimate. You can then do some maths to e.g. replace the point probabilities in a Bayesian network with a probability distribution at each vertex - the variance of the distribution representing your uncertainty in that probability estimate. The "subjective" component comes from how you originally specify this variance - you use a beta distribution and treat the two parameters as pseudo-counts of True/False outcomes (assuming a binary variable), so with no observations you just get a flat prior ("no opinion"). They had worked on ways to incorporate human subjective opinions into this framework, to allow you to combine e.g. field agent reports with sensor data to arrive at a probability for some variable with a reasonable uncertainty around this probability. All very interesting, but it just felt like kicking the can up the road - you still have to define a "subjective" opinion mathematically, and if you use the pseudo-count approach it seems to be equivalent to asking your subjective human what their frequentist estimate of a variable's probability is...

Another framework we talked about a lot was the Rumsfeldian "known knowns" etc. This seems like a pretty good way of critiquing probability theory used to model some kinds of uncertainty. Both "aleatoric" and "epistemic" uncertainty come under the "known unknowns" category (in subjective logic the aleatoric uncertainty would be your point probability and the epistemic would be the variance of the distribution around that point - defined by your prior and how much count data you observed). But how can you model "unknown unknowns" with probability?

Herbert Weisberg's "Willful Ignorance: the Mismeasure of Uncertainty" uses the framework of ambiguity versus doubt - where ambiguity could cover both known and unknown unknowns (at least in my interpretation). His thesis is that probability theory was developed for situations of "doubt" (eg dice games with quantifiable states), but that most situations where it is now applied are rather "ambiguous", and probability is not necessarily suitable for these situations. Unfortunately the book needs a good editing and firmer conclusion, otherwise I'm sure it would be more frequently cited in these kinds of discussions.

I've just joined <large famous tech company> and am in the process of being indoctrinated into their processes for decision making within the organisation. Most business decisions have to be made quickly with little data, and are made effectively without any reference to probability estimates! Some of these decision making frameworks may be of interest in this discussion.

Anyway apologies for the long note, excited to read more of your thoughts on this!