6 Comments
User's avatar
Carl Boettiger's avatar

Great post Ben! Really enjoy your clear explanations. I think (?) I recently encountered another example of this ill-conditioning effect in my work on optimal control in fisheries management and conservation problems, where the model preferred by the adaptive learning which makes the most accurate predictions still leads to a worse policy (both economically and ecologically) in what I referred to as 'the forecast trap', https://doi.org/10.1111/ele.14024.

Expand full comment
Ben Recht's avatar

Thanks, Carl! I'm looking forward to reading this.

Expand full comment
rvenkat's avatar

I am looking at various climate/weather models and associated policy recommendations. Do you think whatever you've blogging about in this series of posts have straightforward implications for climate change mitigation strategies debates?

Expand full comment
Ben Recht's avatar

This is a good question and one I hope to come back to later in this series. What is the role of predictive models of catastrophic events that are deeply uncertain and have dozens of modeling degrees of freedom? I don't have answers, but I have many more questions that I'll raise in a future blog.

Expand full comment
Zach's avatar

This seems somewhat related to the bias-variance tradeoff. Particularly Stein's paradox, where allowing bias in your estimates in high dimensions reduces variance and lowers overall error.

The example from your second 2020 blog where you describe certainty equivalence and its optimality under certain conditions was helpful, I had been trying to find references on that topic since your earlier post.

Expand full comment
Roy Fox's avatar

> You can fix this by pretending your data is bad, or you can fix this by better understanding your broken model.

There's a 3rd option: curb your optimizer. The idea follows ET Jaynes: if you know that you're going to end up having a suboptimal policy (because of model uncertainty, limited data, early stopping, unknown unknowns, whatever) then don't optimize it too hard, thus avoiding the exact issue you raise here. In RL this has many names — max-entropy, energy-based, etc — but I like to call it Bounded RL. It can be done model-based (https://link.springer.com/chapter/10.1007/978-3-642-24647-0_3) or model-free (https://royf.org/pub/pdf/Fox2016Glearning.pdf), and is the principle behind some great algorithms like SAC.

Expand full comment