The Cost of Modeling
On the abstraction boundary between mathematical optimization problems and what they model.
This is the second part of the live blog of Lecture 6 of my graduate class “Convex Optimization.” A Table of Contents is here.
The promise of optimization is a clean abstraction between the math and the modeling (again, this is a key reason why I’m much more excited to be teaching optimization than machine learning). Once you hand a solver a model, the problem solving is purely deductive. Either the problem is convex or it is not. Either the problem is well-conditioned, or it is not. Either I can guarantee efficient solving, or I can’t. There is no guesswork about what the future looks like. That’s the modeler’s problem.
But all of the models themselves, as we are constantly reminded by the ghost of George Box,1 are wrong. Every time I go through examples of optimization problems, I’m struck by how many idealizations need to be made to get a tractable problem. I want to belabor this point with the simplest example that every linear programming class starts with: the minimum cost diet problem.
Educators love the diet problem because it’s easy to describe and, at first blush, is clearly linear. I start with a list of nutritional requirements that my diet must satisfy. For example, I could specify a maximum number of calories, a minimum amount of vitamins, and a minimum amount of protein. I then get a list of foods from my diet tracking app and look at their nutritional content. This content is proportional to weight: every 100g of spinach contains 3g of protein, 2g of fiber, 100mg of vitamin C, etc. I can make a similar nutrient content list for every food I might consider consuming.
A nutritional requirement is then a linear constraint on the allocation of foods in my diet. If I eat twice as much of some food, I get twice the nutrients. If I eat two foods, I get the sum of their nutrients. Moreover, if I am buying this food slurry in bulk from the Berkeley Bowl, the price is proportional to the weight as well, so the cost is additive. With enough assumptions, I can formulate finding the diet of minimum cost that satisfies my nutritional requirements as a linear program.
But is it really? The variation of nutrients in food is unknown. In packaged foods, the USDA allows for 20% errors in calorie count alone. And fresh food is much more highly variable. The nutritional content of an apple varies with size, ripeness, and variety. Tart apples have considerably more Vitamin C than sweet ones, and the amount of vitamin C decays over time after the apple is picked.
And what about the constraints on nutrients? How do we set those? The USDA minimum nutrition guidelines all contain guesswork. Recommended allowances of nutrients are based on averages, observing how much of the consumed nutrients come out the other side (yeah, I know, it’s gross). Many of the daily allowances are inflated by engineering factors just to be safe. If you really care about minimum cost, maybe you eat less than the recommended allowances. If you’re a multivitamin parishioner, maybe you want to exceed these numbers by a factor of 10.
I could go on.
Ironically, the diet problem was invented by economist George Stigler to argue that the entire idea of a minimum cost diet was an absurdity. Stigler, who would go on to be an influential figure in the Chicago School and would win the economics fake Nobel prize in 1982, was a rabid antiregulation, small government conservative. In his 1945 polemical screed, “The Cost of Subsistence,” he argued that USDA recommendations about minimum cost diets weren’t scientific. In the process, he wrote up one of the first linear programs—several years before George Dantzig would coin the term. But Stigler’s linear programming solution was polemical. He argued that there was an “almost infinite complexity of a refined and accurate assessment of nutritive value of a diet.” Moreover:
“...the particular judgments of the dieticians as to minimum palatability, variety, and prestige are at present highly personal and non-scientific, and should not be presented in the guise of being parts of a scientifically-determined budget… these cultural judgments, while they appear modest enough to government employees and even to college professors, can never be valid in such a general form.”
Stigler’s proposed minimum cost diet was particularly grotesque. Every day you get to eat one pound of flour, two ounces of evaporated milk, five ounces of cabbage, one ounce of spinach, and five cans of navy beans. Seasonings add to the cost, so you can’t have any of those. Have fun with that. But it cost 39.93 per year in 1939.2 It’s closer to 800 dollars a year today. Maybe this could work for you FIRE folks out there. Just put it in a blender and drink it like Soylent. For everyone else, this diet is a war crime.
Even though it was introduced as a polemical screed against the absurdity of government nutrition recommendations, we now start every linear programming class with the diet problem as an example. Even the simplest optimization model we present in our courses is loaded with approximations, idealizations, and value judgments. What do you think happens with any of the more complicated optimization problems we discuss? In class yesterday, I also introduced Markowitz portfolio optimization, which requires modeling the future returns on investments. Portfolio optimization rests on far shakier ground than diet planning.
A central lesson I want us to take away from this semester is that just because we can find optimal solutions to well posed optimization problems, the problems themselves are all idealized models. We have to interpret their policy suggestions accordingly. How optimization models are useful is a domain specific question that is necessarily outside the scope of the course. Having laid out my necessary caveat emptor, we can safely dig into the abstract convex geometry of duality theory.
Today’s blog only talks about dead dudes named George.
George Dantzig would later lead a team to solve Stigler’s LP exactly, finding that the optimal solution was 39.69. By hand, Stigler had solved an LP with 86 nonnegative variables and 9 equality constraints and was off by less than 1%. In machine learning today, we consider that solved to optimality.
My dad asked me to do the same experiment for him, but to minimize calories instead of cost. He came to the same conclusion that such a diet is a war crime. But interestingly, this variant of the Stigler diet put loads of spices, so much so that I had to explicitly constraint the quantity of each spice.
https://www.jeremykun.com/2017/09/24/linear-programming-and-healthy-diets-part-2/