Scientific Versus Statistical Prediction

May 23, 2024

Meehl's Philosophical Psychology, Interlude 1.

7 Comments

Jun 3, 2024

It's definitely interesting to consider why or when pure prediction (in the ML sense) is useful for science. One example that I think of often is the negative one: prediction of life outcomes (https://www.pnas.org/doi/10.1073/pnas.1915006117). What does it mean when our fanciest ML models and most extensive data to date *can't* make more accurate predictions that embarassingly simple models? There are definitely ethical/validity implications of this observation (https://predictive-optimization.cs.princeton.edu/), but maybe there are also scientific ones? It's also interesting to consider the role of mass collaboration/prediction competitions in justifying such a negative result.

Expand full comment

Reply (1)

Ben Recht

Jun 9, 2024

Hmm, I'm not sure there's anything deep to say here. In the Future of Families task, ML is no worse at prediction than anything else. It's possible that the social outcomes aren't actually predictable, especially given the messiness of the data (75% of the fields are missing!). I don't know why we'd think that we should be able to predict someone's GPA to the 2nd decimal point from a bunch of standardized survey questions. Did you have a different take away?

I also couldn't disagree more with that predictive optimization rant. Yikes. I might have to call that one out specifically when I write about Meehl's Clinical vs. Statistical Prediction Lecture.

Expand full comment

Bill Taylor

Jul 31

"Some criticize black-box predictive models as punting on scientific understanding, but we learn things when the predictions fail." Yes, but. When we are using large ML models and then trying to learn from their failures, we're studying something truly different. The underlying cause of such failures are by nature not human-understandable; and the improvements we make with more data or larger models are often nothing more than overtraining of the model to drive toward the single truth we are seeking... and not global truths which others will demand of the same model.

Take the example of a ML-based vision processor trying to detect human movers in a given space.

In our example, we may observe that a given model doesn't detect bicyclists very well. More training data with more labeled bicyclists may certainly improve the model on this axis, and we would rejoice... if we're in the bicyclists-detection business. But what has that updated training done to the other predictions?... of pedestrian walkers and resting/sitting/standing persons for example?

In simpler software (or big software performing clearly defined functions) we can run regression tests to ensure we've not stepped backwards. In modern meta-models, I'm not sure we are very good at regression testing. I am sure we're overfitting a lot of models in the name of improving them.

Expand full comment

Miguel

May 24, 2024

> A core assumption throughout is CS, that our software is bug-free.

This segues quite nicely into the discussion of expediency in science. Expediency not only shapes the questions we make, also the degree in which we are critical when adopting the assumptions other make. And there are clear examples that such an assumption (on optimization software, simulators, etc.) is not a very safe one... if you get results that are too good to be true try to replicate on a different library or middleware or whatever.

And definitely, software version numbers are hyperparameters :)

Expand full comment

David Chapman

May 24, 2024

Reminded of Leo Breiman’s Statistical Modeling: The Two Cultures (although your point about software correctness is not in there as far as I recall)

Expand full comment

Non Linear Panacea

May 23, 2024

Words like “predict/infer” really give those models too many responsibilities.

Human “predict/infer” some thing from model. Model itself is just taking input and sending output. When it fails to match with reality, we human are wrong. We learn sth. Model does not need to bear any responsibilities to advance human science. Maybe “fitting/approximating” is better word.

Polynomial or auto regressive model need not to save humanity. Computation is computation.

Expand full comment

Reply (1)

Non Linear Panacea

May 23, 2024

No one use “universal prediction theorem”

Expand full comment

arg min

Scientific Versus Statistical Prediction