7 Comments
User's avatar
Johan Ugander's avatar

Inspired by all your digging, I started looking into the history of cross-validation. Via Cosma Shalizi's excellent notes https://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ I discovered Stone (1974) "Cross-Validatory Choice and Assessment of Statistical Predictions". It contains a brief discussion of pre-1959 sample-splitting work (from the 30s, 40s, and 50s), including a series of papers published in a 1951 "symposium" on "The need and means of cross-validation". I'd love to hear your thoughts on this line of historical work, as it relates to Highleyman’s contributions!

Expand full comment
Ben Recht's avatar

I'm on it! I will report back with what I find.

Expand full comment
Erik's avatar

Platt scaling always seemed like some kind of duct tape (in a respectful way).

Expand full comment
Ben Recht's avatar

John Platt is an endless source of clever and innovative ideas.

Expand full comment
Mario Figueiredo's avatar

I have been thoroughly enjoying these recent posts of yours! They made me go back to an old favourite book from 1996, by Brian D. Ripley, "Pattern Recognition and Neural Networks". Unlike most ML books, but just like your PPA, it starts with a chapter on decision/prediction theory. Also unlike any other ML book I've read, Ripley gives credit to Highleyman: "The idea of of a test set is sometimes called the hold-out method and goes back at least to Highleyman (1962)".

Expand full comment
Ben Recht's avatar

Ripley's is such a great book. Very clear and very aligned with the old and new conceptions of pattern recognition.

Expand full comment
Davis Yoshida's avatar

> There was tremendous excitement in the air, even if we were all deeply confused. Everyone was defining their own problem and pulled in different directions.

This could easily be describing the current LLM moment as well!

Expand full comment