Blurred contexts

Jan 28

From bit prediction to the holdout method

4 Comments

I once searched troves of stats, ML and phil sci literature for a general definition of overfitting. There is none. The best thing that can be said is that it is a relation between a fitting method, a model and data. If someone doesn't like the relation they call it overfit. ML people then decided to confuse me even more and renamed perfect fit to (benign) overfit.

Expand full comment

Reply (2)

Tucker Hermans

Jan 28

Charles Isbell had a very concrete definition of overfitting when I took ML from him in 2010: "when the training error continues to decrease, but the testing error increases." So interestingly also related to hold out.

He taught using Tom Mitchell's book, but I'm not sure the definition is found there.

I've continued to use this as the definition in intro ML contexts when I'm teaching to avoid the hand waving issue. However I've never loved it and it really doesn't make sense in the more recent context of phenomenon like double descent in neural nets.

Expand full comment