4 Comments

I once searched troves of stats, ML and phil sci literature for a general definition of overfitting. There is none. The best thing that can be said is that it is a relation between a fitting method, a model and data. If someone doesn't like the relation they call it overfit. ML people then decided to confuse me even more and renamed perfect fit to (benign) overfit.

Expand full comment

Charles Isbell had a very concrete definition of overfitting when I took ML from him in 2010: "when the training error continues to decrease, but the testing error increases." So interestingly also related to hold out.

He taught using Tom Mitchell's book, but I'm not sure the definition is found there.

I've continued to use this as the definition in intro ML contexts when I'm teaching to avoid the hand waving issue. However I've never loved it and it really doesn't make sense in the more recent context of phenomenon like double descent in neural nets.

Expand full comment

100% agree with you. I'll write more on this topic coming soon.

Expand full comment

Looking forward to reading an installment on why overfitting doesn't exist

Expand full comment