Discussion about this post

User's avatar
Badri's avatar

On the monotonicity. Could it be explained by importance sampling? Test error is an aggregate metric and may be we are looking at importance sample weighted average to account for the distribution difference..

Expand full comment
Chris's avatar

I'm a little surprised, but not overly, at the monotonicity. It makes sense that the only way to do well on every test set is to learn the precise concept. And by that, I mean not only the task, but the selection criteria and so on. What's harder for me to fathom is how narrowly *linear* that is. That means that for every X errors on the real test set there will be mX errors on some new set. Why would it be so precise?

Gael Varoquaux explained it to me once that, suppose your dataset is undersampled, and say it only spans some affine subspace, or some sub-manifold. Do you conclude that your model only describes that affine space or that manifold? No. Why? Because if there is a meaningful concept that only applies to that subspace, and that's different from the ambient concept, then in order to discover it you would have had to sample from a zero-measure subset. That doesn't happen. The only thing that has any chance of happening is missing small modes.

Expand full comment
5 more comments...

No posts