Discussion about this post

User's avatar
Alyosha Efros's avatar

Doesn't seem like a new issue. Back in the 2000s, most of those fancy graphical model papers performed worse than nearest neighbor baselines and no one cared. Another great example I always quote is Viola & Jones (2001) getting much more fame than Rowley, Baluja, Kanade (1998) just because the former used fancy novel algorithm.

Expand full comment
Mark Nelson's avatar

This is an interesting provocation! Here are two takes on it that I don't think entirely duplicate what's been said so far:

In the specific case of high-profile applied deep learning papers, I read these papers as implicitly making a kind of historical claim: The Deep Learning Revolution, which led to major progress on problems like facial recognition, protein folding, and machine translation, could revolutionize [our field X] too, if we adopt the new techniques. And [paper] presents evidence that it does. For that to hold up, the new techniques have to actually outperform old ones. Any given paper author can of course say they aren't actually arguing this. But I think it's in the background of why people care so much about these papers in the first place.

In a more general sense, I think it is just genuinely interesting to know when you might "need" or even "want" a more complex model. Maybe in somewhat the same way that reverse mathematics is interesting, although the analogy isn't great because provable results are hard to come by.

Expand full comment
30 more comments...

No posts