Discussion about this post

User's avatar
Ani's avatar
43mEdited

Hi, very interesting blog.

1. There is this term "memorization" that seems to be thrown around a lot. What is the difference between memorization and generalization?

2. Is there any theory of what happens when the i.i.d assumption is broken? For example something that mathematically quantifies the "brokenness" of the i.i.d assumption, then provides a guarantee on the sample complexity?

3. How does over-parameterization and over-fitting relate to generalization theory?

4. Are there any instances (even "hypothetical") where "more data" is not good? Perhaps if you've built your model using i.i.d assumption and this assumption does not hold in practice, so the model just "collapses"?

5. Also, how does feature representation effect the generalization bounds?

Expand full comment
Sam's avatar

What are your thoughts on modern theory like Arora and Goyal's "A Theory for Emergence of Complex Skills in Language Models"(https://arxiv.org/abs/2307.15936)? Should we see more work like this? Less?

Generalization research in AI seems to be in the "extraordinary science" phase, to use the words of Kuhn. Theoretical progress from here will require recourse to philosophy and first principles, as well as cross-disciplinary interactions with psychology and cognitive science.

Expand full comment
1 more comment...

No posts