Discussion about this post

User's avatar
Jacob N Oppenheim's avatar

To your point about tasks where it's easy to write a program to solve it vs having to rely on DL, we can see the same thing with the creation of complex patterns. If you didn't know, say, how to write reaction diffusion equations and solve them, some relatively simple 3 parameter images look endlessly complex. But you can't describe them easily with typical equations --- you need solved PDE's.

The lurking definition of parameter therein has troubled me since grad school.

Expand full comment
Joao's avatar

Random question Ben. I was reading your book https://mlstory.org/index.html and I had an (irrelevant) question on chapter 6 about generalization. It's about "These powerful concentration inequalities let us precisely quantify how close the sample average will be to the population average. For instance, we know a person’s height is a positive number and that there are no people who are taller than nine feet. With these two facts, Hoeffding’s inequality tells us that if we sample the heights of thirty thousand individuals, our sample average will be within an inch of the true average height with probability at least 83%. This assertion is true no matter how large the population of individuals. The required sample size is dictated only by the variability of height, not by the number of total individuals." . Shouldn't the probability be at least 99.42% instead of 83%? I got to 99.42% by doing 1-exp((-2*(30000)*(1/12)^2)/(9^2)).

Expand full comment
10 more comments...

No posts