My Mathematical Mind

Sep 25, 2023

A call for a qualitative theory of machine learning

12 Comments

Sep 25, 2023

To your point about tasks where it's easy to write a program to solve it vs having to rely on DL, we can see the same thing with the creation of complex patterns. If you didn't know, say, how to write reaction diffusion equations and solve them, some relatively simple 3 parameter images look endlessly complex. But you can't describe them easily with typical equations --- you need solved PDE's.

The lurking definition of parameter therein has troubled me since grad school.

Expand full comment

Reply (1)

Ben Recht

Sep 25, 2023

True, chaos is an interesting third case: We can define fractal images simply, but their structure is undecidably complex.

Expand full comment

Reply (1)

Jacob N Oppenheim

Sep 25, 2023

Question would be how different a case though? If we lacked knowledge of certain signal processing or language primitives wouldn't fitting other classes of data seem much more difficult?

Counterargument would be that the "simple" pde models can't be easily fit even when you know their form in advance?

Expand full comment

Reply (1)

Ben Recht

Sep 25, 2023

I'm not sure I understand what you mean, but when I think of things like chaos and turbulence, these are phenomena that are hard to predict even though we have reasonable models. So it's very different than "is this image a dog or a giraffe" which is a very simple problem but we can't write down a simple math program to solve it.

Expand full comment

Reply (1)

Jacob N Oppenheim

Sep 25, 2023

I think that's fair, but I don't think you need chaos here. Take a Turing Pattern for instance --- simple to generate from 2 paired PDEs with a handful of parameters --- but that doesn't mean we can easily extract the parameters from a picture of the pattern despite knowing the underlying model.

Doesn't "difficult to predict" in general mean that extracting the correct parameter values is very hard and there's exponential sensitivity to them. So it may not be as different as you implied?

Expand full comment

Joao

Sep 15

Random question Ben. I was reading your book https://mlstory.org/index.html and I had an (irrelevant) question on chapter 6 about generalization. It's about "These powerful concentration inequalities let us precisely quantify how close the sample average will be to the population average. For instance, we know a person’s height is a positive number and that there are no people who are taller than nine feet. With these two facts, Hoeffding’s inequality tells us that if we sample the heights of thirty thousand individuals, our sample average will be within an inch of the true average height with probability at least 83%. This assertion is true no matter how large the population of individuals. The required sample size is dictated only by the variability of height, not by the number of total individuals." . Shouldn't the probability be at least 99.42% instead of 83%? I got to 99.42% by doing 1-exp((-2*(30000)*(1/12)^2)/(9^2)).

Expand full comment

Michael Molin

Sep 25, 2023

A Unified Theory - Universal Language https://www.linkedin.com/pulse/unified-theory-consciousness-michael-molin/

Expand full comment

rvenkat

Sep 25, 2023

Would you consider a naturalist's approach qualitative? For example this paper (https://www.nature.com/articles/s41586-019-1138-y) cites Ethology and Behavioral Ecology as a way to study machine behavior. And there is a field (not so popular in neuroscience these days) called neuro-ethology.

Are these kinds of methods appropriate---in our opinion?

Expand full comment

Reply (1)

Ben Recht

Sep 25, 2023

When I say qualitative, I refer to the study people and their practice. I am personally not interested in using social science to study computers. I think we can learn a lot by looking at what machine learning researchers and engineers themselves do (and writing papers anthropomorphising machines is definitely something they love to do).

Expand full comment

Reply (1)

rvenkat

Sep 25, 2023

Thanks for the clarification.

Maybe Donald MacKenzie's social study of finance community (https://mitpress.mit.edu/9780262633673/an-engine-not-a-camera/) or Barry Barnes' sociology of knowledge models (https://www.jstor.org/stable/42852643) are more like it.

Expand full comment

Reply (1)

Zoë Ruha Bell

Sep 26, 2023

MacKenzie has a lot of great stuff that I keep meaning to read, perhaps particularly relevantly a book using a sociological approach to study the interaction between computing and mathematical proof: https://mitpress.mit.edu/9780262632959/mechanizing-proof/

Expand full comment

Reply (1)

rvenkat

Sep 26, 2023

Thanks! You may also like this recent paper: https://www.journals.uchicago.edu/doi/abs/10.1086/697318

Expand full comment

arg min

My Mathematical Mind