arg min

I can’t believe you waited until AFTER I left Berkeley to teach this.

Expand full comment

tbf, your class this semester seems pretty fun...

Expand full comment

Miguel

Great topic guys, looking fwd to read more about this.

Expand full comment

Anna Gilbert

I'm definitely interested in this topic as it applies to science. "Everyone" (okay, many people) tout AI as revolutionizing science. Hell, there two Nobel Prizes awarded this year in AI/ML! But, I'm not convinced that applying ML or AI for scientific questions is that straightforward. It's definitely not for traditional computational problems.

Expand full comment

I don't disagree, but wonder what you mean when you say it's not for traditional computational problems. What are the examples you have in mind?

Expand full comment

Anna Gilbert

Jan 23Edited

I was thinking of the FFT. It would be nonsense to replace an FFT with a neural net to compute the DFT of a vector. The precision would be awful, the compute time terrible compared to the FFT (that's *after* training), and you'd need to train the network with a bunch of data in the first place.

I do think that there might very well be situations (or computational models) in which all that overhead is worthwhile or sensible. For instance, you have a lot of training data readily available, you're willing to spend the computational time to train a network, in exchange for a fast evaluation on new data (a faster evaluation than running a standard scientific computing solver, for instance). And, you're not that concerned about accuracy.

That computational model I described above is really different from standard scientific computing in which you don't have a bunch of training data, you do care tremendously about accuracy, and you might be willing to trade off running time for some accuracy. But, mainly, you don't have lots of instances of solutions to a PDE; you're want to compute the solution to that PDE.

Expand full comment

Lalitha Sankar

I am curious. I am teaching a Generative AI course and evaluation is the biggest challenge for such systems. Don't ask me either if I know fully what I am doing but happy to share if curious. Look forward to your posts.

Expand full comment

Is this a grad or undergrad class? Would you mind sharing your syllabus?

Expand full comment

Lalitha Sankar

Grad. I'll email you the syllabus. I started with PCA and PPCA and voila, latent spaces already appear. :)

Expand full comment

Josh McGrath

Will you all be releasing lecture videos? Or is there a way to audit the course for those of us in the Bay Area?

Expand full comment

There's no video, but I'll be sporadically blogging here and hope to release something more detailed at the end of the semester. Perhaps I'll do a reprise of the class in a more open setting if it goes well.

Expand full comment

∂jalel

You’d do a great service to the world to record the lecture :)

Expand full comment

Thank you. I'm trying to structure the class as a seminar, and the format doesn't lend itself to video.

Though maybe it would be a new pedagogical innovation if I streamed the seminar on Twitch...

Expand full comment

Chris

I've recently started thinking of "machine learning" systems (especially LLMs) as "impostorhood evasion machines". That is, the machines are impostors by construction, and the additive (=summed over examples) loss functions are designed to make them evade detection through statistical means. For the last 50 years, this was hard enough that people have forgotten that there is anything more to "AI" than the part about evading detection by statistical means. Or, "certain" statistical means at any rate.

However, now that there are solutions to this problem that are approaching maturity (= diminishing returns, and practical utility,) people are immediately finding other ways of detecting impostorhood. Twitter/X is full of examples, especially if you follow the people with an interest in that. Most of these have to do with spotting inconsistencies, and, an inability to resolve them when confronted. Now, the question is whether these new means of detecting impostorhood are statistical at all, if they are, how are they different?

On the one hand, I'm inclined to think that certain tests are sufficient by themselves to expose impostorhood, meaning they are not "statistical", but really that just means they don't depend on an average. A "statistic" is just an aggregation of a sample that exposes a property of interest, and the presence of a single dispositive test fills that description. On the other hand, as humans we often face devastating rhetorical attacks, exposing some degree of impostorhood, and eventually recover our standing. I think that's because as humans, impostorhood is generally rectifiable, whereas for a machine system it may not be. Maybe it's because when a human is an "impostor" the thing they're an impostor of is a narrowly scoped role or skillset, not sentience itself. There's also the fact that even when humans fail a consistency check, we have the other thing - self awareness - that implies a path to rehabilitation, and LLMs regularly fail not only on consistency, but also on self-awareness when called out.

How to use this concept in evaluating systems? For one thing, if the average case is all you care about, then by all means, optimize for it. If you care about the worst case, then statistical learning theory doesn't have much to offer except hardness results. There could be a third way, which is something like regret analysis, where the thing you try to bound is, "given one occurrence of an asymptotically worst case outcome, what is the probability of a recurrence?"

Expand full comment

Chris