Cool Kids Keep

Nov 27, 2023

On the academic imperialism of reinforcement learning.

12 Comments

Nov 29, 2023

You're telling me that if I upload GPT-N to a robot's brain and start running PPO, it won't struggle to its feet moments later? And half an hour later it won't be running at 20 miles per hour?

Expand full comment

Reply (1)

Ben Recht

Nov 29, 2023

For which value of N?

Expand full comment

Reply (1)

Sarah Dean

Nov 29, 2023

N >> 4

Expand full comment

Maxim Raginsky

Nov 27, 2023

Here's a spicy take: Sutton and Barto's book has completely ruined a generation of researchers. Among other things, they barely mention partially observed scenarios. I was shocked to discover that some of my colleagues who work on RL don't see a problem with using uncompressed histories of observations and actions when working with POMDPs, and the mention of belief state elicits blank stares.

Expand full comment

Reply (1)

Ben Recht

Nov 27, 2023

I don't consider this spicy at all.

I'm more puzzled by why so many people want to latch their reputation to this notation, terminology, and worldview.

Expand full comment

Reply (1)

Braham Snyder

Nov 30, 2023Edited

For me, I plan on looking for more rigor, but two reasons I use Sutton and Barto off the top of my head: I happened to read it first, and I think they pose important and relatively easy unsolved problems. I think it would take me much longer to begin contributing if I started from eg control theory, since it feels so much more developed.

edit: My comment makes me sound more critical of their book than I am. I'm not yet certain I'll prefer the tradeoff of more mathematical rigor. More important, I think Sutton and Barto might more strongly make the same counterargument I somewhat made here already: don't let the perfect be the enemy of the good (or the great).

Expand full comment

Reply (1)

Miguel

Feb 16, 2024

Coming late to this party... but this discussion reminds me of something I heard from a professor in Political History a long time ago: "Marx was a philosopher, his followers imparted doctrine".

Expand full comment

Reply (1)

Braham Snyder

Feb 16, 2024

I'm not sure I understand. Are you saying I'm taking Sutton and Barto's ideas too far? I roughly only claimed they've posed some important questions. And that, while I think the original post makes many good arguments, I don't feel so strongly about those arguments myself. (Though I'm sure I'll change my tune when someone releases an "Agent Learning" textbook or something...)

Or are you only saying _other_ people have taken Sutton and Barto's ideas too far?

Expand full comment

Reply (1)

Miguel

Feb 16, 2024

> Or are you only saying _other_ people have taken Sutton and Barto's ideas too far?

That

Expand full comment

∂jalel

Nov 29, 2023

Thanks for the post!

What resource (book, course) would you recommend to unRL one's brain?

Expand full comment

Reply (1)

Ben Recht

Nov 29, 2023

Shameless self-promotion, the last couple of chapters here perhaps? https://mlstory.org/

But if you tell me a bit more specifics about what you're looking for, I could send other potential resources.

Expand full comment

Justin Bayer

Nov 29, 2023

A good way to understand **cooking** is to consider some of the examples and possible applications that have guided its development.

- A red pill that, if taken, reveals unpleasant truths for you.

- A druid recipe from ancient Gaulle that lets you prepare a drink so powerful, you will have the muscles of ten for the rest of the day!

- A magic potion that will turn the user into an invincible bear, immune to the arrows of all hunters of the realm combined.

- A medicine so strong, it cures cancer.

Expand full comment

arg min

Cool Kids Keep