Discussion about this post

User's avatar
Sarah Dean's avatar

You're telling me that if I upload GPT-N to a robot's brain and start running PPO, it won't struggle to its feet moments later? And half an hour later it won't be running at 20 miles per hour?

Expand full comment
Maxim Raginsky's avatar

Here's a spicy take: Sutton and Barto's book has completely ruined a generation of researchers. Among other things, they barely mention partially observed scenarios. I was shocked to discover that some of my colleagues who work on RL don't see a problem with using uncompressed histories of observations and actions when working with POMDPs, and the mention of belief state elicits blank stares.

Expand full comment
10 more comments...

No posts