You're telling me that if I upload GPT-N to a robot's brain and start running PPO, it won't struggle to its feet moments later? And half an hour later it won't be running at 20 miles per hour?
Here's a spicy take: Sutton and Barto's book has completely ruined a generation of researchers. Among other things, they barely mention partially observed scenarios. I was shocked to discover that some of my colleagues who work on RL don't see a problem with using uncompressed histories of observations and actions when working with POMDPs, and the mention of belief state elicits blank stares.
For me, I plan on looking for more rigor, but two reasons I use Sutton and Barto off the top of my head: I happened to read it first, and I think they pose important and relatively easy unsolved problems. I think it would take me much longer to begin contributing if I started from eg control theory, since it feels so much more developed.
edit: My comment makes me sound more critical of their book than I am. I'm not yet certain I'll prefer the tradeoff of more mathematical rigor. More important, I think Sutton and Barto might more strongly make the same counterargument I somewhat made here already: don't let the perfect be the enemy of the good (or the great).
Coming late to this party... but this discussion reminds me of something I heard from a professor in Political History a long time ago: "Marx was a philosopher, his followers imparted doctrine".
I'm not sure I understand. Are you saying I'm taking Sutton and Barto's ideas too far? I roughly only claimed they've posed some important questions. And that, while I think the original post makes many good arguments, I don't feel so strongly about those arguments myself. (Though I'm sure I'll change my tune when someone releases an "Agent Learning" textbook or something...)
Or are you only saying _other_ people have taken Sutton and Barto's ideas too far?
You're telling me that if I upload GPT-N to a robot's brain and start running PPO, it won't struggle to its feet moments later? And half an hour later it won't be running at 20 miles per hour?
For which value of N?
N >> 4
Here's a spicy take: Sutton and Barto's book has completely ruined a generation of researchers. Among other things, they barely mention partially observed scenarios. I was shocked to discover that some of my colleagues who work on RL don't see a problem with using uncompressed histories of observations and actions when working with POMDPs, and the mention of belief state elicits blank stares.
I don't consider this spicy at all.
I'm more puzzled by why so many people want to latch their reputation to this notation, terminology, and worldview.
For me, I plan on looking for more rigor, but two reasons I use Sutton and Barto off the top of my head: I happened to read it first, and I think they pose important and relatively easy unsolved problems. I think it would take me much longer to begin contributing if I started from eg control theory, since it feels so much more developed.
edit: My comment makes me sound more critical of their book than I am. I'm not yet certain I'll prefer the tradeoff of more mathematical rigor. More important, I think Sutton and Barto might more strongly make the same counterargument I somewhat made here already: don't let the perfect be the enemy of the good (or the great).
Coming late to this party... but this discussion reminds me of something I heard from a professor in Political History a long time ago: "Marx was a philosopher, his followers imparted doctrine".
I'm not sure I understand. Are you saying I'm taking Sutton and Barto's ideas too far? I roughly only claimed they've posed some important questions. And that, while I think the original post makes many good arguments, I don't feel so strongly about those arguments myself. (Though I'm sure I'll change my tune when someone releases an "Agent Learning" textbook or something...)
Or are you only saying _other_ people have taken Sutton and Barto's ideas too far?
> Or are you only saying _other_ people have taken Sutton and Barto's ideas too far?
That
Thanks for the post!
What resource (book, course) would you recommend to unRL one's brain?
Shameless self-promotion, the last couple of chapters here perhaps? https://mlstory.org/
But if you tell me a bit more specifics about what you're looking for, I could send other potential resources.
A good way to understand **cooking** is to consider some of the examples and possible applications that have guided its development.
- A red pill that, if taken, reveals unpleasant truths for you.
- A druid recipe from ancient Gaulle that lets you prepare a drink so powerful, you will have the muscles of ten for the rest of the day!
- A magic potion that will turn the user into an invincible bear, immune to the arrows of all hunters of the realm combined.
- A medicine so strong, it cures cancer.