6 Comments
User's avatar
James McDermott's avatar

Excellent, I love this type of post!

What type exactly? Hard to put my finger on it, but roughly - seeing multiple algorithms in a single framework in order to see where gaps remain.

Jason Hartline's avatar

I could not understand the figure without reading this post: https://www.argmin.net/p/predictions-and-actions-redux

(It wasn't clear what the action impact axis meant.)

Ben Recht's avatar

Totally agree, that's my bad. I should have linked to that post rather than the table of contents. I made a small edit to this post to reference it.

Nico Formanek's avatar

"We’ve collectively decided that the best strategies against random adversaries are those that maximize the expected value of the score. Don’t ask me why."

Obvious; the scheme proved itself by allowing Voltaire to set up a large (and successful) lottery scam.

Smokey Cigaretto's avatar

Action selection is optimization. Feedback is parametric optimization. You can’t evaluate the solution until the parameters are known. In single agent processes, your future actions are nested parametric optimization problems. In LQ setting there is a closed form solution for that, hence why LQR has closed form for feedback policy. When not LQ, it’s okay, all the nested optimization problems have the same objective so you can collapse into a single problem (non convex). In multi agent settings it gets crazy 🤪 parametric games with all sorts of nested shit that can’t be collapsed. Fun to think about but not really practically useful

Avik De's avatar

I’ve never thought of the “plant” in control as being a (potentially random) adversary but I can wrap my mind around it. Love the big picture view covering different fields- thanks!