### Updates on Policy Gradients

I’ve been swamped with a bit of a travel binge and am hopelessly behind on blogging. But I have updates! This should tide us over until next week. After my last post on nominal control,...### A Model, You Know What I Mean?

This is the seventh part of “An Outsider’s Tour of Reinforcement Learning.” Part 8 is here. Part 6 is here. Part 1 is here. The role of models in reinforcement learning remains hotly debated. Model-free...### The Policy of Truth

This is the sixth part of “An Outsider’s Tour of Reinforcement Learning.” Part 7 is here. Part 5 is here. Part 1 is here. Our first generic candidate for solving reinforcement learning is Policy Gradient....### A Game of Chance to You to Him Is One of Real Skill

This is the fifth part of “An Outsider’s Tour of Reinforcement Learning.” Part 6 is here. Part 4 is here. Part 1 is here. The first two parts of this series highlighted two parallel aspirations...### The Linear Quadratic Regulator

This is the fourth part of “An Outsider’s Tour of Reinforcement Learning.” Part 5 is here. Part 3 is here. Part 1 is here. What would be a dead simple baseline for understanding optimal control...### The Linearization Principle

This is the third part of “An Outsider’s Tour of Reinforcement Learning.” Part 4 is here. Part 2 is here. Part 1 is here. I have an ethos for tackling problems in machine learning that...### Total Control

This is the second part of “An Outsider’s Tour of Reinforcement Learning.” Part 3 is here. Part 1 is here. In addition to the reasons I’ve discussed so far, I’ve been fascinated with the resurgence...### Make It Happen

This is the first part of “An Outsider’s Tour of Reinforcement Learning.” Part 2 is here. If you read hacker news, you’d think that deep reinforcement learning can be used to solve any problem. Deep...### Lessons from Optics, The Other Deep Learning

Would you say deep learning is mature enough to be taught in high schools? Here’s why I ask. Some time ago, I received an email from a product manager at a very large company. I...
Newer