An Outsider's Tour of Reinforcement Learning

Parts 13/14.

Table of Contents.

  1. Make It Happen. Reinforcement Learning as prescriptive analytics.
  2. Total Control. Reinforcement Learning as Optimal Control.
  3. The Linearization Principle. If a machine learning algorithm does crazy things when restricted to linear models, it’s going to do crazy things on complex nonlinear models too.
  4. The Linear Quadratic Regulator. A quick intro to LQR as why it is a great baseline for benchmarking Reinforcement Learning.
  5. A Game of Chance to You to Him Is One of Real Skill. Laying out the rules of the RL Game and comparing to Iterative Learning Control.
  6. The Policy of Truth. Policy Gradient is a Gradient Free Optimization Method.
  7. A Model, You Know What I Mean? Nominal control and the power of models.
  8. Updates on Policy Gradients. Can we fix policy gradient with algorithmic enhancements?
  9. Clues for Which I Search and Choose. Simple methods solve apparently complex RL benchmarks.
  10. The Best Things in Life Are Model Free. PID control and its connection to optimization methods popular in machine learning.
  11. Catching Signals That Sound in the Dark. PID for iterative learning control.
  12. Lost Horizons. Relating popular techniques from RL to methods from Model Predictive Control.
  13. Coarse-ID Control. Combining high-dimensional statistics and robust optimization for the data-driven control of uncertain systems.

Bonus Post: Benchmarking Machine Learning with Performance Profiles. The Five Percent Nation of Atari Champions.