Induction and Feedback

A new class reimagining learning for dynamics and control

Jan 22, 2026

About ten years ago, my research group began a deep dive into the intersection of machine learning and control. Laurent Lessard, Andy Packard, and I started looking at the dynamics of popular machine learning optimization algorithms through the lens of control, realizing that most analyses could be automated using control verification software. We gathered other graduate students and faculty and formed a working group to apply control-theoretic techniques to machine learning problems. Eventually, the group shifted to studying applications of machine learning in the context of control theory, and a bunch of amazing students and postdocs started probing the limits of certainty-equivalent control, dynamical system identification, and reinforcement learning.

There was, of course, a lot of excitement about this intersection outside Berkeley, too. Learning theorists started applying ideas from online and nonstochastic learning to adaptive control. Control theorists started deploying machine-learning-enabled components for robotics systems. In 2019, a few of us decided to bring everyone together to see if we could seed a bigger community at the intersection of learning theory and control theory. We held the first Learning for Dynamics and Control (L4DC) conference at MIT in 2019.

In our opening remarks, we observed:

“Everything in machine learning these days needs to consider feedback: Autonomous cars are obviously control systems, but any contemporary industrial machine learning system is connected in a complex feedback interaction with humans. Similarly, every complex control system needs to consider data: there remain a variety of new challenging tasks in autonomy that require managing varied, unstructured data streams.”

This still rings true. We also highlighted how the perspectives of machine learning and control were complementary:

“Machine learning uses past data to learn about and/or act upon the world. This data-driven approach allows an engineer to obviate complexity in environments, sensing, and models. Control, on the other hand, uses feedback to mitigate the effects of dynamic uncertainty in environments, sensing, and models.”

I think that perspective is right. Machine learning is engineered induction, a program of transforming past data into future predictions and actions. Control is engineered feedback, a program for recovering from quantitative measurements of error or failure. These are both valuable perspectives that can be deployed in tandem. At L4DC, we hoped for a merged perspective for safer, high-performance autonomy.

I don’t think we’ve gotten there yet. I can say more about my assessment in a future post. But one of the more concerning impediments in hindsight is that we got stuck in stochastic optimization.

My class from 2017, a first attempt at assembling an L4DC syllabus, illustrates the trap. The class was almost entirely about optimization. It roughly went something like this: We start with the idea that there is some optimization problem we want to solve, constrained by physical laws. For example, if we want an aircraft to follow a particular trajectory, the cost function is the distance to the trajectory, and the constraints are the differential equations that capture the physics of flight. We build fancy techniques to solve these problems, based on the theory of dynamic programming. Then we ask, “What could we do if we didn’t know physics?” Now we come up with a variety of algorithmic schemes: identifying models in the lab, using approximation techniques from dynamic programming, trial-and-error with reinforcement learning. That can basically cover a semester.

But a focus on optimal control glosses over the most vital points of both learning and control. Yes, dynamic programming has a built-in notion of feedback because it searches for policies that map the state of the world to actions. But this doesn’t capture what feedback is and what feedback can do. Moreover, decision-makers trained on past data can’t get around the Lucas Critique, and we have to constantly retrain and retool. Optimization hides these details. Famous control theorists have been loudly warning us since the seventies that ignoring these details is incredibly dangerous in high-performance systems. To really understand the power of feedback, you need to tackle questions of stability, robustness, and fragility head-on.

I want to try to get unstuck this semester, and this is the theme of my spring class. Learning for control with a deemphasis on optimal control. I’m interested in understanding what I think many consider a “solved” problem: How on earth do we build these feedback systems that work so well? Why do they work so well?

I want to do an L4DC course that chases the open-ended questions on this blog. I want to find better ways to understand how to intervene in monster models. I want to better map out the inherent tradeoffs between action and impact. I want a framework for how feedback systems capture much of what we want from “generalization” in machine learning systems, namely that they “behave effectively in unknown future settings”.

And I want to understand the hidden feedback loops that keep the world running around us. Control theorists always warn us that if we don’t respect feedback loops, we’re doomed. And yet we deploy machine learning everywhere, and it sort of works? Complex machine learning software interconnects us in ways unfathomable even 10 years ago. But we all know it is riddled with bugs and missing edge cases. One of the greatest mysteries of computer science is why anything works, given the reality of software. Massive failures certainly happen, but they are rare, and the world recovers. How?

OK, this is getting too ambitious, but it’s good to have a big vision before settling into the tiny details. Tomorrow I’ll share a syllabus for what I hope to cover, and you can tell me what I’m missing.

Avik De

Jan 23

This course sounds great, and I'm also very much looking forward to the syllabus and other info! I'm also a huge fan of L4DC; thanks for your work on that.

One thing I'm maybe not understanding is the distinction you're making between optimal control and feedback (possibly just to do with definitions). E.g. LQR is an optimal controller for an LTI sytem, and outputs a linear feedback controller. If you consider a PID controller as feedback, its behavior is tied to a local Lypaunov / LaSalle function, just as an optimal controller's behavior is tied to a value function. MPC will locally estimate (and re-estimate online) the value function, and would also be considered to be a "feedback" controller. Trajectory optimization (I agree) has no feedback, but typically in robotics this trajectory will still be stabilized using a feedback controller generated using MPC or LQR. Similarly, for a learned policy generating actions from observations, domain randomization during training necessitates some amount stochastic robustness in the feedback control, which appears to be key for all the robotics behaviors being developed using RL in simulation these days.

Josh

Jan 22

Is there any way I can access the notes or recording or tune in live?

Doing my PhD in AI and human control processes and your article captures a lot of the concepts I'm looking into!

2 replies by Ben Recht and others

8 more comments...

arg min

Discussion about this post

Ready for more?