2 Comments
User's avatar
reed hundt's avatar

Yes do please

Expand full comment
Deborah Carver's avatar

I would love an explanation without the equations. I've been thinking about RL, at least in terms of the current slate of LLMs, as "optimization for a values system wherein nobody except machine learning engineers is able to select the values or has any say in why they are prioritized." Would love to know if that's an accurate description.

Expand full comment