1 Comment

I don't like to be the type of guy who advertises his own work, but I guess I'll have to be that type of guy here. I've been obsessed with the question of why gradient descent (or mirror descent, more generally) is so universal for over a decade. In fact, right after I joined Illinois, I've been having regular chats about this with Angelia Nedić and Alex Olshevsky, but we didn't really get anywhere. Then both Angelia and Alex moved to other universities, and in the meantime I made some progress on that question together with Belinda Tzen, Anant Raj, and Francis Bach:

https://ieeexplore.ieee.org/document/10120759

Indeed, the "steepest descent" story is only local, but we were able to show in what sense mirror descent is optimal by viewing it as an infinite-horizon steady-state stabilization problem.

Expand full comment