Discussion about this post

User's avatar
Kameron Decker Harris's avatar

There actually is some interesting work on optimizing the weights at init. Essentially that's transfer learning in a bunch of setups. I was trying to do something like this and had trouble getting it to work, eventually these guys figured out one way to do it: https://openreview.net/pdf/304a7da98b79c8c15e8baa9f038951976a9ef764.pdf

My take is that the developmental program does a lot to optimize the init of our biological networks.

Expand full comment
Badri's avatar

LOL @ No. 9. I still remember this tweet - https://mobile.x.com/beenwrekt/status/913418862191710208 and now that "optimally" is in scarequotes in the blog description, we can probably welcome No. 9 as the sneaky way of doing what you suggested in the footnote. (Early neural networks did in fact directly tune the weights and we have algorithms like NEAT and its cousins)

Expand full comment
15 more comments...

No posts