An open mindset

Jul 10

The commitments required for fully open source machine learning

7 Comments

2dEdited

But don’t all of the best scientists from the biggest American companies tell the thought leaders in congress that open models are dangerous?

Love the OLMo project from the Allen Institute - theyve already spent all the gpu money and given away competitive models up to 32B parameters, with open data, open training and some great insights into how the models work.

https://allenai.org/olmo

https://arxiv.org/abs/2504.07096

Edit: I am reading more of Nathan’s work and appreciating his previous analysis of OLMo and other open weight models like Gemma.

Expand full comment

Yaroslav Bulatov

2dEdited

I'm also optimistic this can be done, just consider the explosion of new entrants last year. Large companies have unlimited GPUs but they also have an unlimited capacity to waste them, they get better at wasting them as they grow.

Expand full comment

Reply (1)

Ben Recht

Shh! You're not supposed to say that out loud. ;)

Expand full comment

Nathan Lambert

Appreciate you adding some more color to my piece. Lovely

Expand full comment

Yaroslav Bulatov

And a few months after 30 minute TPU record, some riff raff beat them with 18 minutes and $40 to do a GPU run -- https://www.technologyreview.com/2018/08/10/141098/small-team-of-ai-coders-beats-googles-code/

It's entry #4 on DawnBench site https://dawn.cs.stanford.edu/dawnbench

The back-story on this is that I was out of Google and bored. I had first hand experience tuning such runs and it the iteration was so painful that I was sure I could get better results by iterating rapidly with OSS and some cloud credits. Fast.AI had a good PyTorch implementation of a single machine ImageNet model, I offered to parallelize it and wrote a harness for fast iteration. Iterating fast felt like we could try 10x more things per day than I could at Google. For instance Andrew Shaw discovered a better way to initialize batch norm layer this way among many other things he tried.

We put the code up at https://github.com/cybertronai/imagenet18 and it ran out of the box for at least one other person, not connected to us.

Expand full comment

Reply (1)

Ben Recht

Lol, I love it. Time to bring your talents to open LLMs!

Expand full comment

Caithrin

great post!

Expand full comment

arg min

An open mindset