Discussion about this post

User's avatar
Carl Boettiger's avatar

Thanks Ben, great piece as always!

I think an overlooked corollary to this is that you don't actually need a super LLM sophisticated LLM for tool calling to work. Even the largest LLMs still can't add reliably, but they've all learned not try, they just use calculators. But this is a game small/local/open LLMs can do to. Have you poked at the open LLM ecosystem for tool calls (err, "agents")?

Sure, if the tool is just "here's a bash shell" then yeah, an LLM needs to still be pretty clever. But you can give an LLM tools with more narrowly scoped and clearly explained uses, and voila, even a tiny model can suddenly be very powerful. The beautiful thing about this is, as you point out so nicely here, building a tool doesn't involve any GPUs or transformers, here in good conventional-software development land of JSON schemas and function calls. We've had great success building simple MCP tools that a model like gpt-oss or nvidia nemotron-3 can easily outperform what Opus can do with only the generic tools claude-code gives it...

Notger Heinz's avatar

Probably the best explanation of how agentic system work, I have seen so far. Thank you!

7 more comments...

No posts

Ready for more?