Language model teams as distributed systems (arxiv.org)

by jryio 46 comments 104 points
Read article View on HN

46 comments

[−] causalityltd 60d ago
Apart from rediscovering all the problems with distributed systems, I think LM teams will also rediscover their own version of the mythical man-month, and very quickly too.

There were 3 core insights: adding people makes the project later, communication cost grows as n^2, and time isn't fungible.

For agents, maybe the core insight won't hold, and adding a new agent won't necessarily increase dev-time, but the second will be worse, communication cost will grow faster than n^2 because of LLM drift and orchestration overhead.

The third doesn't translate cleanly but i'll try: Time isn't fungible for us and assumptions and context, however fragmented, aren't fungible for agents in a team. If they hallucinate at the wrong time, even a little, it could be a equivalent of a human developer doing a side-project during company time.

An agent should write an article on it and post it on moltbook: "The Inevitable Agent Drift"

[−] dasil003 60d ago
I've long thought of the analogy as useful for human teams, and it even shows up in the corporate jargon (eg. "I'm blocked", "We need to align", etc). It's surprisingly common for whole branches of a large org to be doing net negative work due to conflicting goals or not realizing some implication that cuts across teams and local contexts. Sometimes these issues are technical but just as often they are pure product or business decisions with no explicit dependency until a lightbulb goes off somewhere.

With hand-written code, things generally move slow enough, and there's enough common sense sprinkled across the org chart that things can get uncovered organically. With agent teams, speed increased by several orders of magnitude and common sense is out the window, so I suspect the ceiling on productive use of agents will be far more limited in number than traditional engineering teams, and it will heavily depend on who and how humans are plugged into the right places.

One thing I suspect professional researchers underestimate is how much positive output can be produced by a human team with vague or hand-wavy direction, and surprisingly little deep thinking, let alone a robust specification or structure to keep them on track. The reality is any large team regresses to the mean, and it's usually a few savvy people that actually drive outcomes. These people don't necessarily have official authority, just a nose for the right thing. This won't spontaneously emerge from agents (at least until they become a lot more human like in terms of big picture common sense, and dial down the sycophancy to a more "skeptical engineer" level).

[−] kaicianflone 61d ago
We’ve been building exactly this as an open-source ecosystem at consensus-tools. It’s a governance layer for multi-agent systems with a runtime wrapper that intercepts agent decisions before they execute: .consensus(fn, opts).

The coordination and consistency problems the paper describes are what the monorepo is designed around. Giving agents auditable stake in decisions. Happy to share more if anyone’s working in this space.

[−] calmkeepai 60d ago
I’ve found an interesting model to think about to be production crews similar to in the television world but potentially be something worth using as one’s mental model of how agents and people working alongside agents should coordinate, rather than basing the simulated team force off the typical office worker framework
[−] bhewes 61d ago
This is how we design at HewesNguyen AI. We are both MIS so once LLMs came out we where like sweet whole teams that can be tasked for one thing done well. Thank you Unix Philosophy
[−] bob1029 61d ago
I find depth to be far more interesting than breadth with these models.

Descending into a problem space recursively won't necessarily find the best solution, but it's going to tend to find some solution faster than going wide across a swarm of agents. Theoretically it's exponentially faster to have one symbolically recursive agent than to have any number of parallel agents.

I think agent swarm stuff sucks for complex multi-step problems because it's mostly a form of BFS. It never actually gets to a good solution because it's searching too wide and no one can afford to wait for it to strip mine down to something valuable.

[−] measurablefunc 61d ago
Next up, LLMs as actors & processes in π-calculus.
[−] seanp2k2 61d ago
Everyone wants to be the CEO of their own megacorp managing thousands of AI engineers I guess. Just like microservices, there’s probably a ton of overhead doing things this way vs monolithic / single agent. Certain types of engineers just love over-engineering hugely complex stuff to see it work. Goldberg architecture was already prevalent and bad enough in enterprise before the AI boom.
[−] 50lo 61d ago
Once you run more than one agent in a loop, you inevitably recreate distributed systems problems: message ordering, retries, partial failure, etc. Most agent frameworks pretend these don’t exist. Some of them address those problems partially. None of the frameworks I've seen address all of them.
[−] rando1234 61d ago
Struggling to find anything interesting or non-obvious about this article. You give a bunch of LLMs various parallelizable task and some models manage to do it well but others don't. No insights as to why. As someone with a distributed systems background the supposed 'insights' from distributed computing are almost trivial.
[−] woah 61d ago
The current fad for "agent swarms" or "model teams" seems misguided, although it definitely makes for great paper fodder (especially if you combine it with distributed systems!) and gets the VCs hot.

An LLM running one query at a time can already generate a huge amount of text in a few hours, and drain your bank account too.

A "different agent" is just different context supplied in the query to the LLM. There is nothing more than that. Maybe some of them use a different model, but again, this is just a setting in OpenRouter or whatever.

Agent parallelism just doesn't seem necessary and makes everything harder. Not an expert though, tell me where I'm wrong.

[−] justboy1987 61d ago
[flagged]
[−] agenticbtcio 61d ago
[dead]