Gas Town: From Clown Show to v1.0 (steve-yegge.medium.com)

by martythemaniak 164 comments 113 points
Read article View on HN

164 comments

[−] bayarearefugee 30d ago

> But no need to wait. At a high level, Gas City is the answer to all your problems. Ha! At least, for certain classes of problem, such as, “How can I bring AI into my company and pass an audit trail,”

The important audit at my company is conducted by the FDA.

I have a feeling when they ask what processes we followed to mitigate any user harm that could be caused by software changes that "I told an AI-mayor in the form of a cartoon fox what to do and he spit out a bunch of vibecode software written by AI-driven virtual cartoon characters" is not among the answers they want to hear.

[−] avaer 30d ago
Keep in mind investing in cartoon foxes was a "business strategy" a lot of (otherwise serious) people bought into in 2020-2021.

And those cartoon foxes didn't even do anything! I guess these ones do?

Don't put it past the masses. These are crazy times.

[−] cdata 30d ago
The influence of cartoon foxes on business strategies in tech has a long history and cannot be overstated.

https://poignant.guide/book/chapter-3.html

[−] zactato 30d ago
Ehhh in my experience compliance auditors are 10 behind the cutting edge. I still see auditors that don't understand Kubernetes and so ask the same questions they would about on prem machines. They don't know the questions to ask to get to the real meat of the risks. This leads them to allow things through that probably deserve more scrutiny. I bet the same thing will happen with LLM tools like this. They'll just ask if you use PRs and wave you on through.
[−] Quarrelsome 30d ago
I did an induction at some ISO certified company some years back, reading their docs. A good 50% of them contained significant content that basically read:

> the thing must be in the place where it should be

With no further information e.g. what place, where, how, when, who facilitates that?

> the person who facilitates it, is the person who facilitates it.

Yea thanks. So their ISO accredited process was basically no process. Would have been way better with a talking fox.

So I feel like humans are capable of just as bad. I'd be interested in what answer the Fox could spit out and I kinda wonder where it might fit on the bell curve of all non Gas-Town "auditable" processes. I'm all for skepticism but I feel like it would be more tangible if we instead criticised the response instead of just conjuring it as "definitely awful" because it happens to be on top of a generated stack.

I mean: I don't want it to work, but maybe we're not as good as we think we are, or the stuff we rate as super important is actually way less important with a generated context. As much as I love good code, the thought that gnaws at the back of my head is the truism that some of the most profitable code in history has been some of the "worst" code (e.g. MySpace's janky code base ontop of ColdFusion or Twitter's "Fail Whale" era).

So I'm happy that someone is exploring this space in an open way. I'm just glad I'm not the one finding that out with my face first.

[−] vidarh 30d ago
Which ISO certification matters, but the key thing people should be aware of is that the primary value of the certification to customers is that your processes are documented and that deviations are tracked, so that customers can check whether the processes makes sense before signing a contract. It's important not to expect the certification itself to guarantee quality.
[−] siva7 30d ago
Not yet... but me in 2020 telling you what the HN frontpage 2026 would look like you would have sent me to a mental institution, wouldn't you?
[−] throwup238 30d ago
Same institution I’d send Steve today.

The sanatorium from American Horror Story Asylum comes to mind.

Dominique, nique, nique…

[−] Quarrelsome 30d ago
we can do better than "that man is crazy". Why not pull up a line in his OPENLY AVAILABLE CODE BASE and mock that instead?
[−] throwup238 30d ago
Beads, his glorified CLI based work tracker, was over several hundred thousand lines of code, last I checked in January.

Where do I even begin to mock that except at the source? That’s just absolute insanity.

[−] Quarrelsome 30d ago
[flagged]
[−] underarm5717 30d ago
[dead]
[−] DonHopkins 30d ago
[dead]
[−] Avshalom 30d ago
"every 5th article is about no-code-solutions that sometimes work" might be unexpected but it's hardly the stuff of institutionalization.
[−] cma 30d ago

> The important audit at my company is conducted by [Trump's second term] FDA.

Could work

[−] pron 30d ago
Does Yegge really think that building production software this way is a good idea?

Let's assume that managing context well is a problem and that this kind of orchestration solves it. But I see another problem with agents:

When designing a system or a component we have ideas that form invariants. Sometimes the invariant is big, like a certain grand architecture, and sometimes it's small, like the selection of a data structure. Eventually, though, you want to add a feature that clashes with that invariant. At that point there are usually three choices:

* Don't add the feature. The invariant is a useful simplifying principle and it's more important than the feature; it will pay dividends in other ways.

* Add the feature inelegantly or inefficiently on top of the invariant. Hey, not every feature has to be elegant or efficient.

* Go back and change the invariant. You've just learnt something new that you hadn't considered and puts things in a new light, and it turns out there's a better approach.

Often, only one of these is right. Usually, one of these is very, very wrong, and with bad consequences.

But picking among them isn't a matter of context. It's a matter of judgment and the models - not the harnesses - get this judgment wrong far too often (they go with what they know - the "average" of their training - or they just don't get it). So often, in fact, that mistakes quickly accumulate and compound, and after a few bad decisions like this the codebase is unsalvageable. Today's models are just not good enough (yet) to create a complete sustainable product on their own. You just can't trust them to make wise decisions. Study after study and experiement after experiment show this.

Now, perhaps we make better judgment calls because we have context that the agent doesn't. But we can't really dump everything we know, from facts to lessons, and that pertains to every abstraction layer of the software, into documents. Even if we could, today's models couldn't handle them. So even if it is a matter of context, it is not something that can be solved with better context management. Having an audit trail is nice, but not if it's a trail of one bad decision after another.

[−] mmastrac 30d ago
Serious question - there's a lot of fluff talking about Gas Town, but has Gas Town shipping something in public that can be evaluated without all of this surrounding hype and blogposting?

At this point it should be clear that Gas Town has done something we can evaluate the value of.

[−] 0xbadcafebee 30d ago
Beads is cool, but I tried to use it, and the backend didn't really make sense. I have to run an sql database in the background? How does it sync with Git? (I didn't see any files/objects committed to the repo) Plus, Dolt ended up using a constant 3-30kB/s of i/o in the background, while nothing was actually going on. That and Beads has a lot of features I'm not gonna use. All of this was just too complicated for my tiny brain.

So I slapped together my own Beads implementation (https://codeberg.org/mutablecc/dingles) over a day or two. Probably has bugs, and I'm sure race conditions if you tried to use with Gas Town, and likely does not scale. But it has the minimum functionality needed to create and track issues and sync them (locally and remotely, either via normal merge, or a dedicated ticket branch). No SQL, no extra features, just JSONL and Git. Threw a whole large software project at it, and the AI took to it like a duck to water, used it to make epics for the whole project, methodically worked through them all, dependencies first, across multiple context sessions. The paradigm of making tools the AI wants to use is clearly a winner.

[−] giancarlostoro 30d ago
I loved Beads, but kept running into issues because it is so git heavy. One: not every system and project I work on uses git. Two: Sometimes I'd switch branches, and that would screw up Beads state entirely. Three: And this is at least last I used it, there's no safety net, Claude would close a Bead, without validating anything.

I wound up building my own with Claude, I made it SQLite first, syncs to GitHub, can pull down from GitHub, and I added "Gates" to stopgap Claude or whatever agent from marking things complete if they've not been: compiled, unit tests run, or simple human testing / confirmation. The Gates concept improved my experience with Claude, all too often it says it finished something, when in fact it did not. Every task must have a gate, and gates must pass before you can close a task. Gates can be reused across tasks, so if "Run unit tests" is one gate, you can reuse it for every task, when it passes, it passes for that one task <-> gate combination.

Anyway, I'm happy for Beads, Gas Town not so much my wheelhouse on the other hand.

[−] ethanlew-is 30d ago
Gas Town has always struck me as more of a performance art piece than a tool that was actually meant to be used, even among the recent hyped AI projects. If you’re using it for real, what are you using it for?
[−] sailingparrot 30d ago
Gas Town really feels not just vibe coded but also vibe designed. I looked into it, to see whether multi agent setups really made a difference, the entire design philosophy feels like it was « let’s add one more layer of agent and surely this time it will work » about 10 times in a row.

So now you have agents of type mayor, polecats, witnesses, deacons, dogs etc plus a slew of Unneeded constructs with incomprehensible names.

In one of the blog post for gas town I remember reading something by the author along the lines of « it’s super inefficient, but because you burn so many tokens, you still get what you want at the end! » clearly this is also the design philosophy behind this project, just (get your ai to) throw more random abstractions and more agent types until you feel like it kinda works, don’t bother asking yourself if they actually contribute anything.

This gave me the very clear feeling that most of the complexity of gas town is absolutely not needed and probably detrimental.

Ended up building my own thing that is 10x simpler, just a simple main agent you talk to, that can dispatch subagents, they all communicate, wake each other up and keep track of work through a simple CLI. No « refinery » or « wasteland » or « molecule » or « convoys » or « deacons » or …

[−] vessenes 30d ago
Nice timing. I was just noting that beads in an old repo, just ... worked. Updates worked, I didn't have super weird errors to track down... I was like "nice!" Beads bumping to 1.0 is great. I haven't used gas town in a month or so, but a stable gas town sounds very valuable.

I think Yegge's instincts that making a programmable / editable coordination layer (he calls this gas city) is a great idea. Gas town early days was definitely a wild experience in terms of needing to watch carefully lest your system be destroyed, and then I put that energy into OpenClaw - I'll probably spin up Gas City and see what it can do soon though. Very cool.

[−] phpnode 30d ago
I'm pretty excited about agentic coding myself, but this does appear to be an extended ai-psychosis (i'm not super comfortable with this phrase, but it is becoming pretty recognisable).

I think he's boxed himself in by continually layering more complexity on his approach, rather than stepping back and questioning the fundamentals or the overall direction.

All of the steps Gas Town or Gas City etc are taking are towards reducing human oversight and control. This is profoundly misguided! In a world of infinite cheap software it is precisely this human decision making and control that matters.

> There will be nothing like it. You are going to want to use Gas City.

No. I do not want to talk to the mayor of my software factory, as its cartoonish minions build an infinite mountain of slop. Unreviewable, both in terms of code and the finished product.

Instead, I want to precisely capture human ideas, have those ideas questioned, challenged, improved, and then I want to bring those ideas to life, keeping the human in the loop whenever they want. Neither Beads, Gas Town, nor Gas City or anything like them are required for that.

[−] ncruces 30d ago
The original Beads, it seems, used my CGO-free SQLite driver.

Seems like I'm back to obscurity.

:)

[−] thedevilslawyer 30d ago
For "hacker" news, this group is reacting like its "labor" news. I get it - this will have an impact on the current ways of building software getting deprecated.

But this is genuinely cutting-edge, new paradigm of building software. Something that will iron it's kinks out, and potentially be the way of developing software in the new Era. That's worth some curious and exploratory discussion, rather than more comments about not being the "true way of software development".

(Sorry, was put off by the crab-pulling down discussion.)

[−] cdrnsf 30d ago
An apt name for something doing it's part to spike energy consumption and accelerate the climate crisis.
[−] siliconc0w 30d ago
My experience is that Agentic Coding can legitimately get you mostly-working software. You do, however, still need to spend a few days groking, validating, and usually nudging/whacking it to conform to the shape you intended vs what the agent inferred.

It is pretty magical to go from brainstorming an idea in the evening, having ChatGPT Pro spit out a long list of beads to implement it, leaving it running over night in a totally empty repo and waking up to a mostly-implemented project.

[−] dbbk 30d ago
I tried Beads and it kept breaking in such frustratingly random ways that I just added a Linear MCP server and called it a day. That's really all you need.
[−] pianopatrick 30d ago
I searched on google about the cost of running Gas Town. The Gemini AI response claimed Gas town costs $100 / hour and can spit out 4000 lines of code per hour, so Gas Town costs 2.5 cents per line of code.

I tried tracking down where those numbers came from and the sources were a bit sketchy. Can anybody who has used Gas Town confirm those numbers, or report their personal numbers?

[−] iwontberude 30d ago
I really need to huff more chemicals to keep up with the state of this insane type of engineering
[−] guybedo 30d ago
i've experimented quite a lot with multi agent setups and orchestrations.

In the end, it didn't feel worth it mostly because of high token overhead (inter agent communications, agents re reading same code, etc...) and synchronization / cooperation issues (who should do what).

What actually works for me and provides good results: multi step workflows with clearly defined steps and strong guidance for the agent.

[−] selimthegrim 30d ago
Also I chuckled at the AI-generated "The Overseer is Alnays Right | Vacation Approved" poster in the background of the split image where the mayor is reading so you can supervise. This has strong Boondocks Catcher Freeman vibes. I want to hear the polecats/badgers' version.
[−] spprashant 30d ago
I don't know if Gastown will work out. But it is quite a bold take. I am interested to see how it plays out. I suspect they will eventually roll back some of the stringent "no code reading" approach in favor of observability as the community grows.
[−] munificent 30d ago
> I’ve been saying since last year that by the end of 2026, people will be mostly programming by talking to a face. There’s absolutely NO reason to type with the Mayor. You should be able to chat with them like a person. You’ll have a cartoon fox there onscreen, in costume, building and managing your production software, and showing you pretty status updates whenever you ask for one. This is the end state for IDEs.

This is a desirable end state for highly social but perhaps slightly sociopathic extroverts who want to spend all day talking even though they aren't talking to a person.

For anyone else, it's hard to imagine considering that a desirable way to spend eight hours a day.

[−] coldtea 30d ago
Beads is needlessly overengineered. Puts me off from checking Gas Town.
[−] solomatov 30d ago
Does anyone has any tips for starting with Gastown? I am comfortable with couple of agents running, but not yet comfortable with what Gastown offers.
[−] avaer 30d ago
TBH this post still reads like a clown show.
[−] hackingonempty 29d ago
How does this compare to G-Stack?
[−] rcarmo 30d ago
I cannot really get behind Gas Town or any other “agent swarm” setup. They always seem to waste an incredible amount of tokens on passing the buck around as half-finished specs, and even with a healthy amount of tokens pre-allocated they burn money faster than setting my wallet on fire…
[−] slopinthebag 30d ago
I can't believe people give this any air time at all. If it was just some rando producing slop people wouldn't give it the time of day.

I think we need to take a hard line with AI stuff like this, and put the onus on the creator to prove these ideas have merit.

[−] wenc 30d ago
I feel Gastown is an attempt at answering: what if i push the multi-agent paradigm to its chaotic end?

But I think the point that Yegge doesn't address and that I had to discover for myself is: getting many agents working in parallel doing different things -- while cool and exciting (in an anthromorphic way) -- might not actually be solving the right problem. The bottleneck in development isn't workflow orchestration (what Gastown does) -- it's actually problem decomposition.

And Beads doesn't actually handle the decomposed problem well. I thought it did. But all it is is a task-graph system. Each bead is task, and agents can just pick up tasks to work on. That looks a lot like an SDE picking up a JIRA ticket right? But the problem is embedding just enough context in the task that the agent can do it right. But often it doesn't, so the agent has to guess missing context. And it often produces plausible code that is wrong.

Devolving a goal into the smaller slices is really where a lot of difficulty lies. You might say, oh, "I can just tell Claude to write Epics/Stories/Tasks, and it'll figure it out". Right? But without something grounding it like a spec, Claude doesn't do a good job. It won't know exactly how much context to provide to each independent agent.

What I have found useful is spec-driven development, especially of the opinionated variety that Kiro IDE offers. Kiro IDE is a middling Cursor, but an excellent spec generator -- in fact one of the best. It generates 3 specs at 3 levels of abstraction. It generates a Requirements doc in EARS/INCOSE (used at Rolls Royce and Boeing for reducing spec ambiguity), and then generate a Design doc (commonly done at FAANG), and... then generates a Task list, which cross-references the sections of the requirements/design.

This kind of spec hugely limits the degrees of freedom. The Requirements part of the spec actually captures intent, which is key. The Design part mocks interfaces, embeds glossaries, and also embeds PBTs (property-based tests using Hypothesis -- maybe eventually Hegel?) as gating mechanisms to check invariants. The Task list is what Beads is supposed to do -- but Beads can't do a good job because it doesn't have the other two specs.

I've deployed 4 products now using Kiro spec-driven dev (+ Simon Willison's tip "do red/green tdd") and they're running in prod and so far so good. They're pressure-tested using real data.

Spec-driven development isn't perfect but I feel its aim is the correct one -- to capture intentions, to reduce the degrees of freedom, and to constrain agents toward correctness. I tried using Claude Code's /plan mode but it's nowhere as rigorous, and there's still spec drift in the generated code. It doesn't pin down the problem sufficiently.

Gastown/Beads are solutions for workflow orchestration problem (which is exciting for tech bros), but at its core, it's not the most important problem. Problem decomposition is.

Otherwise you're just solving the wrong problem, fast.

[−] jasonmp85 30d ago
[dead]
[−] kitsune1 30d ago
[dead]
[−] 8593376393 30d ago
[dead]
[−] winstonp 30d ago
[flagged]
[−] _doctor_love 30d ago
I'm a long-time Steve Yegge fan but a major Gas Town hater (now Gas City too, I guess). It's doubling down on all the wrong metaphors.

I also simply detest how Gas Town is modeled fundamentally on an extractive and destructive metaphor, the 19th century factory. I want to live in a verdant software garden, not a dystopian industrialist hellscape.

In my view the StrongDM guys are on the right long-term path.

[−] throw1234567891 30d ago
Does this support OpenAI-compatible APIs? Or is it only clowncode, codex and copilot? Love to try it but without OpenAI-compatible APIs it is junk.