I'm looking forward to trying this. I've had a positive but high-variance experience with Gastown[1], which is in the same genre. I hope that Scion does better.
My main complaints with Gastown are that (1) it's expensive, partly because (2) it refuses to use anything but Claude models, in spite of my configuration attempts, (3) I can't figure out how to back up or add a remote to its beads/dolt bug database, which makes me afraid to touch the installation, and (4) upgrading it often causes yak shaving and lost context. These might all be my own skill issues, but I do RTFM.
But wow, Gastown gets results. There's something magic about the dialogue and coordination between the mayor and the polecats that leads to an even better experience than Claude Code alone.
Really interesting to see Google's approach to this.
Recently I shared my approach, Optio, which is also an Agent Orchestration platform: https://news.ycombinator.com/item?id=47520220
I was much more focused on integrating with ticketing systems (Notion, Github Issues, Jira, Linear), and then having coding agents specifically work towards merging a PR.
Scion's support for long running agents and inter-container communication looks really interesting though. I think I'll have to go plan some features around that. Some of their concepts, make less sense to me, I chose to build on top of k8s whereas they seem to be trying to make something that recreates the control plane. Somewhat skeptical that the recreation and grove/hub are needed, but maybe they'll make more sense once I see them in action the first time.
The documentation mentions OAuth configuration, but doesn't list Claude Code as a harness that supports this. Just to confirm my understanding, does this mean that the only authentication and therefore billing method for Claude is API key, which means you get billed at the API rate, not toward your subscription usage?
The "isolation over constraints" framing is interesting. Scion enforces safety at the infrastructure layer, letting agents operate freely inside containers while controlling what they can reach on the outside. That is a runtime approach.
We have been exploring a different layer for the same problem. ARIA (aria-ir.org) is an intermediate representation designed for AI-authored code. Instead of constraining the agent at runtime, it constrains what the agent produces at the representation level. Functions must declare effects, intent annotations are mandatory and verifiable, and the compiler enforces memory safety at compile time before anything executes.
The two approaches are not mutually exclusive. Scion handles what the agent can reach. ARIA handles what the agent generates. A system that uses both would have safety at the output layer and safety at the execution layer. Curious whether the Scion team has thought about what properties the code an agent produces should have, independent of how that agent is isolated.
Isolation over constraints sounds like the right philosophy. Containers give you a boundary but not vis into what ran inside them. Curious how much execution context Scion surfaces, w/o that you're still in a position similar to the LiteLLM attack where something can run and cause damage before you know it happened.
The failure mode most underrepresented in agent testbeds is cascading failure, what happens when individually correct agents interact in ways that produce collectively incorrect outcomes. Most testing focuses on individual agent behaviour.
Does the testbed have a model for multi-agent state conflicts, can you simulate two agents concurrently modifying the same resource and observe the resolution behaviour?
This looks really promising, I am curious about the choice to use containers as the isolation layer though. If the goal is to treat agents as untrusted and isolate them fully I feel like microVMs would be a better option.
If it supports OCI runtimes though then maybe kata containers can be plugged in, I'll have to dig in after work and see.
> This project is early and experimental. Core concepts are settled, but expect rough edges. Local mode: relatively stable - Hub-based workflows: ~80% verified - Kubernetes runtime: early with known rough edges
i guess gastown is a better choice for now? idk i don't feel good about "relatively stable"
I want to experiment more with agents but my employer only pays for Claude Code, and TOS disallows using the subscription API for other purposes. Anyone else in the same boat? Token based pricing also gets expensive fast.
Their agent tooling is shaping up to be the well known issue of product cancellation. They have how many different takes on this now? (gemini-cli, antigravity, AI studio, this, Gemini app)
I've not been impressed with any of them. I do use their ADK in my custom agent stack for the core runtime. That one I think is good and has legs for longevity.
The main enterprise problem here is getting the various agent frameworks to play nice. How should one have shared runtimes, session clones, sandboxes, memory, etc between the tooling and/or employees?
I swore to not be burned by google ever again after TensorFlow. This looks cool, and I will give this to my Codex to chew on and explain if it fits (or could fit what I am building right now -- the msx.dev) and then move on. I don't trust Google with maintaining the tools I rely on.
Agent orchestration is one side of the problem. The other side
is: where does the data go?
When agents process EU user data (names, emails, IBANs) and
route it to US model providers, that's a GDPR violation.
I open sourced a routing layer that detects PII in prompts and
forces EU-only inference when personal data is found:
https://github.com/mahadillahm4di-cyber/mh-gdpr-ai.eu
62 comments
My main complaints with Gastown are that (1) it's expensive, partly because (2) it refuses to use anything but Claude models, in spite of my configuration attempts, (3) I can't figure out how to back up or add a remote to its beads/dolt bug database, which makes me afraid to touch the installation, and (4) upgrading it often causes yak shaving and lost context. These might all be my own skill issues, but I do RTFM.
But wow, Gastown gets results. There's something magic about the dialogue and coordination between the mayor and the polecats that leads to an even better experience than Claude Code alone.
1. https://github.com/gastownhall/gastown/
I was much more focused on integrating with ticketing systems (Notion, Github Issues, Jira, Linear), and then having coding agents specifically work towards merging a PR. Scion's support for long running agents and inter-container communication looks really interesting though. I think I'll have to go plan some features around that. Some of their concepts, make less sense to me, I chose to build on top of k8s whereas they seem to be trying to make something that recreates the control plane. Somewhat skeptical that the recreation and grove/hub are needed, but maybe they'll make more sense once I see them in action the first time.
We have been exploring a different layer for the same problem. ARIA (aria-ir.org) is an intermediate representation designed for AI-authored code. Instead of constraining the agent at runtime, it constrains what the agent produces at the representation level. Functions must declare effects, intent annotations are mandatory and verifiable, and the compiler enforces memory safety at compile time before anything executes.
The two approaches are not mutually exclusive. Scion handles what the agent can reach. ARIA handles what the agent generates. A system that uses both would have safety at the output layer and safety at the execution layer. Curious whether the Scion team has thought about what properties the code an agent produces should have, independent of how that agent is isolated.
https://github.com/GoogleCloudPlatform/scion
Does the testbed have a model for multi-agent state conflicts, can you simulate two agents concurrently modifying the same resource and observe the resolution behaviour?
If it supports OCI runtimes though then maybe kata containers can be plugged in, I'll have to dig in after work and see.
ADK was (and is) exceptional, but nobody is actually making noise and pushing for it as they should. It feels like Microsoft .net back in the day.
Let's see how it goes. I'm rooting for y'all
> This project is early and experimental. Core concepts are settled, but expect rough edges. Local mode: relatively stable - Hub-based workflows: ~80% verified - Kubernetes runtime: early with known rough edges
i guess gastown is a better choice for now? idk i don't feel good about "relatively stable"
> https://en.wikipedia.org/wiki/SCION_(Internet_architecture)
I've not been impressed with any of them. I do use their ADK in my custom agent stack for the core runtime. That one I think is good and has legs for longevity.
The main enterprise problem here is getting the various agent frameworks to play nice. How should one have shared runtimes, session clones, sandboxes, memory, etc between the tooling and/or employees?
and also wrote about it https://s2.dev/blog/distributed-ai-agents