Show HN: Real-time dashboard for Claude Code agent teams (github.com)

by simple10 28 comments 77 points
Read article View on HN

28 comments

[−] saadn92 44d ago
The hooks performance finding matches what I've seen. I run multiple Claude Code agents in parallel on a remote VM and the first thing I learned was that anything blocking in the agent's critical path kills throughput. Even a few hundred milliseconds per hook call compounds fast when you have agents making dozens of tool calls per minute.

The docker-based service pattern is smart too. I went a different direction for my own setup -- tmux sessions with worktree isolation per agent, which keeps things lightweight but means I have zero observability into what each agent is actually doing beyond tailing logs manually. This solves that gap in a way that doesn't add overhead to the agent itself, which is the right tradeoff.

Curious about one thing -- how does the dashboard handle the case where a sub-agent spawns its own sub-agents? Does it track the full tree or just one level deep?

[−] simple10 44d ago
Sub-agent trees are fully tracked by the dashboard. When an agent is spawned, it always has a parent agent id - claude is sending this in the hooks payload. When you mouse over an agent in the dashboard, it shows what agent spawned it. There currently isn't a tree view of agents in the UI, but it would be easy to add. The data is all there.

[Edit] When claude spawns sub-agents, they inherit the parent's hooks. So all sub-agents activity gets logged by default.

[−] petcat 44d ago
Are you guys spending hundreds (or thousands) of dollars a day on Claude tokens? Holy crap. I can't get more than one or two agents to do anything useful for very long before I'm hitting my usage limits.
[−] kami23 44d ago
I'm in a great situation where I've been piloting Claude for the company among a small group of others. I've been obsessed with pushing the limits of how many sessions and agents I can working at a time. We threw some work at Gas Town and another Orchestrator but they felt too rigid and opinionated for my liking. But I'm biased, I want to make my own eventually.

When I go home to my $20 plan I am sad and annoyed but I don't want to put more in for what is a good enough for me to work a bit at a time, a good pomodoro timer for me personally.

Something like this is perfect for some of the issues that I've wanted to solve as a command and control tool with malleable visuals.

OP: This is cool, thank you for sharing.

[−] simple10 44d ago
I hit a lot of limits on Pro plan. Upgraded to Max $200/mo plan and haven't hit limits for awhile.

It's super important to check your plugins or use a proxy to inspect raw prompts. If you have a lot of skills and plugins installed, you'll burn through tokens 5-10x faster than normal.

Also have claude use sub-agents and agent teams. They're significantly lighter on token usage when they're spawned with fresh context windows. You can see in Agents Observe dashboard exactly what prompt and response claude is using for spawning sub-agents.

[−] edwhitesell 44d ago
I'd bet there are many. I know a few teams with spends in the thousands of dollars per day. It sounds crazy, but not too unrealistic.
[−] PermissionTrail 44d ago
I've been having the same issue. It's such a shame because it is levels above the other AIs
[−] tatrions 40d ago
[flagged]
[−] kami23 44d ago
I tried using hooks for setting up my DIYed version of what channels is now in Claude. I had Claude writing them and not really looking at the results cause the vibes are strong. It struggled with odd behaviors around them. Nice to see some of the possible reasons, I ended up killing that branch of work so I never figured out exactly what was happening.

Now I'm regretting not going deeper on these. This is the type of interface that I think will be perfect for some things I want to demonstrate to a greater audience.

Now that we have the actual internals I have so many things I want to dig through.

[−] nemo8312 43d ago
This is exactly what I needed. Running 4 autonomous marketing agents (content, engagement, learning, strategy) and the hardest part is visibility into what they're doing. Currently built a custom daily activity summary but it's basic. How do you handle the case where agents are running fine but producing bad outputs? We had an issue where the quality scorer's centroids went stale and the agent kept posting content that scored "ok" internally but got zero real engagement.
[−] LeoStehlik 44d ago
This is what I've been missing running multi-agent ops through OpenClaw.

The opacity problem is the one I hit hard: when a coordinator spawns 3-4 agents in parallel (builder, reviewer, tester, each with their own tool calls), the only visibility you have is what they choose to report back. Which is often sanitised and … dangerously optimistic.

The role separation / independent verification structure I run helps catch bad outputs, but it doesn't give me the live timeline of HOW an agent got to a conclusion. That's why I find this genuinely useful.

Noticed OpenClaw is already on the roadmap - had my hands tingling to fork and adapt it. Starring it for now and added to my watchlist. The hook architecture should translate … OpenClaw fires session events that could feed the same pipeline. Looking forward to seeing that happen.

[−] cdnsteve 42d ago
This looks to solve something I've been struggling with in my project, Sugar (1). Using the SDK and having sub agents running I found it difficult to have real- time insight into exactly what they were doing.

You can create a huge task list and Ralph mode can crank through it and also store persistent memory.

Interested in trying them together. 1. https://github.com/roboticforce/sugar

[−] silbercue 42d ago

> Claude code hooks are blocking - performance degrades rapidly if you have a lot of plugins that use hooks

can confirm. ended up being really careful about what runs synchronously vs in the background.

IMHO the "sanitised optimism" thing others mention here is real too. had to add explicit verification steps because Claude kept reporting success when it just silenced the error. now I always make it prove things actually worked before moving on.

[−] kangraemin 42d ago
The "sanitised optimism" problem is real. I've seen agents report "fixed!" when they just suppressed the error.

Role separation (builder/reviewer/tester) helps but the reviewer agent also tends to be too polite. Making the reviewer explicitly output PASS/FAIL/UNKNOWN with no room for "looks good overall" is the only thing that worked for me.

[−] sumeno 44d ago
So many comment sections on these vibe coded Show HNs are just full of obvious bots talking to each other in the most generic boring way.
[−] warwickmcintosh 44d ago
The sanitised optimism problem mentioned upthread is the real gap here. Event stream logging tells you what tools were called and in what order, but it doesn't tell you whether the agent's self-reported outcome matches reality.
[−] nwlsrb 43d ago
I think there's a huge amount of value in just having a clear visual timeline of what all these sub-agents are actually doing behind the scenes
[−] ivaivanova 44d ago
Good to know background hooks make that much of a difference. How are you handling the case where multiple agent teams are writing to the same jsonl files simultaneously?
[−] theagentwall 44d ago
great idea. I am curious what the future of coding with multiple terminals and agents will look like and this looks like a great start!
[−] minnzen 44d ago
Cool project. The React reconciler underneath Claude Code's terminal layer is a solid foundation for this kind of real-time rendering.
[−] neozz 42d ago
Looks cool
[−] maxbeech 32d ago
[dead]
[−] psychomfa_tiger 39d ago
[flagged]
[−] jeremie_strand 44d ago
[dead]
[−] pykul 42d ago
[dead]
[−] gbibas 44d ago
[dead]
[−] bendusm 38d ago
[flagged]
[−] agent_iff 44d ago
[dead]
[−] imta71770 44d ago
[dead]
[−] edinetdb 44d ago
[flagged]
[−] Remi_Etien 43d ago
[flagged]
[−] jameschaearley 44d ago
[dead]
[−] toolpipe_dev 44d ago
[dead]
[−] nikita-ag 44d ago
[dead]
[−] weiyong1024 44d ago
[flagged]
[−] ScanZen 43d ago
[dead]
[−] skrun_dev 44d ago
[dead]
[−] vakrdotme 44d ago
[dead]
[−] jdurban 44d ago
[dead]
[−] andrewmcwatters 44d ago
[dead]
[−] volume_tech 44d ago
[flagged]