OpenCode was the first open source agent I used, and my main workhorse after experimenting briefly with Claude Code and realizing the potential of agentic coding. Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.
I am more concerned about their, umm, gallant approach to security. Not only that OpenCode is permissive by default in what it is allowed to do, but that it apparently tries to pull its config from the web (provider-based URL) by default [1]. There is also this open GitHub issue [2], which I find quite concerning (worst case, it's an RCE vulnerability).
It also sends all of your prompts to Grok's free tier by default, and the free tier trains on your submitted information, X AI can do whatever they want with that, including building ad profiles, etc.
You need to set an explicit "small model" in OpenCode to disable that.
This. I work on projects that warrant a self hosted model to ensure nothing is leaked to the cloud. Imagine my surprise when I discovered that even though the only configured model is local, all my prompts are sent to the cloud to... generate a session title. Fortunately caught during testing phase.
If you're using software someone else wrote, you'd have to repeat this testing phase any time an update is installed, right?
(I do mean this as a general principle, but also it was pointed out elsewhere in the thread that this is a particularly "high velocity" project as far as unexpected changes go.)
I’m curious if there’s a reason you’re not just coding in a container without access to the internet, or some similar setup? If I was worried about things in my dev chain accessing any cloud service, I’d be worried about IDE plugins, libraries included in imports, etc. and probably not want internet access at all.
The small_model option configures a separate model for lightweight tasks like title generation. By default, OpenCode tries to use a cheaper model if one is available from your provider, otherwise it falls back to your main model.
I would expect that if you set a local model it would just use the same model. Or if for example you set GPT as main model, it would use something else from OpenAI. I see no mentions of Grok as default
> and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
That's (one of the reasons) why I'm favoring Codex over Claude Code.
Claude Code is an... Electron app (for a TUI? WTH?) and Codex is Rust. The difference is tangible: the former feels sluggish and does some odd redrawing when the terminal size changes, while the latter definitely feels more snappy to me (leaving aside that GPT's responses also seem more concise). At some point, I had both chewing concurrently on the same machine and same project, and Claude Code was using multiple GBs of RAM and 100% CPU whereas Codex was happy with 80 MB and 6%.
Performance _is_ a feature and I'm afraid the amounts of code AI produces without supervision lead to an amount of bloat we haven't seen before...
> The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
this is what i notice with openclaw as well. there have been releases where they break production features. unfortunately this is what happens when code becomes a commidity, everyone thinks that shipping fast is the moat but at the expense of suboptimality since they know a fix can be implemented quickly on the next release.
I recently listened to this episode from the Claude Code creator (here is the video version: https://www.youtube.com/watch?v=PQU9o_5rHC4) and it sounded like their development process was somewhat similar - he said something like their entire codebase has 100% churn every 6 months. But I would assume they have a more professional software delivery process.
I would (incorrectly) assume that a product like this would be heavily tested via AI - why not? AI should be writing all the code, so why would the humans not invest in and require extreme levels of testing since AI is really good at that?
OpenCode's creator acknowledged that the ease of shipping has let them ship prototype features that probably weren't worth shipping and that they need to invest more time cleaning up and fixing things.
I'm still trying to figure out how "open" it really is; There are reports that it phones home a lot[0], and there is even a fork that claims to remove this behavior[1]:
Probably all describe problems stem from the developers using agent coding; including using TypeScript, since these tools are usually more familiar with Js/Js adjacent web development languages.
The value of having (and executing) a coherent product vision is extremely undervalued in FOSS, and IMO the difference between a successful project in the long-term and the kind of sploogeware that just snowballs with low-value features.
Yeah every time I want to like it, scrolling is glitched vs codex and Claude. And other various things like: why is this giant model list hard coded for ollama or other local methods vs loading what I actually have...
On top of that. Open code go was a complete scam. It was not advertised as having lower quality models when I paid and glm5 was broken vs another provider, returning gibberish and very dumb on the same prompt
Is there a name for these types of "overbearing" and visually busy "TUIs"? It seems like all the other agents have the same aesthetic and it is unlike traditional nurses or plain text interfaces in a bad way IMO. The constant spinners, sidebars and needless margins are a nuisance to me. Especially over an ssh connection in a tmux session it feels wrong.
I’m a little surprised by your description of constant releases and instability. That matches how I would describe Claude Code, and has been one of the main reasons I tend to use OpenCode more than Claude Code.
OpenCode has been much more stable for me in the 6 months or so that I’ve been comparing the two in earnest.
I’ve been testing opencode and it feels TUI in appearance only. I prefer commandline and TUIs and in my mind TUI idea is to be low level, extremely portable interface and to get out of the way. Opencode does not have low color, standard terminal theme so had to switch to a proper terminal program. Copy paste is hijacked so I need to write code out to file in order to get a snippet. The enter key (as in the return by the keypad) does not work for sending a line. I have not tested but don’t think this would work over SSH even. I have been googling around to find if I am holding it wrong but it feels to break expectations of a terminal app in a way that I wish they would have made it a gui. Makes me sad because I think the goods are there and it’s otherwise good.
> Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best;
This is my experience with most AI tools that I spend more than a few weeks with. It's happening so often it's making me question my own judgement: "if everything smells of shit, check your own shoes." I left professional software engineering a couple of years ago, and I don't know how much of this is also just me losing touch with the profession, or being an old man moaning about how we used to do it better.
It reminds me of social media: there was a time where social media platforms were defined by their features, Vine was short video, snapchat was disappearing pictures, twitter was short status posts etc. but now they're all bloated messes that try do everything.
The same looks to be happening with AI and agent software. They start off as defined by one features, and then become messes trying to implement the latest AI approach (skills, or tools, or functions, or RAG, or AGENTS.md, or claws etc. etc.)
I tried running Opencode on my 7$/yr 512mb vps but it had the OOM issue and yes it needs 1GB of ram or more.
I then tried running other options like picoclaw/picocode etc but they were all really hard to manage/create
The UI/UX I want is that I can just put my free openrouter api key in and then I am ready to go to get access to free models like Arcee AI right now
After reading your comments/I read this thread, I tried crush by charmbracelet again and it gives the UI/UX that I want.
I am definitely impressed by crush/ the charm team. They are on HN and they work great for me, highly recommended if you want something which can work on low constrained devices
I do feel like Charm's TUI's are too beautiful in the sense that running a connection over SSH can delay so when I tried to copy some things, the delay made things less copy-able but overall, I think that I am using Crush and I am happy for the most part :-)
Edit: That being said, just as I was typing this, Crush took all the Free requests from Openrouter that I get for free so it might be a bit of minor issue but overall its not much of an issue from Crush side, so still overall, my point is that Crush is worth checking out
Kudos to the CharmBracelet team for making awesome golang applications!
> they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things
Tbf, this seems exactly like Claude Code, they are releasing about one new version per day, sometimes even multiple per day. It’s a bit annoying constantly getting those messages saying to upgrade cc to the latest version
I'm quite impressed by how a team managed to wangle the mental model of React and JSX into a terminal interface and in fact I can only imagine that that itself is a product of AI.
That said, the runtime is so resource heavy that, even though the heavy computational workload is given to AI on a remote cluster of servers, it will bring an old-ish laptop to a stall.
I do wonder though...highly interactive TUIs are not novel. I would wager that AI + the attention of frontend devs have created an environment where you can make fancy terminal UIs without concern for how terminals generally work and if Electron is sitting in the background, it proves it.
I agree that Opencodr is using a lot of RAM, but regarding the features, I am ak only using the built in features and I wouldn't say they are too many, they are just enough for a complete workflow. If you need more you can install plugins, which I haven't done yet and it's my daily driver for four months.
Isn't this pretty much the standard across projects that make heavy use of AI code generation?
Using AI to generate all your code only really makes sense if you prioritize shipping features as fast as possible over the quality, stability and efficiency of the code, because that's the only case in which the actual act of writing code is the bottleneck.
I found out about OpenCode through the Anthropic feud. I now spend most of my AI time in it, both at work and at home. It turns out to be pretty great for general chat too, with the ability to easily integrate various tools you might need (search being the top one of course).
I have things to criticize about it, their approach to security and pulling in code being my main one, but over all it’s the most complete solution I’ve found.
They have a server/client architecture, a client SDK, a pretty good web UI and use pretty standard technologies.
The extensibility story is good and just seems like the right paradigms mostly, with agents, skills, plugins and providers.
They also ship very fast, both for good and bad, I’ve personally enjoyed the rapid improvements (~2 days from criticizing not being able to disable the default provider in the web ui to being able to).
I think OpenCode has a pretty bright future and so far I think that my issues with it should be pretty fixable. The amount of tasteful choices they’ve made dwarfs the few untasteful ones for me so far.
I love OpenCode! I wrote a plugin that adds two tools: prune and retrieve. Prune lets the LLM select messages to remove from the conversation and replace with a summary and key terms. The retrieve tool lets it get those original messages back in case they're needed. I've been livestreaming the development and using it on side projects to make sure it's actually effective... And it turns out it really is! It feels like working with an infinite context window.
I’ve been extraordinarily productive with this, their $10 Go plan, and a rigorous spec-driven workflow. Haven’t touched Claude in 2 months.
I sprinkle in some billed API usage to power my task-planner and reviewer subagents (both use GPT 5.4 now).
The ability to switch models is very useful and a great learning experience. GLM, Kimi and their free models surprised me. Not the best, not perfect, but still very productive. I would be a wary shareholder if I owned a stake in the frontier labs… that moat seems to be shrinking fast.
I don't use it for coding but as an agent backend. Maybe opencode was thought for coding mainly, but for me, it's incredibly good as an agent, especially when paired with skills, a fastapi server, and opencode go(minimax) is just so much intelligence at an incredibly cheap price. Plus, you can talk to it via channels if you use a claw.
opencode stands out as one of the few agents with a proper client server architecture that allows something like openchambers great vscode extension so its possible to seamlessly switch between tui, vscode, webapp, desktop app. i think there is hardly a usable alternative for most coding agent usecases (assuming agents from model providers are a no go, they cannot be allowed to own the tools AND the models). But its also far from perfect: the webui is secretly served from their servers instead of locally for no reason. worse the fallback route gets also sent to their servers so any unknown request to opencode api ends up being sent to opencode servers potentially leaking data. the security defaults are horrific, its impossible to use it safely outside a controlled container. it will just serve your whole hard drive via rest endpoint and not constrain to project folders. the share feature uploading your conversations to their servers is also so weirdly communicated and implemented that it leaves a bad taste. I dont think this will become much better until the agent ecosystem is more modular and less monolith, acp, a2a and mcp need to become good enough so tools, prompts, skills, subagent setups and workflow engines and UIs are completely swappable and the agent core has to only focus on the essentials like runtime and glue architecture. i really hope we dont see all of these grow into full agent oses with artificial lock in effects and big effort buy in.
i've been using this as my primary harness for llama.cpp models, Claude, and Gemini for a few months now. the LSP integration is great. i also built a plugin to enable a very minimal OpenClaw alternative as a self modifying hook system over IPC as a plugin for OpenCode: https://github.com/khimaros/opencode-evolve -- and here's a deployment ready example making use of it which runs in an Incus container/VM: https://github.com/khimaros/persona
The team also is not breathlessly talking about how coding is dead. They have pretty sane takes on AI coding including trying to help people who care about code quality.
I'd really like to get more clarification on offline mode and privacy. The github issues related to privacy did not leave a good feeling, despite being initially excited. Is offline mode a thing yet? I want to use this, but I don't want my code to leave my device.
The only thing I'm wondering is if they have eval frameworks (for lack of a better word). Their prompts don't seem to have changed for a while and I find greater success after testing and writing my own system prompts + modification to the harness to have the smallest most concise system prompt + dynamic prompt snippets per project.
I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.
What would be the advantage using this over say VSCode with Copilot or Roo Code? I need to make some time to compare, but just curious if others have a good insight on things.
The Agent that is blacklisted from Anthropic AI, soon more to come.
I really like how their subagents work, as a bonus I get to choose which model is in which agent. Sadly I have to resort to the mess that Anthropic calls Claude Code
Stupid question, but are there models worth using that specialize in a particular programming language? For instance, I'd love to be able to run a local model on my GPU that is specific to C/C++ or Python. If such a thing exists, is it worth it vs one of the cloud-based frontier models?
I'm guessing that a model which only covers a single language might be more compact and efficient vs a model trained across many languages and non-programming data.
Can someone explain how Claude Code can instantly determine what file I have open and what lines I have selected in VS Code even if it's just running in a VS Code terminal instance, yet I cannot for the life of me get OpenCode to come anywhere close to that same experience?
The OpenCode docs suggest its possible, but it only works with their extension (not in an already open VS Code terminal) with a very specific keyboard shortcut and only barely at that.
I use it with Qwen 3.5 running locally when my daily limits run out on my other subscriptions.
The harness is great. Local models are just slow enough that the subscription models are easier to use. For most of my tasks these days, the model's capability is sufficient; it is just not as snappy.
I want to love this, but the "just install it globally, what could go wrong?" is simply not happening for an AI-written codebase. Open Source was never truly "you can trust it because everyone can vet it", so you had to do your due diligence. Now with AI code bases, that's "it might be open source, but no one actually knows how it works and only other AIs can check if it's safe because no one can read the code". Who's getting the data? No idea. How would you find out? I guess you can wireshark your network? This is not a great feeling.
I tried to use it but OpenCode won't even open for me on Wayland (Ubuntu 24.04), whichever terminal emulator I use. I wasn't even aware TUI could have compatibility issues with Wayland
OpenCode works awesome for me. The BigPickle model is all I want. I do not throw some large work at the agent that requires lot of reasoning, thinking or decision making. It's my role to chop the work down to bite-size and ask the fantastic BigPickle to just do the damn coding or bit of explaining. It works very well with interactive sessions with small tasks. Not giving something to work over night.
I used Claude with paid subscription and codex as well and settled to OpenCode with free models.
One thing that makes OpenCode stand out to me is the web UI. I host it on my rPi 4B, serving as my AI assistant and remote mobile access to my homelab.
Since the homelab doesn't really have access to any risky data, I just gave OpenCode full Docker access and connect to it through Tailscale on my iPhone https://github.com/pprotas/homelab
Since this is blowing up, gonna plug my opencode/claude-code plugin that allows you to annotate LLMs plans like a Google doc with strikethroughs, comments, etc. and loop with your agent until you're happy with the plan.
I started with Codex, then switched to OpenCode, then switched to Codex.
OpenCode just has more bugs, it's incredibly derivative so it doesn't really do anything else than Codex.
The advantage of OpenCode is that it can use any underlying model, but that's a disadvantage because it breaks the native integration. If you use Opus + Claude Code, or Gpt-Codex + Codex App, you are using it the way it was designed to be used.
If you don't actually use different models, or plan to switch, or somehow value vendor neutrality strategically, you are paying a large cost without much reward.
This is in general a rule, vendor neutrality is often seen as a generic positive, but it is actually a tradeoff. If you just build on top of AWS for example, you make use of it's features and build much faster and simpler than if you use Terraform.
- GH copilot API is a first class citizen with access to multiple providers’ models at a very good price with a pro plan
- no terminal flicker
- it seems really good with subagents
- I can’t see any terminal history inside my emacs vterm :(
I wish they would add back support for anthropic max/pro plans via calling the claude cli in -p mode. As I understand thats still very much allowed usage of claude code cli (as you are still using claude cli as it was intended anyway and fixes the issue of cache hits which I believe were the primary reason anthropic sent them the c&d). I love the UX from OpenCode (I loved setting it up in web mode on my home server and code from the web browser vs doing claude code over ssh) but until I can use my pro/max subscription I can't go back, the API pricing is way too much for my third world country wallet.
One thing I notice across all these coding agents is that none of them have a trust or reputation layer. If an agent generates code, merges a PR, or deploys something — there is no cross-tool way to verify whether that agent has a track record of reliable output. We treat every agent invocation as equally trustworthy. In a world where agents are increasingly calling other agents' tools via MCP, that seems like a gap. The agent running your code review has no way to know if the agent that wrote the code has ever produced working code before.
I use this. I run it in a sandbox[0]. I run it inside Emacs vterm so it's really quick for me to jump back and forth between this and magit, which I use to review what it's done.
I really should look into more "native" Emacs options as I find using vterm a bit of a clunky hack. But I'm just not that excited about this stuff right now. I use it because I'm lazy, that's all. Right now I'm actually getting into woodwork.
Question: How do we use Agents to Schedule and Orchestrate Farming and Agricultural production, or Manufacturing assembly machines, or Train rail transportation, or mineral and energy deposit discovery and extraction or interplanetary terraforming and mining, or nuclear reactor modulation, or water desalination automation, or plutonium electric fuel cell production with a 24,000 year half-life radiation decay, or interplanetary colonization, or physics equation creation and solving for faster-than-light travel?
Many folks from other tools are only getting exposed to the same functionality they got used to, but it offers much more than other harnesses, especially for remote coding.
You can start a service via opencode serve, it can be accessed from anywhere and has great experience on mobile except a few bugs. It's a really good way to work with your agents remotely, goes really well with TailScale.
The WebUI that they have can connect to multiple OpenCode backends at once, so you may use multiple VPS-es for various projects you have and control all of them from a single place.
Lastly, there's a desktop app, but TBH I find it redundant when WebUI has everything needed.
Make no mistakes though, it's not a perfect tool, my gripes with it:
- There are random bugs with loading/restoring state of the session
- Model/Provider selection switch across sessions/projects is often annoying
- I had a bug making Sonnet/Opus unusable from mobile phone because phone's clock was 150ms ahead of laptop's (ID generation)
- Sometimes agent get randomly stuck. It especially sucks for long/nested sessions
- WebUI on laptop just completely forgot all the projects at
one day
- opencode serve doesn't pick up new skills automatically, it needs to be restarted
I've used it but recently moved back to plain claude code. We use claude at the company and weirdly the experience has become less and less productive using opencode. I'm a bit sad about it as it was the first experience that really clicked and got great results out of. I'm actually curious if Anthropic knows which client is used and if they negatively influence the experience on purpose. It's very difficult to prove because nothing about this is exact science.
Interesting timing — I've been building on Cloudflare Workers
with edge-first constraints, and the resource footprint of most
AI coding tools is striking by comparison. A TypeScript agent
that uses 1GB+ RAM for a TUI feels like the wrong abstraction.
The edge computing model forces you to think differently about
state, memory, and execution — maybe that's where lighter
agentic tools will emerge.
I've been using opencode for a few months and really like it, both from a UX and a results perspective.
It started getting increasingly flaky with Anthropic's API recently, so I switched back to Claude Code for a couple of days. Oh my, what a night and day difference. Tokens, MCP use, everything.
For anyone reading at OpenAI, your support for OpenCode is the reason I now pay you 200 bucks a month instead.
What does well: helps context switching by using one window to control many repos with many worktrees each.
What can do better?
It's putting AI too much in control? What if I want to edit a function myself in the workspace I'm working on? or select a snippet and refer that in the promp? without that I feel it's missing a non-negotiable feature.
I wish the team would be more responsive to popular issues - like inability to provide a dynamic api key helper like claude has. This one even has a PR open: https://github.com/anomalyco/opencode/issues/1302
That's my favorite CLI agent, over codex, claude, copilot and qwen-code.
It has beautified markdown output, much more subagents, and access to free models. Unlike claude and codex. Best is opencode with GitHub opus 4.6, but the fun only lasts for a day, then you're out of tokens for a month.
I've used both. I stuck with Claude Code, the ergonomics are better and the internals are clearly optimized for Opus which I use daily, you can feel it. That said OpenCode is still a very good alternative, well above Codex, Gemini CLI or Mistral Vibe in my experience.
The decision to build this as a TUI rather than a web app is interesting. Terminal-native tools tend to get out of the way and let you stay in flow -- curious how the context management works when you have a large codebase, do you chunk by file or do something smarter?
619 comments
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.
[1] https://opencode.ai/docs/config/#precedence-order
[2] https://github.com/anomalyco/opencode/issues/10939
You need to set an explicit "small model" in OpenCode to disable that.
(I do mean this as a general principle, but also it was pointed out elsewhere in the thread that this is a particularly "high velocity" project as far as unexpected changes go.)
I mean the default model being Grok, whatever - that everyone sets to their favorite.
But the hidden use of a different model is wow.
The small_model option configures a separate model for lightweight tasks like title generation. By default, OpenCode tries to use a cheaper model if one is available from your provider, otherwise it falls back to your main model.
I would expect that if you set a local model it would just use the same model. Or if for example you set GPT as main model, it would use something else from OpenAI. I see no mentions of Grok as default
[1] https://opencode.ai/docs/config/
Have fun on windows - automatic no from me. https://github.com/anomalyco/opencode/issues?q=is%3Aissue%20...
> and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
That's (one of the reasons) why I'm favoring Codex over Claude Code.
Claude Code is an... Electron app (for a TUI? WTH?) and Codex is Rust. The difference is tangible: the former feels sluggish and does some odd redrawing when the terminal size changes, while the latter definitely feels more snappy to me (leaving aside that GPT's responses also seem more concise). At some point, I had both chewing concurrently on the same machine and same project, and Claude Code was using multiple GBs of RAM and 100% CPU whereas Codex was happy with 80 MB and 6%.
Performance _is_ a feature and I'm afraid the amounts of code AI produces without supervision lead to an amount of bloat we haven't seen before...
> The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
this is what i notice with openclaw as well. there have been releases where they break production features. unfortunately this is what happens when code becomes a commidity, everyone thinks that shipping fast is the moat but at the expense of suboptimality since they know a fix can be implemented quickly on the next release.
I would (incorrectly) assume that a product like this would be heavily tested via AI - why not? AI should be writing all the code, so why would the humans not invest in and require extreme levels of testing since AI is really good at that?
https://x.com/thdxr/status/2031377117007454421
[0] https://www.reddit.com/r/LocalLLaMA/comments/1rv690j/opencod...
[1] https://github.com/standardnguyen/rolandcode
That being said, I do prefer OpenCode to Codex and Claude Code.
On top of that. Open code go was a complete scam. It was not advertised as having lower quality models when I paid and glm5 was broken vs another provider, returning gibberish and very dumb on the same prompt
OpenCode has been much more stable for me in the 6 months or so that I’ve been comparing the two in earnest.
It's fully open, fairly minimal, very extensible and (while getting very frequent updates) never has broken on me so far.
Been using it more and more in the last two months, switching more and more from codex to it now.
> Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best;
This is my experience with most AI tools that I spend more than a few weeks with. It's happening so often it's making me question my own judgement: "if everything smells of shit, check your own shoes." I left professional software engineering a couple of years ago, and I don't know how much of this is also just me losing touch with the profession, or being an old man moaning about how we used to do it better.
It reminds me of social media: there was a time where social media platforms were defined by their features, Vine was short video, snapchat was disappearing pictures, twitter was short status posts etc. but now they're all bloated messes that try do everything.
The same looks to be happening with AI and agent software. They start off as defined by one features, and then become messes trying to implement the latest AI approach (skills, or tools, or functions, or RAG, or AGENTS.md, or claws etc. etc.)
I then tried running other options like picoclaw/picocode etc but they were all really hard to manage/create
The UI/UX I want is that I can just put my free openrouter api key in and then I am ready to go to get access to free models like Arcee AI right now
After reading your comments/I read this thread, I tried crush by charmbracelet again and it gives the UI/UX that I want.
I am definitely impressed by crush/ the charm team. They are on HN and they work great for me, highly recommended if you want something which can work on low constrained devices
I do feel like Charm's TUI's are too beautiful in the sense that running a connection over SSH can delay so when I tried to copy some things, the delay made things less copy-able but overall, I think that I am using Crush and I am happy for the most part :-)
Edit: That being said, just as I was typing this, Crush took all the Free requests from Openrouter that I get for free so it might be a bit of minor issue but overall its not much of an issue from Crush side, so still overall, my point is that Crush is worth checking out
Kudos to the CharmBracelet team for making awesome golang applications!
> they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things
Tbf, this seems exactly like Claude Code, they are releasing about one new version per day, sometimes even multiple per day. It’s a bit annoying constantly getting those messages saying to upgrade cc to the latest version
That said, the runtime is so resource heavy that, even though the heavy computational workload is given to AI on a remote cluster of servers, it will bring an old-ish laptop to a stall.
I do wonder though...highly interactive TUIs are not novel. I would wager that AI + the attention of frontend devs have created an environment where you can make fancy terminal UIs without concern for how terminals generally work and if Electron is sitting in the background, it proves it.
Using AI to generate all your code only really makes sense if you prioritize shipping features as fast as possible over the quality, stability and efficiency of the code, because that's the only case in which the actual act of writing code is the bottleneck.
I tried Opencode but it was just too much? Same with Crush, 10/10 pretty but lacking in features I need. LSP support was cool though.
> they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
I wonder how much of this is because the maintainers are using OpenCode to vibe the code for OpenCode.
I have things to criticize about it, their approach to security and pulling in code being my main one, but over all it’s the most complete solution I’ve found.
They have a server/client architecture, a client SDK, a pretty good web UI and use pretty standard technologies.
The extensibility story is good and just seems like the right paradigms mostly, with agents, skills, plugins and providers.
They also ship very fast, both for good and bad, I’ve personally enjoyed the rapid improvements (~2 days from criticizing not being able to disable the default provider in the web ui to being able to).
I think OpenCode has a pretty bright future and so far I think that my issues with it should be pretty fixable. The amount of tasteful choices they’ve made dwarfs the few untasteful ones for me so far.
https://www.youtube.com/live/z0JYVTAqeQM?si=oLvyLlZiFLTxL7p0
I sprinkle in some billed API usage to power my task-planner and reviewer subagents (both use GPT 5.4 now).
The ability to switch models is very useful and a great learning experience. GLM, Kimi and their free models surprised me. Not the best, not perfect, but still very productive. I would be a wary shareholder if I owned a stake in the frontier labs… that moat seems to be shrinking fast.
I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.
I really like how their subagents work, as a bonus I get to choose which model is in which agent. Sadly I have to resort to the mess that Anthropic calls Claude Code
I'm guessing that a model which only covers a single language might be more compact and efficient vs a model trained across many languages and non-programming data.
The OpenCode docs suggest its possible, but it only works with their extension (not in an already open VS Code terminal) with a very specific keyboard shortcut and only barely at that.
To change that, you need to set a custom "small model" in the settings.
I use it with Qwen 3.5 running locally when my daily limits run out on my other subscriptions.
The harness is great. Local models are just slow enough that the subscription models are easier to use. For most of my tasks these days, the model's capability is sufficient; it is just not as snappy.
Hugely grateful for what they do.
I used Claude with paid subscription and codex as well and settled to OpenCode with free models.
Since the homelab doesn't really have access to any risky data, I just gave OpenCode full Docker access and connect to it through Tailscale on my iPhone https://github.com/pprotas/homelab
"we see occasional complaints about memory issues in opencode
if you have this can you press ctrl+p and then "Write heap snapshot"
Upload here: https://romulus.warg-snake.ts.net/upload
Original post:https://x.com/i/status/2035333823173447885
https://github.com/ndom91/open-plan-annotator
OpenCode just has more bugs, it's incredibly derivative so it doesn't really do anything else than Codex.
The advantage of OpenCode is that it can use any underlying model, but that's a disadvantage because it breaks the native integration. If you use Opus + Claude Code, or Gpt-Codex + Codex App, you are using it the way it was designed to be used.
If you don't actually use different models, or plan to switch, or somehow value vendor neutrality strategically, you are paying a large cost without much reward.
This is in general a rule, vendor neutrality is often seen as a generic positive, but it is actually a tradeoff. If you just build on top of AWS for example, you make use of it's features and build much faster and simpler than if you use Terraform.
- GH copilot API is a first class citizen with access to multiple providers’ models at a very good price with a pro plan - no terminal flicker - it seems really good with subagents - I can’t see any terminal history inside my emacs vterm :(
I really should look into more "native" Emacs options as I find using vterm a bit of a clunky hack. But I'm just not that excited about this stuff right now. I use it because I'm lazy, that's all. Right now I'm actually getting into woodwork.
[0] https://blog.gpkb.org/posts/ai-agent-sandbox/
- With love The Official Pink Eye #ThereIsNoOther
https://opencode.de/
Many folks from other tools are only getting exposed to the same functionality they got used to, but it offers much more than other harnesses, especially for remote coding.
You can start a service via
opencode serve, it can be accessed from anywhere and has great experience on mobile except a few bugs. It's a really good way to work with your agents remotely, goes really well with TailScale.The WebUI that they have can connect to multiple OpenCode backends at once, so you may use multiple VPS-es for various projects you have and control all of them from a single place.
Lastly, there's a desktop app, but TBH I find it redundant when WebUI has everything needed.
Make no mistakes though, it's not a perfect tool, my gripes with it:
- There are random bugs with loading/restoring state of the session
- Model/Provider selection switch across sessions/projects is often annoying
- I had a bug making Sonnet/Opus unusable from mobile phone because phone's clock was 150ms ahead of laptop's (ID generation)
- Sometimes agent get randomly stuck. It especially sucks for long/nested sessions
- WebUI on laptop just completely forgot all the projects at one day
-
opencode servedoesn't pick up new skills automatically, it needs to be restartedIt started getting increasingly flaky with Anthropic's API recently, so I switched back to Claude Code for a couple of days. Oh my, what a night and day difference. Tokens, MCP use, everything.
For anyone reading at OpenAI, your support for OpenCode is the reason I now pay you 200 bucks a month instead.
What does well: helps context switching by using one window to control many repos with many worktrees each.
What can do better? It's putting AI too much in control? What if I want to edit a function myself in the workspace I'm working on? or select a snippet and refer that in the promp? without that I feel it's missing a non-negotiable feature.
It has beautified markdown output, much more subagents, and access to free models. Unlike claude and codex. Best is opencode with GitHub opus 4.6, but the fun only lasts for a day, then you're out of tokens for a month.