OpenCode was the first open source agent I used, and my main workhorse after experimenting briefly with Claude Code and realizing the potential of agentic coding. Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.
I found out about OpenCode through the Anthropic feud. I now spend most of my AI time in it, both at work and at home. It turns out to be pretty great for general chat too, with the ability to easily integrate various tools you might need (search being the top one of course).
I have things to criticize about it, their approach to security and pulling in code being my main one, but over all it’s the most complete solution I’ve found.
They have a server/client architecture, a client SDK, a pretty good web UI and use pretty standard technologies.
The extensibility story is good and just seems like the right paradigms mostly, with agents, skills, plugins and providers.
They also ship very fast, both for good and bad, I’ve personally enjoyed the rapid improvements (~2 days from criticizing not being able to disable the default provider in the web ui to being able to).
I think OpenCode has a pretty bright future and so far I think that my issues with it should be pretty fixable. The amount of tasteful choices they’ve made dwarfs the few untasteful ones for me so far.
I love OpenCode! I wrote a plugin that adds two tools: prune and retrieve. Prune lets the LLM select messages to remove from the conversation and replace with a summary and key terms. The retrieve tool lets it get those original messages back in case they're needed. I've been livestreaming the development and using it on side projects to make sure it's actually effective... And it turns out it really is! It feels like working with an infinite context window.
I’ve been extraordinarily productive with this, their $10 Go plan, and a rigorous spec-driven workflow. Haven’t touched Claude in 2 months.
I sprinkle in some billed API usage to power my task-planner and reviewer subagents (both use GPT 5.4 now).
The ability to switch models is very useful and a great learning experience. GLM, Kimi and their free models surprised me. Not the best, not perfect, but still very productive. I would be a wary shareholder if I owned a stake in the frontier labs… that moat seems to be shrinking fast.
I don't use it for coding but as an agent backend. Maybe opencode was thought for coding mainly, but for me, it's incredibly good as an agent, especially when paired with skills, a fastapi server, and opencode go(minimax) is just so much intelligence at an incredibly cheap price. Plus, you can talk to it via channels if you use a claw.
opencode stands out as one of the few agents with a proper client server architecture that allows something like openchambers great vscode extension so its possible to seamlessly switch between tui, vscode, webapp, desktop app. i think there is hardly a usable alternative for most coding agent usecases (assuming agents from model providers are a no go, they cannot be allowed to own the tools AND the models). But its also far from perfect: the webui is secretly served from their servers instead of locally for no reason. worse the fallback route gets also sent to their servers so any unknown request to opencode api ends up being sent to opencode servers potentially leaking data. the security defaults are horrific, its impossible to use it safely outside a controlled container. it will just serve your whole hard drive via rest endpoint and not constrain to project folders. the share feature uploading your conversations to their servers is also so weirdly communicated and implemented that it leaves a bad taste. I dont think this will become much better until the agent ecosystem is more modular and less monolith, acp, a2a and mcp need to become good enough so tools, prompts, skills, subagent setups and workflow engines and UIs are completely swappable and the agent core has to only focus on the essentials like runtime and glue architecture. i really hope we dont see all of these grow into full agent oses with artificial lock in effects and big effort buy in.
i've been using this as my primary harness for llama.cpp models, Claude, and Gemini for a few months now. the LSP integration is great. i also built a plugin to enable a very minimal OpenClaw alternative as a self modifying hook system over IPC as a plugin for OpenCode: https://github.com/khimaros/opencode-evolve -- and here's a deployment ready example making use of it which runs in an Incus container/VM: https://github.com/khimaros/persona
The team also is not breathlessly talking about how coding is dead. They have pretty sane takes on AI coding including trying to help people who care about code quality.
I'd really like to get more clarification on offline mode and privacy. The github issues related to privacy did not leave a good feeling, despite being initially excited. Is offline mode a thing yet? I want to use this, but I don't want my code to leave my device.
The only thing I'm wondering is if they have eval frameworks (for lack of a better word). Their prompts don't seem to have changed for a while and I find greater success after testing and writing my own system prompts + modification to the harness to have the smallest most concise system prompt + dynamic prompt snippets per project.
I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.
What would be the advantage using this over say VSCode with Copilot or Roo Code? I need to make some time to compare, but just curious if others have a good insight on things.
The Agent that is blacklisted from Anthropic AI, soon more to come.
I really like how their subagents work, as a bonus I get to choose which model is in which agent. Sadly I have to resort to the mess that Anthropic calls Claude Code
Stupid question, but are there models worth using that specialize in a particular programming language? For instance, I'd love to be able to run a local model on my GPU that is specific to C/C++ or Python. If such a thing exists, is it worth it vs one of the cloud-based frontier models?
I'm guessing that a model which only covers a single language might be more compact and efficient vs a model trained across many languages and non-programming data.
Can someone explain how Claude Code can instantly determine what file I have open and what lines I have selected in VS Code even if it's just running in a VS Code terminal instance, yet I cannot for the life of me get OpenCode to come anywhere close to that same experience?
The OpenCode docs suggest its possible, but it only works with their extension (not in an already open VS Code terminal) with a very specific keyboard shortcut and only barely at that.
I use it with Qwen 3.5 running locally when my daily limits run out on my other subscriptions.
The harness is great. Local models are just slow enough that the subscription models are easier to use. For most of my tasks these days, the model's capability is sufficient; it is just not as snappy.
I want to love this, but the "just install it globally, what could go wrong?" is simply not happening for an AI-written codebase. Open Source was never truly "you can trust it because everyone can vet it", so you had to do your due diligence. Now with AI code bases, that's "it might be open source, but no one actually knows how it works and only other AIs can check if it's safe because no one can read the code". Who's getting the data? No idea. How would you find out? I guess you can wireshark your network? This is not a great feeling.
I tried to use it but OpenCode won't even open for me on Wayland (Ubuntu 24.04), whichever terminal emulator I use. I wasn't even aware TUI could have compatibility issues with Wayland
OpenCode works awesome for me. The BigPickle model is all I want. I do not throw some large work at the agent that requires lot of reasoning, thinking or decision making. It's my role to chop the work down to bite-size and ask the fantastic BigPickle to just do the damn coding or bit of explaining. It works very well with interactive sessions with small tasks. Not giving something to work over night.
I used Claude with paid subscription and codex as well and settled to OpenCode with free models.
One thing that makes OpenCode stand out to me is the web UI. I host it on my rPi 4B, serving as my AI assistant and remote mobile access to my homelab.
Since the homelab doesn't really have access to any risky data, I just gave OpenCode full Docker access and connect to it through Tailscale on my iPhone https://github.com/pprotas/homelab
Since this is blowing up, gonna plug my opencode/claude-code plugin that allows you to annotate LLMs plans like a Google doc with strikethroughs, comments, etc. and loop with your agent until you're happy with the plan.
I started with Codex, then switched to OpenCode, then switched to Codex.
OpenCode just has more bugs, it's incredibly derivative so it doesn't really do anything else than Codex.
The advantage of OpenCode is that it can use any underlying model, but that's a disadvantage because it breaks the native integration. If you use Opus + Claude Code, or Gpt-Codex + Codex App, you are using it the way it was designed to be used.
If you don't actually use different models, or plan to switch, or somehow value vendor neutrality strategically, you are paying a large cost without much reward.
This is in general a rule, vendor neutrality is often seen as a generic positive, but it is actually a tradeoff. If you just build on top of AWS for example, you make use of it's features and build much faster and simpler than if you use Terraform.
- GH copilot API is a first class citizen with access to multiple providers’ models at a very good price with a pro plan
- no terminal flicker
- it seems really good with subagents
- I can’t see any terminal history inside my emacs vterm :(
I wish they would add back support for anthropic max/pro plans via calling the claude cli in -p mode. As I understand thats still very much allowed usage of claude code cli (as you are still using claude cli as it was intended anyway and fixes the issue of cache hits which I believe were the primary reason anthropic sent them the c&d). I love the UX from OpenCode (I loved setting it up in web mode on my home server and code from the web browser vs doing claude code over ssh) but until I can use my pro/max subscription I can't go back, the API pricing is way too much for my third world country wallet.
One thing I notice across all these coding agents is that none of them have a trust or reputation layer. If an agent generates code, merges a PR, or deploys something — there is no cross-tool way to verify whether that agent has a track record of reliable output. We treat every agent invocation as equally trustworthy. In a world where agents are increasingly calling other agents' tools via MCP, that seems like a gap. The agent running your code review has no way to know if the agent that wrote the code has ever produced working code before.
I use this. I run it in a sandbox[0]. I run it inside Emacs vterm so it's really quick for me to jump back and forth between this and magit, which I use to review what it's done.
I really should look into more "native" Emacs options as I find using vterm a bit of a clunky hack. But I'm just not that excited about this stuff right now. I use it because I'm lazy, that's all. Right now I'm actually getting into woodwork.
Question: How do we use Agents to Schedule and Orchestrate Farming and Agricultural production, or Manufacturing assembly machines, or Train rail transportation, or mineral and energy deposit discovery and extraction or interplanetary terraforming and mining, or nuclear reactor modulation, or water desalination automation, or plutonium electric fuel cell production with a 24,000 year half-life radiation decay, or interplanetary colonization, or physics equation creation and solving for faster-than-light travel?
Many folks from other tools are only getting exposed to the same functionality they got used to, but it offers much more than other harnesses, especially for remote coding.
You can start a service via opencode serve, it can be accessed from anywhere and has great experience on mobile except a few bugs. It's a really good way to work with your agents remotely, goes really well with TailScale.
The WebUI that they have can connect to multiple OpenCode backends at once, so you may use multiple VPS-es for various projects you have and control all of them from a single place.
Lastly, there's a desktop app, but TBH I find it redundant when WebUI has everything needed.
Make no mistakes though, it's not a perfect tool, my gripes with it:
- There are random bugs with loading/restoring state of the session
- Model/Provider selection switch across sessions/projects is often annoying
- I had a bug making Sonnet/Opus unusable from mobile phone because phone's clock was 150ms ahead of laptop's (ID generation)
- Sometimes agent get randomly stuck. It especially sucks for long/nested sessions
- WebUI on laptop just completely forgot all the projects at
one day
- opencode serve doesn't pick up new skills automatically, it needs to be restarted
I've used it but recently moved back to plain claude code. We use claude at the company and weirdly the experience has become less and less productive using opencode. I'm a bit sad about it as it was the first experience that really clicked and got great results out of. I'm actually curious if Anthropic knows which client is used and if they negatively influence the experience on purpose. It's very difficult to prove because nothing about this is exact science.
Interesting timing — I've been building on Cloudflare Workers
with edge-first constraints, and the resource footprint of most
AI coding tools is striking by comparison. A TypeScript agent
that uses 1GB+ RAM for a TUI feels like the wrong abstraction.
The edge computing model forces you to think differently about
state, memory, and execution — maybe that's where lighter
agentic tools will emerge.
I've been using opencode for a few months and really like it, both from a UX and a results perspective.
It started getting increasingly flaky with Anthropic's API recently, so I switched back to Claude Code for a couple of days. Oh my, what a night and day difference. Tokens, MCP use, everything.
For anyone reading at OpenAI, your support for OpenCode is the reason I now pay you 200 bucks a month instead.
What does well: helps context switching by using one window to control many repos with many worktrees each.
What can do better?
It's putting AI too much in control? What if I want to edit a function myself in the workspace I'm working on? or select a snippet and refer that in the promp? without that I feel it's missing a non-negotiable feature.
I wish the team would be more responsive to popular issues - like inability to provide a dynamic api key helper like claude has. This one even has a PR open: https://github.com/anomalyco/opencode/issues/1302
That's my favorite CLI agent, over codex, claude, copilot and qwen-code.
It has beautified markdown output, much more subagents, and access to free models. Unlike claude and codex. Best is opencode with GitHub opus 4.6, but the fun only lasts for a day, then you're out of tokens for a month.
I've used both. I stuck with Claude Code, the ergonomics are better and the internals are clearly optimized for Opus which I use daily, you can feel it. That said OpenCode is still a very good alternative, well above Codex, Gemini CLI or Mistral Vibe in my experience.
The decision to build this as a TUI rather than a web app is interesting. Terminal-native tools tend to get out of the way and let you stay in flow -- curious how the context management works when you have a large codebase, do you chunk by file or do something smarter?
619 comments
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.
I have things to criticize about it, their approach to security and pulling in code being my main one, but over all it’s the most complete solution I’ve found.
They have a server/client architecture, a client SDK, a pretty good web UI and use pretty standard technologies.
The extensibility story is good and just seems like the right paradigms mostly, with agents, skills, plugins and providers.
They also ship very fast, both for good and bad, I’ve personally enjoyed the rapid improvements (~2 days from criticizing not being able to disable the default provider in the web ui to being able to).
I think OpenCode has a pretty bright future and so far I think that my issues with it should be pretty fixable. The amount of tasteful choices they’ve made dwarfs the few untasteful ones for me so far.
https://www.youtube.com/live/z0JYVTAqeQM?si=oLvyLlZiFLTxL7p0
I sprinkle in some billed API usage to power my task-planner and reviewer subagents (both use GPT 5.4 now).
The ability to switch models is very useful and a great learning experience. GLM, Kimi and their free models surprised me. Not the best, not perfect, but still very productive. I would be a wary shareholder if I owned a stake in the frontier labs… that moat seems to be shrinking fast.
I feel that if you want to build a coding agent / harness the first thing you should do is to build an evaluation framework to track performance for coding by having your internal metrics and task performance, instead I see most coding agents just fiddle with adding features that don't improve the core ability of a coding agent.
I really like how their subagents work, as a bonus I get to choose which model is in which agent. Sadly I have to resort to the mess that Anthropic calls Claude Code
I'm guessing that a model which only covers a single language might be more compact and efficient vs a model trained across many languages and non-programming data.
The OpenCode docs suggest its possible, but it only works with their extension (not in an already open VS Code terminal) with a very specific keyboard shortcut and only barely at that.
To change that, you need to set a custom "small model" in the settings.
I use it with Qwen 3.5 running locally when my daily limits run out on my other subscriptions.
The harness is great. Local models are just slow enough that the subscription models are easier to use. For most of my tasks these days, the model's capability is sufficient; it is just not as snappy.
Hugely grateful for what they do.
I used Claude with paid subscription and codex as well and settled to OpenCode with free models.
Since the homelab doesn't really have access to any risky data, I just gave OpenCode full Docker access and connect to it through Tailscale on my iPhone https://github.com/pprotas/homelab
"we see occasional complaints about memory issues in opencode
if you have this can you press ctrl+p and then "Write heap snapshot"
Upload here: https://romulus.warg-snake.ts.net/upload
Original post:https://x.com/i/status/2035333823173447885
https://github.com/ndom91/open-plan-annotator
OpenCode just has more bugs, it's incredibly derivative so it doesn't really do anything else than Codex.
The advantage of OpenCode is that it can use any underlying model, but that's a disadvantage because it breaks the native integration. If you use Opus + Claude Code, or Gpt-Codex + Codex App, you are using it the way it was designed to be used.
If you don't actually use different models, or plan to switch, or somehow value vendor neutrality strategically, you are paying a large cost without much reward.
This is in general a rule, vendor neutrality is often seen as a generic positive, but it is actually a tradeoff. If you just build on top of AWS for example, you make use of it's features and build much faster and simpler than if you use Terraform.
- GH copilot API is a first class citizen with access to multiple providers’ models at a very good price with a pro plan - no terminal flicker - it seems really good with subagents - I can’t see any terminal history inside my emacs vterm :(
I really should look into more "native" Emacs options as I find using vterm a bit of a clunky hack. But I'm just not that excited about this stuff right now. I use it because I'm lazy, that's all. Right now I'm actually getting into woodwork.
[0] https://blog.gpkb.org/posts/ai-agent-sandbox/
- With love The Official Pink Eye #ThereIsNoOther
https://opencode.de/
Many folks from other tools are only getting exposed to the same functionality they got used to, but it offers much more than other harnesses, especially for remote coding.
You can start a service via
opencode serve, it can be accessed from anywhere and has great experience on mobile except a few bugs. It's a really good way to work with your agents remotely, goes really well with TailScale.The WebUI that they have can connect to multiple OpenCode backends at once, so you may use multiple VPS-es for various projects you have and control all of them from a single place.
Lastly, there's a desktop app, but TBH I find it redundant when WebUI has everything needed.
Make no mistakes though, it's not a perfect tool, my gripes with it:
- There are random bugs with loading/restoring state of the session
- Model/Provider selection switch across sessions/projects is often annoying
- I had a bug making Sonnet/Opus unusable from mobile phone because phone's clock was 150ms ahead of laptop's (ID generation)
- Sometimes agent get randomly stuck. It especially sucks for long/nested sessions
- WebUI on laptop just completely forgot all the projects at one day
-
opencode servedoesn't pick up new skills automatically, it needs to be restartedIt started getting increasingly flaky with Anthropic's API recently, so I switched back to Claude Code for a couple of days. Oh my, what a night and day difference. Tokens, MCP use, everything.
For anyone reading at OpenAI, your support for OpenCode is the reason I now pay you 200 bucks a month instead.
What does well: helps context switching by using one window to control many repos with many worktrees each.
What can do better? It's putting AI too much in control? What if I want to edit a function myself in the workspace I'm working on? or select a snippet and refer that in the promp? without that I feel it's missing a non-negotiable feature.
It has beautified markdown output, much more subagents, and access to free models. Unlike claude and codex. Best is opencode with GitHub opus 4.6, but the fun only lasts for a day, then you're out of tokens for a month.