Hey folks, I'm Alex from the reliability engineering team at Anthropic. We've just posted the retrospective for this incident:
> On March 26–27, 2026, customers experienced elevated error rates when using Claude Opus 4.6 and Claude Sonnet 4.6. The issue was caused by a networking performance degradation within our cloud infrastructure that disrupted communication between components of our serving stack. We resolved the incident by migrating the affected workloads to healthy infrastructure, restoring normal service by 9:30 AM PT on March 27.
Yes, the general trend is the unprecedented growth that we've seen. Typically one would have some time in advance to re-engineer the systems to support the increased in traffic and users. But we're dealing with very compressed timelines and while most of the time we're able to fix the issues beforehand, sometimes we have to do them in production. Sorry for that.
It's pretty damn good, and it's seen a real exodus of conscientious users; the QuitGPT movement alone hit 1.5 million participants, with Claude skyrocketing to #1 on the App Store virtually overnight. No surprise the servers are getting hammered.
The ironic thing about outages such as this one and Github's recent spate of outages are that if those vendors' sales pitches are to be believed, the vendors could just ask their LLMs to program reliable replacements overnight (okay, maybe a weekend).
They seem to be a victim of their own success. Their response times are quite bad, and it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources. They just announced that they're cutting their usage limits down during peak hours as well.
They're in serious risk of losing their lead with this sort of performance.
This is not an outage, Claude just gets lazier on Fridays.
Sometimes Claude wants more lunch breaks, takes a half day and leaves the desk early just like any human would. (since AI boosters like comparing LLMs to humans all the time) /s
87 comments
> On March 26–27, 2026, customers experienced elevated error rates when using Claude Opus 4.6 and Claude Sonnet 4.6. The issue was caused by a networking performance degradation within our cloud infrastructure that disrupted communication between components of our serving stack. We resolved the incident by migrating the affected workloads to healthy infrastructure, restoring normal service by 9:30 AM PT on March 27.
https://status.claude.com/incidents/b9802k1zb5l2
> Our uptime has a '9' in it! -- Anthropic
Not one of the usual ones that has service problems :)
I personally prefer per-token, it makes you more thoughtful about your setup and usage, instead of spray and pray.
You can also access the notable open weight models with VertexAI, only need to change the model id string.
Very few cases these days.. feels like we are lucky to get 2 9s anymore.
time to give your devops guy his job back.
They're in serious risk of losing their lead with this sort of performance.
Anthropic has had more than that.
Yikes.
They are the best.
ChatGPT is walmart.
Gemini is kroger.
Claude is... idk your local grocer that is always amazing and costs more?
Sometimes Claude wants more lunch breaks, takes a half day and leaves the desk early just like any human would. (since AI boosters like comparing LLMs to humans all the time) /s