Sonnet 4.6 Elevated Rate of Errors

[−] ApolloRising 37d ago

As long as everyone is here, have you seen the token usage just go up remarkably recently for the $100 plan? it lasts a lot less time than it used to recently. Might be related to recent releases of claude.

[−] oefrha 37d ago

Yes, it’s extremely obvious. The recent “we give you $100/$200 extra credit for a month” is clearly just “you’re supposed to pay extra for the same usage from the now on” dressed up as a “bonus”, just like giving “bonus” usage off-peak before announcing faster burn rate during peak a short while ago.

And the recent “Investigating usage limits hitting faster than expected” [1] is probably them intentionally gauging how much they can push it without too much of an uproar.

[1] https://www.reddit.com/r/ClaudeAI/comments/1s7zgj0/investiga...

[−] copperx 37d ago

Anthropic bonuses seem to be code for "your usage limits are going down soon."

[−] triage8004 37d ago

Literally every time

[−] gambiting 37d ago

I'm on the basic £18/month plan and with Sonnet 4.6 I literally get 20 maybe 30 minutes of use out of it per day. It's borderline useless now. I was using it for some Home Assistant changes yesterday and it used up my entire daily allowance after 8 prompts.

[−] ManlyBread 37d ago

I guess 2026 is the last year AI is widely available to anyone who isn't willing to shell out hundreds if not thousands for a monthly subscription. I guess all that's left is to thank all the investors for the free ride LOL

[−] _delirium 37d ago

Hard to square that with how good open-weights models are getting? I'm doing stuff with Qwen3.5-4b that required a frontier hosted model less than a year ago.

[−] baq 37d ago

the problem is you're still a year behind with this approach and it isn't at all clear locally hosted models can keep the gap. need more turboquant-like algorithmic boosts for this to happen.

[−] brador 37d ago

For a second there we felt the future. And now it’s gone.

[−] andrewinardeer 37d ago

Good luck jacking prices up too high with new open models flying around daily.

[−] baq 37d ago

good luck getting mythos/spud quality models open

[−] PaulMest 37d ago

Maybe you're experiencing normal usage rates now that the 2x March promotion is over?

> From March 13, 2026 through March 28, 2026, your five-hour usage is doubled during off-peak hours (outside 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT) on weekdays). Usage remains unchanged from 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT on weekdays. Source: https://support.claude.com/en/articles/14063676-claude-march...

[−] joshdev 37d ago

We discovered a bug in AWS Bedrock that is double counting cache writes when thinking/reasoning is enabled for the Anthropic models. It’s not clear to me if this is limited to just AWS Bedrock or all providers. AWS Support is aware.

We’ve also observed a much higher cache miss rate in the past few weeks. Combine both together and your usage consumption can be greatly increased.

[−] gghootch 37d ago

Yeah, the change has been very noticeable imo.

Did you already try tools that can help to reduce token usage cost so you can get more prompts in within your same plan? Some great ones are

https://github.com/gglucass/headroom-desktop

https://github.com/rtk-ai/rtk

https://github.com/chopratejas/headroom

https://github.com/samuelfaj/distill

[−] ApolloRising 37d ago

Thank you, I was searching github myself but this will make it a lot easier.

[−] figmert 37d ago

I'm on the max 20 plan, and yes, it's the same for me. The week before last it used to last all week for me, but now it's Wednesday and it's already at 40% usage.

[−] risyachka 37d ago

Like any company they will squeeze the usage as much as they possibly can. There is not a little chance that prices can be 1k+ so only enterprises can allow coding subs.Those who have ROI will pay for it.

Current phase of usage/pricing is just testing the waters. Especially considering they are the market leader in this category.

[−] anonzzzies 37d ago

I read this on reddit daily ; we have usage monitoring running and collect all stats; we have seen no difference at all. Guess they are split testing or something maybe?

[−] ramon156 37d ago

Could you elaborate what these usage monitors look like? I collect data locally and can easily show that cost per token has gone up in some of my sessions

[−] anonzzzies 37d ago

All our people run a cron script which counts tokens (from jsonl) use and runs a scripted cli /usage (sending keyboard input to the running claude code) and sends that to a central system where we can see this. We see no real changes on any of the accounts or averaged. I have to note here that we only use sonnet 4.6; opus always ran over limits if not continuesly monitored and switched over to sonnet since it came out and it's useless to us for that reason.

[−] mamcx 37d ago

Then is better to use copilot to get access to this? I have wondered if go with Claude directly or copilot...

[−] billynomates 37d ago

No, in fact I'm growing increasingly suspicious of messages I see like this all over the socials.

I am using Claude constantly, multiple agents, around 8-10hrs a day, 5 or 6 days a week, and I'm never anywhere need my limit.

[−] oefrha 37d ago

Unless you’re somehow on a different quota system, or maybe using Haiku, there’s no way you can sustain five continuous hours of parallel agents running without hitting the 5h quota limit, even on the 20x max plan. But maybe your company is flagged as VIP or something.

[−] billynomates 36d ago

OK not constantly using multiple agents, but very frequently.

[−] N_Lens 37d ago

I suspect Anthropic flags accounts in their backend and different people are getting different limits. What criteria they flag with, I am not sure.

[−] dgb23 37d ago

I would try to trim this suspicion with both Occam‘s and Hanlon‘s razor.

[−] tao_oat 37d ago

A bit surprised by the snarky comments here -- I also want Claude to work reliably but very few (no?) companies have ever seen this level of rapid growth. We're going to go through a long fail-whale-style period and I can imagine very, very few companies that could avoid that.

[−] jonatron 37d ago

If you look at the uptime graph, it's probably more newsworthy when it's up, not down.

[−] albert_e 37d ago

This is how it manifests on Claude Code terminal and desktop for me --

API Error: 529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id": "xxxxxxx"}

[−] capnsketch 37d ago

Apparently mythos isn't good enough to fix their infra problems

[−] wg0 37d ago

Mythos is hacking its way to serve itself into production and doesn't like older models to have any limelight could be one theory.

After all it's so dangerous.

[−] antfarm 37d ago

Claude Code started making stupid errors around Saturday. I have been using it frequently for months, and now it feels like back in the day when I tried Gemini for the first time.

[−] rbmck 37d ago

Serious Flowers for Algernon moment.

[−] gdorsi 37d ago

This explains why they are trying to cut all the third party software out of the subscriptions.

[−] t0lo 37d ago

Ever since they minted their deal with Australia everything has been turned upside down.

[−] NiekvdMaas 37d ago

With 30+ billion run rate (https://x.com/i/status/2041275563466502560), there should be plenty of cash to invest in infrastructure.

[−] anshumankmr 37d ago

Might as well log off for the day.

(though Copilot is working :) and OpenCode)

[−] taspeotis 37d ago

I mean if people have judged this important enough to be on the front page of HN ... I guess it's important enough to be on the front page?

But any combination of the Claude models are up or down on any given day: https://status.claude.com/

[−] tipiirai 37d ago

Currently the #1 entry. Noted fast.

[−] ACCount37 37d ago

The legendary one nine of reliability. Frankly, feels like they should be down to zero nines by now.

I get that they barely have the infrastructure to run their models at scale even when absolutely nothing goes wrong in any of it, but holy shit does it suck to be on the receiving end of that.

Makes me wonder where all the "bubble" talk is even coming from when we have a top 3 provider getting fucked over on every day of the week that ends in Y because of its inability to online compute faster than the inference demand grows.

[−] pjmlp 37d ago

Maybe they could just, I don't know, use Claude to research their bugs. /s

Sonnet 4.6 Elevated Rate of Errors (status.claude.com)

88 comments