As long as everyone is here, have you seen the token usage just go up remarkably recently for the $100 plan? it lasts a lot less time than it used to recently. Might be related to recent releases of claude.
Yes, it’s extremely obvious. The recent “we give you $100/$200 extra credit for a month” is clearly just “you’re supposed to pay extra for the same usage from the now on” dressed up as a “bonus”, just like giving “bonus” usage off-peak before announcing faster burn rate during peak a short while ago.
And the recent “Investigating usage limits hitting faster than expected” [1] is probably them intentionally gauging how much they can push it without too much of an uproar.
I'm on the basic £18/month plan and with Sonnet 4.6 I literally get 20 maybe 30 minutes of use out of it per day. It's borderline useless now. I was using it for some Home Assistant changes yesterday and it used up my entire daily allowance after 8 prompts.
I guess 2026 is the last year AI is widely available to anyone who isn't willing to shell out hundreds if not thousands for a monthly subscription. I guess all that's left is to thank all the investors for the free ride LOL
Hard to square that with how good open-weights models are getting? I'm doing stuff with Qwen3.5-4b that required a frontier hosted model less than a year ago.
the problem is you're still a year behind with this approach and it isn't at all clear locally hosted models can keep the gap. need more turboquant-like algorithmic boosts for this to happen.
Maybe you're experiencing normal usage rates now that the 2x March promotion is over?
> From March 13, 2026 through March 28, 2026, your five-hour usage is doubled during off-peak hours (outside 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT) on weekdays). Usage remains unchanged from 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT on weekdays.
Source: https://support.claude.com/en/articles/14063676-claude-march...
We discovered a bug in AWS Bedrock that is double counting cache writes when thinking/reasoning is enabled for the Anthropic models. It’s not clear to me if this is limited to just AWS Bedrock or all providers. AWS Support is aware.
We’ve also observed a much higher cache miss rate in the past few weeks. Combine both together and your usage consumption can be greatly increased.
I'm on the max 20 plan, and yes, it's the same for me. The week before last it used to last all week for me, but now it's Wednesday and it's already at 40% usage.
Like any company they will squeeze the usage as much as they possibly can. There is not a little chance that prices can be 1k+ so only enterprises can allow coding subs.Those who have ROI will pay for it.
Current phase of usage/pricing is just testing the waters. Especially considering they are the market leader in this category.
I read this on reddit daily ; we have usage monitoring running and collect all stats; we have seen no difference at all. Guess they are split testing or something maybe?
Could you elaborate what these usage monitors look like? I collect data locally and can easily show that cost per token has gone up in some of my sessions
All our people run a cron script which counts tokens (from jsonl) use and runs a scripted cli /usage (sending keyboard input to the running claude code) and sends that to a central system where we can see this. We see no real changes on any of the accounts or averaged. I have to note here that we only use sonnet 4.6; opus always ran over limits if not continuesly monitored and switched over to sonnet since it came out and it's useless to us for that reason.
Unless you’re somehow on a different quota system, or maybe using Haiku, there’s no way you can sustain five continuous hours of parallel agents running without hitting the 5h quota limit, even on the 20x max plan. But maybe your company is flagged as VIP or something.
A bit surprised by the snarky comments here -- I also want Claude to work reliably but very few (no?) companies have ever seen this level of rapid growth. We're going to go through a long fail-whale-style period and I can imagine very, very few companies that could avoid that.
Claude Code started making stupid errors around Saturday. I have been using it frequently for months, and now it feels like back in the day when I tried Gemini for the first time.
The legendary one nine of reliability. Frankly, feels like they should be down to zero nines by now.
I get that they barely have the infrastructure to run their models at scale even when absolutely nothing goes wrong in any of it, but holy shit does it suck to be on the receiving end of that.
Makes me wonder where all the "bubble" talk is even coming from when we have a top 3 provider getting fucked over on every day of the week that ends in Y because of its inability to online compute faster than the inference demand grows.
88 comments
And the recent “Investigating usage limits hitting faster than expected” [1] is probably them intentionally gauging how much they can push it without too much of an uproar.
[1] https://www.reddit.com/r/ClaudeAI/comments/1s7zgj0/investiga...
> From March 13, 2026 through March 28, 2026, your five-hour usage is doubled during off-peak hours (outside 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT) on weekdays). Usage remains unchanged from 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT on weekdays. Source: https://support.claude.com/en/articles/14063676-claude-march...
We’ve also observed a much higher cache miss rate in the past few weeks. Combine both together and your usage consumption can be greatly increased.
Did you already try tools that can help to reduce token usage cost so you can get more prompts in within your same plan? Some great ones are
https://github.com/gglucass/headroom-desktop
https://github.com/rtk-ai/rtk
https://github.com/chopratejas/headroom
https://github.com/samuelfaj/distill
Current phase of usage/pricing is just testing the waters. Especially considering they are the market leader in this category.
I am using Claude constantly, multiple agents, around 8-10hrs a day, 5 or 6 days a week, and I'm never anywhere need my limit.
API Error: 529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id": "xxxxxxx"}
After all it's so dangerous.
(though Copilot is working :) and OpenCode)
But any combination of the Claude models are up or down on any given day: https://status.claude.com/
I get that they barely have the infrastructure to run their models at scale even when absolutely nothing goes wrong in any of it, but holy shit does it suck to be on the receiving end of that.
Makes me wonder where all the "bubble" talk is even coming from when we have a top 3 provider getting fucked over on every day of the week that ends in Y because of its inability to online compute faster than the inference demand grows.