Claude Code users hitting usage limits 'way faster than expected' (theregister.com)

by samizdis 224 comments 331 points
Read article View on HN

224 comments

[−] pxtail 45d ago
Recently after noticing how quickly limits are consumed and reading others complaints about same issue on reddit I was wondering how much about this is real error or bug hidden somewhere and how much it's about testing what threshold of constraining limits will be tolerated without cancelling accounts. Eventually, in case of "shit hits the fan" situation it can be always dismissed by waving hands and apologizing (or not) about some abstract "bug".

The lack of transparency and accountability behind all of this is incredible in my perception.

[−] vintagedave 45d ago
I've run into this, and I highly doubt I am one of the more extraordinary users. I have delays between working with it, don't have many running at once, am running on smaller codebases, etc. Yet just a few minutes ago I hit a quota. In the past I did far more work with it without running into the quota.

I emailed their support a few days ago with details, concerns, a link to the twitter thread from one of their employees, and a concrete support request, which had an AI agent ('Fin') tell me:

> While our Support team is unable to manually reset or work around usage limits, you can learn about best practices here. If you’ve hit a message limit, you’ll need to wait until the reset time, or you can consider purchasing an upgraded plan (if applicable).

I replied saying that was not an appropriate answer.

You're absolutely right re the lack of transparency and accountability. On one hand, Anthropic generates good will by appearing to have a more ethical stance then OpenAI, and a better product. On the other hand, they kill it fast through extremely poor treatment of their customers.

If they have a bug, they need to resolve it: and in the meantime refund quotas. 'Unable to' - that's shocking. This is simple and reasonable. It's basic customer service. I don't know if they realise the damage their attitude is doing.

[−] Kim_Bruning 45d ago
Fin is the most useless thing ever. There's no obvious way to get reports in front of a human in a timely manner, and there's no clue to believe fin interactions are retained.

This does mean ultimately no loyalty. I can't stay loyal to a brand that doesn't actually respond to inquiries, bug reports or down reports at all.

I do understand that Anthropic is operating at a tremendous scale and can't have enough humans in the loop. This sounds like a good use for ai classification and triage, really!

[−] traceroute66 45d ago

> I can't stay loyal to a brand that doesn't actually respond to inquiries, bug reports or down reports at all.

Amen to this.

Being in business means having to respond to customer enquiries at some point.

Given the amount of billions being pumped into Anthropic's pockets and given the millions their senior-leadership no doubt pay themselves, I'm sure they could spare a bit of cash to get off their backsides and sort out the Customer Service.

I simply do not buy the "poor Antropic, they are operating at scale, they are too busy winning to deal with customer service" argument that comes up time and time again.

The fact is there are many large businesses, many large governments that are able to deal with customers "at scale".

Scale means you respond a bit slower, maybe a few days or at most a couple of weeks AT MOST. But complete silence for months or years is inexcusable.

All of my experiences with "Fin" matches that of my friends and colleagues .... namely that "Fin" is a synonym for "black hole". I've got "tickets" opened with "Fin" months ago that have not had a modicum of reply.

[−] gaws 44d ago

> Being in business means having to respond to customer enquiries at some point.

Tell that to Google or Meta.

[−] Xmd5a 45d ago
[flagged]
[−] ThunderSizzle 45d ago
What started that though?
[−] therobots927 45d ago
It’s funny to me that you think this is a bug.
[−] joshuak 45d ago
It is also interesting to observe that your most valuable accounts in this kind of pricing model are the ones that are least used and therefore are not confronted by the limits. Heavy users canceling their accounts in frustration is a win for Anthropic not a punishment, at least a short term.
[−] HWR_14 45d ago
Casual users follow the recommendations of power users. Pushing heavy users off your service is a post-growth optimization
[−] falkensmaize 45d ago
I suspect casual users are MUCH more likely to either cancel their account or switch providers on a whim.
[−] JambalayaJimbo 45d ago
Once you get used to using claude as an abstraction layer you start getting pretty reckless with it.

My organization has the concept of "premium models" where our limits reset every month. I hit my limit pretty quickly last month because I was burning tokens doing things that would have been a simple bash loop in the past - all because I was used to interfacing with Claude at the chat layer for all my automation needs and not thinking any more about it.

[−] devmor 45d ago
This is a real danger that I think a lot of people will run into as prices go up more and more in the future.

Completely outside of the productivity debate, offloading cognitive tasks to LLMs leaves you less practiced in them and less ready to do them when the LLM isn't available. When you have to delegate only certain tasks to the LLM for financial reasons, you may find yourself very frustrated.

[−] johntash 45d ago
I'm really hoping locally hosted llms get to the point of competing with current-day frontier models so that we all have "unlimited" usage.
[−] totalmarkdown 45d ago
[flagged]
[−] 3abiton 45d ago
This is the bet of many of the big AI companies, and why they're subsidizing majorly the calls. With the latest cracks by the US gov, it seems Anthropic is starting to reduce those subsidies given their edge in the game. I am starting to consider local models more seriously beside just testing, but nowadays the ram/gpu market is bloated.
[−] devmor 45d ago
Local models just don't seem that useful for me for these particular tasks yet - the most recent versions of Codex and Claude Opus are the first time I've found them to be particularly useful in a "real engineering" context that isn't just vibe coding.

Google's TurboQuant might help address this, but it also might just widen the gap even further.

I am far on the skeptic edge when it comes to the generative AI side of ML tools though, so do take my opinion with that weight.

[−] 3abiton 45d ago
Turboquant is totally irrelevant compared to current quantization methods. It has been thoroughly test by people who build inferencing engines for local models. It's all talk no actual meat to it.
[−] devmor 44d ago
Do you have any reading on this? I find it hard to believe something announced a week ago has been “thoroughly tested”.
[−] 3abiton 44d ago
Their paper TurboQuant (TQ) is not new per say. It's released last year, and heavily rehash of old ideas that were released a year prior (RabitQ). There is also [a bit of drama](https://openreview.net/forum?id=tO3ASKZlok) there that boils down to what it seems a bit of malpractice for google's researchers. TQ does few things: it claims better compression quality and speed, and better KV cache handling. Currently KV cache takes a load of resources beside that of the model itself. Many people applied different quantization strategy for it, but the quality degradation is a too apparent. Enter Attention Rotation. This seems to have genuinely helped KV cache compression as per [llama.cpp latest tests](https://github.com/ggml-org/llama.cpp/pull/21038). On the other hand, [ik_llama.cpp](https://www.reddit.com/r/LocalLLaMA/comments/1s7nq6b/technic...) did tests on the quality of turboquant-3 compared to IQ4 quantized models, and yhe quality degradation is much worse. So it's 2 things: KV compression -> good. Turboquant quantazation -> not good.
[−] cyanydeez 45d ago
Seriously, who isnt planning a local first strategy?
[−] devmor 45d ago
I am sure a lot of people and orgs are - but realistically the majority of users need to understand and prepare not for local-first, but for the fact that they will never have that option for the models they know are the most useful to them.
[−] therobots927 44d ago
Every series A-C startup
[−] flu_bar 41d ago
do you think we're already seeing mental atrophy play out?

or do you think model inference/training will get cheap that we won't reach the point of "high prices"?

[−] thisisit 45d ago
They keep running experiments like free $50 in extra use credits or 2x usage outside certain windows where inference is very slow. You can’t help but think this is all a slowly boiling the frog experiment. Experimenting how much they can charge.
[−] blharr 45d ago
They're boiling the frog pretty quickly, honestly. The token usage has clearly been an issue since using Claude code from the beginning. It just blows through tokens
[−] joshuafuller 45d ago
This feels a lot like the same playbook we’re seeing with dynamic pricing in retail, just applied to compute instead of products. You never really know what you’re getting, and the rules shift under you.

What makes it worse is the lack of transparency. If there were clear, hard limits, people could plan around it. Instead it’s this moving target that makes it impossible to trust for real work.

At some point it stops feeling like a bug and starts feeling like a pricing experiment on users.

[−] bayarearefugee 45d ago
The clear trend over the past decade or so has been using analytics and data gathering to extract maximum rents from every customer in every industry and AI is going to massively accelerate this.

The only way out is government regulation which means we are screwed in the US (our government is too far gone to represent average citizen interests in any meaningful way) but Europeans maybe have a chance if they get it together and demand change.

[−] captainbland 45d ago
It's been pretty clear for a while that companies who have developed foundation models have essentially unprecedented levels of investment to recoup. For all the talk of faster hardware and more efficient models, that spend hasn't gone away and ultimately that investment needs to get a return somewhere.

Dependency on cloud AI models is, in effect, dependency on VC subsidy. From the user's point of view, this dependency is debt which will either be repaid with interest to a model provider or through the hard work of making themselves independent of such models after having become dependent.

[−] therobots927 44d ago
Wow, someone here has above a room temp IQ.
[−] tartoran 45d ago
What a horrid glimpse in the future. I hope we won't get there and we all collectively fight back with our wallets.
[−] ryandrake 45d ago
It's going to get much worse. We're soon going to have enough data and compute (and are losing enough online privacy) to allow every company to apply personalized pricing down to the individual. My local restaurant is going to know that I am willing to buy a burger for at most $4.57 and my neighbor is only willing to pay $2.91 for it, and they will have the ability to charge us individually. Every business is going to soak each of us us to the maximum extent that the data says they can.
[−] falkensmaize 45d ago
I think there’s a pretty good argument to be made that this is discriminatory. Certainly it’s not something I would tolerate as a consumer. I suspect there will be heavy pressure to regulate this practice out of existence if it catches on.
[−] gmerc 45d ago
who is going to stop them? the consumer protection bureau?
[−] captainbland 45d ago
Depends what the political attitudes are where you live. The EU is unlikely to let it fly for example.
[−] cheschire 45d ago
Then your neighbor can charge you up to $1.65 to buy a burger on your behalf and you still get it cheaper.
[−] fcarraldo 45d ago
How can you compete when the algorithms are custom, individualized, and private? How would you even know that you should?
[−] cheschire 45d ago
Not competition, but more like an opportunity for a startup to build a solution that fits in the new gap. A marketplace for people to sell their discounts.
[−] symfoniq 45d ago
Who would voluntarily do business with a company that does this? Not me.
[−] fcarraldo 45d ago
Do you use Uber, Lyft, or Doordash?

What about airlines? https://fortune.com/2025/07/16/delta-moves-toward-eliminatin...

What about Staples or Home Depot? https://www.wsj.com/articles/SB10001424127887323777204578189...

[−] ryandrake 45d ago
Eventually, when all of them do this (and they will be effectively forced to in order to remain competitive), then we will not have a choice.
[−] nvch 45d ago
I will make burgers myself. I take this approach with many things and services without great suppliers anyway. And I don't care if it's suboptimal because, in the long run, I'll have better skills and be protected from exactly this trend.
[−] thunderfork 45d ago
Everyone who uses Uber is voluntarily doing business with a company that does this. When was the last time you took an Uber?
[−] Tade0 45d ago
I'm worried that the present is actually living off a line of credit that will be spent/closed soon.
[−] gmerc 45d ago
That’s what you get when you sign contracts in airline reward miles
[−] nicce 45d ago
Are they going to pay back if subscription was payed but token limit was less than advertised? Is there some tiny text somewhere preventing just suing or pulling money back with credit cards?
[−] skywhopper 45d ago
Everyone on my team has been running into this, including the super users on the Max plan and the skeptics who only use it every few days. The quota is going way faster than it did before, sometimes a single prompt will eat up a third or more of the session quota.
[−] foxyv 45d ago
I suspect that Claude had a bug that undercounted tokens and they fixed it.
[−] tjoff 45d ago
Working as intended? They openly state that how quickly your limit is reached depends on many factors (that you don't know) as well as current load on their systems.

Could just be that usage has gone up.

[−] dinakernel 45d ago
This turned out to be a bug. https://x.com/om_patel5/status/2038754906715066444?s=20

One reddit user reverse engineered the binary and found that it was a cache invalidation issue.

They are doing some hidden string replacement if the claude code conversation talks about billing or tokens. Looks like that invalidates the cache at that point.

If that string appears anywhere in the conversation history, I think the starting text is replaced, your entire cache rebuilds from scratch.

So, nothing devious, just a bug.

[−] p2hari 45d ago
I cancelled my pro plan last month. I was using Claude as my daily driver. In fact had the API plan also and topped it with $20 more. So it was around $40 each month. Starting from December last year it has been like this. When sessions could last a couple of hours with some deep boilerplate and db queries etc. to architecture discussion and tool selection. Slowly the last two months it just gets over. One prompt and few discussions as to why this and not that and it is done.
[−] aliljet 45d ago
There's a weird 'token anxiety' you get on these platforms. And you basically don't know how much of this 'limit' you may consume at any time. And you actually don't even know what the 'limit' is or how it's calculated. So far, people have just assumed Anthropic will do the kind thing and give you more than you could ever use...
[−] elephanlemon 45d ago
Yesterday (pro plan) I ran one small conversation in which Claude did one set of three web searches, a very small conversation with no web search, and I added a single prompt to an existing long conversation. I was shocked to see after the last prompt that I had somehow hit my limit until 5:00pm. This account is not connected to an IDE or Code, super confusing.
[−] midnightdiesel 45d ago
It seems like Anthropic is constantly changing the rules and pulling out rugs, and always entirely by surprise. I’m not sure if they’re incompetent or just careless, but I stopped paying them because of this a while ago, and my days are much more interesting and enjoyable using my own brain instead.
[−] p1necone 45d ago
I burn through the entire 5 hour limit in one or two "implement the feature outlined in this doc" requests with claude pro in a not even huge codebase (low tens of thousands of loc). If there were any reasonable alternatives I wouldn't even consider using it, but sonnet 4.6 (and presumably opus 4.6 - I don't use it as sonnet is faster and more than good enough) is the only model I've used that actually makes good decisions in complex codebases - anything else just gets stuck in the weeds and produces either non working code or tech debt (after churning for a long time).

I have seen more than one comment on this thread mentioning kimi though - I'll have to test it out.

qwen3-coder-next has been surprisingly capable as a local model too - needs to be used to make small changes where you know exactly what the final code should look like rather than implementing whole features, but it is free (except for the power bill).

[−] 0xbadcafebee 45d ago
I've found a lot of people are almost belligerently pro-Claude. They refuse to consider other providers or agents, and won't consider using any model than the latest Opus. The most common reasons I hear are 1) they don't want to use anything other than the greatest model, afraid that anything else would waste their time, 2) they believe their experience is that it's far better than anything else.

Even if you show them benchmarks that show another model equally as good if not better, they refuse to use it. My suspicion is they've convinced themselves that Opus must be the best, because of reputation and price. They might've used a different model and didn't have a good experience, making them double down.

I hope a research institution will perform an experiment. My hypothesis is that if you swapped out a couple similar state-of-the-art models, even changing the "class" of model (Sonnet <-> Opus, GPT 5.4 <-> Sonnet), the user won't be able to tell which is which. This would show that the experience is subjective, and that bias is informing their decision, rather than rationality.

It's like wine tasting experiments. People rate a $100 bottle of wine higher than a $10 bottle. But if they actually taste the same, you should be buying the $10 bottle. But people don't, because they believe the $100 bottle is better. In the AI case, the problem is people won't stop buying the expensive bottle, because they've convinced themselves they must use the more expensive bottle.

[−] robviren 45d ago
I find Claude code to be a token hog. No matter how confidently the papers say context rot is not an issue I find curating context to be highly important to output quality. Manually managing this in the Claude Webui has helped with my use cases more than freely tossing Claude code at it. Likely I am using both "wrong" but the way I use it is easier for me to reason about and minimize context rot.
[−] 1970-01-01 45d ago
This has been verified as a bug. Naturally, people should see some refunds or discounts, but I expect there won't be anything for you unless you make a stink.

https://old.reddit.com/r/ClaudeCode/comments/1s7zg7h/investi...

[−] ZeroCool2u 45d ago
I'm finishing my annual paid Pro Gemini plan, so I'm on the free plan for Claude and I asked one (1) single question, which admittedly was about a research plan, using the Sonnet 4.6 Extended thinking model and instantly hit my limit until 2 PM (it was around 8 or 9 AM).

Just a shockingly constrained service tier right now.

[−] bottomlessmug 34d ago
Definitely hitting limits insanely faster. Not sure if it's the same problem for everyone or they're cranking up usage for low users to support others. I do bursts of work and a month ago would write a lot of code every window and then pause for reset using mix of chat and CLI. Currently ran a excel work session with code, burnt a lot. Asked it to give me a view on how to optimise excel work to burn less, got a 2 paras of answer burning 9% of session and i copy-pasted those two paras into Chat since it usually cost me less to plan there, asked it to help me figure out best way to set this up - 11% burnt for a couple of bullet point options. Has never happened so fast.

This is a massive shift from my previous experience.

[−] kneel 45d ago
I asked it to complete ONE task:

You've hit your limit · resets 2am (America/Los_Angeles)

I waited until the next day to ask it to do it again, and then:

You've hit your limit · resets 1pm (America/Los_Angeles)

At which point I just gave up

[−] reenorap 45d ago
The only way AI will be profitable to companies like Anthropic or OpenAI is to make the cost $1000-2000/month or more for coding. Every programmer will be forced to pay for it because it's only a fraction of their salary (in the US anyway) and it's the only way the programmer will be competitive. Whether the company pays for it, or they pay for it themselves, it will need to be paid.

There's no other way that these companies can compete against the likes of Google, and Facebook unless they sell themselves to these companies. With AWS and GCP spending hundreds of billions of dollars per year, there's no way that Anthropic or OpenAI can continue competing unless they make an absurd amount of money and throw that at resources like their own datacenters, etc and they can't do that at $20/month.

[−] npilk 45d ago
It seems pretty clear there is some sort of bug that only some people are experiencing (or, very cynically, perhaps an A/B test). My usage hasn't seemed to change much in the past few days, but then I see reports where people are hitting limits after one or two prompts. I doubt that could be user error or new limits.

Anthropic has said they are investigating. https://www.reddit.com/r/ClaudeAI/comments/1s7zgj0/investiga...

[−] Kim_Bruning 43d ago
Ok, I have part of the answer at least.

Claude recently improved Opus 4.6 to have a 1Mtoken context. Cache normally invalidates after 5 minutes.

If you come back or --continue after a break (or 5 minutes), that's a MASSIVE hit to your session limit. 250000 tokens at Max x5 will ding you 10% of your session for "Hi, I'm back".

So say you don't typically do /compact very often. And say you're not very chatty and "do the right thing" by only asking a question once in a while? You'll burn through context like crazy.

Meanwhile if you have ADHD and anthropomorphize the bleep out of your claude and chat with them all day long? Hardly a dent!

This trick seems to work for now (ymmv)

Tell your system to

   CronCreate
      cron: "*/4 * * * *"
      prompt: "heartbeat — no action needed"
And turn it back off at end of day.

I'm sure anthropic will be thrilled by this, but I don't have a better solve at this time yet.

Context management is a thing. Unfortunately you're not allowed to use any tool other than claude code with the Anthropic Subscription, so I guess this is the solve they asked for. Allowing people to write their own tools with superior context management would seem to be a no-brainer to me, but what do I know?

[−] techgnosis 45d ago
* Hardware will manage models more efficiently

* Models will manage tokens more efficiently

* Agents will manage models more efficiently

* Users will manage agents more efficiently

Why are we acting like technology is on pause?

[−] garrickvanburen 45d ago
Considering: - Anthropic decides how much a token is worth. - Users have no visibility or ability to control in how many tokens a given response will burn.

This is the only expected answer. https://forstarters.substack.com/p/for-starters-59-on-credit...

[−] giancarlostoro 45d ago
I'm guessing their newer models are taking way more compute than they can afford to give away. The biggest challenge of AI will eventually be, how to bring down how much compute a powerful model takes. I hope Claude puts more emphasis into making Haiku and Sonnet better, when I use them via JetBrains AI it feels like only Opus is good enough, for whatever odd reason.
[−] nprateem 45d ago
I literally ran out of tokens on the antigravity top plan after 4 new questions the other day (opus). Total scam. Not impressed.
[−] stavros 45d ago
Anthropic went about this in a really dishonest way. They had increased demand, fine, but their response was to ban third-party clients (clients they were fine with before), and to semi-quietly reduce limits while keeping the price the same.

Unilaterally changing the deal to give customers less for the same price should not be legal, but companies have slowly boiled the frog in such a way that now we just go "welp, it's corporations, what can you do", and forget that we actually used to have some semblance of justice in the olden days.

[−] canada_dry 45d ago
I hit my limit on the project I've been working on (after I let "MAX" run out and moved to "PRO") after about only 2 hours!

TIP (YMMV): I've found that moving the current code base into a new 'project' after a dozen or so turns helps as I suspect the regurgitation of the old conversations chews up tokens.

[−] rglover 45d ago
If you haven't tried it yet, I'd recommend Cline as an alternative (with full support for Anthropic API). Tracks the current token spend on chats so you know when to do a /newchat. Really nice way to budget token spend on a task-by-task basis and your flow isn't interrupted by limits.
[−] Saline9515 45d ago
The way Anthropic prices its services is honestly dubious at best. You have no way to know what the real limits are, nor to verify what was actually consumed. For most people it's ok because it's likely heavily subsidized, however this won't last forever...
[−] edbern 45d ago
Yesterday asked claude to write up a simple plan adding very basic features to a project I'm working on and it took 20% of 5-hour pro plan limit. Then somehow Codex seems to be infinite. Is OpenAI just burning through way more cash or are they more efficient?
[−] bjconlan 44d ago
While I haven't read all the posts here I was wondering if anyone also noticed a 10% usage before their most recent weeks usage even started? (Specifically over 2026-03-27/28) I was seeing weird service outages over this time too. I suspect they're not being 100% truthful with how they are recording usage (feels like they had an agent run a backfill approximation). So they blur it with weekends rates etc.

Anyways I don't have the knowledge as to how to audit this (claud pro) to confirm what feels like an onboard at any cost business behavior.

Is anyone currently auditing through openrouter/litellm and seeing any poor correlation to the session/weekly limit?

[−] pagecalm 45d ago
Hit this myself recently, along with a bunch of overloaded errors. I think it's growing pains for where we are with AI right now.

As the tooling matures I think we'll see better support for mixing models — local and cloud, picking the right one for the task. Run the cheap stuff locally, use the expensive cloud models only when you actually need them. That would go a long way toward managing costs.

There's also the dependency risk people aren't talking about enough. These providers can change pricing whenever they want. A tool you've built your entire workflow around can become inaccessible overnight just because the economics shifted. It's the vendor lock-in problem all over again but with less predictability.

[−] delphic-frog 45d ago
The token usage differs day to day - that's the most frustrating part. You can't effectively plan a development session if you aren't sure how far you'll likely get into a feature.
[−] bicepjai 43d ago
After spending some time on how Claude code (leaked) tools were written, it makes sense why we constantly hit limits, for all the amazing llm capabilities, CC does not have edit tool and always read before write ( this makes sense) but I expected some surgical precision magic software, seems like other agents like master open code and pi are way better than CC in taken usage
[−] therobots927 45d ago
There is both the opportunity for and an incentive towards these companies actively deceiving users, both by hiding the true amount of subsidies behind AI output and by shuffling users between high and low quality models in order to minimize said subsidies. It’s difficult for me to understand why most engineers here don’t seem to get this.

If you’re not listening to Ed Zitron you’d better start if you don’t want to get whiplash in the coming months.

[−] lukewarm707 45d ago
please tell me if i'm crazy.

i just refuse to use openai/google/anthropic subscriptions, i only use open source models with ZDR tokens.

- i like privacy in my work, and i share when i wish. somehow we accepted that our prompts and work may be read and moderated by employees. would you accept people moderating what you write in excel, google docs, apple pages?

- i want a consistent tool, not something that is quantised one day, slow one day, a different harness one day, stops randomly.

- unless i am missing something, the closed source models are too slow for me to watch what they are doing. i feel comfortable with monitoring something, usually at about 200-300tps on GLM 5. above that it might even be too fast!

[−] bradlannon 44d ago
I was literally planning a new feature using superpowers. I typed "continue" after it hit it's limited. No joke, about 1 minute later, I looked at the token % left and it said I was at 20% already. I literally typed continue and then took 1 minute to look at the usage. Something is seriously broken!!!!!!
[−] sibtain1997 45d ago
Faced this too. Tried https://github.com/rtk-ai/rtk to compress cli output but some commands started failing and the savings were minimal. Ended up just being more deliberate about context size instead of adding more tooling on top
[−] _JoRo 45d ago
I've used Claude Max awhile now, and I usually only get to around 50% usage in a 4/5hr block (using medium effort). Yesterday, I switched from high -> medium effort using the /model command, but afterwards it still felt like I was burning through tokens at the high effort rate.
[−] shafyy 45d ago
What is the best way to get start with open weight models? And are they a good alternative to Claude Code?
[−] rajadroit2026 44d ago
I am sensing from last couple of days, hitting limit much earlier. Today Claude has hardly created one markdown file and complaining about hitting limit issue. Any idea when this will return to normal. It has been really creating a lot of delay.
[−] torginus 45d ago
I dunno, but CC might give away tokens for cheaper, but when I used Opus as standalone in Cursor, I definitely get way more mileage out of a token.

Considering how much progress I made vs how much I paid, I couldn't make a scientific assessement, but it felt pretty close.

[−] nitekode 45d ago
This could also be because of the recently introduced 1 million token buffer. I also saw my tokens drain away quickly; then in noticed I was pushing 750k tokens through for every prompt :) Sometimes its hard to get into the habit of clearing
[−] Asmod4n 45d ago
When asking it to write a http library which can decode/parse/encode all three versions of it the usage limit of the day gets hit with one sentence. In the pro plan. Even when you hand it a library which does hpack/huffmann.
[−] ryan42 45d ago
claude automatically enabled "extra usage" on my pro account for me (I had it disabled) and the total got to $49 extra before I noticed. I sent an email asking wtf but I don't expect much.
[−] zackify 45d ago
After using it all week on pro plan it worked fine for me. Hit limits a couple times.

But if I was doing deep coding on pro plan it would have sucked.

You can't expect to use massive context windows for $20

[−] HDBaseT 45d ago
I asked Claude on the $20 plan to rewrite the Linux Kernel and ffmpeg in Rust (using Opus 4.6, Ultra Thinking) with high verbosity and it ran out of usage!
[−] mszczodrak 45d ago
I've been hitting the API limit errors over Claude CLI, yet the total usage was 0% on the claude.ai website. Changing the model fixed the problem.
[−] anon7000 45d ago
I think I ran into this yesterday, with Claude Code taking FOREVER on a lot of tasks. But using Claude within Cursor seems way faster
[−] paulbjensen 45d ago
I have found that:

- If I ask Claude to go and build a product idea out for me from scratch, it can get quite far, but then I will hit quota limits on the pro plan ($20pm).

- I have not drunk the Kool-aid and tried to indulge in ClaudeMaxxing (Max plan at $200pm). I need to sleep and touch grass from time to time.

- I don't bother with a Claude.md in my projects. I just raw-dog context.

- If I have a big codebase, and I'm very clear about what code changes I want to make Claude do, I can easily get a lot of changes made without getting near my quota. It's like Mr Miyagi making precision edits to that Bonsai Tree in Karate Kid.

My last bit of advice - use the tool, but don't let the tool use you.

[−] aperture_hq 45d ago
There is no transparent metrics on the token usage count, they just compare their plans with their plans.
[−] sudo_and_pray 45d ago
I gave claude code a try at home ($20 sub), since we use it at work without any limits and I wanted to see how I can use it on some of my projects.

It was a big disappointment and it just burned through tokens so fast that I hit first limit after 30 minutes while it was gathering info on my project and doing websearches.

My experience was that when I wanted to use it, maybe 2-3 days per week, Pro sub was not enough. On some days I did not use it at all. The daily or weekly token limit was really restrictive.