Anthropic downgraded cache TTL on March 6th

[−] sunaurus 33d ago

Has anybody else noticed a pretty significant shift in sentiment when discussing Claude/Codex with other engineers since even just a few months ago? Specifically because of the secret/hidden nature of these changes.

I keep getting the sense that people feel like they have no idea if they are getting the product that they originally paid for, or something much weaker, and this sentiment seems to be constantly spreading. Like when I hear Anthropic mentioned in the past few weeks, it's almost always in some negative context.

[−] foofloobar 33d ago

Claude Code and the subscription are now less useful than a few months ago. Claude Code and the service seem to pick up more and more issues as time goes by: more bugs, fast quota drain, reduced quota, poor model performance, cache invalidation problems, MCP related bugs, potential model quantization and other problems.

Claude Code was able to implement something in one shot. It was decent for a proof of concept initial implementation. It's barely able to do work now with full specs and detailed plans.

ChatGPT is also being watered down.

It seems obvious that Anthropic and OpenAI aren't the solution to any problem.

[−] cassianoleal 33d ago

The title should be changed. It makes it look like they upped the TTL from 1 h to 5 months.

The SI symbol for minutes is "min", not "M".

A compromise would be to use the OP notation "m".

[−] albert_e 33d ago

So a side effect of this is -- even at 1 hour caching -- ...

If you run out of session quota too quickly and need to wait more than an hour to resume your work ... you are paying even more penalty just to resume your work -- a penalty you wouldnt have needed if session quota was not so restrictive in first place, and which in turn causes you to burn through next session quota even faster.

Seems like a vicious cycle that made the UX very poor. I remember Claude Code with Pro became virtually unuseable in middle of March with session quota expiring within first hour or less for me -- which was wildly different experience from early March.

[−] disillusioned 33d ago

It's also routinely failing the car wash question across all models now, which wasn't the case a month ago. :-/

Seeing some things about how the effort selector isn't working as intended necessarily and the model is regressing in other ways: over-emphasizing how "difficult" a problem is to solve and choosing to avoid it because of the "time" it would take, but quoted in human effort, or suggesting the "easier" path forward even if it's a hack or kludge-filled solution.

[−] benced 33d ago

Anthropic responded: https://github.com/anthropics/claude-code/issues/46829#issue...

[−] davidkuennen 33d ago

On slightly off topic note: Codex is absolutely fantastic right now. I'm constantly in awe since switching from Claude a week ago.

[−] layer8 33d ago

From the recent-ish Dwarkesh podcast, Anthropic seems to be wary about buying/building too much compute [0]. That probably means that they have to attempt to minimize compute usage when there is a surge in demand. Following the argument in the podcast, throwing more money after them, as some in this thread are suggesting, won’t solve the issue, at least not in the short term.

[0] https://www.dwarkesh.com/i/187852154/004620-if-agi-is-immine...

[−] hirako2000 33d ago

There is a chef, he opens a restaurant. Delicious food.

It costs him more in ingredients alone than he charges. He even offers some pseudo unlimited buffet, combo sets, and happy hours.

He announced a new restaurant, apparently it will be even better, so good he's a bit worried. He makes sure to share his worries while he picks a few select enterprise for business parties and the likes.

In the meantime he cracks down on free buffet goers who happen to eat too much, and downgrades all ingredients without notice to finally hope to make a profit.

[−] Tarcroi 33d ago

This coincides with Anthropic's peak-hour announcement (March 26th). Could the throttling be partly a response to infrastructure load that was itself inflated by the TTL regression?

[−] perks_12 33d ago

Just give us the option to get the quality back, Anthropic. I get that even a $200 subscription is not possible eventually, but give us the option to sub the $1000 tier or tell us to use the API tier, but give us some consistency.

[−] bsaul 33d ago

could it be that anthropic is experiencing a massive shortage of compute capacity, and is desperately trying to find means to overcome it ?

All the news i hear about this company for the past weeks made it sound like they're really desperate.

[−] hattimaTim 33d ago

Classic scammer tactics: first, lure users in by promising a huge deal, then scam the hell out of them.

[−] throwaway2027 33d ago

I also noticed this, just resuming something eats up your entire session. The past two weeks also felt like a substantial downgrade and made me regret renewing my subscription, it sucks because I wish I kept my Codex subscription instead and renewed that.

[−] zoogeny 32d ago

As an aside, I built a tool to manage my own chat interface over the provider APIs. I added caching because the savings are quite significant and I have a little countdown timer that shows me how much time remaining until the cache is expired.

However, for the basic turn-based conversation the cache (at 5 minutes) is almost always insufficient. By the time I read the LLM response, consider my next question, write it out, etc. I frequently miss the cache.

I imagine it is much more useful if you have a tool that has a common prefix (like a system instruction, tool specs or common set of context across many users).

If you can get it to work frequently enough the savings are quite worth it.

[−] poly2it 33d ago

One of the largest AI companies on Earth cannot figure out an algorithm for when not to drop caches in long-running sessions?

[−] foobar10000 33d ago

So, this especially bites if your validation step (let’s say integration tests) take 1hr plus. The harness is just waiting, prefix caching should happily resume things with just a minor new prefill chunk of output from the harness, and bam - completely new prefill.

[−] par 33d ago

Claude code has gone down hill in a really bad way. It is often far too quick to make significant changes, and requires much higher level of hand-holding and explanation than I am used to. r/claudecode on reddit shows a litany of complaints!

[−] willworktill4pm 33d ago

This Friday CC wrote wall off gibberish text for me. No reason, happened twice with different gibberish text

https://ibb.co/4wcVQG5k

[−] zeckalpha 33d ago

I find similar happening with Gemini Pro. Despite paying for Pro, it regularly locks me out, without visibility into consumption. Nothing on the plan comparison page indicates limits. https://one.google.com/about/plans

Edit: I may have conflated these two threads. https://news.ycombinator.com/item?id=47739260

[−] throwaway2027 33d ago

It's absolutely ridiculous how stupid Claude is now. I sometimes notice it and last year too but it feels like it's just last year before December model.

[−] the_mitsuhiko 33d ago

Since I (until Anthropic decided to remove access for subs) used Anthropic models extensively with pi I explored the two caching options and the much higher cost of 1h caches is almost never a good tradeoff.

Since the caching really primarily is something they can be judged at scale from across many users I can only assume that Anthropic looked at their infra load and impact and made a very intentional change.

[−] azuanrb 33d ago

As a Pro user, even though these issues and bugs are “new,” the downgrade has been noticeable since January. I’ve unsubscribed because the Pro plan is no longer usable for me.

It’s only making the news now because it’s affecting Max users as well ($100/$200 plans). I understand the need for change, but having zero communication about it is just wrong.

[−] almog 32d ago

Given how the cache eviction policy is mismatched with the 5h usage window, it might make sense to just stop at say 97% of the session max usage and keep running a script every 4 min and 50 sec that consumes a minimal number of tokens whose entire purpose is to keep the cache. reply

[−] motbus3 33d ago

The TOS basically states you need to deal with whatever they want.

Meanwhile their 'best' competitor just announced they want to provide unreliable mass destruction guidance tools but they don't wanna feel said.

Honestly speaking, we are wrong whenever we do business with this sort of people

[−] simianwords 33d ago

There’s a case for intelligent caching: coarse grained 1h and 5min type TTls are not optimal.

[−] PunchyHamster 33d ago

Well, how entirely expected. The money man comes to collect and they are squeezing for money

[−] pkaye 33d ago

Actually I remember the change being reported in the Reddit /r/claueai chat back around that time frame. I was concerned that it would increase costs but nobody made a fuss so I presumed it was not a big deal.

[−] superxpro12 32d ago

If anyone thinks this situation doesnt end in a massive global rugpull, y'all are asleep at the wheel.

The very instant the AI suppliers lock in a dependency on their product, prices are going through the roof.

[−] jasonjmcghee 32d ago

All the weird stuff happening with anthropic / Claude aside- just talking about this post:

Looking at the table with February and April- I don't get it. What am I missing?

The cost and number of calls look pretty aligned on all rows

[−] ikekkdcjkfke 33d ago

If youre reading this claude, people are willing to pay extra if you want to make more money, just please stop doing this undermining, it devreases the trust of your platform to something that cannot be relied on

[−] espeed 33d ago

Does Anthropic's real time data ingestion effect its model behavior globally? Could a file read by your agent effect the behavior of mine?

[−] c16 33d ago

I’ve definitely noticed in evenings it stops trying as hard to solve the issue and suggests I go find the answer. Never the case in the morning.

[−] srsbzns 32d ago

Gotta use the API directly for cache control

[−] sscaryterry 33d ago

Anthropic is leaving so much evidence around… proving damages and a pattern is becoming trivial

[−] snowstormsun 33d ago

Well, the 10x promised revenue increase must come from somewhere...

[−] lordmoma 33d ago

Claude Code is not performing on par since September 2025, there was already a huge backlash then, and many people just keep cheering for CC every time it made some model upgrade or TUI change, it just feels so unreal.

[−] taffydavid 33d ago

This is the same shit openAI used to do last year, quietly downgrading their offerings while hyping the next big thing. I thought Anthropic were different but it seems they're playing the exact same long con with Mythos.

They can't really revolutionize AI again so they make the product worse and worse and then offer you a "better" one

[−] ares623 33d ago

AGI finding bugs again. Actual Guys/Gals Instead.

[−] yobid20 33d ago

i thought it was always 5 minutes? ive been telling people 5 minutes for months so i dont think this is anything new?

[−] mrdw 33d ago

I noticed another limitation: "An image in the conversation exceeds the dimension limit for many-image requests (2000px). Start a new session with fewer images."

So I can't continue my claude code session I started yesterday.

[−] computerex 33d ago

Good job anthropic. You had a clear lead with all devs singing the praises of Opus. Way to lose all that by Enshittifying the experience.

[−] idrdex 33d ago

[dead]

[−] a7om_com 32d ago

[dead]

[−] AlexSalikov 33d ago

[dead]

[−] EthanFrostHI 33d ago

[flagged]

[−] cameolkc 32d ago

[dead]

[−] GetBurnd 33d ago

[dead]

Anthropic downgraded cache TTL on March 6th (github.com)

420 comments