No amount of valuation can fix global supply issues for GPUs for inference unfortunately.
I suspect they're highly oversubscribed, thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).
Wouldn't that be good? I remember back in the day you could only get Gmail thru an invite, it was an awesome strategy. "Currently closed for applications" creates FOMO. They'd just need to actually get the GPUs in relatively short supply. They could do it in bursts though, right? "Now accepting applications for a short time."
I'm not an internet marketer but that sounds like a win win to me. People feel special, they get extra hype, and the service isn't broken.
Are you sure it was fake scarcity for Gmail? IIRC they did it because they were worried about systems falling over if it grew too fast, and discovered the marketing benefits as a side effect.
I didn't. Anthropic and others followed the concept of scaling up models and worry about efficiency and availability later. Sam likely didn't invent the idea but he talked about it.
maybe, but the response to GPU shortages being increased error rates is the concern imo. they could implement queuing or delayed response times. it's been long enough that they've had plenty of time to implement things like this, at least on their web-ui where they have full control. instead it still just errors with no further information.
i notice that as well. most of the time when i see those it has a retry counter also and i can see it trying and failing multiple requests haha. almost never succeeds in producing a response when i see those though, eventually just errors out completely.
That implies that either the auth is too heavy (possible, ish) or their systems don't degrade gracefully enough and many different types of failures propagate up and out all the way to their outermost layer, ie. auth (more plausible).
Disclosure: I have scars from a distributed system where errors propagated outwards and took down auth...
> thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).
The dynamic thinking and response length is funny enough the best upgrade I've experienced with the service for more than a year. I really appreciate that when I say or ask something simple the answer now just comes back as a single sentence without having to manually toggle "concise" mode on and off again.
I'm pretty sure ai-x writes sarcasm and skips the /s for pure fun. Personally, I'm amused and I like what he's doing. Others have done it before him though, it's not a new trick.
I literally just came to HN to ask if I was alone with the acurséd "API Error: 500 {"type":"error","error":{"type":"api_error","message":"Internal server error"},"request_id":"…"}" greeting me and telling me to get back to using my brain!
500-series errors are server-side, 400 series are client side.
A 500 error is almost never "just you".
( 404 is a client error, because it's the client requesting a file that does not exist, a problem with the client, not the server, who is _obviously_ blameless in the file not existing. )
I know you added the defensive "almost" but if I had a dollar each time I saw a 500 due to the session cookies being sent by the client that made the backend explode - for whatever root cause - well, I would have a fatter wallet.
Indeed, and also there's a special circle of hell reserved for anyone who dares change the interface on a public API, and forgets about client caching leading to invalid requests but only for one or two confused users in particular.
Bonus points if due to the way that invalid requests are rejected, they are filtered out as invalid traffic and don't even show up as a spike in the application error logs.
I know that in principle this is true. However, I have seen claude shadow-throttle my ipv4 address (I am behind CGNAT), in line with their "VPN" policy -- so I do not trust it, frankly.
This is how I learn that they have a "VPN" policy. Thinking of it maybe it makes sense, that is if it's what I think it is, but seems scummy nonetheless.
Yep, daily haha. Well at least this time they aren't just silently reducing thinking on the server side, which ended up making a mess in my codebase when they did that last time. I'd rather a 500 than a silent rug-pull.
Funny that I just saw this after have "Console temporarily unavailable". I am currently at the stage that: 1) I think Claude Code is very impressive 2) I think pretty much everything else about them is terrible.
* Support really poor, raised a ticket last week and have heard nothing back at all
* Separation of claude.ai accounts and console accounts is super confusing
* Couldn't log into the platform since I had an old org in the process of deletion even though I was invited to a new one (had to wait 7 days!)
* Payments for more API credits were broken for about a week
* Claude chat has really gone to s*t unless it always was. Just getting back terrible answers to simple questions.
* The desktop app is a web app pretending to be a desktop app that doesn't always know it is a desktop app so you get things like, "this will only work in the desktop app". Yes I know, this is the desktop app! "Oh sorry about that but you need to use the desktop app".
* mcp integration and debugging is dreadful, just a combination of generic "an error ocurred" and sometimes nothing at all
* MCP only supports OAuth for shared connectors but auth key doesn't work even with "local" servers that are not necessarily local, just the config is local.
Anthropic has been avoiding the hard thing, but they just need to do SOME kind of pricing thing here to shift demand. I expect there is just no amount of tricks that can handle the few hours at peak load. They need "surge pricing".
This seems reasonable surge pricing approach to me:
1. Implement surge pricing for everyone for the peak 2 hours of the day if possible. 2 hours you can work around, 5 hours is too hard.
2. Give existing customers a one time credit for surge
3. Make sure the plans just consume credits at an accelerated rate (e.g. if on the max plan, i just get 1/2 usage during peak hours).
4. Exempt sonnet/haiku from surge (so people can keep using)
5. Make "auto" settings in claude code etc automatically adapt during surge hours, so people don't get surprises by default.
6. For the first 90 days, unofficially waive the fist $100 in surge for every user but notify them. To train users about the surge, and get them used to it without having them actually pay.
7. (I don't think they would do this but this would help) allow users to fall back to using something like GLM 5.1 or Gemma 4 automatically in outages, with a partnership to handle it. Its not ideal, but i would prefer it in "partner mode" than not. IMO they can charge like 10% on top of the partner fees for this if used in other times, but during outages or surge, partner mode is free. But 100% managed by Anthropic so users don't need to set things up and can just use the Anthropic harness.
Have anyone found good techniques to get a session out of Claude Code, so that I can point another tool at it and pick up there?
This always seems to happen at the worst possible time, after having spent an hour getting deep into something – half finished edits across files, subagents running, etc.
- Anthropic introduced stringent limits at peak hours. By "introduced" I mean announced it on a random dev's Xitter account
- Users suddenly started burning through all of their tokens even on trivial tasks. Anthropic never truly acknowledged it, their random devs posted "we're working on it".
- One of the workarounds was to somewhat quietly reduce default reasoning to medium
- OpenClaw and "usage through other tools" banned
- Announce "redesigned Claude Code Desktop App that lets you run many parallel sessions"
- Availability is still circling down the drain
- Dario Amodei is in continuous "trust us we have AGI coding is solved we don't need programmers just give us more money" mode now
It seems like Claude has taken Github's place in terms of developer reaction to it being unavailable. It's like everyone forgot how they did things 18 months ago.
But I thought coding was solved? I guess having a single 9 of availability is something we need true AGI for, we should probably give OpenAI and Anthropic another gazillion dollars to burn through to figure this out!
I went back to the $20 plan and a single prompt maxed out my quota for the five hour window within 15 minutes. I used to be able to vibe code for over an hour before. This is really annoying.
OpenAI is very good in terms of not having as much outages as Anthropic, but almost all products except Codex and the pro model is unimpressive, anthropic has the opposite situation.
good advertisement now to shift the tide back to openai that just works and honestly codex with gpt 5.4 is _surprisingly good_ currently, not nerfed or forgetting half the tasks along the way so far. Opus already got worse than sonnet last weeks beyond just crazy token costs, now reliabilty goes to shit and anthropic seems like using it. Meanwhile, delightful of the codex desktop app in fact, stuff seems to "just work" elegantly with good quality.
Cross your fingers they're about to drop 4.7. 4.6 came out with a bang, now it seems all the compute bottlenecks just lead to customer frustration as they get closer to releasing next model. Balancing the books over there must be a nightmare, "Well we can piss off every single customer for a week, but we'll be able to release the next model 1 week faster"
Yea, it's peak time. They don't have enough compute. Why do you think they are banning external subscription use. They sell subscriptions. They don't need people to use CC. That doesn't matter. And yet - they won't have people using their service outside of CC. Something is fishy.
Why would anyone assume a new model is dropping when their status page is showing elevated errors? Are they that sloppy that they just let their status systems report failures when they are the ones deploying new infrastructure / models / etc?
Basically pushed the button staying up late finishing something, didn't really factor in a Claude outage in the middle of it, here's to red eyes while I use my clumsy fingers and brain to complete the task the old fashioned way.
While your developers are twiddling their thumbs waiting for Claude to come back online, your competitor is using alternatives to get work done right now and advancing on their go to market timeline.
I just had to upgrade my plan because I ran out of tokens because medium effort had dementia and things only worked on high. Good to know I'm getting my money's worth...
A few hours ago I noticed a considerable decline in code quality. It seemed the model got downgraded so I switched to codex. Anybody else noticed this? It starts to switch from deep reasoning and trying to fully grasp architectural changes to trying to solve things on a very adhoc basis. Maybe that's just my imagination or maybe that's Anthropic trying to balance the load before being fully overloaded.
224 comments
Claude Code returning: API Error: 500 {"type":"error","error":{"type":"api_error","message":"Internal server error"},"request_id":"---"}
Over and over again!
I suspect they're highly oversubscribed, thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).
I'm not an internet marketer but that sounds like a win win to me. People feel special, they get extra hype, and the service isn't broken.
In the case of Anthropic is fake availability.
Sam Altman explained the idea is to scale the thing up, and see what happens.
He hadn't claimed to offer a solution to the supply problem that would unfold.
Engineer roles dead in 6 months.
> I edit it. I code around it.
You're never gonna guess what software engineers do.
Disclosure: I have scars from a distributed system where errors propagated outwards and took down auth...
> thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).
The dynamic thinking and response length is funny enough the best upgrade I've experienced with the service for more than a year. I really appreciate that when I say or ask something simple the answer now just comes back as a single sentence without having to manually toggle "concise" mode on and off again.
B. Everything is down, even auth.
A 500 error is almost never "just you".
( 404 is a client error, because it's the client requesting a file that does not exist, a problem with the client, not the server, who is _obviously_ blameless in the file not existing. )
> A 500 error is almost never "just you".
I know you added the defensive "almost" but if I had a dollar each time I saw a 500 due to the session cookies being sent by the client that made the backend explode - for whatever root cause - well, I would have a fatter wallet.
Bad input should be a 4xx, but if the server can't cope with it, that's still a 5xx.
Bonus points if due to the way that invalid requests are rejected, they are filtered out as invalid traffic and don't even show up as a spike in the application error logs.
> in line with their "VPN" policy
This is how I learn that they have a "VPN" policy. Thinking of it maybe it makes sense, that is if it's what I think it is, but seems scummy nonetheless.
> Seems to be a very regular occurrence starting around this time of day (14:30 UTC)...
8.30am on the US west coast
* Support really poor, raised a ticket last week and have heard nothing back at all * Separation of claude.ai accounts and console accounts is super confusing * Couldn't log into the platform since I had an old org in the process of deletion even though I was invited to a new one (had to wait 7 days!) * Payments for more API credits were broken for about a week * Claude chat has really gone to s*t unless it always was. Just getting back terrible answers to simple questions. * The desktop app is a web app pretending to be a desktop app that doesn't always know it is a desktop app so you get things like, "this will only work in the desktop app". Yes I know, this is the desktop app! "Oh sorry about that but you need to use the desktop app". * mcp integration and debugging is dreadful, just a combination of generic "an error ocurred" and sometimes nothing at all * MCP only supports OAuth for shared connectors but auth key doesn't work even with "local" servers that are not necessarily local, just the config is local.
You can put those on the health status!
https://mesmer.tools/random/is-it-peak-hours
This seems reasonable surge pricing approach to me: 1. Implement surge pricing for everyone for the peak 2 hours of the day if possible. 2 hours you can work around, 5 hours is too hard. 2. Give existing customers a one time credit for surge 3. Make sure the plans just consume credits at an accelerated rate (e.g. if on the max plan, i just get 1/2 usage during peak hours). 4. Exempt sonnet/haiku from surge (so people can keep using) 5. Make "auto" settings in claude code etc automatically adapt during surge hours, so people don't get surprises by default. 6. For the first 90 days, unofficially waive the fist $100 in surge for every user but notify them. To train users about the surge, and get them used to it without having them actually pay. 7. (I don't think they would do this but this would help) allow users to fall back to using something like GLM 5.1 or Gemma 4 automatically in outages, with a partnership to handle it. Its not ideal, but i would prefer it in "partner mode" than not. IMO they can charge like 10% on top of the partner fees for this if used in other times, but during outages or surge, partner mode is free. But 100% managed by Anthropic so users don't need to set things up and can just use the Anthropic harness.
- Anthropic introduced stringent limits at peak hours. By "introduced" I mean announced it on a random dev's Xitter account
- Users suddenly started burning through all of their tokens even on trivial tasks. Anthropic never truly acknowledged it, their random devs posted "we're working on it".
- One of the workarounds was to somewhat quietly reduce default reasoning to medium
- OpenClaw and "usage through other tools" banned
- Announce "redesigned Claude Code Desktop App that lets you run many parallel sessions"
- Availability is still circling down the drain
- Dario Amodei is in continuous "trust us we have AGI coding is solved we don't need programmers just give us more money" mode now
> claude: "API Error: 500... check status.claude.com"
> status.claude.com: "All Systems Operational"
Early twitter showed the fail whale as often as it showed tweets and yet it was an unstoppable juggernaut that people kept using.
[0] https://news.ycombinator.com/item?id=47753710
Android app is still responding but no-go on claude.ai and I can't login with email
status.claude.com has an update:
Investigating - We are seeing increased errors on Claude.ai, API, and Claude Code Apr 15, 2026 - 14:53 UTC
> time to break out Pro C# textbook
- downdetector comment