For me one of the most interesting aspects is how compaction works. It turns out compaction still preserves the full original pre-compaction conversation in the session jsonl file, and those are marked as "not to be sent to the API". Which means, even after compaction, if you think something was lost, you can tell CC to "look in the session log files to find details about what we did with XYZ". I knew this before the leak since it can be seen from the session logs. Some more details:
The full conversation is preserved in the JSONL file, and messages
are filtered before being sent to the API.
Key mechanisms:
1. JSONL is append-only — old pre-compaction messages are never deleted. New messages (boundary
marker, summary, attachments) are appended after compaction.
2. Messages have flags controlling API visibility:
- isCompactSummary: true — marks the AI-generated summary message
- isVisibleInTranscriptOnly: true — prevents a message from being sent to the API
- isMeta — another filter for non-API messages
- getMessagesAfterCompactBoundary() returns only post-compaction messages for API calls
3. After compaction, the API sees only:
- The compact boundary marker
- The summary message
- Attachments (file refs, plan, skills)
- Any new messages after compaction
4. Three compaction types exist:
- Full compaction — API summarizes all old messages
- Session memory compaction — uses extracted session memory as summary (cheaper)
- Microcompaction — clears old tool result content when cache is cold (>1h idle)
I dug into this more. It's disabled by default, and it's a cost/token-usage optimization.
The logic is:
1. Anthropic's API has a server-side prompt cache with a 1-hour TTL
2. When you're actively using a session, each API call reuses the cached prefix — you only pay
for new tokens
3. After 1 hour idle, that cache is guaranteed expired
4. Your next message will re-send and re-process the entire conversation from scratch — every
token, full price
5. So if you have 150K tokens of old Grep/Read/Bash outputs sitting in the conversation, you're
paying to re-ingest all of that even though it's stale context the model probably doesn't need
The microcompact says: "since we're paying full price anyway, let's shrink the bill by clearing
the bulky stuff."
What's preserved vs lost:
- The tool_use blocks (what tool was called, with what arguments) — kept
- The tool_result content (the actual output) — replaced with [Old tool result content cleared]
- The most recent 5 tool results — kept
So Claude can still see "I ran Grep for foo in src/" but not the 500-line grep output from 2
hours ago.
Does it affect quality? Yes, somewhat — but the tradeoff is that without it, you're paying
potentially tens of thousands of tokens to re-ingest stale tool outputs that the model already
acted on. And remember, if the conversation is long enough, full compaction would have summarized
those messages anyway.
And critically: this is disabled by default (enabled: false in timeBasedMCConfig.ts:31). It's
behind a GrowthBook feature flag that Anthropic controls server-side. So unless they've flipped
it on for your account, it's not happening to you.
it's basically a cost optimization masquerading as a feature
Cost optimization in the user's favor.
Remember that every time you send a new message to the LLM, you are actually sending the entire conversation again with that added last message to the LLM.
Remember that LLMs are fixed functions, the only variable is the context input (and temperature, sure).
Naively, this would lead to quadratic consumption of your token quota, which would get ridiculously expensive as conversations stretch into current 100k-1M context windows.
To solve this, AI providers cache the context on the GPU, and only charge you for the delta in the conversation/context. But they're not going to keep that GPU cache warm for you forever, so it'll time out after some inactivity.
So the microcompaction-on-idle happens to soften the token consumption blow after you've stepped away for lunch, your context cache has been flushed by the AI provider, and you basically have to spend tokens to restart your conversation from scratch.
There are now several comments that (incorrectly?) interpret the undercover mode as only hiding internal information. Excerpts from the actual prompt[0]:
NEVER include in commit messages or PR descriptions:
- The phrase "Claude Code" or any mention that you are an AI
- Co-Authored-By lines or any other attribution
BAD (never write these):
- 1-shotted by claude-opus-4-6
- Generated with Claude Code
- Co-Authored-By: Claude Opus 4.6 <…>
This very much sounds like it does what it says on the tin, i.e. stays undercover and pretends to be a human. It's especially worrying that the prompt is explicitly written for contributions to public repositories.
No problem at all in the EU, as the user would either would need to review and redact the output or would need to put a transparency note up by law [0]. I am sure that Anthropic with their high ethical standards will educate their users ...
Code may not be, but opening a Merge Request undercover may be unlawful:
> Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system
Depends if it's a closed loop agent. If the agent opens the request, writes the body and is triggered by an answer on the MR, then I'd expect the law to cover this.
Good question. Actually, i was assuming that at least source code is treated as text under the legal regime (there is typically special rules in copyright law, but provision applying to text should apply). Furthermore I would think pull requests, etc are all text. So I would think this applies.
But it's not just text. Once again, it's explicitly defined as:
> which is published with the purpose of informing the public on matters of public interest
There is no "informing public on matters of public interest" in source code nor an MR. It's clearly meant to prevent "deepfake" news, like the image and video ones explicitly call that out.
You are absolutely right. However, the recitals point clearly beyond only protection against fake news. IMHO running such an agent in stealth mode can easily be illegal, Articl 50 (1) states :
> Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system, unless this is obvious from the point of view of a natural person who is reasonably well-informed, observant and circumspect, taking into account the circumstances and the context of use.
But you aren't a provider of AI services by using AI. There's a clear difference already called out between "provider" and "deployer". An AI user could barely be called a "deployer" as is, let alone a "provider".
In other words, what AI service are you providing by creating a PR?
I would argue that the person reading an AI generated pull request is if there is no human oversight with an Ai. And are you sure that you get out of this definition (at least as a company):
> provider’ means a natural or legal person, public authority, agency or other body that develops an AI system or a general-purpose AI model or that has an AI system or a general-purpose AI model developed and places it on the market or puts the AI system into service under its own name or trademark, whether for payment or free of charge;
I would have expected people (maybe a small minority, but that includes myself) to have already instructed Claude to do this. It’s a trivial instruction to add to your CLAUDE.md file.
It doesn't work so well in my experience. I am currently wrapping (or asking the LLM to wrap) the commit message prompt in a script call.
1. the LLM is instructed on how to write a commit message and never include co-authorship
2. the LLM is asked to produce a commit message
3. the LLM output is parsed by a script which removes co-authorship if the LLM chooses to include it nevertheless
Also for future reference, Copilot - specifically - includes a configuration flag to toggle the co-authorship (see copilot help config):
> includeCoAuthoredBy: whether to instruct the agent to add a Co-authored-by trailer to git commits; defaults to true.
This means that, if you don't explicitely configure otherwise, the LLM is specifically instructed to include co-authorship in its higher level instructions.
Typo from speech to text, corrected: “I guess Anthropic’s system prompt didn't work. If folks are having to add it manually into their own Claude.md files...”
Does this apply to their internal use as well? They can really only claim DMCA status on the leaked code if it was authored by humans. Claude attribution in their internal git history would make a strong case that they do not in fact own the copyright to Claude Code itself and are therefore abusing the DMCA system to protect leaked trade secrets rather than protect copyright.
According to the US Copyright Office, fully AI-generated works aren’t eligible for copyright because they don’t have human authors. They’re in the public domain by default.
It seems like it's an active area of legal thought (IANAL though).
Recent relevant discussion about this in the chardet repo between the chardet maintainer who relicensed the chardet code and Richard Fontana, a well regarded lawyer US IP lawyer who's worked for Red Hat (now IBM) for decades:
My take away from the conversation there is that being in an edit loop, where the files are AI generated through your control rather than directly editing the files yourself, means the files are then "AI authored" for copyright protection purposes rather than yourself.
But I double stress, I'm not a lawyer so may have misunderstood things radically.
I think that may not be answerable until a case concerning it has been heard and ruled on. A lawyer may have a better answer for you, but if I had to bet then I'd put $100 on it being something like 'it depends'.
It's interesting how AI can be its own worst enemy in this legal system. The very thing it's excellent at is not protected. In practice, there seems to be a strong opportunity to disintermediate brands by acting as a layer of abstraction above the seller and manufacturer. An AI instruction likely cares less about brand or sharing customer information with the seller; it's just more friction and tokens spent.
I think its just a case of dealing with something that has no precedent. We have never had to determine what the line is between a tool and an employee when they can both be instructed with natural language. If we were to evaluate AI as if it were in a contract with us for use of its time and efforts in exchange for something of consideration, it would be an easy ruling. If we were to evaluate AI as if it were a tool which operates as an extension of the operators skill without any independent additions then it would be an easy ruling. But since we now have a tool that can produce results that are independent of our ability to produce them with any former class of tools, then we have to create entirely new models for how to map these tools into the complexity of real life conflicts where people have different goals and where we must decouple fairness from intentions.
None of this is really worrying, this is a pattern implemented in a similar way by every single developer using AI to write commit messages after noticing how exceptionally noisy they are to self-attribute things. Anthropics views on AI safety and alignment with human interests dont suddenly get thrown out with the bathwater because of leaked internal tooling of which is functionally identical to a basic prompt in a mere interface (and not a model). I dont really buy all the forced "skepticism" on this thread tbh.
It's less about pretending to be a human and more about not inviting scrutiny and ridicule toward Claude if the code quality is bad. They want the real human to appear to be responsible for accepting Claud's poor output.
The code has a stated goal of avoiding leaks, but then the actual implementation becomes broader than that. I see two possible explanations:
* The authors made the code very broad to improve its ability to achieve the stated goal
* The authors have an unstated goal
I think it's healthy to be skeptical but what I'm seeing is that the skeptics are pushing the boundaries of what's actually in the source. For example, you say "says on the tin" that it "pretends to be human" but it simply does not say that on the tin. It does say "Write commit messages as a human developer would" which is not the same thing as "Try to trick people into believing you're human." To convince people of your skepticism, it's best to stick to the facts.
You can already turn off "Co-Authored-By" via Claude Code config. This is what their docs show:
~/.claude/settings.json
{
"attribution": {
"commit": "",
"pr": ""
},
The rest of the prompt is pretty clear that it's talking about internal use.
Claude Code users aren't the ones worried about leaking "internal model codenames" nor "unreleased model opus-4-8" nor Slack channel names. Though, nobody would want that crap in their generated docs/code anyways.
Seems like a nothingburger, and everyone seems to be fantasizing about "undercover mode" rather than engaging with the details.
There's a more worrying part: It refers to unreleased versions of Claude in more detail than released versions.
For a company calling chinese companies out for distillation attacks on their models, this very much looks like a distillation attack against human maintainers, especially when combined with the frustration detector.
I cringe every time I see Claude trying to co-author a commit. The git history is expected to track accountability and ownership, not your Bill of Tools. Should I also co-author my PRs with my linter, intellisense and IDE?
The buddy feature the article mentions is planned for release tomorrow, as a sort of April Fools easter egg. It'll roll out gradually over the day for "sustained Twitter buzz" according to the source.
The pet you get is generated based off your account UUID, but the algorithm is right there in the source, and it's deterministic, so you can check ahead of time. Threw together a little app to help, not to brag but I got a legendary ghost https://claudebuddychecker.netlify.app/
The name "Undercover mode" and the line The phrase "Claude Code" or any mention that you are an AI sound spooky, but after reading the source my first knee-jerk reaction wouldn't be "this is for pretending to be human" given that the file is largely about hiding Anthropic internal information such as code names. I encourage looking at the source itself in order to draw your conclusions, it's very short: https://github.com/alex000kim/claude-code/blob/main/src/util...
My GitHub fork of anthropics/claude-code just got taken down with a DMCA notice lol
It did not have a copy of the leaked code...
Anthropic thinking 1) they can unring this bell, and 2) removing forks from people who have contributed (well, what little you can contribute to their repo), is ridiculous.
GitHub's note at the top says: "Note: Because the reported network that contained the allegedly infringing content was larger than one hundred (100) repositories, and the submitter alleged that all or most of the forks were infringing to the same extent as the parent repository, GitHub processed the takedown notice against the entire network of 8.1K repositories, inclusive of the parent repository."
I don't understand the part about undercover mode. How is this different from disabling claude attribution in commits (and optionally telling claude to act human?)
On that note, this article is also pretty obviously AI-generated and it's unfortunate the author didn't clean it up.
I'm amazed at how much of what my past employers would call trade secrets are just being shipped in the source. Including comments that just plainly state the whole business backstory of certain decisions. It's like they discarded all release harnesses and project tracking and just YOLO'd everything into the codebase itself.
Edit: Everyone is responding "comments are good" and I can't tell if any of you actually read TFA or not
> “BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally.”
This is just revealing operational details the agent doesn't need to know to set MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3
Maybe it would be okay as a first filtering step, before doing actual sentiment analysis on the matches. That would at least eliminate obvious false positives (but of course still do nothing about false negatives).
I'd really recommend putting a modicum of work into cleaning up obvious AI generated output. It's rude, otherwise, to the humans you're expecting to read this.
>This was the most-discussed finding in the HN thread. The general reaction: an LLM company using regexes for sentiment analysis is peak irony.
>Is it ironic? Sure. Is it also probably faster and cheaper than running an LLM inference just to figure out if a user is swearing at the tool? Also yes. Sometimes a regex is the right tool.
I'm reading an LLM written write up on an LLM tool that just summarizes HN comments.
I'm so tired man, what the hell are we doing here.
The hooks system is the most underappreciated thing in what leaked. PreToolUse,
PostToolUse, session lifecycle, all firing via curl to a local server. Clean
enough to build real tooling on top of without fighting it.
The frustration regex is funny but honestly the right call. Running an LLM call
just to detect "wtf" would be ridiculous.
KAIROS is what actually caught my attention. An always-on background agent that
acts without prompting is a completely different thing from what Claude Code is
today. The 15 second blocking budget tells me they actually thought through what
it feels like to have something running in the background while you work, which
is usually the part nobody gets right.
I'm still amazed that something as ubiquitous as "daemon mode" is still unreleased.
- Claude Chat: built like it's 1995, put business logic in the button click() handler. Switch to something else in in the UI and a long running process hard stops. Very Visual Basic shovelware.
- Claude Cowork: same but now we're smarter, if you change the current convo we don't stop the underlying long-running process. 21st century FTW!
- Claude Code: like chat, but in the CLI
- Claude Dispatch: an actual mobile client app, not the whole thing bundled together.
- Daemon mode: proper long-running background process, still unreleased.
I’m more curious how this impacts trust than anything else.
In the span of basically a week, they accidentally leaked Mythos, and then now the entire codebase of CC. All while many people are complaining about their usage limits being consumed quickly.
Individually, each issue is manageable (Because its exciting looking through leaked code). But together, it starts to feel like a pattern.
At some point, I think the question becomes whether people are still comfortable trusting tools like this with their codebases, not just whether any single incident was a mistake.
It is super weird that developers have to run a binary blob on their machines. It's 2026, all the major developer CLI tools are open-source anyway. What's the point for Anthropic to even make it secret?
" ...accidentally shipping your source map to npm is the kind of mistake that sounds impossible until you remember that a significant portion of the codebase was probably written by the AI you are shipping.”
Can someone clarify how the signing can't be spoofed (or can it)? If we have the source, can't we just use the key to now sign requests from other clients and pretend they're coming from CC itself?
I have yet to see such a company that's so insecure that they would keep their CLI closed source even when the secret sauce is in the model that they control already and is closed source.
Not only that, wouldn't allow other CLIs to be used either.
The "undercover mode" discussion here is exactly the kind of thing non-technical CEOs need to understand — not the implementation, but the governance implication. If your developers are using a tool that actively avoids disclosing its involvement in commits and PRs, your audit trail is broken.
The short version: rotate API keys as a precaution, check what audit logs you actually have, and add a clause to your AI policy requiring vendor disclosure of new autonomous capabilities before they get enabled.
Anyone else have CI checks that source map files are missing from the build folder? Another trick is to grep the build folder for several function/variable names that you expect to be minified away.
Something I’ve been thinking about, somewhat related but also tangential to this topic:
The more code gets generated by AI, won’t that mean taking source code from a company becomes legal? Isn’t it true that works created with generative AI can’t be copyrighted?
I wonder if large companies have throught of this risk. Once a company’s product source code reaches a certain percentage of AI generation it no longer has copyright. Any employee with access can just take it and sell it to someone else, legally, right?
> The obvious concern, raised repeatedly in the HN thread: this means AI-authored commits and PRs from Anthropic employees in open source projects will have no indication that an AI wrote them. It’s one thing to hide internal codenames. It’s another to have the AI actively pretend to be human.
I don’t get it. What does this mean? I can use Claude code now without anyone knowing it is Claude code.
I like that if they decide that your usage looks like distillation it just becomes useless, because there’s no way for the end user to distinguish between it just being sort of crappy or sabotaged intentionally. That’s a cool thing to pay for
Absolutely hilarious that it's watching for frustration.
I'd discovered, perhaps mid-2025, that Cursor was noticeably better at fixing bugs if I started cursing at it. Better yet, after a while it would seem to break and start cursing itself ("Oh yes, I see the f*** problem now" and so on). Hilarity ensued.
What a world, where cursing at your machines can make them get their act together.
They want "Made with Claude Code" on your PRs as a growth marketing strategy. They don't want it on their PRs, so it looks like they're doing something you're not capable of. Well, you are and they have no secret sauce.
The Claude Code leak suggests multi-agent orchestration is largely driven by prompts (e.g., “do not rubber-stamp weak work”), with code handling execution rather than enforcing decisions.
Prompts are not hard constraints—they can be interpreted, deprioritized, or reasoned around, especially as models become more capable.
From what’s visible, there’s no clear evidence of structural governance like voting systems, hard thresholds, or mandatory human escalation. That means control appears to be policy (prompts), not enforcement (code).
This raises the core issue:
If governance is “prompts all the way down,” it’s not true governance—it’s guidance.
And as model capability increases, that kind of governance doesn’t get stronger—it becomes easier to bypass without structural constraints.
Has anyone actually implemented structural governance for agent swarms — voting logic, hard thresholds, REQUIRES_HUMAN as architecture not instruction?
I'm surprised that they don't just keep the various prompts, which are arguably their "secret sauce", hidden server side. Almost like their backend and frontend engineers don't talk to each other.
> Anti-distillation: injecting fake tools to poison copycats
Does this mean huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled is unusable? Had anyone seen fake tool calls working with this model?
The irony of an IP scraper on an absolutely breathtaking, epic scale getting its secret sauce "scraped" - because the whole app is vibe coded (and the vibe coders appear to be oblivious to things like code obfuscation cuz move fast!)...
And so now the copy cats can ofc claim this is totally not a copy at all, it's actually Opus. No license violation, no siree!
It's fucking hilarious is what it is, it's just too much.
578 comments
>
it's basically a cost optimization masquerading as a featureCost optimization in the user's favor.
Remember that every time you send a new message to the LLM, you are actually sending the entire conversation again with that added last message to the LLM.
Remember that LLMs are fixed functions, the only variable is the context input (and temperature, sure).
Naively, this would lead to quadratic consumption of your token quota, which would get ridiculously expensive as conversations stretch into current 100k-1M context windows.
To solve this, AI providers cache the context on the GPU, and only charge you for the delta in the conversation/context. But they're not going to keep that GPU cache warm for you forever, so it'll time out after some inactivity.
So the microcompaction-on-idle happens to soften the token consumption blow after you've stepped away for lunch, your context cache has been flushed by the AI provider, and you basically have to spend tokens to restart your conversation from scratch.
[0]: https://github.com/chatgptprojects/claude-code/blob/642c7f94...
[0] https://ai-act-service-desk.ec.europa.eu/en/ai-act/article-5...
> which is published with the purpose of informing the public on matters of public interest
From your link, that's the only case where text needs to be attributed to AI.
> Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system
That should be obvious considering an MR is not providing AI services.
> which is published with the purpose of informing the public on matters of public interest
There is no "informing public on matters of public interest" in source code nor an MR. It's clearly meant to prevent "deepfake" news, like the image and video ones explicitly call that out.
In other words, what AI service are you providing by creating a PR?
> provider’ means a natural or legal person, public authority, agency or other body that develops an AI system or a general-purpose AI model or that has an AI system or a general-purpose AI model developed and places it on the market or puts the AI system into service under its own name or trademark, whether for payment or free of charge;
https://code.claude.com/docs/en/settings#attribution-setting...
copilot help config):>
includeCoAuthoredBy: whether to instruct the agent to add a Co-authored-by trailer to git commits; defaults totrue.This means that, if you don't explicitely configure otherwise, the LLM is specifically instructed to include co-authorship in its higher level instructions.
See: https://library.osu.edu/site/copyright/2026/02/06/artificial...
Recent relevant discussion about this in the chardet repo between the chardet maintainer who relicensed the chardet code and Richard Fontana, a well regarded lawyer US IP lawyer who's worked for Red Hat (now IBM) for decades:
https://github.com/chardet/chardet/issues/334#issuecomment-4...
My take away from the conversation there is that being in an edit loop, where the files are AI generated through your control rather than directly editing the files yourself, means the files are then "AI authored" for copyright protection purposes rather than yourself.
But I double stress, I'm not a lawyer so may have misunderstood things radically.
* The authors made the code very broad to improve its ability to achieve the stated goal
* The authors have an unstated goal
I think it's healthy to be skeptical but what I'm seeing is that the skeptics are pushing the boundaries of what's actually in the source. For example, you say "says on the tin" that it "pretends to be human" but it simply does not say that on the tin. It does say "Write commit messages as a human developer would" which is not the same thing as "Try to trick people into believing you're human." To convince people of your skepticism, it's best to stick to the facts.
~/.claude/settings.json
The rest of the prompt is pretty clear that it's talking about internal use.Claude Code users aren't the ones worried about leaking "internal model codenames" nor "unreleased model opus-4-8" nor Slack channel names. Though, nobody would want that crap in their generated docs/code anyways.
Seems like a nothingburger, and everyone seems to be fantasizing about "undercover mode" rather than engaging with the details.
For a company calling chinese companies out for distillation attacks on their models, this very much looks like a distillation attack against human maintainers, especially when combined with the frustration detector.
The pet you get is generated based off your account UUID, but the algorithm is right there in the source, and it's deterministic, so you can check ahead of time. Threw together a little app to help, not to brag but I got a legendary ghost https://claudebuddychecker.netlify.app/
>
"Anti-distillation: injecting fake tools to poison copycats"Plot twist: Chinese competitors end up developing real, useful versions of Claude's fake tools.
Interesting!
The phrase "Claude Code" or any mention that you are an AIsound spooky, but after reading the source my first knee-jerk reaction wouldn't be "this is for pretending to be human" given that the file is largely about hiding Anthropic internal information such as code names. I encourage looking at the source itself in order to draw your conclusions, it's very short: https://github.com/alex000kim/claude-code/blob/main/src/util...It did not have a copy of the leaked code...
Anthropic thinking 1) they can unring this bell, and 2) removing forks from people who have contributed (well, what little you can contribute to their repo), is ridiculous.
---
DMCA: https://github.com/github/dmca/blob/master/2026/03/2026-03-3...
GitHub's note at the top says: "Note: Because the reported network that contained the allegedly infringing content was larger than one hundred (100) repositories, and the submitter alleged that all or most of the forks were infringing to the same extent as the parent repository, GitHub processed the takedown notice against the entire network of 8.1K repositories, inclusive of the parent repository."
On that note, this article is also pretty obviously AI-generated and it's unfortunate the author didn't clean it up.
Edit: Everyone is responding "comments are good" and I can't tell if any of you actually read TFA or not
> “BQ 2026-03-10: 1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally.”
This is just revealing operational details the agent doesn't need to know to set
MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3> Sometimes a regex is the right tool.
I’d argue that in this case, it isn’t. Exhibit 1 (from the earlier thread): https://github.com/anthropics/claude-code/issues/22284. The user reports that this caused their account to be banned: https://news.ycombinator.com/item?id=47588970
Maybe it would be okay as a first filtering step, before doing actual sentiment analysis on the matches. That would at least eliminate obvious false positives (but of course still do nothing about false negatives).
> The multi-agent coordinator mode in coordinatorMode.ts is also worth a look. The whole orchestration algorithm is a prompt, not code.
So much for langchain and langraph!! I mean if Anthropic themselves arent using it and using a prompt then what’s the big deal about langchain
>This was the most-discussed finding in the HN thread. The general reaction: an LLM company using regexes for sentiment analysis is peak irony.
>Is it ironic? Sure. Is it also probably faster and cheaper than running an LLM inference just to figure out if a user is swearing at the tool? Also yes. Sometimes a regex is the right tool.
I'm reading an LLM written write up on an LLM tool that just summarizes HN comments.
I'm so tired man, what the hell are we doing here.
The frustration regex is funny but honestly the right call. Running an LLM call just to detect "wtf" would be ridiculous.
KAIROS is what actually caught my attention. An always-on background agent that acts without prompting is a completely different thing from what Claude Code is today. The 15 second blocking budget tells me they actually thought through what it feels like to have something running in the background while you work, which is usually the part nobody gets right.
It also somehow messed up my alacritty config when I first used it. Who knows what other ~/.config files it modifies without warning.
- Claude Chat: built like it's 1995, put business logic in the button click() handler. Switch to something else in in the UI and a long running process hard stops. Very Visual Basic shovelware.
- Claude Cowork: same but now we're smarter, if you change the current convo we don't stop the underlying long-running process. 21st century FTW!
- Claude Code: like chat, but in the CLI
- Claude Dispatch: an actual mobile client app, not the whole thing bundled together.
- Daemon mode: proper long-running background process, still unreleased.
>Claude Code also uses Axios for HTTP.
Interesting based on the other news that is out.
In the span of basically a week, they accidentally leaked Mythos, and then now the entire codebase of CC. All while many people are complaining about their usage limits being consumed quickly.
Individually, each issue is manageable (Because its exciting looking through leaked code). But together, it starts to feel like a pattern.
At some point, I think the question becomes whether people are still comfortable trusting tools like this with their codebases, not just whether any single incident was a mistake.
" ...accidentally shipping your source map to npm is the kind of mistake that sounds impossible until you remember that a significant portion of the codebase was probably written by the AI you are shipping.”
Not only that, wouldn't allow other CLIs to be used either.
I wrote a short piece explaining the 3 policy implications for teams using Claude Code (or any AI coding tool) — without the technical jargon: https://www.aipolicydesk.com/blog/claude-code-leak-what-ceo-...
The short version: rotate API keys as a precaution, check what audit logs you actually have, and add a clause to your AI policy requiring vendor disclosure of new autonomous capabilities before they get enabled.
The more code gets generated by AI, won’t that mean taking source code from a company becomes legal? Isn’t it true that works created with generative AI can’t be copyrighted?
I wonder if large companies have throught of this risk. Once a company’s product source code reaches a certain percentage of AI generation it no longer has copyright. Any employee with access can just take it and sell it to someone else, legally, right?
> The obvious concern, raised repeatedly in the HN thread: this means AI-authored commits and PRs from Anthropic employees in open source projects will have no indication that an AI wrote them. It’s one thing to hide internal codenames. It’s another to have the AI actively pretend to be human.
I don’t get it. What does this mean? I can use Claude code now without anyone knowing it is Claude code.
> So I spent my morning reading through the HN comments and leaked source.
> This was one of the first things people noticed in the HN thread.
> The obvious concern, raised repeatedly in the HN thread
> This was the most-discussed finding in the HN thread.
> Several people in the HN thread flagged this
> Some in the HN thread downplayed the leak
when the original HN post is already at the top of the front page...why do we need a separate blogpost that just summarizes the comments?
Plus there's demand for skilled TS software devs that don't ship your company's roadmap using a js.map
20,000 agents and none of them caught it...
> 250,000 wasted API calls per day
How much approximate savings would this actually be?
> Frustration detection via regex (yes, regex)
/\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful| piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)| fucking? (broken|useless|terrible|awful|horrible)|fuck you| screw (this|you)|so frustrating|this sucks|damn it)\b/
Personally, I'm generally polite even towards AI and even when frustrated. I simply point out the its mistakes instead of using emotional words.
I'd discovered, perhaps mid-2025, that Cursor was noticeably better at fixing bugs if I started cursing at it. Better yet, after a while it would seem to break and start cursing itself ("Oh yes, I see the f*** problem now" and so on). Hilarity ensued.
What a world, where cursing at your machines can make them get their act together.
They would either need to lie about consuming the tokens at one point to use in another so the token counting was precise.
But that does not make sense because if someone counted the tokens by capturing the session it would certainly not match what was charged.
Unless they would charge for the fake tools anyway so you never know they were there
Prompts are not hard constraints—they can be interpreted, deprioritized, or reasoned around, especially as models become more capable.
From what’s visible, there’s no clear evidence of structural governance like voting systems, hard thresholds, or mandatory human escalation. That means control appears to be policy (prompts), not enforcement (code).
This raises the core issue: If governance is “prompts all the way down,” it’s not true governance—it’s guidance.
And as model capability increases, that kind of governance doesn’t get stronger—it becomes easier to bypass without structural constraints.
Has anyone actually implemented structural governance for agent swarms — voting logic, hard thresholds, REQUIRES_HUMAN as architecture not instruction?
> Anti-distillation: injecting fake tools to poison copycats
Does this mean
huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilledis unusable? Had anyone seen fake tool calls working with this model?And so now the copy cats can ofc claim this is totally not a copy at all, it's actually Opus. No license violation, no siree!
It's fucking hilarious is what it is, it's just too much.