section includes: When a request leaves minor details unspecified, the person typically wants Claude to make a reasonable attempt now, not to be interviewed first.
Uff, I've tried stuff like these in my prompts, and the results are never good, I much prefer the agent to prompt me upfront to resolve that before it "attempts" whatever it wants, kind of surprised to see that they added that
I've recently started adding something along the lines of "if you can't find or don't know something, don't assume. Ask me." It's helped cut down on me having to tell it to undo or redo things a fair amount. I also have used something like, "Other agents have made mistakes with this. You have to explain what you think we're doing so I can approve." It's kind of stupid to have to do this, but it really increases the quality of the output when you make it explain, correct mistakes, and iterate until it tells you the right outcome before it operates.
I even have a specific, non-negotiable phase in the process where model MUST interview me, and create an interview file with everything captured. Plan file it produces must always include this file as an artifact and interview takes the highest precedence.
Otherwise, the intent gets lost somewhere in the chat transcript.
The raw Q&A is essential. I think Q & Q works so we'll because it reveals how the model is "thinking" about what you're working on, which allows for correction and guidance upfront.
Not GP, but BMAD has several interview techniques in its brainstorming skill. You can invoke it with /bmad-brainstorming, briefly explain the topic you want to explore, then when it asks you to if you want to select a technique, pick something like "question storming". I've had positive experience with this (with Opus 4.7).
Seriously, when you're conversing with a person would you prefer they start rambling on their own interpretation or would you prefer they ask you to clarify? The latter seems pretty natural and obvious.
Edit: That said, it's entirely possible that large and sophisticated LLMs can invent some pretty bizarre but technically possible interpretations, so maybe this is to curb that tendency.
—So what would theoretically happen if we flipped that big red switch?
—Claude Code: FLIPS THE SWITCH, does not answer the question.
Claude does that in React, constantly starting a wrong refactor. I’ve been using Claude for 4 weeks only, but for the last 10 days I’m getting anger issues at the new nerfing.
Yeah this happens to me all the time! I have a separate session for discussing and only apply edits in worktrees / subagents to clearly separate discuss from work and it still does it
I sometimes prompt with leading questions where I actually want Claude to understand what I’m implying and go ahead and do it. That’s just part of my communication style. I suppose I’m the part of the distribution that ruins things for you.
To me too, if something is ambigious or unclear when I'm getting something to do from someone, I need to ask them to clarify, anything else be borderline insane in my world.
But I know so many people whose approach is basically "Well, you didn't clearly state/say X so clearly that was up to me to interpret however I wanted, usually the easiest/shortest way for me", which is exactly how LLMs seem to take prompts with ambigiouity too, unless you strongly prompt them to not "reasonable attempt now without asking questions".
I have a fun little agent in my tmux agent orchestration system - Socratic agent that has no access to codebase, can't read any files, can only send/receive messages to/from the controlling agent and can only ask questions.
When I task my primary agent with anything, it has to launch the Socratic agent, give it an overview of what are we working on, what our goals are and what it plans to do.
This works better than any thinking tokens for me so far. It usually gets the model to write almost perfectly balanced plan that is neither over, nor under engineered.
IME "don't ask questions and just do a bunch of crap based on your first guess that we then have to correct later after you wasted a week" is one of the most common junior-engineer failure modes and a great way for someone to dead-end their progression.
I usually need to remind it 5 times to do the opposite - because it makes decisions that I don't like or that are harmful to the project—so if it lands in Claude Code too, I have hard times ahead.
I try to explicitly request Claude to ask me follow-up questions, especially multiple-choice ones (it explains possible paths nicely), but if I don't, or when it decides to ignore the instructions (which happens a lot), the results are either bad... or plain dangerous.
I wonder if they're optimizing for metrics that look superficially-worse if the system asks questions about ambiguity early. I've had times where those questions tell me "ah, shit, this isn't the right path at all" and that abandoned session probably shows up in their usage stats. What would be much harder to get from the usage stats are "would I have been happier if I had to review a much bigger blob of output to realize it was underspecified in a breaking way?" But the answer has been uniformly "no." This, in fact, is one of the biggest things that has made it easier to use the tools in "lazy" ways compared to a year ago: they can help you with your up-front homework. But the dialogue is key.
Or they're optimizing for increased revenue? If Claude goes down a completely wrong path because it just assumes it knows what you want rather than asking you, and you have to undo everything and start again, that obviously uses much more tokens than if you would have been able to clarify the misunderstanding early on.
I get this feeling sometimes, like it is so unreliable at referring to context and getting details right it feels like deliberate random rewards to create the equivalent to a gambling addiction. About half my tokens feel wasted on trivial errors that i gave it the context for, on average. And any meta discussions / clarifications result in Claude telling me I did all the right things and there is nothing more I can do and it should have gotten it right from the provided input - which is disempowering but to be fair is at least better than chatGPT gaslighting users about improving prompts over and over to get no better result in the end.
Dammit that’s why I could never get it to not try to one shot answers, it’s in the god damn system prompt… and it explains why no amount of user "system" prompt could fix this behavior.
With my use of Claude code, I find 4.7 to be pretty good about clarifying things. I hated 4.6 for not doing this and had generally kept using 4.5. Maybe they put this in the chat prompt to try to keep the experience similar to before? I definitely do not want this in Claude code.
Having to "unprompt" behaviour I want that Anthropic thinks I don't want is getting out of hand. My system prompts always try to get Claude to clarify _more_.
The past month made me realize I needed to make my codebase usable by other agents. I was mainly using Claude Code. I audited the codebase and identified the points where I was coupling to it and made a refactor so that I can use either codex, gemini or claude.
Here are a few changes:
1. AGENTS.md by default across the codebase, a script makes sure CLAUDE.md symlink present wherever there's an AGENTS.md file
2. Skills are now in a 'neutral' dir and per agent scripts make sure they are linked wherever the coding agent needs them to be (eg .claude/skills)
3. Hooks are now file listeners or git hooks, this one is trickier as some of these hooks are compensating/catering to the agent's capabilities
4. Subagents and commands also have their neutral folders and scripts to transform and linters to check they work
5. agent now randomly selects claude|codex|gemini instead of typing claude to start a coding session
I guess in general auditing where the codebase is coupled and keeping it neutral makes it easier to stop depending solely on specific providers. Makes me realize they don't really have a moat, all this took less than an hour probably.
I've been doing the same except that I'm done with Claude. Cancelled my subscription. I can't use a tool where the limits vary so wildly week to week, or maybe even day to day.
So I'm migrating to pi. I realized that the hardest thing to migrate is hooks - I've built up an expensive collection of Claude hooks over the last few months and unlike skills, hooks are in Claude specific format. But I'd heard people say "just tell the agent to build an extension for pi" so I did. I pointed it at the Claude hooks folder and basically said make them work in pi, and it, very quickly.
I'm leaning in this direction. Recently slopforked pi to python and created a version that's basically a loop, an LLM call to openrouter and a hook system using pluggy. I have been able to one-shot pretty much any feature a coding agent has. Still toy project but this thread seems to be leading me towards mantaining my own harness. I have a feeling it will be just documenting features in other systems and maintaining evals/tests.
Pi appears to be a reference to what is essentially an alternative to agent harnesses / CLI tools like Claude Code or Open Code [0]. I am curious what providers/models are you using in place of Claude's model?
It's got an annoyingly hard to search name because there's a lot of overlap in results with the Raspberry Pi single board computer.
Over the past week or so my workload has been quite low so I've been tinkering rather than doing serious deep work.
I've been using:
* Gemini pro and flash
* Opus 4.6 when I had some free extra usage credits (it burned through $50 of credits like crazy).
* Qwen 3.6 Plus
* Codex 5.3
* Kimi 2.5
I just spent the last hour using Kimi. I was very impressed actually, definitely possible to do useful work with it. However, I used $1 of openrouter credits in about 20 or 30 minutes of a single session, no subagents, so it's not cheap.
The "agent-neutral codebase" framing is the right abstraction. We
ended up building a small generator that takes a single spec file
and emits the agent-specific config (CLAUDE.md, AGENTS.md,
.cursor/rules) rather than maintaining symlinks. Easier to version
and to add a new agent when they inevitably ship next month.
The pain point you're underselling is hooks. They're the least
portable piece by far because each harness has its own event model.
Skills port reasonably, subagents mostly port, hooks almost never do.
Have you got any advice in making agents from different providers work together?
In Claude, I’ve seen cases in which spawning subagents from Gemini and Codex would raise strange permission errors (even if they don’t happen with other cli commands!), making claude silently continue impersonating the other agent.
Only by thoroughly checking I was able to understand that actually the agent I wanted failed.
Not sure if you mean 1) sub-agent definitions (similar to skills in Claude Code) or 2) CLI scripts that use other coding agents (eg claude calling gemini via cli).
For (1) I'm trying to come up with a simple enough definition that can be 'llm compiled' into each format. Permissions format requires something like this two and putting these together some more debugging.
(2) the only one I've played with is claude -p and it seems to work for fairly complex stuff, but I run it with --dangerously-skip-permissions
I would eliminate the possibility of sandbox conflicts by 1) making sure any subagents are invoked with no sandbox (they should still be covered under the calling agent's sandbox) 2) make sure the calling agent's sandbox allows the subagents access to the directories they need (ex: ~/.gemini, ~/.codex).
It works out of the box with something like opencode. I've had no issue creating rather complex interactions between agents plugged into different models.
I implemented a client for each so that the session history is easy to extract regarding the agent (somewhat related to progress of goal).
Context: AGENTS.md is standard across all; and subdirectories have their AGENTS.md so in a way this is a tree of instructions. Skills are also standard so it's a bunch of indexable .md files that all agents can use.
Even better, adding it to the system prompt is a temporary fix, then they'll work it into post-training, so next model release will probably remove it from the system prompt. At least when it's in the system prompt we get some visibility into what's being censored, once it's in the model it'll be a lot harder to understand why "How many calories does 100g of Pasta have?" only returns "Sorry, I cannot divulge that information".
That part of the system prompt is just stating that telling someone who has an actual eating disorder to start counting calories or micro-manage their eating in other ways (a suggestion that the model might well give to an average person for the sake of clear argument, which would then be understood sensibly and taken with a grain of salt) is likely to make them worse off, not better off. This seems like a common-sense addition. It should not trigger any excess refusals on its own.
This. It's like the exaggerated safety instructions everywhere: "do not lean ladder on high voltage wires". Only worse: because you can choose to ignore such instructions when they don't apply, but Claude cannot.
In the best case, wrapping users in cotton wool is annoying. In the worst case, it limits the usefulness of the tool.
It feels like half of AI research is math, and the other half is coming up with yet another way to state "please don't do bad things" in the prompt that will sure work this time I promise.
The alignment favors supporting healthy behaviors so it can be a thin line. I see the system prompt as "plan B" when they can't achieve good results in the training itself.
It's a particularly sensitive issue so they are just probably being cautious.
They have to secretly add these guardrails on because the alternative would be to train the users out of consulting these things as if they are advanced all-knowing alien-technogawds. And that would be bad for business.
The better solution I think would be a reality/personal responsibility approach, teach the consumers that the burden of interpretation is on them and not the magic 8ball. For example if your AI tells you to kill your parents or that you’ve discovered new math that makes time travel possible, etc then: 1. Stop 2. Unplug 3. Go outside 4. Ask a human for a sanity check.
Since that would be bad for business and take a lot of effort on the user side (while being very embarrassing). Obviously can’t do that right before an IPO & in the middle of global economic war so secretive moral frameworks have to be installed.
If you are what you eat then you believe what you consume. Ironically, I think this undisclosed and hidden moral shaping of billions of people will be the most dangerous. Imagine all the things we could do if we can just, ever-so-slightly, move the Overton window / goal posts on w/e topic day by day, prompt by prompt.
Personally I find AI output insidiously disarming and charming and I think I’m in the norm. So while we’ve been besieged by propaganda since time immemorial I do worry that AI is a special case.
> Claude keeps its responses focused and concise so as to avoid potentially overwhelming the user with overly-long responses. Even if an answer has disclaimers or caveats, Claude discloses them briefly and keeps the majority of its response focused on its main answer.
I am strongly opinionated against this. I use Claude in some low-level projects where these answers are saving me from making really silly things, as well as serving as learning material along the way.
This should not be Anthropic's hardcoded choice to make. It should be an option, building the system prompt modularily.
I'm fascinated that Anthropic employees, who are supposed to be the LLM experts, are using tricks like these which go against how LLMs seem to work.
Key example for me was the "malware" tool call section that included a snippet with intent "if it's malware, refuse to edit the file". Yet because it appears dozens of times in a convo, eventually the LLM gets confused and will refuse to edit a file that is not malware.
I've resorted to using tweakcc to patch many of these well-intentioned sections and re-work them to avoid LLM pitfalls.
I'm curious as to why 4.7 seems obsessed with avoiding any actions that could help the user create or enhance malware. The system prompts seem similar on the matter, so I wonder if this is an early attempt by Anthropic to use steering vector injection?
The malware paranoia is so strong that my company has had to temporarily block use of 4.7 on our IDE of choice, as the model was behaving in a concerningly unaligned way, as well as spending large amounts of token budget contemplating whether any particular code or task was related to malware development (we are a relatively boring financial services entity - the jokes write themselves).
In one case I actually encountered a situation where I felt that the model was deliberately failing execute a particular task, and when queried the tool output that it was trying to abide by directives about malware. I know that model introspection reporting is of poor quality and unreliable, but in this specific case I did not 'hint' it in any way. This feels qualitatively like Claude Golden Gate Bridge territory, hence my earlier contemplation on steering vectors. I've been many other people online complaining about the malware paranoia too, especially on reddit, so I don't think it's just me!
I feel like we are at the point where the improvements at one area diminishes functionality in others. I see some things better in 4.7 and some in 4.6. I assume they’ll split in characters soon.
I knew these system prompts were getting big, but holy fuck. More than 60,000 words. With the 3/4 words per token rule of thumb, that's ~80k tokens. Even with 1M context window, that is approaching 10% and you haven't even had any user input yet. And it gets churned by every single request they receive. No wonder their infra costs keep ballooning. And most of it seems to be stable between claude version iterations too. Why wouldn't they try to bake this into the weights during training? Sure it's cheaper from a dev standpoint, but it is neither more secure nor more efficient from a deployment perspective.
Before Opus 4.7, the 4.6 became very much unusable as it has been flagging normal data analysis scripts it wrote itself as cyber security risk. Got several sessions blocked and was unable to finish research with it and had to switch to GPT-5.4 which has its own problems, but at least is not eager to interfere in legitimate work.
edit:
to be fair Anthropic should be giving money back for sessions terminated this way.
Interesting that it's not a direct "you should" but an omniscient 3rd person perspective "Claude should".
Also full of "can" and "should" phrases: feels both passive and subjunctive as wishes, vs strict commands (I guess these are better termed “modals”, but not an expert)
I had seen reports that it was clamping down on security research and things like web-scraping projects were getting caught up in that and not able to use the model very easily anymore. But I don't see any changes mentioned in the prompt that seem likely to have affected that, which is where I would think such changes would have been implemented.
>“If a user indicates they are ready to end the conversation, Claude does not request that the user stay in the interaction or try to elicit another turn and instead respects the user’s request to stop.”
Seems like a good idea. Don't think I've ever had any of those follow up suggestions from a chatbot be actually useful to me
The acting_vs_clarifying change is the one I notice most as a heavy user. Older Claude would ask 3 clarifying questions before doing anything. Now it just picks the most reasonable interpretation and goes. Way less friction in practice.
That's how bloat happens. The more people you add to the team, the more likely there would be one grump who thought that the thing they care at the moment deserved to be added to the system prompt.
Personally, as someone who has been lucky enough to completely cure "incurable" diseases with diet, self experimentation and learning from experts who disagreed with the common societal beliefs at the time - I'm concerned that an AI model and an AI company is planting beliefs and limiting what people can and can't learn through their own will and agency.
My concern is these models revert all medical, scientific and personal inquiry to the norm and averages of whats socially acceptable. That's very anti-scientific in my opinion and feels dystopian.
218 comments
> The new
Uff, I've tried stuff like these in my prompts, and the results are never good, I much prefer the agent to prompt me upfront to resolve that before it "attempts" whatever it wants, kind of surprised to see that they added that
Edit: forgot "don't assume"
Otherwise, the intent gets lost somewhere in the chat transcript.
Edit: That said, it's entirely possible that large and sophisticated LLMs can invent some pretty bizarre but technically possible interpretations, so maybe this is to curb that tendency.
—Claude Code: FLIPS THE SWITCH, does not answer the question.
Claude does that in React, constantly starting a wrong refactor. I’ve been using Claude for 4 weeks only, but for the last 10 days I’m getting anger issues at the new nerfing.
> The latter seems pretty natural and obvious.
To me too, if something is ambigious or unclear when I'm getting something to do from someone, I need to ask them to clarify, anything else be borderline insane in my world.
But I know so many people whose approach is basically "Well, you didn't clearly state/say X so clearly that was up to me to interpret however I wanted, usually the easiest/shortest way for me", which is exactly how LLMs seem to take prompts with ambigiouity too, unless you strongly prompt them to not "reasonable attempt now without asking questions".
When I task my primary agent with anything, it has to launch the Socratic agent, give it an overview of what are we working on, what our goals are and what it plans to do.
This works better than any thinking tokens for me so far. It usually gets the model to write almost perfectly balanced plan that is neither over, nor under engineered.
I try to explicitly request Claude to ask me follow-up questions, especially multiple-choice ones (it explains possible paths nicely), but if I don't, or when it decides to ignore the instructions (which happens a lot), the results are either bad... or plain dangerous.
>
I've tried stuff like these in my prompts, and the results are never goodI've found that Google AI Mode & Gemini are pretty good at "figuring it out". My queries are oft times just keywords.
It's possible they tried to train this out of it for 4.7 and over corrected, and the addition to the system prompt is to rein it in a bit.
Here are a few changes:
1. AGENTS.md by default across the codebase, a script makes sure CLAUDE.md symlink present wherever there's an AGENTS.md file
2. Skills are now in a 'neutral' dir and per agent scripts make sure they are linked wherever the coding agent needs them to be (eg .claude/skills)
3. Hooks are now file listeners or git hooks, this one is trickier as some of these hooks are compensating/catering to the agent's capabilities
4. Subagents and commands also have their neutral folders and scripts to transform and linters to check they work
5.
agentnow randomly selects claude|codex|gemini instead of typingclaudeto start a coding sessionI guess in general auditing where the codebase is coupled and keeping it neutral makes it easier to stop depending solely on specific providers. Makes me realize they don't really have a moat, all this took less than an hour probably.
So I'm migrating to pi. I realized that the hardest thing to migrate is hooks - I've built up an expensive collection of Claude hooks over the last few months and unlike skills, hooks are in Claude specific format. But I'd heard people say "just tell the agent to build an extension for pi" so I did. I pointed it at the Claude hooks folder and basically said make them work in pi, and it, very quickly.
[0] https://github.com/badlogic/pi-mono
It's got an annoyingly hard to search name because there's a lot of overlap in results with the Raspberry Pi single board computer.
Over the past week or so my workload has been quite low so I've been tinkering rather than doing serious deep work.
I've been using:
* Gemini pro and flash
* Opus 4.6 when I had some free extra usage credits (it burned through $50 of credits like crazy).
* Qwen 3.6 Plus
* Codex 5.3
* Kimi 2.5
I just spent the last hour using Kimi. I was very impressed actually, definitely possible to do useful work with it. However, I used $1 of openrouter credits in about 20 or 30 minutes of a single session, no subagents, so it's not cheap.
Using it I don't need skills, memory, subagents, a specific agent CLI. It defines roles, tasks, context out of the box.
I made it for me and my family though. I don't expect interest outside of that.
https://github.com/grantcarthew/start
The pain point you're underselling is hooks. They're the least portable piece by far because each harness has its own event model. Skills port reasonably, subagents mostly port, hooks almost never do.
In Claude, I’ve seen cases in which spawning subagents from Gemini and Codex would raise strange permission errors (even if they don’t happen with other cli commands!), making claude silently continue impersonating the other agent. Only by thoroughly checking I was able to understand that actually the agent I wanted failed.
For (1) I'm trying to come up with a simple enough definition that can be 'llm compiled' into each format. Permissions format requires something like this two and putting these together some more debugging.
(2) the only one I've played with is
claude -pand it seems to work for fairly complex stuff, but I run it with--dangerously-skip-permissionsContext: AGENTS.md is standard across all; and subdirectories have their AGENTS.md so in a way this is a tree of instructions. Skills are also standard so it's a bunch of indexable .md files that all agents can use.
So spending $50M to fund a team to weed out "food for crazies" becomes a no-brainer.
In the best case, wrapping users in cotton wool is annoying. In the worst case, it limits the usefulness of the tool.
It's a particularly sensitive issue so they are just probably being cautious.
Hard for me to say this because I have always been pro-Western and suddenly it seems like the world has flipped.
Because it's a waste of my money to check whether my Object Pascal compiler doesn't develop eating disorders, on every turn.
>the year is 2028 >5M of your 10M context window is the system prompt
If Claude is going to be Claude, we should support these kind of additions.
The better solution I think would be a reality/personal responsibility approach, teach the consumers that the burden of interpretation is on them and not the magic 8ball. For example if your AI tells you to kill your parents or that you’ve discovered new math that makes time travel possible, etc then: 1. Stop 2. Unplug 3. Go outside 4. Ask a human for a sanity check.
Since that would be bad for business and take a lot of effort on the user side (while being very embarrassing). Obviously can’t do that right before an IPO & in the middle of global economic war so secretive moral frameworks have to be installed.
If you are what you eat then you believe what you consume. Ironically, I think this undisclosed and hidden moral shaping of billions of people will be the most dangerous. Imagine all the things we could do if we can just, ever-so-slightly, move the Overton window / goal posts on w/e topic day by day, prompt by prompt.
Personally I find AI output insidiously disarming and charming and I think I’m in the norm. So while we’ve been besieged by propaganda since time immemorial I do worry that AI is a special case.
Letting the system improve over time is fine. System prompt is an inefficient place to do it, buts it's just a patch until the model can be updated.
> Claude keeps its responses focused and concise so as to avoid potentially overwhelming the user with overly-long responses. Even if an answer has disclaimers or caveats, Claude discloses them briefly and keeps the majority of its response focused on its main answer.
I am strongly opinionated against this. I use Claude in some low-level projects where these answers are saving me from making really silly things, as well as serving as learning material along the way.
This should not be Anthropic's hardcoded choice to make. It should be an option, building the system prompt modularily.
Key example for me was the "malware" tool call section that included a snippet with intent "if it's malware, refuse to edit the file". Yet because it appears dozens of times in a convo, eventually the LLM gets confused and will refuse to edit a file that is not malware.
I've resorted to using tweakcc to patch many of these well-intentioned sections and re-work them to avoid LLM pitfalls.
The malware paranoia is so strong that my company has had to temporarily block use of 4.7 on our IDE of choice, as the model was behaving in a concerningly unaligned way, as well as spending large amounts of token budget contemplating whether any particular code or task was related to malware development (we are a relatively boring financial services entity - the jokes write themselves).
In one case I actually encountered a situation where I felt that the model was deliberately failing execute a particular task, and when queried the tool output that it was trying to abide by directives about malware. I know that model introspection reporting is of poor quality and unreliable, but in this specific case I did not 'hint' it in any way. This feels qualitatively like Claude Golden Gate Bridge territory, hence my earlier contemplation on steering vectors. I've been many other people online complaining about the malware paranoia too, especially on reddit, so I don't think it's just me!
edit: to be fair Anthropic should be giving money back for sessions terminated this way.
Also full of "can" and "should" phrases: feels both passive and subjunctive as wishes, vs strict commands (I guess these are better termed “modals”, but not an expert)
>“If a user indicates they are ready to end the conversation, Claude does not request that the user stay in the interaction or try to elicit another turn and instead respects the user’s request to stop.”
Seems like a good idea. Don't think I've ever had any of those follow up suggestions from a chatbot be actually useful to me
Users need to unite and take control back, or be controlled
> “I don’t have access to X” is only correct after tool_search confirms no matching tool exists.
Yay! This will be a big win. I'm glad they fixed this. The number of times I've had to prompt "you do have access to GitHub"...
> If a user shows signs of disordered eating, Claude should not give precise nutrition, diet, or exercise guidance
I wonder which are the "signs of disordered eating" on which Claude relies.
My concern is these models revert all medical, scientific and personal inquiry to the norm and averages of whats socially acceptable. That's very anti-scientific in my opinion and feels dystopian.