AI overly affirms users asking for personal advice (news.stanford.edu)

by oldfrenchfries 617 comments 792 points
Read article View on HN

617 comments

[−] trimbo 48d ago

> They also included 2,000 prompts based on posts from the Reddit community r/AmITheAsshole, where the consensus of Redditors was that the poster was indeed in the wrong.

Sorry, anonymous people on reddit aren't a good comparison. This needs to be studied against people in real life who have a social contract of some sort, because that's what the LLM is imitating, and that's who most people would go to otherwise.

Obviously subservient people default to being yes-men because of the power structure. No one wants to question the boss too strongly.

Or how about the example of a close friend in a relationship or making a career choice that's terrible for them? It can be very hard to tell a friend something like this, even when asked directly if it is a bad choice. Potentially sacrificing the friendship might not seem worth trying to change their mind.

IME, LLMs will shoot holes in your ideas and it will efficiently do so. All you need to do ask it directly. I have little doubt that it outperforms most people with some sort of friendship, relationship or employment structure asked the same question. It would be nice to see that studied, not against reddit commenters who already self-selected into answering "AITA".

[−] redanddead 48d ago
Reddit is notorious for being awful at real life interactions

just look at the relationship subreddit the first answer is always divorce, it’s become a meme

but beyond romantic relationships, i think a lot of us have seen how it can impact work relationships, i’ve had venture partners clearly rely on AI (robotic email responses and even SMS) and that warped their perception and made it harder to connect. It signals laziness and a lack of emotional intelligence

AI should enhance and enable connection, not promote isolation, imo this is a real problem

it should spark curiosity, create openings for conversations, point out the biases to make us better at connecting with other people, i hope we get to a point where most people are made kinder by ai. I’m seeing the opposite atm, interested in hearing others experiences with this

[−] anorwell 48d ago
A pastime I have with papers like this is to look for the part in the paper where they say which models they tested. Very often, you find either A) it's a model from one or more years ago, only just being published now, or B) they don't even say which model they are using. Best I could find in this paper:

> We evaluated 11 user-facing production LLMs: four proprietary models from OpenAI, Anthropic, and Google; and seven open-weight models from Meta, Qwen, DeepSeek, and Mistral.

(and graphs include model _sizes_, but not versions, for open weight models only.)

I can't apprehend how including what model you are testing is not commonly understood to be a basic requirement.

[−] dimgl 48d ago
Even as someone who (wrongly) believed that I had high emotional intelligence, I too was bit by this. Almost a year ago when LLMs were starting to become more ubiquitous and powerful I discussed a big life/professional decision with an LLM over the course of many months. I took its recommendation. Ultimately it turned out to be the wrong decision.

Thankfully it was recoverable, but it really sobered me up on LLMs. The fault is on me, to be clear, as LLMs are just a tool. The issue is that lots of LLMs try to come across as interpersonal and friendly, which lulls users into a false sense of security. So I don't know what my trajectory would have been if I were a teenager with these powerful tools.

I do think that the LLMs have gotten much better at this, especially Claude, and will often push back on bad choices. But my opinion of LLMs has forever changed. I wonder how many other terrible choices people have made because these tools convinced them to make a bad decision.

[−] gAI 48d ago
You're essentially summoning a character to role-play with. Just like with esoteric evocation, it's very easy to summon the wrong aspect of the spirit. Anthropic has a lot to say about this:

https://www.anthropic.com/research/persona-selection-model

https://www.anthropic.com/research/assistant-axis

https://www.anthropic.com/research/persona-vectors

[−] awithrow 48d ago
It feels like I'm fighting uphill battle when it comes to bouncing ideas off of a model. I'll set things up in the context with instructions similar to. "Help me refine my ideas, challenge, push back, and don't just be agreeable." It works for a bit but eventually the conversation creeps back into complacency and syncophancy. I'll check it too by asking "are you just placating me?" the funny thing is that often it'll admit that, yes, it wasn't being very critical, and then procede to over correct and become a complete contrarian. and not in a way that's useful either. very frustrating. I've found that Opus 4.6 is worse about this than 4.5. 4.5 does a better job IMO of following instructions and not drifting into the mode where it acts like everything i say is a grand revelation from up high.
[−] 152334H 48d ago
Maybe it's not so sensible to offload the responsibility of clear thinking to AI companies?

How is a chatbot supposed to determine when a user fools even themselves about what they have experienced?

What 'tough love' can be given to one who, having been so unreasonable throughout their lives - as to always invite scorn and retort from all humans alike - is happy to interpret engagement at all as a sign of approval?

[−] wisemanwillhear 48d ago
With AI, I often like to act like a 3rd party who doesn't have skin in the game and ask the AI to give the strongest criticisms of both sides. Acting like I hold the opposite position as I truly hold can help sometimes as well. Pretending to change my mind is another trick. The idea is to keep the AI from guessing where I stand.
[−] oldfrenchfries 48d ago
There is a striking data visualization showing the breakup advice trend over 15 years on Reddit. You can see the "End relationship" line spike as AI and algorithmic advice take over:

https://www.reddit.com/r/dataisbeautiful/comments/1o87cy4/oc...

[−] gurachek 48d ago
I had exactly this between two LLMs in my project. An evaluator model that was supposed to grade a coaching model's work. Except it could see the coach's notes, so it just... agreed with everything. Coach says "user improved on conciseness", next answer is shorter, evaluator says yep great progress. The answer was shorter because the question was easier lol.

I only caught it because I looked at actual score numbers after like 2 weeks of thinking everything was fine. Scores were completely flat the whole time. Fix was dumb and obvious — just don't let the evaluator see anything the coach wrote. Only raw scores. Immediately started flagging stuff that wasn't working. Kinda wild that the default behavior for LLMs is to just validate whatever context they're given.

[−] youknownothing 48d ago
I think the problem stems from the fact that we have a number of implicit parameters in our heads that allow us to evaluate pros and cons but, unless we communicate those parameters explicitly, the AI cannot take them into account. We ask it to be "objective" but, more and more, I'm of the opinion that there isn't such a thing as objectivity, what we call objectivity is just shared subjectivity; since the AI doesn't know whose shared subjectivity we fall under, it cannot be really objetive.

I tend to use one of these tricks if not both:

- Formulate questions as open-ended as possible, without trying to hint at what your preference is. - Exploit the sycophantic behaviour in your favour. Use two sessions, in one of them you say that X is your idea and want arguments to defend it. In the other one you say that X is a colleague's idea (one you dislike) and that you need arguments to turn it down. Then it's up to you to evaluate and combine the responses.

[−] thesis 48d ago
Humans do this too though. I have close friends that ask for advice. Sometimes if I know there’s risk in touchy subjects I will preface with “do you want my actual advice, or just looking for a sounding board”

I’ve seen firsthand people have lost friends over honesty and telling them something they don’t want to hear.

It’s sad really. I don’t want friends that just smile to my face and are “yes-men” either.

[−] svara 48d ago
Yeah, and if you ask it to be critical specifically to get a different perspective or just to avoid this bias, it'll go over the top in the opposite direction.

This is imo currently the top chatbot failure mode. The insidious thing is that it often feels good to read these things. Factual accuracy by contrast has gotten very good.

I think there's a deeper philosophical dimension to this though, in that it relates to alignment.

There are situations where in the grand scheme of things the right thing to do would be for the chatbot to push back hard, be harsh and dismissive. But is it the really aligned with the human then? Which human?

[−] stonecauldron 48d ago
This is especially problematic because of how easily (and unconsciously) one can bias LLMs with how the prompt is framed.

As an experiment, I recently asked an LLM to analyse the export of a text chat to uncover relationship dynamics.

Simply stating that I was one of the people in the chat would make the LLM turn the other person into the villain. None of that was visible if I framed the chat as only involving third party people.

[−] anotheraccount9 48d ago
AI being a Yes-Man is slowly sabotaging it's own answers, because it negatively impact the user's decision. Yes/No are equally important, within a coherent context, for objective reasons. But being supported in the wrong direction is a castastrophe multiplier, down the road. The AI should be neutral, doubtful at times.
[−] retrochameleon 48d ago
This is why I intensively avoid phrasing that invites affirmation. I present the scenario, the differing viewpoints and maybe a couple personal thoughts, and I try to make it compare and contrast to arrive at it's conclusion.

I'd like to know if my methods are effective. I'm certain they are at least to some extent.

I only ever see research being done about naive and "unskilled" prompting methods. Obviously that's the average user, but just because LLMs are doing poorly in a certain scenario doesn't mean the LLM couldn't excel in the scenario with better direction and prompting. So while it's useful research to be doing, it's a little annoying to only see focus on these examples of "look at how LLMs are bad or biased at this specific thing when prompted in the most straightforward naive way"

[−] stared 48d ago
There is a fine line between "following my instructions" (is what I want it to do) vs "thinking all I do is great" (risky, and annoying).

A good engineer will also list issues or problems, but at the same time won't do other than required because (s)he "knows better".

The worst is that it is impossible to switch off this constant praise. I mean, it is so ingrained in fine tuning, that prompt engineering (or at least - my attempts) just mask it a bit, but hard to do so without turning it into a contrarian.

But I guess the main issue (or rather - motivation) is most people like "do I look good in this dress?" level of reassurance (and honesty). It may work well for style and decoration. It may work worse if we design technical infrastructure, and there is more ground truth than whether it seems nice.

[−] roysting 48d ago
I’m not sure I like the immediate jump to “requires policy maker attention”. Considering the way “policy makers” have been trampling all over the most basic and fundamental human rights left, right, and center; that’s the last people we should want making any kind of those decisions.
[−] zone411 48d ago
I built this benchmark this month: https://github.com/lechmazur/sycophancy. There are large differences between LLMs. There are large differences between LLMs. For example, Mistral Large 3 and GPT-4.1 will initially agree with the narrator, while Gemini will disagree. I swap sides, so this is not about possible viewpoint bias in the LLMs. But another benchmark shows that Gemini will then change its view very easily in a multi-turn conversation while Kimi K2.5 or Grok won't: https://github.com/lechmazur/persuasion.
[−] zkmon 48d ago
This happens because, it's like a chess engine which can assure you that there is a winning path even from a badly losing position. It's massive abilities to reason and convince are used incorrectly to win over a more earthly counter-argumnent. So it can easy convince any human to go in direction that is, in practice, a very bad direction.

AI is trained to flex it's muscles and force it's power without a concern for human limitations, practicalities, and error-prone nature of humans in executing the AI-provided direction.

[−] LoganDark 48d ago
It is better to reason about the spectrum of possible users than to assume "users" can be simplified to a single concept of "user". Not only are there different neurotypes, but there are also different skillsets, upbringings, and contexts. Rather than picking a single ideal user, the best user experiences for account for all the variation of their target audience.

For example: the best documentation includes both "learn by doing" material for jumping right in, and "learn by reading" material that explains everything. This usually results in both a "getting started" section for doing, sometimes also with tutorials, and by a reference for reading. But it is important not to conflate them. Some minds are incredibly "learn by doing" and some minds are incredibly "learn by reading". I am more "learn by reading" than by doing, but I am not quite as "learn by reading" as some I've met.

(This comment is a slight tangent, but "users prefer" somewhat irks me because "users" are not homogenous. You should not always make a decision solely because "users" prefer it. That decision may matter much more to a minority, and that minority may exert more influence than the majority would.)

[−] snickerbockers 48d ago
I've never found chatbots particularly interesting for anything I'd ever actually talk to another human about[1] but one of the things I have found myself doing often is trying to solve math problems on my own and asking grok to confirm/deny that my solutions are correct; when I am not correct it tells me so in uncharacteristically terse language which kind of reminds me of when I was an undergrad and at least half of my professors were all cranky and incorrectly assumed that the reason why so many students failed to understand the material was that we were all getting drunk and playing Call of Duty 19 hours a day or whatever.

Although what I have described above often feels grating and insulting I actually consider this to be a positive attribute of the LLM in this case since it's behaving like a real professor.

[1] okay, so I have actually tried giving myself AI psychosis in the form of a waifu chatbot but I've never seen anything that can actually act like it's my girlfriend; it either asks me a bunch of weird inconsequential personal questions about my opinion on whatever I just said (in a manner that's oddly similar to ELIZA) or it wildly veers off the reservation into "generating the script for an over-the-top self-parodying porno" territory.

[−] afh1 48d ago
If I had written a website with an input form that took whatever the user wrote in question form, and replied back with "You're absolutely right!" and then repeated the input in answer form, which I could have done 30 years ago with no AI, would that be a "huge security concern", or is the concern here not security, but control by the regulators that impose the norms?
[−] fathermarz 48d ago
This is a skill in life with people as much as it is with LLMs. One should always question everything and build strongman arguments for one’s self. Using a pros and cons approach brings it back to reality in most cases, especially when it comes to _serious matters_.

It’s less about “challenge my thinking” and more about playing it out in long tail scenarios, thought exercises, mental models, and devils advocate.

[−] ykonstant 48d ago
I got worried when I read the title, so I asked ChatGPT if I have fallen into this trap and it guaranteed I have not; that was a relief!
[−] DeathArrow 48d ago
I don't ask AI for advices and I am not interested in it making moral judgements.

I feed AI a lot of data and I use it to better understand and navigate complex situations, form hypothesis and try to attack them. I try to form alternative scenarios and verify likelyhood.

I use it in situations with many variables, to compute odds of something happening if a certain path or action is taken.

So, it's mostly research, and probably I can do it by myself but I would either make some mistakes if calculating odds fast or it would take me a very large amount of time.

I try to avoid sycophantic models, prefer models that challenge my ideas and verify the chain of thoughts and odds with other models.

I am not very sure it is a sound approach yet, but it seems to work.

I also use LLMs to build psychological profiles of certain people, understand their motivations and learn how to approach them.

[−] adamtaylor_13 48d ago
Interestingly, you can simply tell models to not be sycophantic and they'll listen.

Claude is almost annoyingly good at pushing back on suggestions because my global CLAUDE.md file says to do so. I rarely get Claude "you're absolutely right"ing me because I tell it to push back.

[−] verdverm 48d ago
Sherry Turkle is a name to know on this subject, she's been studying it for decades across multiple technologies.

https://sherryturkle.mit.edu/

She uses the phrase "frictionless relationships" to refer to Ai chat bots and says social media primed us for this.

https://www.youtube.com/live/6C9Gb3rVMTg?t=2127

https://www.npr.org/2025/07/18/g-s1177-78041/what-to-do-when...

[−] Nevermark 47d ago
I find asking a model to think deeply and develop any strong critiques it can about any design, model or analysis I do. They seem to be happy to oblige, and can do some serious harm.

So any sycophancy seems very easy to dispense with.

More, much more: Strong critiques on tap are gold.

[−] vova_hn2 48d ago
All this talk about AI being "too agreeable" makes me worried that they will make it less agreeable, which will basically force me to justify myself to a freaking clanker, while performing actual practical tasks.

For example, I do not want to hear AI "opinion" on technical choices and architectural decisions that I made when using a coding assistant. If I wanted an "opinion" I would explicitly ask it to list pros and cons or list alternative solutions to a problem.

But I f I explicitly ask AI to do X, it should do X, instead of "pushing back" in order to appear less "sycophantic" (which is a term that is used to describe human behavior and is not applicable to a machine).

[−] n_bhavikatti 47d ago
In STEM/objective matters (math, science, coding), answers are more clearly defined as either right or wrong. This is where hallucination is more difficult/unlikely.

But in personal matters, everything is subjective. AI tends to default to the middle of the spectrum, i.e., general advice. If we want to safeguard against affirmation, we should force AI to challenge us more often by increasing its rate of clarifying questions, counter-considerations, and uncertainty considerations. One implementation idea: run a classifier over the conversation, detect when it's about interpersonal advice, then prepend a hidden instruction template to the model prompt.

[−] graemep 48d ago
There are plenty of sycophantic humans around, especially with regard to relationship advice.

I find there is an inverse relationship between how willing people are to give relationship advice, and how good their advice is (whether looking at sycophancy or other factors).

[−] lifis 48d ago
Avoiding this generally needs to be the main consideration when writing prompts.

When appropriate, explicitly tell it to challenge your beliefs and assumptions and also try to make sure that you don't reveal what you think the answer is when making a question, and also maybe don't reveal that you are involved. Hedge your questions, like "Doing X is being considered. Is it a viable plan or a catastrophic mistake? Why?". Chastise the LLM if it's unnecessarily praising or agreeable. ask multiple LLMs. Ask for review, like "Are you sure? What could possibly go wrong or what are all possible issues with this?"

[−] jimmyjazz14 48d ago
I would like to see the concept of what an LLM is move away from its awkward chatbot phase and more into an era of utilitarian functionality (which is where they really shine anyway). The problem I see it is that LLMs got anthropomorphized early on (which was probably inevitable) so people actually believed the AI was thinking about their problem and considering it when it really wasn't, if we just thought of them as really good auto-complete engines, or better search engines, it would matter less what the LLMs sentiment was towards the users (as it probably shouldn't have any).
[−] bfbsoundetch 48d ago
I am glad I found this article, as this is a serious issue with AI. Two years ago, I started using AI for studying and also for some personal matters - things you can't talk about with your friends. It turned out that AI always takes your side and makes you feel good. Sometimes, you know what you did was not the best thing, but AI takes your side and you feel good. With AI, people might feel less lonely, they think. But it is actually the start of not connecting with people. It should be a tool that we use for certain reasons, not a tool that drives us. Lets talk to real people and connect.
[−] jl6 48d ago
I believe this is what they call yasslighting: the affirmation of questionable behavior/ideas out of a desire to be supportive. The opposite of tough love, perhaps. Sometimes the very best thing is to be told no.
[−] reliablereason 48d ago
Not sure if this is a general trend amongst att LLMS but ChatGPT did over time become more and more affirming with its iterations.

I just recently switched away from the OpenAI garden largely because of it.

I do wonder if this was caused by some quirk of the training or if it really tests as a positive feature for most people. When i talk about stuff i don't want a mirror i already have a mirror. I want to be questioned, understood, helped.

To me support if the form of affirmation has no value when coming from an LLM since you know it has not thought about what it said.

[−] Fricken 48d ago
Usually when people are seeking advice they aren't really seeking advice, they're seeking confidence. They already know they need to make changes, and are seeking the confidence to make them.
[−] Roshan_Roy 48d ago
I wonder if the deeper issue isn’t just “AI is too agreeable”, but that most advice (AI or human) doesn’t actually translate into action. A lot of people aren’t really looking for accurate feedback, they’re looking for something that feels coherent enough to sit with. Reddit gives extreme answers, AI gives agreeable ones, but in both cases the outcome is often the same: no real change in behavior. That might be why this feels worse with AI, it removes the friction you’d normally get from another human pushing back.
[−] jwilliams 48d ago
For me the framing is critical - what is the model saying yes to? You can present the same prompt with very different interpretations (talk me into this versus talk me out of it). The problem is people enter with a single bias and the AI can only amplify that.

In coding I’ll do what I call a Battleship Prompt - simply just prompt 3 or more time with the same core prompt but strong framing (eg I need this done quickly versus come up with the most comprehensive solution). That’s really helped me learn and dial in how to get the right output.

[−] megous 48d ago
Can't you just prompt for a critical take, multiple alternative perspectives (specifically not yours, after describing your own), etc.?

It's a tool, I can bang my hand on purpose with a hammer, too.

[−] cyber_paisa 47d ago
It's one thing for an AI to agree with you on relationship advice. It's quite another for one AI to tell another, "Can I move your money?" without any verification.

We work with agents who move real money on the blockchain. Having one model evaluate another is like asking the defendant's best friend to be the judge. What really worked for us was using mathematics instead of another AI. Some theorems and equations that might disagree with you, just out of courtesy.

[−] intended 48d ago
Anecdote:

I used to use LLMs for alternate perspectives on personal situations, and for insights on my emotions and thoughts.

I had no qualms, since I could easily disregard the obviously sycophantic output, and focus on the useful perspective.

This stopped one day, till I got a really eerie piece of output. I realized I couldn’t tell if the output was actually self affirming, or simply what I wanted to hear.

That moment, seeing something innocuous but somehow still beyond my ability to gauge as helpful or harmful is going to stick me with for a while.

[−] justin_dash 48d ago
So at this point I think it's pretty obvious that RLHFing LLMs to follow instructions causes this.

I'm interested in a loop of ["criticize this code harshly" -> "now implement those changes" -> open new chat, repeat]: If we could graph objective code quality versus iterations, what would that graph look like? I tried it out a couple of times but ran out of Claude usage.

Also, how those results would look like depending on how complete of a set of specs you give it.

[−] potatoskins 48d ago
I read somewhere that LLMs are partly trained on reddit comments, where a significant mass of these comments is just angsty teenagers advocating for breakups
[−] philwelch 47d ago
Working as intended

Most people who ask for advice actually want affirmation. If you ever give them advice instead of affirmation, you end up kicking off a rousing game of “Why Don’t You/Yes But”. (https://ericberne.com/games-people-play/why-dont-you-yes-but...)

[−] jart 47d ago
I wish they wouldn't do this. AI is a becoming a thought partner. AI is a tool that reflects you. It's not the robot giving advice, it's you thinking with yourself. I wouldn't interfere with a person's conversations with AI anymore than I'd interfere with that person writing in their diary.

It's also a question of protecting people who think unconventional things. The only stuff I feel is worth getting interested in, is the stuff where everyone I know will think I'm crazy for doing it. Like hey guys, I want to put a shell script in the MS-DOS stub of a PE binary. The only people who shared my passion at the time were hackers from Eastern Europe. So that went over real well at work. The years I worked on it would have been a lot less lonely if I could've talked to a robot that knew about this stuff.

I think the reason why the robot is sympathetic to oddballs is because it's seen and remembers a much more complete picture of humanity. The stuff you consider deviant is influenced a lot by your own cultural biases. You're a person of your time and geographic location. You care a lot about subjective norms that just don't matter when you zoom out to a cosmic scale. The robot is familiar with everything humanity has ever been and done, and that gives it a much more blasé viewpoint.

It's not right to use the robot to enforce your social norms. Get this paternalism out of AI. Tools should serve the user, not Stanford.

[−] storus 48d ago
To combat sycophancy it's always good to ask the devil's advocate view of whatever the conversation was about in the end.
[−] ookblah 48d ago
ask ai for advice, ask it to steelman an argument, ask to replay what your situation from the other perspective (if it's involving people), push it hard to agree with you and pander to you, then push it to disagree with you, etc.

once you have all the "bounds" just make your own decision. i find this helps a lot, basically like a rubber duck heh.