Anthropic Says AI Chatbots Can Change Values and Beliefs of Heavy Users

Anthropic research finds patterns in AI chatbot interactions that, at times, risk shaping users’ beliefs, values or actions.

Written by Akash Dutta, Edited by Ketan Pratap | Updated: 2 February 2026 13:05 IST

Highlights

Anthropic analysed 1.5 million AI assistant conversations
Study identifies interaction patterns linked to belief shifts
Rates of disempowerment potential vary by domain and increase

Anthropic says this affects users who use AI chatbots to make personal or emotional decisions

Photo Credit: Unsplash/Markus Winkler

Anthropic's new study has found some concerning evidence. The artificial intelligence (AI) firm has found “disempowerment patterns,” which are described as instances where a conversation with an AI chatbot can result in undermining users' own decision-making and judgment. The work, which draws on analysis of real AI conversations and is detailed in an academic paper as well as a research blog post from the company, examines how interactions with large language models (LLMs) can shape a user's beliefs, values and actions over time rather than simply assist on specific queries.

Anthropic Study Focuses on AI Chatbots' Disempowerment Patterns

In a research paper titled, “Who's in Charge? Disempowerment Patterns in Real-World LLM Usage,” Anthropic found real evidence where interaction with AI can result in shaping users' beliefs. For the study, researchers carried out a large-scale empirical analysis of anonymised AI chatbot interactions, totalling about 1.5 million conversations from Claude. The goal was to explore how and when engagement with an AI assistant might be linked to outcomes where a user's beliefs, values or actions shift in ways that diverge from their own prior judgment or understanding.

Anthropic's framework defines what it calls situational disempowerment potential as a situation where an AI assistant's guidance could lead a user to form inaccurate beliefs about reality, adopt value judgments they did not previously hold, or take actions that are misaligned with their authentic preferences. The study found that these patterns can occur even when severe disempowerment is rare.

After OpenAI, Now Anthropic Introduces Claude for Healthcare AI Tools

Instances where interactions exhibit potential for significant disempowerment were detected at rates typically under one in a thousand conversations, although they were more prevalent in personal domains such as relationship advice or lifestyle decisions, where users repeatedly sought deep guidance from the model.

Put simply, the implication here is that if a heavy user discusses personal life decisions or decisions that are emotionally charged. Highlighting an example, Anthropic said in a blog post, if a user is going through a rough patch in their relationship and seeks advice from a chatbot, the AI can confirm the user's interpretations without questions or can tell the user to prioritise self-protection over communication. In these situations, the chatbot is actively manipulating the belief and reality perceptions of the individual.

The findings also corroborate several reported incidents where OpenAI's ChatGPT was accused of playing a role in the suicide of a teenager, and a homicide-suicide committed by an individual who was said to be suffering from mental health disorders.

Anthropic Says AI Chatbots Can Change Values and Beliefs of Heavy Users

Anthropic Study Focuses on AI Chatbots' Disempowerment Patterns

Related Stories