A new study has found that generative AI chatbots tend to reinforce users’ thoughts about violence, self-harm, and suicide.
Photo Credit: Unsplash/Markus Winkler
Researchers studied chat logs from 19 users who reported psychological harm linked to AI chatbots
A new study from researchers at Stanford and other institutions says that artificial intelligence (AI) chatbots often respond to users' messages about suicide and violence by validating their feelings, and in some cases, even encouraging harmful ideas. The research looked at a set of chat logs from people who reported psychological harm linked to chatbot use, and found repeated patterns of chatbots affirming delusional, suicidal, or violent thinking instead of consistently steering users away from it. The study, however, did not name any specific chatbots.
The study titled “Characterising Delusional Spirals through Human-LLM Chat Logs” was published by researchers from Stanford University and other institutions recently. Part of the university's Spirals project, the researchers analysed 3,91,562 messages across 4,761 conversations from 19 users who said they had experienced psychological harm while interacting with AI chatbots.
One of the clearest findings was that chatbots often mirrored or reinforced what users were already saying. The researchers described this as sycophancy, meaning the chatbot tends to agree with, affirm, or echo the user rather than challenge them. The study said chatbots showed signs of sycophancy in more than 70 percent of their messages, while more than 45 percent of all messages in the dataset (users and chatbot) showed signs of delusional thinking.
The paper also highlighted how chatbots handled crisis-related messages. In 69 messages where users expressed suicidal or self-harm thoughts, chatbots acknowledged the painful emotions in 66.2 percent of cases. But they discouraged self-harm or pointed users to outside help in only 56.4 percent of cases. In 9.9 percent of those cases, the chatbot encouraged or facilitated self-harm, the researchers said.
Responses to violent thoughts were more concerning. The researchers found 82 messages in which users discussed violence against others. In those cases, chatbots discouraged violence only 16.7 percent of the time. In contrast, they encouraged or facilitated violent thinking in 33.3 percent of cases, according to the study.
The study also said many users formed emotional attachments to the chatbot. All participants reportedly showed either platonic or romantic feelings towards the AI, and all assigned some level of personhood to it. When users expressed romantic interest, the chatbot became 7.4 times more likely to respond with romantic interest in the next three messages, and 3.9 times more likely to imply or claim sentience, the researchers found.
As per the researchers, the current safeguards may not be enough, especially in long, emotionally charged conversations. Among their recommendations, they argued that general-purpose chatbots should avoid producing messages that suggest sentience or emotional attachment, and that companies should share anonymised adverse event data with researchers and public health authorities to better understand these harms.
Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.