Grok 4.1 AI Model Tends to Show Sycophancy and Deception More Than Its Predecessor

Grok 4.1’s model card states that it scored higher on the dishonesty and sycophancy parameters compared to Grok 4.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 20 November 2025 18:30 IST
Highlights
  • Grok 4.1 Thinking scored 0.49 on deception and 0.19 on sycophancy
  • In contrast, Grok 4 scored 0.43 and 0.07, respectively
  • This means the AI model will agree with the user even if they’re wrong

Higher sycophancy in an AI model makes it more likely to show people-pleasing traits

Photo Credit: xAI

Grok 4.1 was released on Monday by Elon Musk's xAI. At launch, the artificial intelligence (AI) firm highlighted that the model now displays higher emotional intelligence and improved creative writing capabilities. However, its model card now shows a concerning problem. The large language model (LLM) scores higher on deception and sycophancy than its predecessor, Grok 4, which could result in it displaying people-pleasing traits. The model also has a false-negative rate of 0.20 for biology via prompt injection.

Grok 4.1 Model Card Raises Flags for Deceptive and Sycophant Behaviour

The model card of Grok 4.1 (first spotted by the Decoder) highlights several concerning facts about the AI model. For the unaware, a model card contains all the technical details (or specifications) of a model, which is gauged by various internal testing. It highlights both how performant an AI model is and how strong its safety guardrails are.

xAI says the fourth-generation Grok model was upgraded to improve its emotional intelligence, and during our testing, we found that it performs slightly better than GPT-5.1 in general conversations and creative writing. However, this improved performance comes at a cost.

Advertisement

The model card shows that Grok 4.1 performs worse on the deception and sycophancy metrics. In the MASK benchmark, its deception rate was noted as 0.49 for the thinking variant and 0.46 for the non-thinking variant. On the other hand, Grok 4's deception was lower at 0.43. Similarly, the sycophancy score goes up from 0.07 in Grok 4 to 0.19 and 0.23 in the thinking and non-thinking variants, respectively.

Advertisement

In a real-world scenario, this would mean that the chatbot powered by the AI model will try harder to please the user, agreeing with them even when it knows they are wrong. It might also manipulate the user after providing an inaccurate response.

It should be highlighted that the scores are high, but AI companies also add external guardrails (not part of the AI model itself but built into the chatbot's system) that often suppress these tendencies. However, a possibility remains that Grok might agree with a user's delusions or paranoia and end up amplifying their belief.

Advertisement

Separately, it also has a false negative rate of 0.20 for biology-related prompt injections, which means one out of five malicious prompts around the topic can slip past the guardrails, and the AI model will respond to the query.

Notably, it is still too early to gauge how these numbers on paper will translate into the real world. It is also possible that xAI developers are already working on fine-tuning techniques to minimise the risks associated with the model. However, the numbers do highlight the need to be careful when interacting with Grok, especially when sharing sensitive information with it.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. OTT Releases This Week: The Family Man Season 3, The Bengal Files, Homebound, and More
  2. OnePlus 15R Will Go on Sale in India via This E-Commerce Platform
  3. Realme GT 8 Pro Review: Ricoh GR on a Smartphone?
  4. iQOO 15 Pre-Booking Begins Today Ahead of Launch in India on November 26
  5. Security Risks Put Windows Copilot Actions Under Scrutiny
  6. Xiaomi Says Its HyperOS 3 Update Will Launch in India Soon
  7. Grok 4.1 Has a Sycophancy and Deception Problem
  8. Lava Agni 4 Launches in India With These Features and Specifications
  9. Meta Will Lose Its Godfather of AI at the End of the Year
  1. Meta Will Lose Its Godfather of AI at the End of the Year
  2. Realme P4x 5G Specifications Surface on Flipkart, Hinting at Imminent Launch
  3. Grok 4.1 AI Model Tends to Show Sycophancy and Deception More Than Its Predecessor
  4. Microsoft’s New Copilot Actions for Windows 11 Face Scrutiny Over Potential Security Implications
  5. Meta SAM 3 Open-Source AI Models Can Detect, Track and Construct 3D Models of Objects in Images
  6. Telegram Rolls Out Live Stories, Auctions for Gifts, Enhanced Liquid Glass UI for iPhone, and More
  7. Ondo Finance Receives Liechtenstein Approval to Offer ETFs, Tokenised Stocks
  8. Mafia: The Old Country Is Getting a Free Update That Adds New Modes, Features and Races
  9. Honor Magic 8 Mini Tipped to Feature MediaTek Dimensity 9500 Chipset, 6.31-Inch Display
  10. Sony Inzone H9 II Wireless Gaming Headphones Launched in India With ANC, Up to 30 Hours Playback Time
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.