Anthropic Researchers Make Major Breakthrough In Understanding How an AI Model Thinks

Anthropic researchers found evidence of AI thinking patterns by locating interpretable concepts linked to computational circuits.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 28 March 2025 17:46 IST
Highlights
  • Anthropic released two papers detailing the methodology
  • The researchers found that AI thinks in a shared language space
  • Claude is said to plan responses many words ahead

AI models are also capable of intentional hallucination if the questions asked are difficult

Photo Credit: YouTube/Anthropic

Anthropic researchers shared two new papers on Thursday, sharing the methodology and findings on how an artificial intelligence (AI) model thinks. The San Francisco-based AI firm developed techniques to monitor the decision-making process of a large language model (LLM) to understand what motivates a particular response and structure over another. The company highlighted that this particular area of AI models remains a black box, as even the scientists who develop the models do not fully understand how an AI makes conceptual and logical connections to generate outputs.

Anthropic Research Sheds Light on How an AI Thinks

In a newsroom post, the company posted details from a recently conducted study on “tracing the thoughts of a large language model”. Despite building chatbots and AI models, scientists and developers do not control the electrical circuit a system creates to produce an output.

To solve this “black box,” Anthropic researchers published two papers. The first investigates the internal mechanisms used by Claude 3.5 Haiku by using a circuit tracing methodology, and the second paper is about the techniques used to reveal computational graphs in language models.

Advertisement

Some of the questions the researchers aimed to find answers to included the “thinking” language of Claude, the method of generating text, and its reasoning pattern. Anthropic said, “Knowing how models like Claude think would allow us to have a better understanding of their abilities, as well as help us ensure that they're doing what we intend them to.”

Advertisement

Based on the insights shared in the paper, the answers to the abovementioned questions were surprising. The researchers believed that Claude would have a preference for a particular language in which it thinks before it responds. However, they found that the AI chatbot thinks in a “conceptual space that is shared between languages.” This means that its thinking is not influenced by a particular language, and it can understand and process concepts in a sort of universal language of thought.

While Claude is trained to write one word at a time, researchers found that the AI model plans its response many words ahead and can adjust its output to reach that destination. Researchers found evidence of this pattern while prompting the AI to write a poem and noticing that Claude first decided the rhyming words and then formed the rest of the lines to make sense of those words.

Advertisement

The research also claimed that, on occasion, Claude can also reverse-engineer logical-sounding arguments to agree with the user instead of following logical steps. This intentional “hallucination” occurs when an incredibly difficult question is asked. Anthropic said its tools can be useful for flagging concerning mechanisms in AI models, as it can identify when a chatbot provides fake reasoning in its responses.

Anthropic highlighted that there are limitations in this methodology. In this study, only prompts of tens of words were given, and still, it took a few hours of human effort to identify and understand the circuits. Compared to the capabilities of LLMs, the research endeavour only captured a fraction of the total computation performed by Claude. In the future, the AI firm plans to use AI models to make sense of the data.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. iQOO Z11 Turbo Design Teased; Specifications Leaked
  2. Oppo Reno 15 Pro Mini Confirmed to Launch in India Alongside These Models
  3. OnePlus Reportedly Developing New Smartphone for India, Global Markets
  4. OnePlus Pad Go 2 Review
  5. Oppo Reno 15 Series 5G Confirmed to Launch in India Soon
  6. Huawei Nova 15 Series With Kirin Chips, Up To 6,500mAh Batteries Launched
  7. Airtel-Perplexity Free Offer Now Requires a Card to Continue
  8. Asus VM670KA AiO All-in-One Desktop PC With 27-Inch Display Launched in India
  9. Oppo Find X9 Ultra Camera Specifications Leaked Ahead of China Launch
  10. Xiaomi Watch 5, Xiaomi Buds 6 to Launch Alongside Xiaomi 17 Ultra
  1. New Ionic Liquid Breaks Stability Barrier for Perovskite Solar Cells
  2. Yann LeCun Sets Up Advanced Machine Intelligence AI Startup After Announcing Departure From Meta
  3. Nayanam Now Available For Streaming Online: What You Need to Know About This Psychological Thriller Online
  4. Kaya-Chan Isn’t Scary OTT Release Details: Know Where to Watch This Anime Horror-Comedy Series Online
  5. Samsung Galaxy S25 Series Gets One UI 8.5 Beta 2 Update in India With New Improvements, Bug Fixes
  6. Oppo Pad Air 5 Display, Battery Upgrades Confirmed Ahead of December 25 Launch in China
  7. OpenAI Upgrades ChatGPT With Adjustable Personality Traits, Response Styles
  8. Huawei Nova 15 Ultra Launched With 6,500mAh Battery, Kirin 9010S Chip, Nova 15 Pro, Nova 15 Tag Along: Price, Features
  9. Huawei Watch 10th Anniversary Edition With 1.38-inch LTPO 2.0 AMOLED Screen, HarmonyOS 6 Launched: Price, Features
  10. OnePlus Phone Codenamed ‘Volkswagen’ With Snapdragon 8s Gen 4 Chip Tipped to Launch in India, Global Markets
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.