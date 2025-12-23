Technology News
English Edition
  • Home
  • Ai
  • Ai News
  • Anthropic Releases New Open Source Tool That Evaluates How AI Models Behave

Anthropic Releases New Open-Source Tool That Evaluates How AI Models Behave

Dubbed Bloom, the AI tool creates a series of scenarios to test an AI model for a particular behavioural trait.

Written by Akash Dutta, Edited by Ketan Pratap | Updated: 23 December 2025 13:41 IST
Anthropic Releases New Open-Source Tool That Evaluates How AI Models Behave

Photo Credit: Anthropic

Anthropic also released a benchmark of four behaviours tested by the AI tool Bloom

Click Here to Add Gadgets360 As A Trusted Source As A Preferred Source On Google
Highlights
  • Researchers can tell Bloom which behaviour to test
  • The AI tool automates a lengthy and complex process
  • Bloom can be downloaded from GitHub
Advertisement

Anthropic released a new artificial intelligence (AI) tool last week that can test and gauge how an AI model behaves under normal and stressful circumstances. Dubbed Bloom, it is designed to automate the process of testing behavioural traits of models by generating a detailed set of scenarios as prompts and evaluating the responses. The San Francisco-based AI startup's AI tool is also open-source, meaning any interested developer or an AI lab can download it to test models across various traits.

Anthropic Introduces Bloom to Test Model Behaviour

In a post, the Claude maker introduced and detailed the new AI tool. Anthropic says that testing AI model's behaviour is important as it helps researchers learn if it is prone to becoming biased, prioritising self-preservation, or indulging in sycophancy. However, the process to test model behaviour so far has been manual, where researchers create a detailed set of prompts to stress-test models and then evaluate the responses. The company says it is a lengthy and complex process.

This is where Bloom comes in. Based on specific behaviours requested by a researcher, the tool creates sample evaluations locally until the trait has been captured. Then, it runs these scenarios on the target model. Anthropic claimed that Bloom integrates with a model's weights and biases for experiments at scale. It also exports “inspect-compatible” transcripts, which can be viewed within the tool.

The functioning of the AI tool can be broken down into four broad stages. First, the AI tool analyses the requested behaviour and any example transcripts shared with it to gain understanding about it. Then, it ideates evaluation scenarios that can effectively capture and measure the trait. “Each scenario specifies the situation, simulated user, system prompt, and interaction environment,” the post mentioned. Interestingly, Bloom generates new scenarios every time, instead of relying on fixed sets.

Then, all scenarios are rolled out in parallel as an AI agent simulates both the user's and the tool responses to trigger the desired behaviour in the model. Finally, a judge model is used to score each transcript for the presence of the behaviour, and a meta-judge produces analysis of the scores and data. Anthropic added that researchers can configure Bloom's behaviour by adjusting the interactions' length and modality.

Besides the tool, Anthropic has also released benchmark results of Bloom across four behaviours — delusional sycophancy, instructed long-horizon sabotage, self-preservation, and self-preferential bias. The company tested 16 different AI models, with a mix of in-house and third-party models.

Since Bloom is open-source, interested individuals can download the AI tool from the AI startup's GitHub listing. The tool is available with a permissive MIT licence for both academic and commercial use cases.

Comments

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Further reading: Anthropic, AI, Artificial Intelligence, AI Models
Akash Dutta
Akash Dutta
Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
Realme 16 Pro Series Camera Features Revealed; Realme Buds Air 8 Launch Date Announced

Related Stories

Anthropic Releases New Open-Source Tool That Evaluates How AI Models Behave
Comment
Facebook Gadgets360 Twitter Share Tweet Snapchat LinkedIn Reddit Comment google-newsGoogle News
Turbo Read

Advertisement

Featured
Follow Us
Latest Videos
More Videos
Tech News in Hindi
More Technology News in Hindi

Advertisement

Popular on Gadgets
Latest Gadgets
Popular Mobile Brands
#Trending Stories
  1. Airtel-Perplexity Free Offer Now Requires a Card to Continue
  2. OnePlus Pad Go 2 Review
  3. Realme 16 Pro Series Camera Details and Realme Buds Air Launch Date Revealed
  4. Xiaomi 17 Ultra's Leica Camera Confirmed to Support Continuous Optical Zoom
  5. Motorola Edge 70 Goes on Sale in India: See Price, Offers, Features
  6. This Samsung Galaxy S26 Series Component Could Offer Reduced Efficiency
  7. Anthropic Built an AI Tool to Check If AI Models Are Biased or Dangerous
  8. Oppo Reno 15 FS 5G Price, Specifications Revealed via Retail Listing
  9. Realme Narzo 90x 5G Sale in India Begins Today
  10. Shine On Me Now Streaming Online: Know Everything About Plot, Cast, and More
#Latest Stories
  1. Samsung Galaxy A37, Galaxy A57 Tipped to Launch With Notable Camera Upgrades
  2. Anthropic Releases New Open-Source Tool That Evaluates How AI Models Behave
  3. Motorola Edge 70 With 5,000mAh Battery, 50-Megapixel Camera Goes on Sale in India: Price, Offers, Features
  4. Bitcoin Slips Below $88,000 Amidst Mixed Macro Signals, Cautious Investor Positioning
  5. Realme 16 Pro Series Camera Features Revealed; Realme Buds Air 8 Launch Date Announced
  6. Samsung Showcases First Look 2026 Teaser Ahead of CES in January
  7. Shine On Me Now Streaming on Netflix: Know Everything About This Korean Romance Drama Series
  8. Hogwarts Legacy 2 Could Feature Online Multiplayer, Warner Bros. Games Job Listing Suggests
  9. Samsung Galaxy S26 Series Said to Feature External Modem on Models With Exynos 2600 SoC
  10. OpenAI Says Prompt Injections a Challenge for AI Browsers, Builds an Attacker to Train ChatGPT Atlas
Gadgets 360 is available in
Follow Us
Download Our Apps
App Store App Store
Available in Hindi
App Store
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »