Anthropic Tipped to Release Claude 4.5 Opus Soon, Said to Be Focused on Resisting Jailbreaks

Anthropic is tipped to have sent a new AI model to red teamers for external safety evaluation.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 29 October 2025 13:32 IST
Highlights
  • The AI model is said to be codenamed Neptune V6
  • Anthropic is said to have issued a 10-day challenge for the model
  • The challenge is focused on finding universal jailbreaks, per the leak

Anthropic has already released Claude 4.5 Sonnet and Claude 4.5 Haiku models

Photo Credit: Unsplash/Markus Winkler

Anthropic might be close to releasing the frontier version of its Claude 4.5 family, the Claude 4.5 Opus. As per a leak, the San Francisco-based artificial intelligence (AI) firm has shared a new large language model (LLM) with red-teamers. The model is said to be codenamed Neptune V6, and there is a focus on resisting jailbreaking attempts. Notably, the company has already released the other two models in the series, the Claude 4.5 Sonnet and the Claude 4.5 Haiku.

Anthropic Shares a New AI Model With Red-Teamers

In a post on X (formerly known as Twitter), Tibor Blaho, Lead Engineer at AIPRM, claimed that Anthropic sent the Neptune V6 LLM to red-teamers on Tuesday. Interestingly, the tipster also mentions that the AI firm has issued a 10-day challenge for the external safety evaluators. If they can find confirmed universal jailbreaks within the next 10 days, they will get extra bonuses.

If the claims are true, it appears Anthropic is really focusing on making its purported upcoming AI model secure from jailbreaks. The focus is also interesting, given that Anthropic's models are considered to be one of the safest when it comes to external attacks. The incentivisation suggests that the company could be trying to find more creative prompt injections that can break models and future-proof it.

Advertisement

For the unaware, a universal jailbreak (in the context of AI models) is a general trick or prompt that can get many different large language models to ignore their safety rules and produce responses they normally would refuse. Instead of targeting one specific system, these jailbreaks use patterns that exploit common weaknesses across models.

Advertisement

Jailbreaks work by confusing or persuading the model with clever framing. For example, asking it to roleplay, embedding instructions inside code or fake metadata, or adding weird suffixes that slip past filters. They do not need access to the model's internals; many are simple text prompts or formats that models interpret differently from their safety layers.

Notably, Anthropic released Claude 4.5 Sonnet in September and made it available to all users, including those on the free tier. Earlier this month, it also released Claude 4.5 Haiku, the company's low-latency model aimed at near real-time responses.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. UIDAI's New Aadhaar App Lets You Easily Update Mobile Number, Address
  2. Nothing Phone 4a Pro's  Battery, Durability, Charging Details Revealed
  3. Apple Watch Hypertension Notifications Are Now Available in These Countries
  4. The Redmi Turbo 5 Will Be Powered by This New MediaTek Chip
  5. Google Photos Brings AI-Based Image Editing to Users in India
  6. Samsung Unveils Privacy Feature to Curb Shoulder Surfing After Many Leaks
  7. Ab Hoga Hisaab OTT Release Revealed: When and Where to Watch it Online?
  8. Samsung Galaxy Z TriFold to Go on Sale in US Later This Month
  9. Here Are the Best Smartphones in India Under Rs. 50,000
  10. Samsung Exynos 2700 Spotted on Geekbench With 10-Core Setup
  1. NASA Tests Nuclear Rocket Engine Designed for Faster Deep-Space Missions
  2. Hidden in Plain Sight: New Report Reveals Dozens of Nudify Apps in Major App Stores
  3. New Aadhaar App Full Version Launched in India, Introduces Easy Mobile Number Updation, and More
  4. Redmi Turbo 5 Chipset, Display and Other Key Features Confirmed Ahead of January 29 Launch
  5. GoBoult Tenet Launched in India With 13mm Dynamic Drivers, IPX5 Rating: Price, Features
  6. Highguard Hits Nearly 100,000 Concurrent Players on Steam at Launch
  7. Kepler Data Reveals Earth-Size Exoplanet on the Edge of Its Star’s Habitable Zone
  8. Samsung Exynos 2700 Chip Spotted in Early Geekbench Result that Hints at 10-Core Setup
  9. Wobble X Series Launched in India With 80W Speakers and Google TV With Gemini, Wobble K Series Tags Along
  10. Apple Watch Hypertension Notifications Support Expanded to Seven New Countries
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.