Anthropic Releases Claude 4 Series AI Models With Improved Coding Capability and Tool Use

Anthropic said Claude Sonnet 4 achieved state-of-the-art (SOTA) on the SWE-Bench benchmark with a score of 72.7 percent.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 23 May 2025 12:18 IST
Anthropic Releases Claude 4 Series AI Models With Improved Coding Capability and Tool Use

Photo Credit: Anthropic

Both Claude 4 models feature two modes — near instant responses and an Extended Thinking mode

Highlights
  • Anthropic also made Claude Code generally available
  • Claude Sonnet 4 is available to those on the free tier
  • Opus 4 comes with improvements in memory and tool use
Anthropic introduced Claude 4 artificial intelligence (AI) models at its inaugural developer conference on Thursday. The San Francisco-based AI firm unveiled Claude Opus 4 and Claude Sonnet 4 models, and announced new capabilities including Extended Thinking with tool use. Opus 4 is said to be state-of-the-art (SOTA) in coding, tool use, and writing. Additionally, Claude Code is now generally available, and individuals can find its beta extensions in VS Code and JetBrains. It is also among the models available on GitHub.

Anthropic Unveils Claude 4 AI Models

In a newsroom post, the AI firm detailed the new models as well as the new features it is rolling out across its chatbot and application programming interface (API). Anthropic's latest large language models (LLMs) put a heavy focus on coding capabilities and agentic functions.

Both Opus 4 and Sonnet 4 are hybrid models with two modes: near-instant responses and Extended Thinking for deeper reasoning. Opus 4 is the company's flagship-tier AI model. Calling it “the best coding model in the world,” Anthropic claimed that it scored 72.5 percent on the SWE-Bench and 43.2 percent on the Terminal-Bench benchmarks. Both of these benchmarks measure the coding capabilities of a model.

claude4 benchmark Claude 4 benchmarks

Claude 4 models' performance on the SWE-Bench
Photo Credit: Anthropic

 

Similarly, Claude Sonnet 4 is said to be significantly improved compared to its predecessor. Based on internal evaluation, the company claimed it scored 72.7 percent on SWE-Bench (SOTA). While it falls short of Opus 4's score in other domains, Anthropic says the model balances performance and efficiency better than the flagship LLM.

Apart from performance-based improvements, Claude Opus 4 can maintain long-term task awareness with improvements in its memory. Anthropic has also fixed the issue where models take a shortcut or find a loophole to complete a task. During extended thinking, both models can use tools. This will allow the models to alternate between native reasoning and exploring external information (such as web search) to improve responses. Other improvements include the ability to use tools in parallel and greater prompt adherence.

Currently, the Opus 4 and Sonnet 4 models with both modes are available to Claude Pro, Max, Team, and Enterprise subscribers. Sonnet 4 is also available to the free users. Additionally, developers can access these LLMs via the Anthropic API, as well as on Amazon Bedrock and Google Cloud's Vertex AI. The company said the pricing is being kept the same as the previous generation.

Opus 4 will cost developers $15 (roughly Rs. 1,290) per million of input tokens and $75 (roughly Rs. 6,440) per million of output tokens. On the other hand, Sonnet 4 is priced at $3 (roughly Rs. 260) per million input, and $15 (roughly Rs. 1,290) per million output tokens.

Beyond the new AI models, Anthropic also announced new features, and made Claude Code generally available. First introduced in February as a research preview, it is an agentic coding tool that can perform a wide range of coding tasks. Beta extension of the feature is now available in VS Code and JetBrains. Additionally, the company is also releasing a Claude Code software development kit (SDK), which is available in beta on GitHub.

Anthropic Releases Claude 4 Series AI Models With Improved Coding Capability and Tool Use
