  • Anthropic Accuses DeepSeek and Other Chinese AI Firms of Model Distillation Attempts

Anthropic Accuses DeepSeek and Other Chinese AI Firms of Model Distillation Attempts

Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify.

Written by Shaurya Tomer, Edited by Ketan Pratap | Updated: 24 February 2026 09:00 IST
Anthropic Accuses DeepSeek and Other Chinese AI Firms of Model Distillation Attempts

Photo Credit: Anthropic

Anthropic recently inaugurated its first Indian office, located in Bengaluru

Highlights
  • Anthropic alleged DeepSeek targeted Claude’s reasoning model
  • Moonshot and MiniMax were also named in the report
  • Anthropic built new detection systems as part of the countermeasures
Anthropic on Tuesday accused China-based artificial intelligence (AI) companies, including DeepSeek, of attempting to extract knowledge from its AI systems using a technique known as distillation. The US-based AI firm said it detected activity consistent with large-scale model distillation attempts targeting its systems. Anthropic claims the effort was aimed at using outputs from its models to train competing AI systems, and it says it has taken steps to block and prevent such activity.

What Are Distillation Attacks?

Distillation is a technique in machine learning where a smaller “student” model is trained to replicate the outputs of a larger “teacher” model. This is commonly used to create lightweight versions of powerful systems that can run more efficiently, the company explained in a blog post.

Without explicit permission, however, distillation can become a form of intellectual property (IP) extraction. During a distillation attack, a party repeatedly queries a proprietary AI model through its public interface or API, collects large volumes of responses, and then uses that data to train a new model that mimics the original system's behaviour, as per Anthropic.

The AI firm explained that such type of activity can allow competitors to benefit from the performance, alignment work, and safety guardrails of frontier models without incurring the same research and training costs.

What Anthropic Alleged About DeepSeek and Others

Anthropic said it discovered industrial-scale campaigns by three AI laboratories — DeepSeek, Moonshot, and MiniMax — which illicitly attempted to “steal” Claude's capabilities. The AI firm also provided detailed breakdowns of three separate operations it says it identified.

DeepSeek was accused of carrying out more than 150,000 exchanges targeting Claude's reasoning capabilities across diverse tasks, including rubric-based grading that turned Claude into a reward model for reinforcement learning. Anthropic also alleged that DeepSeek generated censorship-safe alternatives to politically sensitive queries, most likely to train its own systems to avoid restricted topics.

According to Anthropic, DeepSeek used synchronised traffic across multiple accounts, with identical patterns, shared payment methods, and coordinated timing that suggested deliberate load balancing to increase throughput and evade detection. However, the requested metadata allowed it to trace these activities to specific researchers at the lab.

The company also accused Moonshot AI of conducting over 3.4 million exchanges focused on agentic reasoning, coding, tool use, computer-use agent development, and computer vision tasks. Anthropic claims Moonshot employed hundreds of fraudulent accounts across multiple access pathways to obscure coordination.

Lastly, MiniMax is alleged to have conducted more than 13 million exchanges centred on agentic coding and tool orchestration. According to the Anthropic, attribution was made using request metadata and infrastructure indicators. The AI firm claimed it detected this campaign while it was still active, before MiniMax's in-training model was released.

Anthropic's Response

To prevent future attacks, Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify. It is claimed to have built multiple detection systems, including classifiers and behavioural fingerprinting tools, to flag patterns consistent with distillation in API traffic.

The company is also sharing technical indicators with other AI labs, cloud providers, and relevant authorities in an effort to highlight the issue of distillation. It has also strengthened access controls, particularly around educational accounts, security research programmes, and startup pathways that it says are often exploited to create fraudulent accounts.

Lastly, Anthropic is developing countermeasures at the product, API, and model levels to reduce the effectiveness of its outputs for illicit distillation, without hampering the customer experience. The company said it published the details to make the evidence available to stakeholders with an interest in safeguarding advanced AI systems.

Anthropic, Claude, DeepSeek, MiniMax, Moonshot AI, AI, Artificial Intelligence, Cybersecurity
Shaurya Tomer
Shaurya Tomer
Shaurya Tomer is a Sub Editor at Gadgets 360 with 2 years of experience across a diverse spectrum of topics. With a particular focus on smartphones, gadgets and the ever-evolving landscape of artificial intelligence (AI)
