Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify.
Photo Credit: Anthropic
Anthropic recently inaugurated its first Indian office, located in Bengaluru
Anthropic on Tuesday accused China-based artificial intelligence (AI) companies, including DeepSeek, of attempting to extract knowledge from its AI systems using a technique known as distillation. The US-based AI firm said it detected activity consistent with large-scale model distillation attempts targeting its systems. Anthropic claims the effort was aimed at using outputs from its models to train competing AI systems, and it says it has taken steps to block and prevent such activity.
Distillation is a technique in machine learning where a smaller “student” model is trained to replicate the outputs of a larger “teacher” model. This is commonly used to create lightweight versions of powerful systems that can run more efficiently, the company explained in a blog post.
Without explicit permission, however, distillation can become a form of intellectual property (IP) extraction. During a distillation attack, a party repeatedly queries a proprietary AI model through its public interface or API, collects large volumes of responses, and then uses that data to train a new model that mimics the original system's behaviour, as per Anthropic.
The AI firm explained that such type of activity can allow competitors to benefit from the performance, alignment work, and safety guardrails of frontier models without incurring the same research and training costs.
Anthropic said it discovered industrial-scale campaigns by three AI laboratories — DeepSeek, Moonshot, and MiniMax — which illicitly attempted to “steal” Claude's capabilities. The AI firm also provided detailed breakdowns of three separate operations it says it identified.
DeepSeek was accused of carrying out more than 150,000 exchanges targeting Claude's reasoning capabilities across diverse tasks, including rubric-based grading that turned Claude into a reward model for reinforcement learning. Anthropic also alleged that DeepSeek generated censorship-safe alternatives to politically sensitive queries, most likely to train its own systems to avoid restricted topics.
According to Anthropic, DeepSeek used synchronised traffic across multiple accounts, with identical patterns, shared payment methods, and coordinated timing that suggested deliberate load balancing to increase throughput and evade detection. However, the requested metadata allowed it to trace these activities to specific researchers at the lab.
The company also accused Moonshot AI of conducting over 3.4 million exchanges focused on agentic reasoning, coding, tool use, computer-use agent development, and computer vision tasks. Anthropic claims Moonshot employed hundreds of fraudulent accounts across multiple access pathways to obscure coordination.
Lastly, MiniMax is alleged to have conducted more than 13 million exchanges centred on agentic coding and tool orchestration. According to the Anthropic, attribution was made using request metadata and infrastructure indicators. The AI firm claimed it detected this campaign while it was still active, before MiniMax's in-training model was released.
To prevent future attacks, Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify. It is claimed to have built multiple detection systems, including classifiers and behavioural fingerprinting tools, to flag patterns consistent with distillation in API traffic.
The company is also sharing technical indicators with other AI labs, cloud providers, and relevant authorities in an effort to highlight the issue of distillation. It has also strengthened access controls, particularly around educational accounts, security research programmes, and startup pathways that it says are often exploited to create fraudulent accounts.
Lastly, Anthropic is developing countermeasures at the product, API, and model levels to reduce the effectiveness of its outputs for illicit distillation, without hampering the customer experience. The company said it published the details to make the evidence available to stakeholders with an interest in safeguarding advanced AI systems.
Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.