Anthropic Accuses DeepSeek, Chinese AI Firms of Data Distillation Amid Online Backlash

Anthropic Accuses DeepSeek, Chinese AI Firms of Data Distillation Amid Online Backlash

US artificial intelligence company Anthropic has accused several Chinese AI developers, including DeepSeek, MiniMax and Moonshot AI, of improperly extracting data from its Claude chatbot to improve their own AI models.

In a recent blog post, the company claimed the firms conducted large-scale interactions with Claude using thousands of accounts to gather outputs for training purposes. According to Anthropic, these conversations generated millions of tokens of responses that were later used to fine-tune competing AI systems.

The practice, often called model distillation, involves training a smaller or new model using outputs from a more advanced system. While distillation can be legitimate under controlled conditions, Anthropic argues it becomes unlawful when performed without permission.

Allegations of large-scale data extraction

Anthropic said it identified “industrial-scale campaigns” involving roughly 24,000 accounts that produced over 16 million exchanges with Claude.

The company alleges the extracted outputs focused on high-value areas such as coding, reasoning, and tool usage — capabilities that define the performance of advanced AI models.

Anthropic warned that models built through unauthorised distillation may lack safety safeguards, potentially allowing harmful capabilities to spread without protective controls.

Internet reaction turns critical

Despite the seriousness of the allegations, online reactions have been mixed. Social media users and technology commentators argued that data scraping and large-scale data collection are common across the AI industry.

Some users referenced past legal disputes involving Anthropic over training data usage, while others claimed AI companies broadly rely on publicly available or scraped information.

Tech entrepreneur Elon Musk also criticised Anthropic, alleging hypocrisy and arguing that most AI systems are trained using vast datasets collected from the internet.

AI industry practices under scrutiny

The controversy highlights broader debates about training data ethics and intellectual property in artificial intelligence development.

Critics note that modern AI models require enormous datasets, raising questions about consent, copyright, and transparency. Supporters argue that distillation without safeguards could weaken safety protections and enable misuse.

Geopolitical tensions add to AI rivalry

The dispute comes amid intensifying technological competition between the United States and China, where advanced AI is increasingly viewed as a national security priority.

Recent reports have also raised questions about Chinese access to advanced semiconductor technology and the rapid development of next-generation AI models.

Chinese AI companies have gained global attention by releasing open-weight models, allowing developers worldwide to use and adapt them. In contrast, many US AI systems remain proprietary, creating competitive and commercial tensions.

Growing stakes in the global AI race

With new models expected from Chinese firms and continued advances from US developers, the race to build powerful AI systems is accelerating.

The dispute underscores the urgent need for global standards on data usage, model training practices, and AI safety — issues likely to shape the future of the industry.

Prev Article
The Science of Love: How Brain Chemicals Shape Romance

Related to this topic: