Microsoft and OpenAI probe alleged data theft by DeepSeek

Illustration of someone unlocking a vault as Microsoft and OpenAI probe alleged AI data theft by DeepSeek for their popular artificial intelligence models.

Microsoft and OpenAI are investigating a potential breach of the AI firm’s system by a group allegedly linked to Chinese AI startup DeepSeek.

According to Bloomberg, the investigation stems from suspicious data extraction activity detected in late 2024 via OpenAI’s application programming interface (API), sparking broader concerns over international AI competition.

Microsoft, OpenAI’s largest financial backer, first identified the large-scale data extraction and informed the ChatGPT maker of the incident. Sources believe the activity may have violated OpenAI’s terms of service, or that the group may have exploited loopholes to bypass restrictions limiting how much data they could collect.

DeepSeek has quickly risen to prominence in the competitive AI landscape, particularly with the release of its latest model, R-1, on 20 January.

Billed as a rival to OpenAI’s ChatGPT in performance but developed at a significantly lower cost, R-1 has shaken up the tech industry. Its release triggered a sharp decline in tech and AI stocks that wiped billions from US markets in a single week.

David Sacks, the White House’s newly appointed “crypto and AI czar,” alleged that DeepSeek may have employed questionable methods to achieve its AI’s capabilities. In an interview with Fox News, Sacks noted evidence suggesting that DeepSeek had used “distillation” to train its AI models using outputs from OpenAI’s systems.

“There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI’s models, and I don’t think OpenAI is very happy about this,” Sacks told the network.

Model distillation involves training one AI system using data generated by another, potentially allowing a competitor to develop similar functionality. This method, when applied without proper authorisation, has stirred ethical and intellectual property debates as the global race for AI supremacy heats up.

OpenAI declined to comment specifically on the accusations against DeepSeek but acknowledged the broader risk posed by model distillation, particularly by Chinese companies.

“We know PRC-based companies — and others — are constantly trying to distill the models of leading US AI companies,” a spokesperson for OpenAI told Bloomberg.

Geopolitical and security concerns

Growing tensions around AI innovation now extend into national security. CNBC reported that the US Navy has banned its personnel from using DeepSeek’s products, citing fears that the Chinese government could exploit the platform to access sensitive information.

In an email dated 24 January, the Navy warned its staff against using DeepSeek AI “in any capacity” due to “potential security and ethical concerns associated with the model’s origin and usage.”

Critics have highlighted DeepSeek’s privacy policy, which permits the collection of data such as IP addresses, device information, and even keystroke patterns—a scope of data collection considered excessive by some experts.

Just fyi, @deepseek_ai collects your IP, keystroke patterns, device info, etc etc, and stores it in China, where all that data is vulnerable to arbitrary requisition from the 🇨🇳 State.

From their own privacy policy: pic.twitter.com/wueJokHcn3

— Luke de Pulford (@lukedepulford) January 27, 2025

Earlier this week, DeepSeek stated it was facing “large-scale malicious attacks” against its systems. A banner on its website informed users of a temporary sign-up restriction.

The growing competition between the US and China in particular in the AI sector has underscored wider concerns regarding technological ownership, ethical governance, and national security.

Experts warn that as AI systems advance and become increasingly integral to global economic and strategic planning, disputes over data usage and intellectual property are only likely to intensify. Accusations such as those against DeepSeek amplify alarm over China’s rapid development in the field and its potential quest to bypass US-led safeguards through reverse engineering and other means.

While OpenAI and Microsoft continue their investigation into the alleged misuse of OpenAI’s platform, businesses and governments alike are paying close attention. The case could set a precedent for how AI developers police model usage and enforce terms of service.

For now, the response from both US and Chinese stakeholders highlights how AI innovation has become not just a race for technological dominance, but a fraught geopolitical contest that is shaping 21st-century power dynamics.

(Image by Mohamed Hassan)

See also: Qwen 2.5-Max outperforms DeepSeek V3 in some benchmarks

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, artificial intelligence, data, deepseek, distillation, microsoft, openai, security, training

Source link