OpenAI and Microsoft revealed on Tuesday that they have evidence suggesting that last year, China’s DeepSeek exploited their AI models. The new Chinese chatbot, which has surprised the AI market over the past week and caused a significant drop in stock prices, reportedly used a known technique to reduce costs. The technique, called “distillation,” involves leveraging a large, established model by querying it and using its responses to train a smaller, learning model.
Microsoft admitted to Bloomberg that it is investigating whether DeepSeek’s bots took advantage of its services last fall. Meanwhile, OpenAI confirmed to the Financial Times that they have evidence of such intrusion. The companies’ terms of service prohibit this activity, but it is challenging to detect when it occurs. DeepSeek allegedly connected to these companies’ APIs, which allow other firms to use AI models in exchange for a fee, to create its own tool. “The problem arises when [you take it off a platform and] use it to create your own model with your own objectives,” a source from OpenAI told the FT.
By doing this, DeepSeek presumably bypassed the expensive human reinforcement needed for these models, giving it a competitive advantage. The new AI czar in the Trump administration, David Sacks, told Fox News that there was “substantial evidence” that DeepSeek had engaged in this intellectual property theft, although he did not provide any proof. “There’s a technique called distillation, where one model learns from another and siphons knowledge,” he said, adding, “I don’t think OpenAI is very happy about that.”
In a statement following Sacks’ comments, OpenAI confirmed their suspicions: “We know that companies based in China (and other countries) are constantly attempting to distill from leading U.S. companies.” The company called for the new Trump administration’s support, which is already engaged in a global trade war, to assist in this new battle.
At the same time, OpenAI continues to fight its own copyright infringement accusations in court, with cases brought by artists and companies like The New York Times. These accusations stem from OpenAI’s initial training of its models using all content available online, including protected works, articles, and books.