Earlier this week, DeepSeek, a well-funded Chinese language AI lab, launched an “open” AI mannequin that beats many rivals on well-liked benchmarks. The mannequin, DeepSeek V3, is massive however environment friendly, dealing with text-based duties like coding and writing essays with ease.
It additionally appears to suppose it’s ChatGPT.
Posts on X — and TechCrunch’s personal checks — present that DeepSeek V3 identifies itself as ChatGPT, OpenAI’s AI-powered chatbot platform. Requested to elaborate, DeepSeek V3 insists it’s a model of OpenAI’s GPT-4 mannequin launched in 2023.
This really reproduces as of immediately. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), whereas claiming to be DeepSeekV3 solely 3 instances.
Offers you a tough thought of a few of their coaching information distribution. https://t.co/Zk1KUppBQM pic.twitter.com/ptIByn0lcv
— Lucas Beyer (bl16) (@giffmana) December 27, 2024
The delusions run deep. In the event you ask DeepSeek V3 a query about DeepSeek’s API, it’ll provide you with directions on find out how to use OpenAI’s API. DeepSeek V3 even tells a number of the identical jokes as GPT-4 — right down to the punchlines.
So what’s occurring?
Fashions like ChatGPT and DeepSeek V3 are statistical methods. Educated on billions of examples, they study patterns in these examples to make predictions — like how “to whom” in an e mail usually precedes “it might concern.”
DeepSeek hasn’t revealed a lot in regards to the supply of DeepSeek V3’s coaching information. However there’s no shortage of public datasets containing textual content generated by GPT-4 through ChatGPT. If DeepSeek V3 was skilled on these, the mannequin would possibly’ve memorized a few of GPT-4’s outputs and is now regurgitating them verbatim.
“Clearly, the mannequin is seeing uncooked responses from ChatGPT in some unspecified time in the future, however it’s not clear the place that’s,” Mike Cook dinner, a analysis fellow at King’s School London specializing in AI, instructed TechCrunch. “It could possibly be ‘unintentional’ … however sadly, we’ve got seen situations of individuals immediately coaching their fashions on the outputs of different fashions to try to piggyback off their data.”
Cook dinner famous that the follow of coaching fashions on outputs from rival AI methods could be “very unhealthy” for mannequin high quality, as a result of it may possibly result in hallucinations and deceptive solutions just like the above. “Like taking a photocopy of a photocopy, we lose an increasing number of info and connection to actuality,” Cook dinner stated.
It may additionally be towards these methods’ phrases of service.
OpenAI’s phrases prohibit customers of its merchandise, together with ChatGPT prospects, from utilizing outputs to develop fashions that compete with OpenAI’s personal.
OpenAI and DeepSeek didn’t instantly reply to requests for remark. Nonetheless, OpenAI CEO Sam Altman posted what gave the impression to be a dig at DeepSeek and different rivals on X Friday.
“It’s (comparatively) straightforward to repeat one thing that works,” Altman wrote. “This can be very exhausting to do one thing new, dangerous, and troublesome if you don’t know if it’ll work.”
Granted, DeepSeek V3 is way from the primary mannequin to misidentify itself. Google’s Gemini and others sometimes declare to be competing fashions. For instance, prompted in Mandarin, Gemini says that it’s Chinese language firm Baidu’s Wenxinyiyan chatbot.
And that’s as a result of the net, which is the place AI firms supply the majority of their coaching information, is changing into littered with AI slop. Content material farms are utilizing AI to create clickbait. Bots are flooding Reddit and X. By one estimate, 90% of the net could possibly be AI-generated by 2026.
This “contamination,” if you’ll, has made it quite difficult to totally filter AI outputs from coaching datasets.
It’s definitely doable that DeepSeek skilled DeepSeek V3 immediately on ChatGPT-generated textual content. Google was as soon as accused of doing the identical, in spite of everything.
Heidy Khlaaf, chief AI scientist on the nonprofit AI Now Institute, stated the associated fee financial savings from “distilling” an current mannequin’s data could be enticing to builders, whatever the dangers.
“Even with web information now brimming with AI outputs, different fashions that may unintentionally practice on ChatGPT or GPT-4 outputs wouldn’t essentially reveal outputs harking back to OpenAI personalized messages,” Khlaaf stated. “If it’s the case that DeepSeek carried out distillation partially utilizing OpenAI fashions, it will not be stunning.”
Extra probably, nevertheless, is that a whole lot of ChatGPT/GPT-4 information made its approach into the DeepSeek V3 coaching set. Which means the mannequin can’t be trusted to self-identify, for one. However what’s extra regarding is the chance that DeepSeek V3, by uncritically absorbing and iterating on GPT-4’s outputs, may exacerbate a number of the mannequin’s biases and flaws.
TechCrunch has an AI-focused e-newsletter! Sign up here to get it in your inbox each Wednesday.
AI,ChatGPT,deepseek,DeepSeek v3,Generative AI,gpt-4,hallucinations,OpenAI
Add comment