DeepSek Allegedly Uses Google Gemini Output To Train Its AI, Triggers Ethics Controversy

JAKARTA – The latest update from DeepSek AI drew sharp attention after reports emerged that the company used output from Google Gemini’s model to train their AI model. This technique, known as distillation, is considered efficient but raises serious questions about ethics, especially because practices like this are prohibited by OpenAI’s service policies and DeepSek has previously been accused of doing the same as the ChatGPT model.

DeepSek has been the talk of the town since the beginning of this year after suddenly appearing with an AI model that is claimed to be able to compete with industry giants. In its latest update, a number of tech observers suspect that DeepSek is using Google Gemini as a source of training, not just raw data.

This suspicion arose after Sam Paech, a user of platform X, revealed that the voice and trace (track of AI model thinking when making decisions) of DeepSek’s latest model sounded similar to Gemini. This opinion is reinforced by the SpeechMap developer, who also noted the similarity of response patterns between DeepSek and Gemini.

It’s Happened Before

The accusations against DeepSek are not the first. When it was first launched, DeepSek was accused of having used the ChatGPT output from OpenAI in its training process. This allegation is the reason why DeepSek’s training costs are reported to be much lower than their competitors.

The distillation technique used by DeepSek is actually not new. This method is like a relationship between teachers and students, where knowledge that has been simplified by a teacher (a high-level AI model like Gemini) is used to train students (a new model like DeepSek). The results can be very efficient in the use of resources and time.

However, this efficiency comes with the risk of violating the law and ethics. OpenAI explicitly prohibits the use of their model output to train a competing model. If accusations against DeepSek prove true, then they have violated the policy.

Despite being considered a violation of ethics, a number of experts say DeepSek’s approach is quite reasonable from a strategic perspective. Nathan Lambert, a researcher at AI2, a non-profit research institute, said that DeepSek’s approach does make sense under certain conditions.

“If I were DeepSeeek, I would definitely make a lot of synthetic data from the best available model APIs. They lack GPU but have a lot of cash. This basically gives them more computational power indirectly,” Lambert explained, quoted by VOI from Android Headlines.

It is important to note that the trade conflict between the US and China exacerbated the technological gap, especially with restrictions on China’s access to advanced semiconductors and other export technologies. This forces Chinese technology companies, such as DeepSek, to find alternative ways in their AI development.

With geopolitical pressures, limited resources, and the need to compete globally, DeepSek seems to choose the pragmatic path despite ethical and legal risks.

The polemic closure around DeepSeek shows a gap in the fast-growing AI industry regulations, where technical efficiency sometimes clashes with ethical principles and intellectual property rights. Whether such practices will become a new norm or actually encourage stricter regulations, is still a big question in the global world of AI.

The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language.
(system supported by DigitalSiber.id)

Content Curated Originally From Here