DeepSeek’s latest AI release takes another swipe at industry giants

DeepSeek’s latest AI release takes another swipe at industry giants


Released earlier this week, DeepSeek-V3-0324 takes everything that made the initial model impressive and builds on it with enhanced reasoning abilities, improved code handling, and better medium-to-long-form writing capabilities.

The updated model also significantly improved its performance on popular benchmark tests, as the Chinese startup looks to go beyond its initial hit.

DeepSeek said the new V3-0324 can produce better quality translations and features improved function calling, a feature that lets AI models fetch data and resources from external systems, which suffered in the previous version.

DeepSeek described its latest AI model as demonstrating “notable improvements over its predecessor,” highlighted by a score of 81.2 on the MMLU-Pro benchmark, which evaluates a model’s language understanding across broader and more challenging tasks.

DeepSeek V3 only achieved a score of 75.9, though according to the benchmark’s leaderboard on Hugging Face, that score was self-reported, meaning third parties have not evaluated it on that particular test.

Evaluations from Artificial Analysis suggest the new DeepSeek V3-0324 is the top-performing non-reasoning model.

“DeepSeek are now driving the frontier of non-reasoning open weights models, eclipsing all proprietary non-reasoning models, including Gemini 2.0 Pro, Claude 3.7 Sonnet and Llama 3.3 70B,” the firm said in a post on X (formerly Twitter).

Graphic displaying non-reasoning AI model performance, with the new DeepSeek V3-0324 outscoring AIs from the likes of OpenAI, Google, and Anthropic

DeepSeek sent shockwaves across the AI and wider tech space in early January, causing stock prices to tumble after the startup’s innovative yet controversial approaches to training and development costs raised investor concerns over value for money.

In the weeks since, DeepSeek has aimed to broaden access to its models, notably switching its latest model to an MIT licence, a move away from the startup’s previously more restrictive licence.

The news V3-0324 model can also run locally, meaning it can potentially power edge applications. Running the model locally requires software from DeepSeek’s V3 repository on GitHub.

However, the announcement did not address whether V3-0324 has improved safety features, after the initial release was repeatedly criticised for its inadequate safety measures.

Beyond developing V3-0324, DeepSeek has been working behind the scenes on R1’s successor, aptly named R2.

Capacity previously reported that DeepSeek was fast-tracking the release of R2 after its breakout success earlier this year, with the model potentially going live this Spring.

“[The V3-0324] release is arguably even more impressive than R1 – and potentially indicates that R2 is going to be another significant leap forward,” Artificial Analysis said.

DeepSeek accelerates R2 model launch amid US-China AI battle

DeepSeek’s $6m AI cost is ‘misleading’, Google DeepMind CEO claims

DeepSeek failed all safety tests, responding to harmful prompts



Originally Appeared Here