Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese AI startup Deep

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

submited by
Style Pass
2024-12-26 21:00:04

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3.

Available via Hugging Face under the company’s license agreement, the new model comes with 671B parameters but uses a mixture-of-experts architecture to activate only select parameters, in order to handle given tasks accurately and efficiently. According to benchmarks shared by DeepSeek, the offering is already topping the charts, outperforming leading open-source models, including Meta’s Llama 3.1-405B, and closely matching the performance of closed models from Anthropic and OpenAI.

The release marks another major development closing the gap between closed and open-source AI. Ultimately, DeepSeek, which started as an offshoot of Chinese quantitative hedge fund High-Flyer Capital Management, hopes these developments will pave the way for artificial general intelligence (AGI), where models will have the ability to understand or learn any intellectual task that a human being can.

Leave a Comment