DeepSeek's surprisingly cheap AI model challenges industry giants. The Chinese startup claims to have trained its powerful DeepSeek V3 neural network for a mere $6 million, utilizing only 2048 GPUs, significantly undercutting competitors. This seemingly low cost, however, belies a far more substantial investment.
Image: ensigame.com
DeepSeek V3's innovative architecture contributes to its efficiency. Key technologies include Multi-token Prediction (MTP) for simultaneous word prediction, Mixture of Experts (MoE) employing 256 neural networks for accelerated training, and Multi-head Latent Attention (MLA) for improved information extraction.
Image: ensigame.com
However, a SemiAnalysis report reveals a much larger infrastructure: approximately 50,000 Nvidia GPUs, costing around $1.6 billion, with operational expenses nearing $944 million. This contrasts sharply with the publicized $6 million training cost, which only reflects pre-training GPU usage, excluding research, refinement, data processing, and overall infrastructure.
DeepSeek's independence and efficient structure, a subsidiary of High-Flyer hedge fund, are key to its success. Owning its data centers allows for optimized model development and rapid innovation. The company's self-funding and lean structure also contribute to its agility. High salaries, exceeding $1.3 million annually for some researchers, attract top talent from Chinese universities.
Image: ensigame.com
While DeepSeek's $6 million claim is misleading, its actual investment exceeding $500 million still represents a significant cost advantage compared to competitors. The company’s R1 model cost $5 million to train, while ChatGPT-4 reportedly cost $100 million. DeepSeek's success highlights the competitive potential of a well-funded, independent AI company, though its "budget-friendly" narrative needs qualification.
Image: ensigame.com
In conclusion, DeepSeek's competitive edge stems from a combination of substantial investment, technological innovation, and a highly skilled team, rather than solely from a remarkably low training budget. However, even with the corrected figures, its costs remain significantly lower than those of its competitors.