What is DeepSeek? How It Outpaced ChatGPT on a Budget

DeepSeek has become the talk of the AI town, taking over OpenAI’s ChatGPT in Apple’s App Store as the #1 free app in Canada and beyond in the span of a week. The open-source AI model powering the new chatbot, DeepSeek R1, has also become the top-trending model on HuggingFace with over 109,000 downloads (via VentureBeat).
But what exactly is this new bot on the block and where did it come from? DeepSeek emerged in China as an off-shoot of hedge fund High-Flyer Quant in 2023. The company went to work developing AI models for its proprietary chatbot, which culminated in the DeepSeek app and web client it released last week.
DeepSeek R1, which underpins the company’s chatbot, matches or exceeds the performance of ChatGPT’s o1 model. It was developed with staggering speed and a (seemingly) shoestring budget, only requiring 3-5% of the cost of OpenAI’s o1.
The current consensus is that DeepSeek managed to get much of the way to its chatbot by leveraging existing, publicly available AI technologies like Meta’s Llama model PyTorch ML library. This significantly cut down on DeepSeek’s time to market.
DeepSeek trained the prototype version of its AI model, DeepSeek-R1-Zero, entirely using reinforcement learning, encouraging it to think independently and value both correct answers and the logical processes used to get to them. It appears to have worked, given that users have been raving about how accurate and impressive its responses are.
But you still need GPUs to train AI models — and a lot of them, at that. DeepSeek started with just 10,000 Nvidia GPUs that it acquired before U.S.-China trade restrictions went into effect. Scale AI CEO Alexandr Wang claimed during a recent segment on CNBC that the company also has around 50,000 of Nvidia’s H100 GPUs that they can’t publicly disclose due to the trade ban.
Obviously
— Elon Musk (@elonmusk) January 27, 2025
Granted, that’s still pocket change compared to AI incumbents like OpenAI, Google, and Anthropic, each of which boasts compute capacities of more than 500,000 GPUs. DeepSeek trained its base model, V3, in a little over two months for just $5.58 billion USD, according to Nvidia engineer Jim Fan. In contrast, Meta plans to spend $65 billion USD on its AI ambitions this year.
Research coming out from experts suggests DeepSeek also made several GPU and infrastructure innovations to lower costs and development time. These included multi-token predicting during inference to boost efficiency, mixed-precision training to reduce memory requirements per GPU (and therefore using fewer of them), and algorithm improvements to maximize GPU utilization,
DeepSeek has clearly turned AI as we know it on its head, causing companies and investors alike to reevaluate how they view AI development and the capital expenditure required to power it. However, whether the new chatbot is as much of an underdog story as it appears to be remains unclear.
Some are still skeptical about DeepSeek’s claim to fame, including billionaire Elon Musk, who founded rival company xAI. In response to SalesForce CEO Marc Benioff lauding DeepSeek for surpassing ChatGPT without “Nvidia supercomputers or $100M” but using data instead, Musk pressed X to doubt and tweeted out a succinct yet cryptic “Lmao no.”
You can try DeepSeek out for yourself today on Apple’s App Store.
Want to see more of our stories on Google?
P.S. Want to keep this site truly independent? Support us by buying us a beer, treating us to a coffee, or shopping through Amazon here. Links in this post are affiliate links, so we earn a tiny commission at no charge to you. Thanks for supporting independent Canadian media!
So did Huawei outpace Nortel at the time…CCP arm is far reaching. Just search for examples given online of searches related to Mao and the Chinese population decimated in the 50s.
If you don't see what's behind this "success", I have a bridge in London (UK) to sell you for 5$.