Photo by Freepik
Table of Contents
Since the launch of ChatGPT, the world has begun to witness the transformative power of AI. While much of the spotlight remains on ChatGPT, a Chinese AI model has quietly emerged on the global stage—DeepSeek. It boasts a training cost of just 1/20 of OpenAI’s and 1/10 of Meta’s. Recently, DeepSeek has even received recognition from tech giants including Apple’s Tim Cook, NVIDIA’s Jensen Huang, Microsoft’s Satya Nadella, Amazon’s Andy Jassy, Alphabet’s Sundar Pichai, and Meta’s Mark Zuckerberg.
DeepSeek, also known as “Shendu Qiusuo” (meaning “deep exploration”), is a Chinese AI startup founded in May 2023 by the prominent private equity firm High-Flyer Quant. As a rising star in the AI landscape, DeepSeek has quickly drawn attention for its rapid development and cost-efficiency. To better understand its potential, let’s take a closer look at how it compares to industry leader OpenAI and its product, ChatGPT.
DeepSeek’s models include V3, released in December 2024, and R1, launched in January 2025. The V3 model utilizes a Mixture-of-Experts (MoE) architecture, combined with Multi-Head Latent Attention (MLA) technology. This design significantly reduces the KV cache needed per query, minimizing computational resource consumption during inference and striking an optimal balance between performance and cost. V3 is well-suited for large-scale natural language processing (NLP) applications.
The R1 model builds upon the V3 framework and further enhances reasoning capabilities, specifically designed for advanced reasoning tasks. According to media reports, the company’s upcoming R2 model, originally planned for a release in May, may launch even earlier.
On the other hand, OpenAI’s models include ChatGPT-3.5, ChatGPT-4, and newer iterations like GPT-4o and GPT-o3mini. Unlike OpenAI’s closed-source approach, DeepSeek primarily operates as an open-source platform. Interestingly, OpenAI CEO Sam Altman recently admitted that it was a mistake to keep reasoning processes opaque and expressed interest in learning from DeepSeek’s practice of open-sourcing model reasoning, suggesting that more leading AI firms may reconsider their current strategies.
In terms of pricing, DeepSeek appears to offer significantly lower prices. For instance, DeepSeek-R1 charges approximately $0.55 per million input tokens and $2.19 per million output tokens. In comparison, ChatGPT’s o3-mini charges $1.10 per million input tokens and $4.40 per million output tokens.
Regarding training costs, DeepSeek claims that it used around 2,048 H800 GPUs and accumulated 2.788 million H800 GPU hours during training, with a total cost of only $5.58 million. This amount is reportedly 1/10 the cost of Meta’s LLaMA model and one-twentieth the cost of OpenAI’s GPT-4o, fueling market concerns of a potential “Jevons paradox” in U.S. AI development.
However, there are ongoing doubts about DeepSeek’s cost calculations. Its parent company, High-Flyer Quant, reportedly stockpiled around 10,000 NVIDIA A100 GPUs as early as 2021, investing over $500 million just in GPUs. The total server capital expenditure is estimated at $1.6 billion, and the omission of high-salaried AI talent, with some earning tens of millions annually, may also lead to an underestimation of the model’s development cost.
Table 1. Comparison of DeepSeek and ChatGPT
Name | DeepSeek | ChatGPT |
---|---|---|
Developer | DeepSeek (China) | OpenAI(U.S.) |
Established | 2023 | 2015 |
Main Models | V3 (Released Dec 2024), R1 (Released Jan 2025), R2 (Expected before May 2025) | ChatGPT-3.5 (Released Nov 2022), GPT-4 (Released Mar 2023) |
Open Source Status | R1(Open) | GPT-2(Open)、GPT-3、GPT-4(Closed) |
Price | With DeepSeek-R1, the cost is approximately $0.55 per million input tokens and $2.19 per million output tokens | With o3-mini, the cost is $1.10 per million input tokens and $4.40 per million output tokens |
Training Cost | $5.58 million (DeepSeek-V3), though many believe the actual cost is significantly higher | $100 million (GPT-4o) |
Source:Artificial Analysis、TEJ
In the past, global demand for AI chips has been largely driven by cloud service providers (CSPs), which accounted for nearly 80% of AI chip shipments in 2024. Therefore, CSPs’ capital expenditure performance clearly indicates the AI chip market’s outlook. In recent years, CSPs have aggressively procured AI chips to build computing power for training large-parameter models, based on the principles of scaling laws.
However, DeepSeek’s recent breakthroughs in training methods and algorithms—particularly its distillation techniques—have significantly reduced training costs. This has led to market skepticism regarding the necessity of large-scale compute stacking and the sustainability of CSPs’ high capital expenditure strategies.
Nonetheless, data shows that capital expenditure among the four major CSPs is expected to continue rising in 2025, reaching a combined total of USD 320 billion—an annual growth rate of 40.1%. Individually, Amazon, Microsoft, Google, and Meta have projected capital expenditures of USD 100 billion, 80 billion, 75 billion, and 65 billion respectively, with corresponding annual growth rates of 20.5%, 43.9%, 42.9%, and 74.3%. These figures indicate that CSPs remain highly committed to AI infrastructure investment, unwilling to fall behind in the AI arms race—providing reassurance for AI chip demand in 2025.
Table 2. Capital Expenditure Plans of the Four Major CSPs in 2025 (Unit: USD Billion; %)
Amazon | Microsoft | Meta | Sum | ||
---|---|---|---|---|---|
2024 | 830 | 556 | 525 | 373 | 2,284 |
2025(F) | 1,000 | 800 | 750 | 650 | 3,200 |
YoY | 20.5% | 43.9% | 42.9% | 74.3% | 40.1% |
Source:TEJ,2025/02
Taiwan plays a critical role in the global AI supply chain. From semiconductor manufacturing and IC design to electronics manufacturing services (EMS) and high-performance computing (HPC) equipment, Taiwan companies contribute significantly across all segments. For instance, in AI development, TSMC (2330.TW) provides cutting-edge wafer foundry services to tech giants like NVIDIA, AMD, and Google. IC design firms such as MediaTek (2454.TW) and Alchip (3661.TW) are actively developing AI computing chips. Major EMS companies including HON HAI (2317.TW), Quanta (2382.TW), and Wistron (3231.TW) are key suppliers of AI servers and terminal devices. Furthermore, Taiwan’s optical communication sector has also benefited from the increase in high-speed data transmission driven by large-scale AI computations.
The emergence of DeepSeek highlights the rapid evolution of AI technology and sheds light on several trends observed in recent news and corporate developments related to Taiwan’s core AI sectors:
In addition, DeepSeek’s advancement underscores the shifting dynamics in the AI inference chip market, with rising demand for custom AI ASICs. Taiwan’s robust ecosystem in foundry services, IC design, and server manufacturing is expected to sustain its pivotal role amid global AI industry transformations.
As the cost of AI applications continues to decline and capital expenditures are expected to peak in 2025, the current wave of infrastructure deployment is gradually meeting short-term compute needs. While cloud service providers (CSPs) may slow down capital spending after 2026, the principle of scaling laws still applies. According to Jevons Paradox, falling AI costs could actually stimulate overall AI adoption, allowing the industry to maintain steady growth in the long run.
Additionally, CSPs are actively developing reasoning-based large language models (LLMs) to improve deep thinking capabilities. This shift is expected to drive greater compute demand. NVIDIA’s CFO Colette Kress recently noted that reasoning-type models, which generate additional information to “think through” answers, could require up to 100 times more compute resources per task compared to one-shot inference models.
While 2024 is still dominated by major CSPs, declining costs and increasingly diverse AI use cases may attract broader enterprise adoption soon. For instance, in China, media reports indicate that DeepSeek has already been implemented across various sectors, including smartphones, automotive, asset management, telecommunications, and government. This has led to a spike in demand for secondary GPUs, such as H20, among Chinese enterprises riding the new wave of AI enthusiasm.
Moreover, the emergence of DeepSeek is also reshaping the AI chip landscape. Historically, the market has been dominated by general-purpose GPUs like NVIDIA’s H100 and A100. However, as DeepSeek lowers the barriers to AI adoption, the growing diversity of edge applications and the need for low power consumption will likely fuel demand for custom ASICs. Still, ASICs come with longer development cycles and high upfront costs. Furthermore, future inference tasks will demand greater interconnectivity—an area where NVIDIA’s mature CUDA architecture still holds an advantage. As such, the future of AI compute will likely be one where GPUs and ASICs coexist.
As AI applications become more widespread and reasoning capabilities continue to improve, global demand for computing power will keep rising. Taiwan, with its robust industrial ecosystem in semiconductors, server manufacturing, and optical communication, is positioned to play a pivotal role in this transformation. Moreover, the rise of DeepSeek highlights how advances in AI inference and cost optimization may reshape the future of computing infrastructure. Looking ahead, if Taiwan enterprises continue to strengthen their R&D capabilities and deepen collaborations with international AI leaders, they are likely to maintain a competitive edge in the global AI market and further solidify their position in high-end AI computing and applications.
However, AI trends continue to evolve, the follow-up developments in the semiconductor sector are worth monitoring.The AI boom may experience delays due to the impact of Trump’s reciprocal tariffs.Therefore, we should remain optimistic and prudent to prevent the loss of credit risk.
Explore the latest insights into Taiwan’s corporate landscape with the TEJ Watchdog Database. Our analysis of major events affecting Taiwan’s tech industry provides a clear and actionable perspective. Through a detailed evaluation of recent corporate news, we quantitatively assess the impact of these events on credit conditions, offering an event intensity rating from -3 to +3.
Stay informed about the dynamic changes in Taiwan’s technology sector, understand the factors influencing corporate credit risks, and make timely adjustments to your investment strategies. Visit the TEJ Watchdog Database today and gain the edge in navigating Taiwan’s evolving business environment.
➤ AI Infrastructure — Another Growth Opportunity For Electronic Manufacturing Services Industry!
➤ NVIDIA Sparks AI Revolution, Do AI Server Supply Chain Keep Up?
Subscribe to newsletter