DeepSeek slashes API prices tenfold to escalate global AI competition

08:40

By: Dakir Madiha

DeepSeek slashes API prices tenfold to escalate global AI competition

DeepSeek has cut the cost of its API services by up to 90 percent, intensifying price competition across the global artificial intelligence industry. The move follows the release of its latest large language model, V4, and includes a 75 percent promotional discount on its flagship V4 Pro model through early May.

The new pricing structure reduces input caching costs for V4 Pro to 0.025 yuan per million tokens, equivalent to roughly $0.0036. Standard promotional rates place input costs at 3 yuan and output costs at 6 yuan per million tokens. These levels position DeepSeek far below major Western competitors. Comparable output pricing for models such as Claude Opus by Anthropic, GPT-5.4 by OpenAI, and Gemini 3.1 Pro from Google ranges between $12 and $25 per million tokens.

The company introduced V4 Pro and V4 Flash in late April, marking its first major release since V3.2 in December. V4 Pro features a total of 1.6 trillion parameters, with 49 billion active during inference, making it one of the largest open weight models available. V4 Flash offers a lighter alternative with 284 billion parameters, targeting applications that require lower computational overhead.

Even before the latest discounts, DeepSeek’s pricing was significantly lower than its rivals. Previous rates for V4 Pro were estimated at $1.74 for input and $3.48 for output per million tokens, already undercutting competing systems by a wide margin. The latest reductions widen this gap further, reinforcing a broader trend of aggressive pricing strategies in the AI sector as companies compete for developers and enterprise adoption.

A notable aspect of the model is its reliance on Huawei Ascend chips instead of hardware from Nvidia. This shift reduces dependence on US semiconductor supply chains and could accelerate domestic AI deployment in China. Analysts say the use of alternative hardware platforms may also influence global competition by diversifying the infrastructure supporting advanced AI systems.

Efficiency gains also play a central role in the model’s positioning. For a context window of one million tokens, V4 Pro requires only about 27 percent of the computing power used by its predecessor. Despite these improvements, the model remains slightly behind leading systems from OpenAI and Google, with an estimated gap of three to six months in cutting edge performance.

The pricing move signals a new phase in the AI industry, where cost competitiveness is becoming as critical as model capability. As companies race to expand adoption, lower pricing could reshape market dynamics and accelerate the global deployment of advanced language models.