DeepSeek rolled out preview versions of its highly anticipated V4 AI models today, once again narrowing the gap with leading AI models from the world’s largest tech companies.
The Chinese startup released two open-source versions, a high-performance V4 Pro model and a cheaper, smaller V4 Flash model. The company is pitching both as competitive with frontier systems, highlighting strong performance in coding, improved reasoning, and more advanced agentic capabilities.
One of the more eye-catching upgrades is the jump to a 1-million-token context window, which allows the models to process entire codebases or extremely long documents in a single prompt.
But what really makes these models stand out is their focus on efficiency.
The V4 models rely on a mixture-of-experts (MoE) architecture, a design that activates only a subset of the model’s parameters at any given time. While the system may have trillions of parameters in total, only a fraction are used per task, which keeps inference costs low.
The new models arrive just over a year after DeepSeek first made headlines with its R1 reasoning model. That system rivaled advanced models from companies like OpenAI and Google, but was reportedly built at a fraction of the cost and used fewer AI chips for training. The news even triggered a trillion-dollar selloff on Wall Street, with Nvidia losing nearly $600 million in a single day.
In a technical paper, the company says its latest models are competitive, while acknowledging a small performance gap.
“Through the expansion of reasoning tokens, DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks,” the company said. “Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini-3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.”
Still, for many users, the cost savings may outweigh any slight performance shortfall.
Datasette creator Simon Willison compared token pricing across major models on his blog and found DeepSeek to be the cheapest in its class.
DeepSeek is charging $0.14 per million input tokens and $0.28 per million output tokens for its V4 Flash model. For comparison, GPT-5.4 Nano costs $0.20 per million input tokens and $1.25 per million output tokens, while Claude Haiku 4.5 is priced at $1 and $5 per million input and output tokens, respectively.
The gap becomes even more stark when it comes to the pro models. DeepSeek is charging $1.74 per million input tokens and $3.48 per million output tokens for its V4 Pro model. By comparison, Gemini 3.1 Pro costs $2 per million input tokens and $12 per million output tokens, while GPT-5.5 is priced at $5 and $30 per million input and output tokens, respectively.
And of course, in keeping with DeepSeek’s previous releases, V4 is MIT-licensed and open-weight, so if you have the resources to run it, it’s “free” in the way a movie on Netflix is “free.” Nobody is charging you at the moment you press play, but the meter is still running somewhere. In this case, it’s your electricity bill.







Leave a Reply