Model releasesVentureBeatMay 28, 2026

DeepSeek makes its V4 price cut permanent and pressures Western token margins

DeepSeek made a 75% V4 Pro price cut permanent, with VentureBeat arguing its cache and open-weight economics challenge closed-model API pricing.

DeepSeek's V4 Pro pricing is a major cost-curve story. VentureBeat reports that DeepSeek made its 75% price cut permanent and that V4 Pro is now far cheaper than comparable Western workhorse models, especially for cached context. The article says V4 Pro is 7x cheaper on inputs and 17x cheaper on outputs than some Western frontier alternatives, while cache-read pricing can be dramatically lower when hosted natively in China. It also reports V4 Flash reached No. 1 on OpenRouter over the prior week, with DeepSeek's top three models processing nearly 6T tokens there. The precise market-share implications are uncertain because OpenRouter is only a proxy, but the pricing signal is real: autonomous agents consume huge cached-context volumes, and cheap open weights put pressure on premium API margins.

Key details: DeepSeek, V4 Pro, V4 Flash, 75% permanent price cut, 7x cheaper inputs claim, 17x cheaper outputs claim, 87x cache-read pricing claim, OpenRouter No. 1 V4 Flash.

Continue swiping for more AI Brief stories.

Original

DeepSeek makes its V4 price cut permanent and pressures Western token margins

Your reading trail

Saved stories