How much does the DeepSeek V4 API cost?

DeepSeek-V4-Flash costs $0.14 per 1M input tokens (cache miss), $0.0028 per 1M input tokens (cache hit), and $0.28 per 1M output tokens. DeepSeek-V4-Pro is $0.435 / $0.003625 / $0.87 during its limited-time 75% promo (valid through 2026/05/31 15:59 UTC); regular post-promo pricing is $1.74 / $0.0145 / $3.48.

How does DeepSeek V4 pricing compare to other providers?

DeepSeek-V4-Flash is one of the most affordable frontier-class APIs available. At $0.14 per 1M input tokens (cache miss) and $0.28 per 1M output tokens, it is significantly lower than GPT-5.5 ($5/$30), Claude Opus 4.7 ($5/$25), and Claude Sonnet 4.6 ($3/$15). With context caching enabled, input costs drop to $0.0028 per 1M — about 98% off cache-miss pricing. See the comparison table on this page for current side-by-side numbers.

What is DeepSeek context caching?

DeepSeek caches repeated prefixes on a best-effort basis, calculated in 64-token increments. When a request shares the same prefix as a previous one, those cached input tokens cost only $0.0028 per 1M (V4-Flash) — roughly 98% off the $0.14 cache-miss price. With V4, the cache-hit price was reduced to one-tenth of the launch price. Caching works automatically with no configuration needed.

What is the difference between deepseek-v4-flash and deepseek-v4-pro?

Both V4 models offer a 1M context window, 384K max output, and support both thinking and non-thinking modes. V4-Flash ($0.14 / $0.28 in/out per 1M) is optimized for speed and cost on general tasks. V4-Pro ($1.74 / $3.48 regular; $0.435 / $0.87 during the limited-time promo) is the advanced tier for the hardest reasoning, math, and code workloads. The legacy aliases deepseek-chat and deepseek-reasoner now map to V4-Flash non-thinking and thinking modes respectively, and will be deprecated.

Does DeepSeek offer free API credits?

Yes, DeepSeek provides free trial tokens to new users upon registration. No credit card is required. The free chat at chat.deepseek.com is also unlimited with no subscription.

How do I calculate my DeepSeek API costs?

Use the formula: Cost = (input_tokens / 1,000,000 × input_price) + (output_tokens / 1,000,000 × output_price). Factor in cache hit rate for input tokens to get a more accurate estimate. Our calculator above does this automatically.

DeepSeek API Cost Calculator

Estimate DeepSeek API costs before you build. This calculator helps you forecast per-request, daily, monthly, and yearly spend for both deepseek-v4-flash and deepseek-v4-pro using current DeepSeek V4 API pricing, including cache-hit discounts. The legacy deepseek-chat and deepseek-reasoner aliases now map to V4-Flash non-thinking and thinking modes.

V4-Flash and V4-Pro share a 1M context window and 384K max output, but pricing differs sharply: V4-Flash is the low-cost workhorse, while V4-Pro targets the hardest reasoning, math, and code workloads (currently 75% off through 2026/05/31 15:59 UTC). Use the calculator to compare workloads before you ship.

Model Fast tier — supports both non-thinking and thinking modes

Input Tokens per Request

Output Tokens per Request

Cache Hit Rate: 50%

0% (No Caching) 100% (Full Cache)

Caching applies to repeated prefixes only, is best-effort, and is calculated in 64-token increments. Learn more

Requests per Day

Requests per Month

Estimated Costs

Per Request $0.000

Daily Cost $0.00

Monthly Cost $0.00

Yearly Cost $0.00

Cost Breakdown per Request

Input (Cache Hit) $0.000

Input (Cache Miss) $0.000

Output $0.000

Total per Request $0.000

Cache-hit input is billed at 1/10 of the launch price on V4 — roughly a 98% discount versus cache-miss input on V4-Flash. You save $0.000/mo with your current cache rate.

All prices per 1 million tokens (USD). Last verified: May 1, 2026 — Source: DeepSeek Models & Pricing | DeepSeek Platform

Model	Input (Cache Hit)(2)	Input (Cache Miss)	Output	Context	Max Output
DeepSeek-V4-Flash(1) Fast tier — supports both non-thinking and thinking modes	$0.0028	$0.140	$0.28	1M	384K
DeepSeek-V4-Pro Advanced tier — 75% off until 2026/05/31 15:59 UTC; regular: $1.74 / $3.48	$0.003625(3)	$0.435(3)	$0.87(3)	1M	384K

1 The model names deepseek-chat and deepseek-reasoner will be deprecated in the future. For compatibility, they correspond to the non-thinking mode and thinking mode of deepseek-v4-flash, respectively.
2 For all models, the input cache hit price has been reduced to 1/10 of the launch price.
3 The deepseek-v4-pro model is currently offered at a limited-time 75% discount, valid until 2026/05/31 15:59 UTC.

💡 Cache Hit Discount: DeepSeek caches repeated prefixes on a best-effort basis, calculated in 64-token increments. With V4 the cache-hit price was reduced to 1/10 of the launch price: V4-Flash cache hits are billed at $0.0028/1M (about 98% off the $0.14 cache-miss rate). See official caching guide

DeepSeek-V4-Flash

JSON Output
Tool Calls
Chat Prefix (Beta)
FIM Completion (Beta)
Thinking Mode

DeepSeek-V4-Pro

JSON Output
Tool Calls
Chat Prefix (Beta)
FIM Completion (Beta)
Thinking Mode

Paste or type your text

0 Characters

0 Words

0 Est. Tokens

$0.000 Input Cost

Token count is an approximation (~3.3 characters per token for English, per DeepSeek's tokenization guide). Actual tokenization may vary by language and content type.

Need setup help? Read our DeepSeek API guide. Need the bigger picture? Start with our main DeepSeek AI guide.

DeepSeek API Cost Calculator