DeepSeek V4 Guide: Pro vs Flash, 1M Context & API Pricing

DeepSeek V4 Preview is the current official DeepSeek V4 release line. It includes two current API model IDs, deepseek-v4-pro and deepseek-v4-flash, with 1M-token context, 384K maximum output, thinking and non-thinking modes, OpenAI-compatible API access, Anthropic-compatible API access, and official open-weight repositories.

Last verified against official DeepSeek sources: April 26, 2026.

Independent site notice: Chat-Deep.ai is an independent DeepSeek guide and browser access site. It is not affiliated with DeepSeek, DeepSeek.com, chat.deepseek.com, the official DeepSeek app, or the official DeepSeek developer platform. For production decisions, always verify model names, prices, limits, feature support, status, and deprecation notices in the official DeepSeek API documentation.

Current official V4 status: DeepSeek-V4 Preview is live, API-available, and open-sourced. New API integrations should use deepseek-v4-pro or deepseek-v4-flash.

Pricing rule for this page: this guide does not publish fixed token prices. API prices and promotions can change, so all pricing references point readers to the official DeepSeek Models & Pricing page.

Quick Answer: Is DeepSeek V4 Released?

Yes. DeepSeek V4 has launched as DeepSeek-V4 Preview. The safest wording is Preview Release, not final release. DeepSeek describes the preview as officially live, open-sourced, and available through the API using deepseek-v4-pro and deepseek-v4-flash.

DeepSeek-V4-Pro is the larger flagship V4 model with 1.6T total parameters and 49B active parameters. DeepSeek-V4-Flash is the faster and more economical V4 model with 284B total parameters and 13B active parameters. Both current V4 API models are documented with a 1M-token context window and 384K maximum output.

DeepSeek V4 Key Takeaways

  • Release status: DeepSeek V4 is live as DeepSeek-V4 Preview.
  • Current API model names: use deepseek-v4-pro or deepseek-v4-flash for new integrations.
  • API base URLs: use https://api.deepseek.com for OpenAI-compatible requests and https://api.deepseek.com/anthropic for Anthropic-compatible requests.
  • Context and output: both V4 API models are listed with 1M context and 384K maximum output.
  • Model roles: V4-Flash is the fast and economical model; V4-Pro is the stronger model for harder reasoning, coding, long-context analysis, and agentic workflows.
  • Pricing: do not hardcode prices from this guide. Use the official DeepSeek Models & Pricing page.
  • Legacy aliases: deepseek-chat and deepseek-reasoner currently route to V4-Flash non-thinking and thinking modes and are scheduled to be retired after July 24, 2026, 15:59 UTC.
  • Open weights: DeepSeek published V4 model repositories through the official DeepSeek-V4 Hugging Face collection.

DeepSeek V4 Guide Contents

  1. Release Date and Current Status
  2. DeepSeek V4 at a Glance
  3. Is DeepSeek V4 a 1T-Parameter Model?
  4. DeepSeek V4 Pro vs Flash
  5. Which DeepSeek V4 Model Should You Use?
  6. API Model Names and Base URLs
  7. deepseek-chat and deepseek-reasoner Migration
  8. API Examples
  9. DeepSeek V4 API Pricing Source
  10. Usage Tracking and Cost Control Without Hardcoded Prices
  11. 1M Context and 384K Output
  12. Open Weights and MIT License
  13. Architecture and Benchmark Highlights
  14. Thinking Modes, JSON, Tool Calls and Agents
  15. What Changed from DeepSeek V3.2 to V4?
  16. Developer Migration Checklist
  17. What Not to Overclaim
  18. FAQ
  19. Sources

DeepSeek V4 Release Date and Current Status

DeepSeek V4 Preview was announced on April 24, 2026. DeepSeek’s official release note says DeepSeek-V4 Preview is live and open-sourced, and that the API is available now. It also says developers can keep the same base URL and update the model parameter to deepseek-v4-pro or deepseek-v4-flash.

For fast-moving launch updates, outages, or later API changes, check the official DeepSeek change log, the official DeepSeek Service Status page, and the local DeepSeek Status guide.

DeepSeek V4 at a Glance

TopicCurrent DeepSeek V4 Detail
Release nameDeepSeek-V4 Preview
Release dateApril 24, 2026
Main API modelsdeepseek-v4-pro and deepseek-v4-flash
OpenAI-compatible base URLhttps://api.deepseek.com
Anthropic-compatible base URLhttps://api.deepseek.com/anthropic
V4-Pro size1.6T total parameters / 49B active parameters
V4-Flash size284B total parameters / 13B active parameters
Context length1M tokens
Maximum output384K tokens
ModesThinking and non-thinking modes
Supported API featuresJSON Output, Tool Calls, Chat Prefix Completion Beta, and FIM Completion Beta in non-thinking mode only
Pricing sourceOfficial DeepSeek Models & Pricing
Open weightsPublished through DeepSeek’s official Hugging Face collection
Legacy aliasesdeepseek-chat and deepseek-reasoner route to V4-Flash compatibility modes during the transition period

Is DeepSeek V4 a 1T-Parameter Model?

You may see DeepSeek V4 described online as a “1T parameter model,” but that is only rough shorthand. The official V4 details are more precise:

  • DeepSeek-V4-Pro: 1.6T total parameters and 49B active parameters.
  • DeepSeek-V4-Flash: 284B total parameters and 13B active parameters.

For accurate technical content, avoid saying “DeepSeek V4 is a 1T model” as if there is only one V4 model. The better wording is: DeepSeek V4 Preview includes V4-Pro at 1.6T total parameters and V4-Flash at 284B total parameters.

DeepSeek V4 Pro vs Flash: Key Differences

FeatureDeepSeek-V4-ProDeepSeek-V4-Flash
API model namedeepseek-v4-prodeepseek-v4-flash
Total parameters1.6T284B
Activated parameters49B13B
Context length1M tokens1M tokens
Maximum output384K tokens384K tokens
Thinking modesNon-thinking, Think High, and Think Max through effort controlsNon-thinking, Think High, and Think Max through effort controls
Best forAdvanced reasoning, difficult coding, long-context analysis, agentic workflows, and higher-value production testsFast chat, summaries, extraction, support assistants, simpler agents, and high-volume workloads
Pricing sourceVerify current official pricingVerify current official pricing
Simple ruleUse when answer quality and reasoning depth matter most.Use as the default starting point, then escalate difficult tasks to Pro.

The safest practical summary is: DeepSeek-V4-Pro is the stronger flagship model; DeepSeek-V4-Flash is the faster and more economical model. Start with Flash for normal workloads, then route more complex reasoning, coding, agent, and long-context tasks to Pro when quality justifies the switch.

Which DeepSeek V4 Model Should You Use?

Use this decision table before choosing between deepseek-v4-pro and deepseek-v4-flash.

Use CaseRecommended ModelWhy
Everyday chat, quick answers, rewriting, basic explanationsdeepseek-v4-flashFast and economical enough for most routine outputs.
Customer support bot with many conversationsdeepseek-v4-flashGood default for high-volume workflows where speed and efficiency matter.
Extraction, classification, structured summariesdeepseek-v4-flashStart with the efficient model and validate output quality against your schema.
Large document summarizationStart with deepseek-v4-flash, escalate to deepseek-v4-pro for complex synthesisBoth support 1M context, but Pro may be better for harder reasoning across documents.
Code review, debugging, complex software planningdeepseek-v4-proBetter fit for higher-value coding and reasoning tasks.
Agentic coding, tool use, multi-step workflowsdeepseek-v4-pro for hard tasks; Flash for simple agent stepsThis balances capability, latency, and budget control.
Production system with mixed complexityUse a router: Flash first, Pro for escalationRoute by task difficulty instead of forcing every request through one model.
Strict budget controlUse deepseek-v4-flash first and verify official pricing before scale-upDo not estimate budget from old or copied pricing snippets.

DeepSeek V4 API Model Names and Base URLs

The official V4 API model names are:

  • deepseek-v4-pro
  • deepseek-v4-flash

The OpenAI-compatible base URL remains:

https://api.deepseek.com

The Anthropic-compatible base URL is:

https://api.deepseek.com/anthropic

For most OpenAI-compatible tooling, you can keep your base URL and update the model name. For Claude Code or Anthropic-compatible ecosystems, use the Anthropic-compatible base URL and a current V4 model name. See the local DeepSeek API guide for a broader integration walkthrough.

What Happens to deepseek-chat and deepseek-reasoner?

The legacy names deepseek-chat and deepseek-reasoner are no longer the best model names for new V4 API integrations. During the current transition period:

  • deepseek-chat currently routes to the non-thinking mode of DeepSeek-V4-Flash.
  • deepseek-reasoner currently routes to the thinking mode of DeepSeek-V4-Flash.

DeepSeek’s V4 release note says these two legacy API model names will be fully retired and inaccessible after July 24, 2026, 15:59 UTC. New integrations should use deepseek-v4-pro or deepseek-v4-flash directly.

NameStatusCurrent MappingRecommended Action
deepseek-v4-proCurrent V4 API modelDeepSeek-V4-ProUse for stronger reasoning, coding, long-context, and agentic workloads.
deepseek-v4-flashCurrent V4 API modelDeepSeek-V4-FlashUse for fast, economical, high-volume workloads.
deepseek-chatLegacy compatibility aliasV4-Flash non-thinking modeReplace with deepseek-v4-flash unless you intentionally need temporary compatibility.
deepseek-reasonerLegacy compatibility aliasV4-Flash thinking modeReplace with deepseek-v4-flash or deepseek-v4-pro plus thinking settings.

DeepSeek V4 API Examples

OpenAI-Compatible cURL Example

This example uses the current V4 model name directly:

curl https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_DEEPSEEK_API_KEY" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain the difference between DeepSeek V4 Pro and Flash."}
    ],
    "reasoning_effort": "high",
    "thinking": {"type": "enabled"},
    "stream": false
  }'

OpenAI SDK Python Example

When using the OpenAI SDK, DeepSeek-specific fields such as thinking should be passed through extra_body.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_DEEPSEEK_API_KEY",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {"role": "user", "content": "Create a migration checklist for DeepSeek V4."}
    ],
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}}
)

print(response.choices[0].message.content)

Anthropic-Compatible Example

DeepSeek also supports an Anthropic-compatible API format through a separate base URL:

export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_API_KEY=YOUR_DEEPSEEK_API_KEY
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="deepseek-v4-pro",
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [{"type": "text", "text": "Summarize DeepSeek V4 in 5 bullets."}]
        }
    ]
)

print(message.content)

For fast, economical workloads, replace deepseek-v4-pro with deepseek-v4-flash. Always keep API keys server-side and out of public code, screenshots, analytics, or browser bundles.

DeepSeek V4 API Pricing Source

This page intentionally does not publish fixed DeepSeek token prices. DeepSeek API pricing is a live billing detail, and it can change because of launch pricing, promotions, discounts, product updates, or future model changes.

Official pricing source: DeepSeek Models & Pricing.

Use that official page for current token prices, any limited-time promotions, context length, maximum output, feature matrix, and model availability. Use this Chat-Deep.ai page for model selection and migration guidance, not as the final billing source.

If you are planning an API product, you can also use the local DeepSeek API pricing guide and DeepSeek API cost calculator for explanation and budgeting workflow, then confirm the actual rates against DeepSeek’s official pricing page before shipping.

Usage Tracking and Cost Control Without Hardcoded Prices

Even without hardcoding prices in this article, developers still need a reliable way to estimate and control API spend. The safe approach is to track token categories and apply the current official rates from DeepSeek’s pricing page at the time of calculation.

The general calculation pattern is:

Estimated request cost =
  cache_hit_input_tokens × current official cache-hit input rate
+ cache_miss_input_tokens × current official cache-miss input rate
+ output_tokens × current official output rate

Track these fields in production where available:

  • usage.prompt_tokens
  • usage.prompt_cache_hit_tokens
  • usage.prompt_cache_miss_tokens
  • usage.completion_tokens
  • usage.completion_tokens_details.reasoning_tokens
  • usage.total_tokens

Cost control is not only a pricing-table issue. It also depends on model routing, prompt length, retrieved context size, cache-hit rate, output limits, thinking effort, retries, tool loops, and how often you escalate from V4-Flash to V4-Pro.

DeepSeek V4 Context Length and Max Output

The official V4 API documentation lists a 1M-token context length and a 384K maximum output for the current V4 API models.

A 1M-token context window can help with long-document analysis, transcript processing, codebase review, research workflows, legal or technical document review, and agents that need a large working memory. However, a larger context window does not automatically guarantee better answers. You should still test retrieval accuracy, prompt structure, latency, cost, tool calls, and structured output behavior before moving production workloads to V4.

DeepSeek V4 Open Weights and MIT License

DeepSeek published an official DeepSeek-V4 Hugging Face collection. The DeepSeek-V4-Pro model card states that the repository and model weights are licensed under the MIT License.

ModelTotal ParametersActivated ParametersContext LengthPrecisionOfficial Repository
DeepSeek-V4-Flash-Base284B13B1MFP8 MixedHugging Face
DeepSeek-V4-Flash284B13B1MFP4 + FP8 MixedHugging Face
DeepSeek-V4-Pro-Base1.6T49B1MFP8 MixedHugging Face
DeepSeek-V4-Pro1.6T49B1MFP4 + FP8 MixedHugging Face

The official model card explains that FP4 + FP8 mixed precision uses FP4 for MoE expert parameters and FP8 for most other parameters. Open weights are useful for research and infrastructure teams, but local deployment still requires serious hardware, careful inference setup, and close reading of the official model card.

DeepSeek V4 Architecture and Benchmark Highlights

DeepSeek describes V4 as a preview series of Mixture-of-Experts language models. The V4 model card lists several architecture and optimization upgrades:

  • Mixture-of-Experts architecture: only a subset of parameters is activated per token, while the model keeps large total capacity.
  • Hybrid Attention Architecture: DeepSeek combines Compressed Sparse Attention and Heavily Compressed Attention to improve long-context efficiency.
  • Manifold-Constrained Hyper-Connections: mHC is used to strengthen residual connections and improve training stability.
  • Muon optimizer: DeepSeek says it uses the Muon optimizer for faster convergence and stability.
  • Large-scale pretraining: DeepSeek says V4 models were pretrained on more than 32T diverse and high-quality tokens.

DeepSeek-Reported Benchmark Highlights

The numbers below are DeepSeek-reported model-card results. Treat them as vendor-reported benchmarks, not a replacement for your own production tests.

Benchmark / MetricDeepSeek-V4-Pro MaxWhy It Matters
GPQA Diamond90.1Advanced reasoning and science-heavy QA.
LiveCodeBench93.5Coding performance under benchmark conditions.
SWE Verified80.6Software engineering task resolution.
Terminal Bench 2.067.9Agentic command-line and terminal workflows.
MRCR 1M83.5Long-context reasoning at 1M context scale.

For real applications, test DeepSeek V4 against your own prompts, data, safety constraints, latency requirements, tool-calling patterns, budget limits, and acceptance criteria.

Thinking Modes, Agentic Coding, JSON Output and Tool Calls

DeepSeek V4 supports both thinking and non-thinking modes. In the OpenAI-compatible format, the thinking toggle uses {"thinking": {"type": "enabled"}} or {"thinking": {"type": "disabled"}}. Thinking effort can be controlled with reasoning_effort.

FeatureDeepSeek V4 SupportImplementation Note
Thinking modeSupportedUse thinking and reasoning_effort settings.
Non-thinking modeSupportedUseful for faster routine tasks.
JSON OutputSupportedSet response_format, include the word “json” in the prompt, and provide the target shape.
Tool / Function callingSupportedUseful for agents, external APIs, and structured workflows. Validate tool arguments before execution.
Chat Prefix CompletionSupported, BetaUse official Beta documentation before production.
FIM CompletionSupported in non-thinking mode only, BetaUseful for code completion workflows.
Anthropic API formatSupportedUse https://api.deepseek.com/anthropic.

Important implementation note: DeepSeek’s thinking guide says thinking mode does not support temperature, top_p, presence_penalty, or frequency_penalty. For compatibility, setting those parameters may not trigger an error, but they also may have no effect in thinking mode.

What Changed from DeepSeek V3.2 to DeepSeek V4 Preview?

DeepSeek V3.2 remains important historically, but it should not be described as the current hosted API mapping after the V4 Preview update. Current V4 API content should use the V4 model names directly.

TopicDeepSeek V3.2 Historical PositionDeepSeek V4 Preview Position
Release statusReleased in December 2025 as a prior open-weight model generation.Released on April 24, 2026 as DeepSeek-V4 Preview.
Hosted API mappingHistorically mapped to deepseek-chat and deepseek-reasoner after the V3.2 release.Current V4 API models are deepseek-v4-pro and deepseek-v4-flash.
Legacy aliasesOlder articles may describe aliases as V3.2.The aliases now route to V4-Flash modes during the transition period.
Context lengthOften discussed around earlier 128K-context API content.Official V4 API documentation lists 1M context.
PricingOlder V3.2 pricing snippets should be treated as historical.Use the current official V4 pricing page before production use.

Developer Migration Checklist for DeepSeek V4

  • Replace new production uses of deepseek-chat with deepseek-v4-flash where Flash is suitable.
  • Use deepseek-v4-pro for higher-value reasoning, coding, long-context, and agentic workloads.
  • Audit documentation, SDK wrappers, environment variables, examples, and internal dashboards for old model names.
  • Remove static pricing snippets from evergreen pages and link to the official DeepSeek pricing page instead.
  • Track cache-hit input tokens, cache-miss input tokens, output tokens, and reasoning tokens where available.
  • Test thinking mode and reasoning_effort before production rollout.
  • Validate JSON Output and tool/function calling behavior with your own schemas.
  • Measure latency separately for short prompts, long prompts, and 1M-context workflows.
  • Use feature flags or staged rollout so you can compare V4-Flash and V4-Pro safely.
  • Check the official DeepSeek change log before the July 24, 2026 legacy-alias retirement date.
  • Update internal content so no current page says V3.2 is the current hosted API model.

Copy-Paste Migration Mapping

Old model: deepseek-chat
Current recommended replacement: deepseek-v4-flash
Reason: deepseek-chat is a legacy alias currently routing to V4-Flash non-thinking mode.

Old model: deepseek-reasoner
Current recommended replacement: deepseek-v4-flash or deepseek-v4-pro with thinking enabled
Reason: deepseek-reasoner is a legacy alias currently routing to V4-Flash thinking mode.

High-value reasoning / coding / long-context tasks:
Use: deepseek-v4-pro

High-volume chat / support / summaries / cheaper agent steps:
Use: deepseek-v4-flash

What Not to Overclaim About DeepSeek V4

Avoid This ClaimUse This Safer Wording
“DeepSeek V4 final version has launched.”“DeepSeek V4 has launched as a Preview Release.”
“DeepSeek V4 is a 1T-parameter model.”“DeepSeek V4 Preview includes V4-Pro at 1.6T total parameters and V4-Flash at 284B total parameters.”
“deepseek-chat currently means DeepSeek V3.2.”“deepseek-chat is now a legacy alias that currently routes to V4-Flash non-thinking mode.”
“deepseek-reasoner currently means DeepSeek V3.2.”“deepseek-reasoner is now a legacy alias that currently routes to V4-Flash thinking mode.”
“DeepSeek-V4-Pro-Max is a separate API model.”“DeepSeek-V4-Pro-Max is best described as the maximum reasoning effort mode of DeepSeek-V4-Pro unless DeepSeek documents it as a separate API model name.”
“DeepSeek V4 pricing will not change.”“Verify the official DeepSeek Models & Pricing page before production budgeting.”
“Every open DeepSeek model has the same license.”“Check the exact official model card or repository for the model you plan to use.”
“The hosted API, web app, and open weights are always identical.”“Treat hosted API behavior, web/app behavior, and local open-weight deployment as separate surfaces.”

For more context, see the DeepSeek AI guide, DeepSeek Chat page, DeepSeek API guide, DeepSeek API pricing guide, DeepSeek API cost calculator, DeepSeek Models hub, DeepSeek V3.2 historical guide, and DeepSeek Status guide.

DeepSeek V4 FAQ

Is DeepSeek V4 officially released?

Yes. DeepSeek V4 has officially launched as DeepSeek-V4 Preview on April 24, 2026. The correct wording is Preview Release, not final release.

What are the official DeepSeek V4 API model names?

The official V4 API model names are deepseek-v4-pro and deepseek-v4-flash. New integrations should use these names directly.

What is DeepSeek-V4-Pro?

DeepSeek-V4-Pro is the larger V4 Preview model. DeepSeek lists it as 1.6T total parameters with 49B activated parameters and positions it for advanced reasoning, coding, knowledge-heavy, long-context, and agentic workloads.

What is DeepSeek-V4-Flash?

DeepSeek-V4-Flash is the smaller, faster, and more economical V4 Preview model. DeepSeek lists it as 284B total parameters with 13B activated parameters.

Is DeepSeek V4 a 1T-parameter model?

Not exactly. The precise official figures are 1.6T total parameters for V4-Pro and 284B total parameters for V4-Flash. Calling DeepSeek V4 simply a “1T model” is less accurate.

What is the DeepSeek V4 context length?

The official V4 API documentation lists a 1M-token context length for the current V4 API models.

What is the DeepSeek V4 maximum output limit?

The official DeepSeek Models & Pricing page lists a maximum output limit of 384K tokens for the current V4 API models.

Where should I verify DeepSeek V4 API pricing?

Verify current prices on the official DeepSeek Models & Pricing page. This guide avoids fixed prices because rates and promotions can change.

Is DeepSeek V4 open source?

DeepSeek describes V4 Preview as open-sourced and published official V4 model repositories through Hugging Face. The DeepSeek-V4-Pro model card states that the repository and model weights are licensed under the MIT License. Always check the exact model card before self-hosting or commercial deployment.

Does deepseek-chat still mean DeepSeek V3.2?

No. After the V4 Preview update, deepseek-chat is a legacy compatibility alias that currently routes to DeepSeek-V4-Flash non-thinking mode.

Does deepseek-reasoner still mean DeepSeek V3.2?

No. After the V4 Preview update, deepseek-reasoner is a legacy compatibility alias that currently routes to DeepSeek-V4-Flash thinking mode.

Is DeepSeek-V4-Pro-Max a separate API model?

For API documentation, the safe wording is that DeepSeek-V4-Pro-Max is the maximum reasoning effort mode of DeepSeek-V4-Pro unless DeepSeek documents it as a separate API model name. For new API calls, use deepseek-v4-pro or deepseek-v4-flash.

Is Chat-Deep.ai the official DeepSeek website?

No. Chat-Deep.ai is an independent DeepSeek guide and browser access site. It is not affiliated with DeepSeek, DeepSeek.com, chat.deepseek.com, the official DeepSeek app, or the official DeepSeek developer platform.