DeepSeek V4 Preview is the current official DeepSeek V4 release line. It includes two current API model IDs, deepseek-v4-pro and deepseek-v4-flash, with 1M-token context, 384K maximum output, thinking and non-thinking modes, OpenAI-compatible API access, Anthropic-compatible API access, and official open-weight repositories.
Last verified against official DeepSeek sources: April 26, 2026.
Independent site notice: Chat-Deep.ai is an independent DeepSeek guide and browser access site. It is not affiliated with DeepSeek, DeepSeek.com, chat.deepseek.com, the official DeepSeek app, or the official DeepSeek developer platform. For production decisions, always verify model names, prices, limits, feature support, status, and deprecation notices in the official DeepSeek API documentation.
Current official V4 status: DeepSeek-V4 Preview is live, API-available, and open-sourced. New API integrations should use deepseek-v4-pro or deepseek-v4-flash.
Pricing rule for this page: this guide does not publish fixed token prices. API prices and promotions can change, so all pricing references point readers to the official DeepSeek Models & Pricing page.
Quick Answer: Is DeepSeek V4 Released?
Yes. DeepSeek V4 has launched as DeepSeek-V4 Preview. The safest wording is Preview Release, not final release. DeepSeek describes the preview as officially live, open-sourced, and available through the API using deepseek-v4-pro and deepseek-v4-flash.
DeepSeek-V4-Pro is the larger flagship V4 model with 1.6T total parameters and 49B active parameters. DeepSeek-V4-Flash is the faster and more economical V4 model with 284B total parameters and 13B active parameters. Both current V4 API models are documented with a 1M-token context window and 384K maximum output.
DeepSeek V4 Key Takeaways
- Release status: DeepSeek V4 is live as DeepSeek-V4 Preview.
- Current API model names: use
deepseek-v4-proordeepseek-v4-flashfor new integrations. - API base URLs: use
https://api.deepseek.comfor OpenAI-compatible requests andhttps://api.deepseek.com/anthropicfor Anthropic-compatible requests. - Context and output: both V4 API models are listed with 1M context and 384K maximum output.
- Model roles: V4-Flash is the fast and economical model; V4-Pro is the stronger model for harder reasoning, coding, long-context analysis, and agentic workflows.
- Pricing: do not hardcode prices from this guide. Use the official DeepSeek Models & Pricing page.
- Legacy aliases:
deepseek-chatanddeepseek-reasonercurrently route to V4-Flash non-thinking and thinking modes and are scheduled to be retired after July 24, 2026, 15:59 UTC. - Open weights: DeepSeek published V4 model repositories through the official DeepSeek-V4 Hugging Face collection.
DeepSeek V4 Guide Contents
- Release Date and Current Status
- DeepSeek V4 at a Glance
- Is DeepSeek V4 a 1T-Parameter Model?
- DeepSeek V4 Pro vs Flash
- Which DeepSeek V4 Model Should You Use?
- API Model Names and Base URLs
- deepseek-chat and deepseek-reasoner Migration
- API Examples
- DeepSeek V4 API Pricing Source
- Usage Tracking and Cost Control Without Hardcoded Prices
- 1M Context and 384K Output
- Open Weights and MIT License
- Architecture and Benchmark Highlights
- Thinking Modes, JSON, Tool Calls and Agents
- What Changed from DeepSeek V3.2 to V4?
- Developer Migration Checklist
- What Not to Overclaim
- FAQ
- Sources
DeepSeek V4 Release Date and Current Status
DeepSeek V4 Preview was announced on April 24, 2026. DeepSeek’s official release note says DeepSeek-V4 Preview is live and open-sourced, and that the API is available now. It also says developers can keep the same base URL and update the model parameter to deepseek-v4-pro or deepseek-v4-flash.
For fast-moving launch updates, outages, or later API changes, check the official DeepSeek change log, the official DeepSeek Service Status page, and the local DeepSeek Status guide.
DeepSeek V4 at a Glance
| Topic | Current DeepSeek V4 Detail |
|---|---|
| Release name | DeepSeek-V4 Preview |
| Release date | April 24, 2026 |
| Main API models | deepseek-v4-pro and deepseek-v4-flash |
| OpenAI-compatible base URL | https://api.deepseek.com |
| Anthropic-compatible base URL | https://api.deepseek.com/anthropic |
| V4-Pro size | 1.6T total parameters / 49B active parameters |
| V4-Flash size | 284B total parameters / 13B active parameters |
| Context length | 1M tokens |
| Maximum output | 384K tokens |
| Modes | Thinking and non-thinking modes |
| Supported API features | JSON Output, Tool Calls, Chat Prefix Completion Beta, and FIM Completion Beta in non-thinking mode only |
| Pricing source | Official DeepSeek Models & Pricing |
| Open weights | Published through DeepSeek’s official Hugging Face collection |
| Legacy aliases | deepseek-chat and deepseek-reasoner route to V4-Flash compatibility modes during the transition period |
Is DeepSeek V4 a 1T-Parameter Model?
You may see DeepSeek V4 described online as a “1T parameter model,” but that is only rough shorthand. The official V4 details are more precise:
- DeepSeek-V4-Pro: 1.6T total parameters and 49B active parameters.
- DeepSeek-V4-Flash: 284B total parameters and 13B active parameters.
For accurate technical content, avoid saying “DeepSeek V4 is a 1T model” as if there is only one V4 model. The better wording is: DeepSeek V4 Preview includes V4-Pro at 1.6T total parameters and V4-Flash at 284B total parameters.
DeepSeek V4 Pro vs Flash: Key Differences
| Feature | DeepSeek-V4-Pro | DeepSeek-V4-Flash |
|---|---|---|
| API model name | deepseek-v4-pro | deepseek-v4-flash |
| Total parameters | 1.6T | 284B |
| Activated parameters | 49B | 13B |
| Context length | 1M tokens | 1M tokens |
| Maximum output | 384K tokens | 384K tokens |
| Thinking modes | Non-thinking, Think High, and Think Max through effort controls | Non-thinking, Think High, and Think Max through effort controls |
| Best for | Advanced reasoning, difficult coding, long-context analysis, agentic workflows, and higher-value production tests | Fast chat, summaries, extraction, support assistants, simpler agents, and high-volume workloads |
| Pricing source | Verify current official pricing | Verify current official pricing |
| Simple rule | Use when answer quality and reasoning depth matter most. | Use as the default starting point, then escalate difficult tasks to Pro. |
The safest practical summary is: DeepSeek-V4-Pro is the stronger flagship model; DeepSeek-V4-Flash is the faster and more economical model. Start with Flash for normal workloads, then route more complex reasoning, coding, agent, and long-context tasks to Pro when quality justifies the switch.
Which DeepSeek V4 Model Should You Use?
Use this decision table before choosing between deepseek-v4-pro and deepseek-v4-flash.
| Use Case | Recommended Model | Why |
|---|---|---|
| Everyday chat, quick answers, rewriting, basic explanations | deepseek-v4-flash | Fast and economical enough for most routine outputs. |
| Customer support bot with many conversations | deepseek-v4-flash | Good default for high-volume workflows where speed and efficiency matter. |
| Extraction, classification, structured summaries | deepseek-v4-flash | Start with the efficient model and validate output quality against your schema. |
| Large document summarization | Start with deepseek-v4-flash, escalate to deepseek-v4-pro for complex synthesis | Both support 1M context, but Pro may be better for harder reasoning across documents. |
| Code review, debugging, complex software planning | deepseek-v4-pro | Better fit for higher-value coding and reasoning tasks. |
| Agentic coding, tool use, multi-step workflows | deepseek-v4-pro for hard tasks; Flash for simple agent steps | This balances capability, latency, and budget control. |
| Production system with mixed complexity | Use a router: Flash first, Pro for escalation | Route by task difficulty instead of forcing every request through one model. |
| Strict budget control | Use deepseek-v4-flash first and verify official pricing before scale-up | Do not estimate budget from old or copied pricing snippets. |
DeepSeek V4 API Model Names and Base URLs
The official V4 API model names are:
deepseek-v4-prodeepseek-v4-flash
The OpenAI-compatible base URL remains:
https://api.deepseek.com
The Anthropic-compatible base URL is:
https://api.deepseek.com/anthropic
For most OpenAI-compatible tooling, you can keep your base URL and update the model name. For Claude Code or Anthropic-compatible ecosystems, use the Anthropic-compatible base URL and a current V4 model name. See the local DeepSeek API guide for a broader integration walkthrough.
What Happens to deepseek-chat and deepseek-reasoner?
The legacy names deepseek-chat and deepseek-reasoner are no longer the best model names for new V4 API integrations. During the current transition period:
deepseek-chatcurrently routes to the non-thinking mode of DeepSeek-V4-Flash.deepseek-reasonercurrently routes to the thinking mode of DeepSeek-V4-Flash.
DeepSeek’s V4 release note says these two legacy API model names will be fully retired and inaccessible after July 24, 2026, 15:59 UTC. New integrations should use deepseek-v4-pro or deepseek-v4-flash directly.
| Name | Status | Current Mapping | Recommended Action |
|---|---|---|---|
deepseek-v4-pro | Current V4 API model | DeepSeek-V4-Pro | Use for stronger reasoning, coding, long-context, and agentic workloads. |
deepseek-v4-flash | Current V4 API model | DeepSeek-V4-Flash | Use for fast, economical, high-volume workloads. |
deepseek-chat | Legacy compatibility alias | V4-Flash non-thinking mode | Replace with deepseek-v4-flash unless you intentionally need temporary compatibility. |
deepseek-reasoner | Legacy compatibility alias | V4-Flash thinking mode | Replace with deepseek-v4-flash or deepseek-v4-pro plus thinking settings. |
DeepSeek V4 API Examples
OpenAI-Compatible cURL Example
This example uses the current V4 model name directly:
curl https://api.deepseek.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_DEEPSEEK_API_KEY" \
-d '{
"model": "deepseek-v4-pro",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the difference between DeepSeek V4 Pro and Flash."}
],
"reasoning_effort": "high",
"thinking": {"type": "enabled"},
"stream": false
}'
OpenAI SDK Python Example
When using the OpenAI SDK, DeepSeek-specific fields such as thinking should be passed through extra_body.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com"
)
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[
{"role": "user", "content": "Create a migration checklist for DeepSeek V4."}
],
reasoning_effort="high",
extra_body={"thinking": {"type": "enabled"}}
)
print(response.choices[0].message.content)
Anthropic-Compatible Example
DeepSeek also supports an Anthropic-compatible API format through a separate base URL:
export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_API_KEY=YOUR_DEEPSEEK_API_KEY
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="deepseek-v4-pro",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[
{
"role": "user",
"content": [{"type": "text", "text": "Summarize DeepSeek V4 in 5 bullets."}]
}
]
)
print(message.content)
For fast, economical workloads, replace deepseek-v4-pro with deepseek-v4-flash. Always keep API keys server-side and out of public code, screenshots, analytics, or browser bundles.
DeepSeek V4 API Pricing Source
This page intentionally does not publish fixed DeepSeek token prices. DeepSeek API pricing is a live billing detail, and it can change because of launch pricing, promotions, discounts, product updates, or future model changes.
Official pricing source: DeepSeek Models & Pricing.
Use that official page for current token prices, any limited-time promotions, context length, maximum output, feature matrix, and model availability. Use this Chat-Deep.ai page for model selection and migration guidance, not as the final billing source.
If you are planning an API product, you can also use the local DeepSeek API pricing guide and DeepSeek API cost calculator for explanation and budgeting workflow, then confirm the actual rates against DeepSeek’s official pricing page before shipping.
Usage Tracking and Cost Control Without Hardcoded Prices
Even without hardcoding prices in this article, developers still need a reliable way to estimate and control API spend. The safe approach is to track token categories and apply the current official rates from DeepSeek’s pricing page at the time of calculation.
The general calculation pattern is:
Estimated request cost =
cache_hit_input_tokens × current official cache-hit input rate
+ cache_miss_input_tokens × current official cache-miss input rate
+ output_tokens × current official output rate
Track these fields in production where available:
usage.prompt_tokensusage.prompt_cache_hit_tokensusage.prompt_cache_miss_tokensusage.completion_tokensusage.completion_tokens_details.reasoning_tokensusage.total_tokens
Cost control is not only a pricing-table issue. It also depends on model routing, prompt length, retrieved context size, cache-hit rate, output limits, thinking effort, retries, tool loops, and how often you escalate from V4-Flash to V4-Pro.
DeepSeek V4 Context Length and Max Output
The official V4 API documentation lists a 1M-token context length and a 384K maximum output for the current V4 API models.
A 1M-token context window can help with long-document analysis, transcript processing, codebase review, research workflows, legal or technical document review, and agents that need a large working memory. However, a larger context window does not automatically guarantee better answers. You should still test retrieval accuracy, prompt structure, latency, cost, tool calls, and structured output behavior before moving production workloads to V4.
DeepSeek V4 Open Weights and MIT License
DeepSeek published an official DeepSeek-V4 Hugging Face collection. The DeepSeek-V4-Pro model card states that the repository and model weights are licensed under the MIT License.
| Model | Total Parameters | Activated Parameters | Context Length | Precision | Official Repository |
|---|---|---|---|---|---|
| DeepSeek-V4-Flash-Base | 284B | 13B | 1M | FP8 Mixed | Hugging Face |
| DeepSeek-V4-Flash | 284B | 13B | 1M | FP4 + FP8 Mixed | Hugging Face |
| DeepSeek-V4-Pro-Base | 1.6T | 49B | 1M | FP8 Mixed | Hugging Face |
| DeepSeek-V4-Pro | 1.6T | 49B | 1M | FP4 + FP8 Mixed | Hugging Face |
The official model card explains that FP4 + FP8 mixed precision uses FP4 for MoE expert parameters and FP8 for most other parameters. Open weights are useful for research and infrastructure teams, but local deployment still requires serious hardware, careful inference setup, and close reading of the official model card.
DeepSeek V4 Architecture and Benchmark Highlights
DeepSeek describes V4 as a preview series of Mixture-of-Experts language models. The V4 model card lists several architecture and optimization upgrades:
- Mixture-of-Experts architecture: only a subset of parameters is activated per token, while the model keeps large total capacity.
- Hybrid Attention Architecture: DeepSeek combines Compressed Sparse Attention and Heavily Compressed Attention to improve long-context efficiency.
- Manifold-Constrained Hyper-Connections: mHC is used to strengthen residual connections and improve training stability.
- Muon optimizer: DeepSeek says it uses the Muon optimizer for faster convergence and stability.
- Large-scale pretraining: DeepSeek says V4 models were pretrained on more than 32T diverse and high-quality tokens.
DeepSeek-Reported Benchmark Highlights
The numbers below are DeepSeek-reported model-card results. Treat them as vendor-reported benchmarks, not a replacement for your own production tests.
| Benchmark / Metric | DeepSeek-V4-Pro Max | Why It Matters |
|---|---|---|
| GPQA Diamond | 90.1 | Advanced reasoning and science-heavy QA. |
| LiveCodeBench | 93.5 | Coding performance under benchmark conditions. |
| SWE Verified | 80.6 | Software engineering task resolution. |
| Terminal Bench 2.0 | 67.9 | Agentic command-line and terminal workflows. |
| MRCR 1M | 83.5 | Long-context reasoning at 1M context scale. |
For real applications, test DeepSeek V4 against your own prompts, data, safety constraints, latency requirements, tool-calling patterns, budget limits, and acceptance criteria.
Thinking Modes, Agentic Coding, JSON Output and Tool Calls
DeepSeek V4 supports both thinking and non-thinking modes. In the OpenAI-compatible format, the thinking toggle uses {"thinking": {"type": "enabled"}} or {"thinking": {"type": "disabled"}}. Thinking effort can be controlled with reasoning_effort.
| Feature | DeepSeek V4 Support | Implementation Note |
|---|---|---|
| Thinking mode | Supported | Use thinking and reasoning_effort settings. |
| Non-thinking mode | Supported | Useful for faster routine tasks. |
| JSON Output | Supported | Set response_format, include the word “json” in the prompt, and provide the target shape. |
| Tool / Function calling | Supported | Useful for agents, external APIs, and structured workflows. Validate tool arguments before execution. |
| Chat Prefix Completion | Supported, Beta | Use official Beta documentation before production. |
| FIM Completion | Supported in non-thinking mode only, Beta | Useful for code completion workflows. |
| Anthropic API format | Supported | Use https://api.deepseek.com/anthropic. |
Important implementation note: DeepSeek’s thinking guide says thinking mode does not support temperature, top_p, presence_penalty, or frequency_penalty. For compatibility, setting those parameters may not trigger an error, but they also may have no effect in thinking mode.
What Changed from DeepSeek V3.2 to DeepSeek V4 Preview?
DeepSeek V3.2 remains important historically, but it should not be described as the current hosted API mapping after the V4 Preview update. Current V4 API content should use the V4 model names directly.
| Topic | DeepSeek V3.2 Historical Position | DeepSeek V4 Preview Position |
|---|---|---|
| Release status | Released in December 2025 as a prior open-weight model generation. | Released on April 24, 2026 as DeepSeek-V4 Preview. |
| Hosted API mapping | Historically mapped to deepseek-chat and deepseek-reasoner after the V3.2 release. | Current V4 API models are deepseek-v4-pro and deepseek-v4-flash. |
| Legacy aliases | Older articles may describe aliases as V3.2. | The aliases now route to V4-Flash modes during the transition period. |
| Context length | Often discussed around earlier 128K-context API content. | Official V4 API documentation lists 1M context. |
| Pricing | Older V3.2 pricing snippets should be treated as historical. | Use the current official V4 pricing page before production use. |
Developer Migration Checklist for DeepSeek V4
- Replace new production uses of
deepseek-chatwithdeepseek-v4-flashwhere Flash is suitable. - Use
deepseek-v4-profor higher-value reasoning, coding, long-context, and agentic workloads. - Audit documentation, SDK wrappers, environment variables, examples, and internal dashboards for old model names.
- Remove static pricing snippets from evergreen pages and link to the official DeepSeek pricing page instead.
- Track cache-hit input tokens, cache-miss input tokens, output tokens, and reasoning tokens where available.
- Test thinking mode and
reasoning_effortbefore production rollout. - Validate JSON Output and tool/function calling behavior with your own schemas.
- Measure latency separately for short prompts, long prompts, and 1M-context workflows.
- Use feature flags or staged rollout so you can compare V4-Flash and V4-Pro safely.
- Check the official DeepSeek change log before the July 24, 2026 legacy-alias retirement date.
- Update internal content so no current page says V3.2 is the current hosted API model.
Copy-Paste Migration Mapping
Old model: deepseek-chat
Current recommended replacement: deepseek-v4-flash
Reason: deepseek-chat is a legacy alias currently routing to V4-Flash non-thinking mode.
Old model: deepseek-reasoner
Current recommended replacement: deepseek-v4-flash or deepseek-v4-pro with thinking enabled
Reason: deepseek-reasoner is a legacy alias currently routing to V4-Flash thinking mode.
High-value reasoning / coding / long-context tasks:
Use: deepseek-v4-pro
High-volume chat / support / summaries / cheaper agent steps:
Use: deepseek-v4-flash
What Not to Overclaim About DeepSeek V4
| Avoid This Claim | Use This Safer Wording |
|---|---|
| “DeepSeek V4 final version has launched.” | “DeepSeek V4 has launched as a Preview Release.” |
| “DeepSeek V4 is a 1T-parameter model.” | “DeepSeek V4 Preview includes V4-Pro at 1.6T total parameters and V4-Flash at 284B total parameters.” |
| “deepseek-chat currently means DeepSeek V3.2.” | “deepseek-chat is now a legacy alias that currently routes to V4-Flash non-thinking mode.” |
| “deepseek-reasoner currently means DeepSeek V3.2.” | “deepseek-reasoner is now a legacy alias that currently routes to V4-Flash thinking mode.” |
| “DeepSeek-V4-Pro-Max is a separate API model.” | “DeepSeek-V4-Pro-Max is best described as the maximum reasoning effort mode of DeepSeek-V4-Pro unless DeepSeek documents it as a separate API model name.” |
| “DeepSeek V4 pricing will not change.” | “Verify the official DeepSeek Models & Pricing page before production budgeting.” |
| “Every open DeepSeek model has the same license.” | “Check the exact official model card or repository for the model you plan to use.” |
| “The hosted API, web app, and open weights are always identical.” | “Treat hosted API behavior, web/app behavior, and local open-weight deployment as separate surfaces.” |
Related DeepSeek Resources
For more context, see the DeepSeek AI guide, DeepSeek Chat page, DeepSeek API guide, DeepSeek API pricing guide, DeepSeek API cost calculator, DeepSeek Models hub, DeepSeek V3.2 historical guide, and DeepSeek Status guide.
DeepSeek V4 FAQ
Is DeepSeek V4 officially released?
Yes. DeepSeek V4 has officially launched as DeepSeek-V4 Preview on April 24, 2026. The correct wording is Preview Release, not final release.
What are the official DeepSeek V4 API model names?
The official V4 API model names are deepseek-v4-pro and deepseek-v4-flash. New integrations should use these names directly.
What is DeepSeek-V4-Pro?
DeepSeek-V4-Pro is the larger V4 Preview model. DeepSeek lists it as 1.6T total parameters with 49B activated parameters and positions it for advanced reasoning, coding, knowledge-heavy, long-context, and agentic workloads.
What is DeepSeek-V4-Flash?
DeepSeek-V4-Flash is the smaller, faster, and more economical V4 Preview model. DeepSeek lists it as 284B total parameters with 13B activated parameters.
Is DeepSeek V4 a 1T-parameter model?
Not exactly. The precise official figures are 1.6T total parameters for V4-Pro and 284B total parameters for V4-Flash. Calling DeepSeek V4 simply a “1T model” is less accurate.
What is the DeepSeek V4 context length?
The official V4 API documentation lists a 1M-token context length for the current V4 API models.
What is the DeepSeek V4 maximum output limit?
The official DeepSeek Models & Pricing page lists a maximum output limit of 384K tokens for the current V4 API models.
Where should I verify DeepSeek V4 API pricing?
Verify current prices on the official DeepSeek Models & Pricing page. This guide avoids fixed prices because rates and promotions can change.
Is DeepSeek V4 open source?
DeepSeek describes V4 Preview as open-sourced and published official V4 model repositories through Hugging Face. The DeepSeek-V4-Pro model card states that the repository and model weights are licensed under the MIT License. Always check the exact model card before self-hosting or commercial deployment.
Does deepseek-chat still mean DeepSeek V3.2?
No. After the V4 Preview update, deepseek-chat is a legacy compatibility alias that currently routes to DeepSeek-V4-Flash non-thinking mode.
Does deepseek-reasoner still mean DeepSeek V3.2?
No. After the V4 Preview update, deepseek-reasoner is a legacy compatibility alias that currently routes to DeepSeek-V4-Flash thinking mode.
Is DeepSeek-V4-Pro-Max a separate API model?
For API documentation, the safe wording is that DeepSeek-V4-Pro-Max is the maximum reasoning effort mode of DeepSeek-V4-Pro unless DeepSeek documents it as a separate API model name. For new API calls, use deepseek-v4-pro or deepseek-v4-flash.
Is Chat-Deep.ai the official DeepSeek website?
No. Chat-Deep.ai is an independent DeepSeek guide and browser access site. It is not affiliated with DeepSeek, DeepSeek.com, chat.deepseek.com, the official DeepSeek app, or the official DeepSeek developer platform.
