Last updated: May 12, 2026
DeepSeek-R1-0528 is the May 2025 upgraded version of DeepSeek’s R1 reasoning model. It improves benchmark performance, reasoning depth, coding ability, front-end generation, JSON output, and function calling compared with the original DeepSeek R1. It is no longer DeepSeek’s newest model in 2026 because DeepSeek V4 Preview is now live, but it remains important for developers, researchers, and AI teams studying open reasoning models, R1-compatible workflows, and distilled reasoning models. DeepSeek’s official release notes describe the update as improving benchmark performance, front-end capabilities, hallucination behavior, JSON output, and function calling, while the Hugging Face model card reports major gains on AIME 2025, GPQA-Diamond, LiveCodeBench, SWE Verified, and Aider-Polyglot.
Table of Contents
What Is DeepSeek-R1-0528?
DeepSeek-R1-0528 is a minor-version upgrade to DeepSeek R1, DeepSeek’s first-generation reasoning model family. It was released on May 28, 2025, as an updated R1 checkpoint with stronger reasoning, coding, and benchmark performance. DeepSeek’s release page says the update improves benchmark performance, enhances front-end capabilities, reduces hallucinations, and supports JSON output and function calling.
In simple terms, DeepSeek-R1-0528 is designed to spend more computation on difficult tasks before producing a final answer. Reasoning models are especially useful when the task requires multi-step logic, math, competitive programming, code debugging, technical planning, or structured analysis rather than a fast one-paragraph response. The Hugging Face model card says the update improved reasoning depth through increased computational resources and algorithmic optimization during post-training.
DeepSeek-R1-0528 is available through the official Hugging Face model page, third-party API providers, and supported local inference stacks such as vLLM and SGLang. The model card also notes that users can access DeepSeek R1 through DeepSeek’s chat site with the “DeepThink” option and through an OpenAI-compatible API, although official DeepSeek API model routing changed later in 2026 with the V4 Preview rollout.
What Changed in DeepSeek-R1-0528?
The main improvement is reasoning quality. According to the Hugging Face model card, DeepSeek-R1-0528 raised AIME 2025 accuracy from 70.0% in the original R1 to 87.5% in R1-0528. The same model card says the model’s average thinking length on AIME increased from about 12K tokens per question to about 23K tokens per question, which helps explain why it performs better on complex math and reasoning tasks.
DeepSeek-R1-0528 also improved coding and software-engineering performance. The official benchmark table reports gains on LiveCodeBench, Codeforces-Div1, SWE Verified, and Aider-Polyglot, which makes the model more useful for competitive programming, code repair, code review, and multi-step development tasks.
The update also matters for application developers because DeepSeek’s release notes explicitly list JSON output and function calling support. In practice, this means DeepSeek-R1-0528 is more relevant for tool-using agents, structured outputs, and workflows that need predictable response formats. However, developers should verify feature behavior with the exact provider they use because DeepSeek’s current 2026 API documentation now centers on V4 models and marks older deepseek-chat and deepseek-reasoner names for future deprecation.
Another practical change is prompting behavior. Compared with earlier R1 guidance, the Hugging Face page says system prompts are now supported and users no longer need to force a thinking pattern by adding <think>\n at the beginning of the output.
DeepSeek-R1-0528 Technical Specs
The specs below combine the most useful verified details from DeepSeek’s official pages, Hugging Face, GitHub, Azure AI Foundry, Together AI, and current DeepSeek API documentation. Some values vary by provider, especially context length, pricing, and model aliases.
| Spec | DeepSeek-R1-0528 |
|---|---|
| Model name | DeepSeek-R1-0528 |
| Developer | DeepSeek |
| Release date | May 28, 2025 |
| Model family | DeepSeek R1 |
| Model type | Open reasoning model / text-generation model |
| Architecture | Mixture-of-Experts lineage based on DeepSeek R1 / DeepSeek V3 architecture |
| Model size | Hugging Face lists 685B params; original DeepSeek R1 GitHub docs list 671B total / 37B activated for R1 |
| Active parameters | 37B active is documented for original DeepSeek R1; provider pages may not always restate it for R1-0528 |
| Max generation length | 64K tokens in the R1-0528 benchmark setup |
| Context length | Provider-specific: Azure lists 163,840; Together lists 160K; OpenRouter has listed 164K |
| License | MIT on Hugging Face / DeepSeek R1 series supports commercial use and distillation |
| Access methods | Hugging Face, local inference, DeepSeek Chat, provider APIs, Azure AI Foundry, Together AI, OpenRouter, GitHub Models |
| API compatibility | OpenAI-compatible APIs are available through DeepSeek and providers |
| JSON output | Supported in the R1-0528 release; provider behavior should be tested |
| Function calling / tool calls | R1-0528 release says function calling is supported; current V4 DeepSeek API docs use tool calls terminology |
There is a common parameter-count confusion. Hugging Face displays 685B params for deepseek-ai/DeepSeek-R1-0528, while the original DeepSeek R1 GitHub repository lists DeepSeek-R1 as 671B total parameters and 37B activated parameters with a 128K context length. The safest way to write about the model is to say that Hugging Face currently lists R1-0528 as 685B, while the original R1 architecture documentation and some providers refer to the 671B/37B-active MoE structure.
For licensing, Hugging Face states that the code repository and DeepSeek-R1 models are subject to the MIT License and support commercial use and distillation. The original GitHub repository also says the R1 series supports commercial use, modifications, derivative works, and distillation.
DeepSeek-R1-0528 Benchmarks
The following official/model-card benchmark table compares the original DeepSeek R1 with DeepSeek R1 0528. DeepSeek says the benchmark setup used a maximum generation length of 64K tokens, temperature 0.6, top-p 0.95, and 16 responses per query to estimate pass@1.
| Category | Benchmark | DeepSeek R1 | DeepSeek R1 0528 |
|---|---|---|---|
| General | MMLU-Redux (EM) | 92.9 | 93.4 |
| General | MMLU-Pro (EM) | 84.0 | 85.0 |
| General | GPQA-Diamond (Pass@1) | 71.5 | 81.0 |
| General | Humanity’s Last Exam / HLE (Pass@1) | 8.5 | 17.7 |
| Code | LiveCodeBench 2408–2505 (Pass@1) | 63.5 | 73.3 |
| Code | Codeforces-Div1 (Rating) | 1530 | 1930 |
| Code | SWE Verified (Resolved) | 49.2 | 57.6 |
| Code | Aider-Polyglot (Acc.) | 53.3 | 71.6 |
| Math | AIME 2024 (Pass@1) | 79.8 | 91.4 |
| Math | AIME 2025 (Pass@1) | 70.0 | 87.5 |
| Math | HMMT 2025 (Pass@1) | 41.7 | 79.4 |
| Math | CNMO 2024 (Pass@1) | 78.8 | 86.9 |
| Tools | BFCL_v3_MultiTurn (Acc.) | — | 37.0 |
| Tools | Tau-Bench (Pass@1) | — | 53.5 Airline / 63.9 Retail |
The strongest story in the DeepSeek-R1-0528 benchmarks is not a tiny incremental gain; it is the size of the improvements in math, reasoning, and coding. AIME 2025 rose by 17.5 percentage points, GPQA-Diamond rose by 9.5 points, LiveCodeBench rose by 9.8 points, and Aider-Polyglot rose by 18.3 points. These results make DeepSeek-R1-0528 particularly interesting for complex reasoning, competitive programming, code modification, and benchmark-focused research.
Benchmarks still need context. They do not prove that the model is always better in production, faster in real-world apps, safer in high-risk domains, or cheaper for every workload. They show that, under DeepSeek’s stated benchmark setup, DeepSeek-R1-0528 significantly improved over the original R1 on several difficult reasoning and coding tests.
DeepSeek-R1-0528 vs Original DeepSeek R1
DeepSeek-R1-0528 is best understood as a stronger post-training update to the original DeepSeek R1 rather than a completely separate product line. The original R1 was released in January 2025 as an open reasoning model with MIT licensing, open weights, and performance positioned as comparable to OpenAI o1 across math, code, and reasoning tasks.
| Area | DeepSeek R1 | DeepSeek-R1-0528 |
|---|---|---|
| Release timing | January 2025 | May 2025 update |
| Reasoning depth | Strong first-generation R1 reasoning | Deeper reasoning with longer average thinking on AIME |
| AIME 2025 | 70.0 | 87.5 |
| LiveCodeBench | 63.5 | 73.3 |
| SWE Verified | 49.2 | 57.6 |
| Aider-Polyglot | 53.3 | 71.6 |
| Function calling | Less emphasized in original R1 release | Explicitly mentioned in R1-0528 release |
| JSON output | Less emphasized in original R1 release | Explicitly mentioned in R1-0528 release |
| Hallucination behavior | Strong but not the main update theme | DeepSeek says hallucinations are reduced |
| Prompting | Older R1 usage sometimes required special thinking prompts | System prompts supported; no need to force <think> |
| Best fit | Open reasoning, math, code, distillation | Stronger reasoning, coding, structured output, front-end generation |
For most users, the reason to choose DeepSeek-R1-0528 over the original R1 is straightforward: it has better official benchmark numbers, broader feature support, and more mature prompting recommendations. The original R1 is still historically important, but R1-0528 is the more relevant checkpoint for most R1-line evaluations.
DeepSeek-R1-0528 vs DeepSeek V4: Should You Still Use It in 2026?
DeepSeek-R1-0528 is not DeepSeek’s newest model in 2026. DeepSeek V4 Preview was announced on April 24, 2026, with two models: DeepSeek-V4-Pro at 1.6T total parameters and 49B active parameters, and DeepSeek-V4-Flash at 284B total parameters and 13B active parameters. DeepSeek also says V4 has 1M context as the default across official DeepSeek services and supports both OpenAI Chat Completions and Anthropic APIs.
Current DeepSeek API docs also say deepseek-chat and deepseek-reasoner are being deprecated and, for compatibility, correspond to non-thinking and thinking modes of deepseek-v4-flash. That means developers who specifically need exact DeepSeek-R1-0528 behavior should verify the model behind their endpoint and may prefer a provider that explicitly lists DeepSeek-R1-0528.
| Choose this | When it makes sense |
|---|---|
| DeepSeek-R1-0528 | You need the R1 reasoning line, open R1 benchmarks, R1-compatible workflows, or reproducible comparisons against the original R1. |
| DeepSeek V4-Pro | You want DeepSeek’s newer flagship direction, 1M context, stronger agentic coding, and current official API support. |
| DeepSeek V4-Flash | You need a newer, faster, lower-cost official DeepSeek API model with thinking and non-thinking modes. |
| DeepSeek-R1-0528-Qwen3-8B | You want a smaller distilled reasoning model for local experiments, research, or limited-resource inference. |
DeepSeek-R1-0528 is still relevant in 2026 because it remains a major reference point for open reasoning models. It is useful when the task is model research, benchmark comparison, reasoning trace analysis, distillation, or existing R1-based tooling. But for new production builds on the official DeepSeek API, V4 should be evaluated first because it is the current DeepSeek platform direction.
How to Use DeepSeek-R1-0528
DeepSeek Chat / DeepThink
DeepSeek’s model card says users can chat with DeepSeek R1 on the official DeepSeek website and switch on “DeepThink.” This is the simplest option for manual testing, prompt exploration, and comparing reasoning outputs before building an API workflow.
DeepSeek API or OpenAI-Compatible API
Historically, the original R1 release told API users to use model=deepseek-reasoner, and the R1-0528 release said there was no change to API usage. However, current 2026 DeepSeek API docs now list V4 model names and note that deepseek-chat and deepseek-reasoner will be deprecated on July 24, 2026, mapping to V4-Flash non-thinking and thinking modes for compatibility.
For exact R1-0528 behavior, use a provider that explicitly identifies the R1-0528 checkpoint, or verify with the provider that its endpoint is still serving the updated R1-0528 weights.
Python API Example
Together AI states that its DeepSeek R1 endpoint was updated on May 28, 2025, to use the improved DeepSeek-R1-0528 weights, and its documented endpoint is deepseek-ai/DeepSeek-R1.
from together import Together
client = Together()
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[
{
"role": "user",
"content": "Explain the tradeoffs between RAG and long-context prompting."
}
],
)
print(response.choices[0].message.content)
cURL Example
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-R1",
"messages": [
{
"role": "user",
"content": "Solve this math problem and put the final answer in boxed notation."
}
]
}'
Hugging Face
Hugging Face lists deepseek-ai/DeepSeek-R1-0528 as a text-generation model with MIT licensing and provides Transformers examples for loading the tokenizer and model. Because the full model is extremely large, this path is usually for research labs, inference providers, or teams with serious GPU infrastructure rather than casual local use.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "deepseek-ai/DeepSeek-R1-0528"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
messages = [{"role": "user", "content": "Who are you?"}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
vLLM
Hugging Face provides a vLLM serving example for the model.
pip install vllm
vllm serve "deepseek-ai/DeepSeek-R1-0528"
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "deepseek-ai/DeepSeek-R1-0528",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'
SGLang
Hugging Face also provides an SGLang launch example for DeepSeek-R1-0528.
pip install sglang
python3 -m sglang.launch_server \
--model-path "deepseek-ai/DeepSeek-R1-0528" \
--host 0.0.0.0 \
--port 30000
Provider APIs
Provider access is often the most practical route. Azure AI Foundry, Together AI, OpenRouter, GitHub Models, and other catalogs have listed DeepSeek-R1-0528 or R1 endpoints. But pricing, context length, quantization, and routing change over time. For example, Azure lists a 163,840-token context length for DeepSeek-R1-0528, while Together lists 160K, and OpenRouter has listed 164K for its R1-0528 page. Always verify current provider metadata before publishing production docs or cost calculations.
DeepSeek-R1-0528-Qwen3-8B
DeepSeek-R1-0528-Qwen3-8B is a distilled model created by transferring chain-of-thought behavior from DeepSeek-R1-0528 into a Qwen3 8B Base model. The goal is to make some of the reasoning behavior of the larger model available in a smaller and more practical checkpoint.
The model card says the Qwen3-8B distilled version reaches 86.0 on AIME 2024, 76.3 on AIME 2025, 61.5 on HMMT February 2025, 61.1 on GPQA Diamond, and 60.5 on LiveCodeBench 2408–2505. It also says the model architecture is identical to Qwen3-8B but uses the same tokenizer configuration as DeepSeek-R1-0528.
Use the Qwen3-8B distill if you want smaller-scale local experimentation, lower serving cost, faster iteration, or research into reasoning distillation. Do not expect it to fully match the full DeepSeek-R1-0528 model on difficult tasks; the point of the distill is practicality, not complete parity.
Best Use Cases
DeepSeek-R1-0528 is best for tasks where slower, deeper reasoning is acceptable and accuracy matters more than instant latency. It fits complex math, competitive programming, algorithm explanation, code review, debugging, scientific reasoning, multi-step technical planning, and difficult question answering.
It is also useful for front-end generation and “vibe coding” workflows because DeepSeek specifically highlighted enhanced front-end capabilities and the model card mentions a better vibe coding experience.
For developers, the most practical use cases are structured JSON output, tool/function-calling agents, reasoning-heavy coding assistants, and benchmark comparison systems. For researchers, the model is important because its chain-of-thought behavior was used to distill the smaller DeepSeek-R1-0528-Qwen3-8B model, making it relevant for studying how reasoning can be transferred into smaller open models.
Limitations and Risks
DeepSeek-R1-0528 is powerful, but it is not a universal default model. Reasoning models can be slower and more expensive per task because they may generate long intermediate reasoning before producing a final answer. Provider latency and pricing vary significantly, so production teams should benchmark the exact endpoint they plan to use.
Benchmarks are also not production guarantees. A high AIME or LiveCodeBench score does not automatically mean the model will follow every internal policy, avoid every hallucination, or integrate smoothly with your agent stack.
Safety evaluation matters. Microsoft’s Azure AI Foundry page warns that DeepSeek R1 may be less aligned than some other models and recommends that customers use safety systems and conduct their own production evaluations. It also warns that reasoning output can contain more harmful content than the final answer and may need to be suppressed in production.
Function calling should also be tested carefully. DeepSeek’s R1-0528 release says function calling is supported, but current DeepSeek V4 thinking-mode docs include specific requirements around tool calls and passing reasoning content back to the API after tool use. That means agent workflows should be tested end-to-end rather than assumed to work because a provider page lists tool support.
Finally, running the full model locally is hardware-intensive. Hugging Face lists the model size as 685B parameters, and although vLLM, SGLang, Docker Model Runner, quantizations, and local app integrations are available, most teams will use hosted inference unless they already operate large-scale GPU infrastructure.
Practical Prompting Tips for DeepSeek-R1-0528
Be explicit about the final answer format. For math problems, ask the model to reason carefully and put the final answer in a clear format such as boxed notation. Azure’s model page includes similar benchmark-oriented advice for mathematical prompts.
Use JSON output only when you need structured data. DeepSeek’s JSON Output documentation says users should set response_format to {"type": "json_object"}, include the word “json” in the prompt, provide a desired JSON example, and set max_tokens carefully to avoid truncation.
Do not overcomplicate the system prompt. The Hugging Face page says system prompts are now supported for R1-0528, but Azure’s benchmarking guidance recommends putting instructions in the user prompt for expected performance. In production, test both styles with your own tasks.
For benchmark-like tasks, run multiple trials and average results. DeepSeek’s benchmark setup used sampling parameters and multiple responses per query, so a single casual run may not reproduce official scores.
For code, always validate outputs. DeepSeek-R1-0528 may perform well on coding benchmarks, but generated code can still contain bugs, missing imports, unsafe assumptions, or dependency mismatches.
Frequently Asked Questions
What is DeepSeek-R1-0528?
DeepSeek-R1-0528 is the May 2025 upgraded version of DeepSeek R1, designed for stronger reasoning, coding, math, structured output, and reduced hallucination behavior.
Is DeepSeek-R1-0528 open source?
It is open-weight and released under MIT terms according to Hugging Face and the DeepSeek R1 repository. The R1 series supports commercial use and distillation.
Is DeepSeek-R1-0528 better than DeepSeek R1?
On DeepSeek’s reported benchmark table, yes. It improves over the original R1 on GPQA-Diamond, LiveCodeBench, Codeforces-Div1, SWE Verified, Aider-Polyglot, AIME 2024, AIME 2025, HMMT 2025, and CNMO 2024.
Does DeepSeek-R1-0528 support function calling?
DeepSeek’s May 2025 release says DeepSeek-R1-0528 supports function calling. However, providers may expose function calling differently, and current DeepSeek API docs have shifted to V4 model names and tool-call terminology, so test your exact endpoint before production.
Does DeepSeek-R1-0528 support JSON output?
Yes. DeepSeek’s release note says R1-0528 supports JSON output, and DeepSeek’s JSON Output docs explain how to enable valid JSON using the response_format parameter.
Can I run DeepSeek-R1-0528 locally?
Yes, but the full model is very large. Hugging Face provides vLLM, SGLang, Docker Model Runner, and Transformers examples, but most practical local experimentation will use quantized versions or the smaller DeepSeek-R1-0528-Qwen3-8B distill.
What is DeepSeek-R1-0528-Qwen3-8B?
It is a distilled 8B model based on Qwen3 8B Base, trained from chain-of-thought generated by DeepSeek-R1-0528. It is designed for smaller-scale reasoning experiments and local deployment.
Is DeepSeek-R1-0528 the latest DeepSeek model?
No. As of May 2026, DeepSeek V4 Preview is newer. DeepSeek announced V4-Pro and V4-Flash on April 24, 2026, with 1M context across official DeepSeek services.
What is the best use case for DeepSeek-R1-0528?
Its best use cases are complex reasoning, math, competitive programming, code review, debugging, front-end generation, structured outputs, and research into open reasoning models.
How does DeepSeek-R1-0528 compare with DeepSeek V4?
DeepSeek-R1-0528 is still valuable for R1-specific reasoning research and compatibility, while DeepSeek V4 is newer and better aligned with DeepSeek’s current official API direction, long-context workflows, and agentic capabilities.
Conclusion
DeepSeek-R1-0528 remains one of the most important open reasoning model updates in the DeepSeek R1 family. It significantly improves over the original R1 on math, coding, reasoning, and tool-oriented benchmarks, while adding practical features such as JSON output and function calling. It is especially relevant for developers who need strong reasoning, researchers comparing open models, and teams maintaining R1-based workflows.
In 2026, however, DeepSeek-R1-0528 should not be described as DeepSeek’s newest model. DeepSeek V4 Preview is newer, has 1M context on official services, and is the current direction for DeepSeek’s API platform.
The right choice depends on the job: use DeepSeek-R1-0528 for R1-line reasoning, benchmarks, and open-model research; use V4 for current official API development and long-context agent workflows; and use DeepSeek-R1-0528-Qwen3-8B when you need a smaller distilled reasoning model.
