Last verified against official DeepSeek API documentation: April 27, 2026.
Current API status: DeepSeek’s POST /chat/completions endpoint documents deepseek-v4-flash and deepseek-v4-pro as the current V4 chat completion model IDs.
The older names deepseek-chat and deepseek-reasoner are legacy compatibility aliases. During the transition period, deepseek-chat maps to DeepSeek-V4-Flash non-thinking mode, while deepseek-reasoner maps to DeepSeek-V4-Flash thinking mode. DeepSeek says these aliases are scheduled to become inaccessible after July 24, 2026, 15:59 UTC.
This guide focuses on integration behavior, request structure, model selection, limits, and implementation patterns.
The DeepSeek Chat Completions API is the OpenAI-compatible interface for building chatbots, copilots, assistants, extraction tools, coding agents, and backend automation with DeepSeek models. You send a model and a messages array to POST /chat/completions, then receive either a normal chat completion object or streamed Server-Sent Event chunks.
This guide covers the current V4 model names, request fields, message roles, thinking mode, streaming, JSON Output, tool calls, strict mode, Chat Prefix Completion, stateless multi-turn conversations, usage fields, context caching, common errors, and practical production patterns.
Independent site notice: Chat-Deep.ai is an independent DeepSeek guide and browser access site. It is not affiliated with DeepSeek, DeepSeek.com, chat.deepseek.com, the official DeepSeek app, or the official DeepSeek developer platform. For production decisions, verify current model names, limits, feature support, deprecation notices, and service status in the official DeepSeek documentation.
Contents
- Quick Answer
- Quick Reference
- Current DeepSeek Chat Completion Models
- Endpoint, Base URLs and Authentication
- Minimal API Examples
- Request Fields and Message Roles
- Thinking Mode
- Streaming Responses
- JSON Output
- Tool Calls
- Strict Tool Mode Beta
- Chat Prefix Completion Beta
- Multi-turn Conversations Are Stateless
- Response Object and Usage Fields
- Context Caching
- Errors, Rate Limits and Keep-alives
- Security and Production Best Practices
- Migration from Legacy Aliases
- FAQ
- Official Sources
Quick Answer
Use POST https://api.deepseek.com/chat/completions for OpenAI-compatible DeepSeek chat requests. At the minimum, the JSON body needs:
model— normallydeepseek-v4-flashordeepseek-v4-pro.messages— an array containing at least one chat message.
Choose deepseek-v4-flash for fast, efficient, high-volume chat and automation. Choose deepseek-v4-pro for harder reasoning, coding, long-context analysis, and agentic workflows. For simple responses, disable thinking mode. For reasoning-heavy work, enable thinking mode and set reasoning_effort to high or max.
Quick Reference
| Item | Current Official Detail |
|---|---|
| Endpoint | POST /chat/completions |
| OpenAI-compatible base URL | https://api.deepseek.com |
| Anthropic-compatible base URL | https://api.deepseek.com/anthropic |
| Required request fields | model and messages |
| Current V4 model IDs | deepseek-v4-flash and deepseek-v4-pro |
| Legacy aliases | deepseek-chat and deepseek-reasoner |
| Legacy alias retirement | After July 24, 2026, 15:59 UTC |
| Thinking mode | Supported; default is enabled |
| Thinking effort | high or max |
| Context length | 1M tokens for current V4 API models |
| Supported chat features | Streaming, JSON Output, tool calls, strict tool mode Beta, and Chat Prefix Completion Beta |
Current DeepSeek Chat Completion Models
The safest current model IDs for new DeepSeek Chat Completions integrations are deepseek-v4-flash and deepseek-v4-pro. The legacy aliases still exist for compatibility during the transition period, but they should not be treated as long-term model IDs.
| Model ID | Status | Best Fit | Implementation Note |
|---|---|---|---|
deepseek-v4-flash | Current V4 API model | Fast chat, summarization, extraction, routing, support assistants, and high-volume workflows. | Good default starting point for most applications. |
deepseek-v4-pro | Current V4 API model | Advanced reasoning, coding, long-context analysis, complex agents, and high-value tasks. | Use when answer quality, synthesis, or reasoning depth matters more than speed. |
deepseek-chat | Legacy compatibility alias | Existing integrations that still depend on the older alias. | Currently maps to V4-Flash non-thinking mode during the transition period. |
deepseek-reasoner | Legacy compatibility alias | Existing reasoning-mode integrations that still depend on the older alias. | Currently maps to V4-Flash thinking mode during the transition period. |
Endpoint, Base URLs and Authentication
For direct HTTP requests, call:
POST https://api.deepseek.com/chat/completions
For OpenAI-compatible SDKs, use:
base_url = "https://api.deepseek.com"
For Anthropic-compatible SDKs and tools, use:
https://api.deepseek.com/anthropic
Authentication uses a DeepSeek API key as a Bearer token:
Authorization: Bearer YOUR_DEEPSEEK_API_KEY
Keep API keys server-side. Do not place them in browser JavaScript, mobile app bundles, public repositories, screenshots, analytics logs, or client-visible configuration.
Minimal API Examples
Minimal cURL Example
This example uses deepseek-v4-flash with thinking mode disabled for a straightforward chat response:
curl https://api.deepseek.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Explain DeepSeek Chat Completions in one paragraph."}
],
"thinking": {"type": "disabled"},
"stream": false
}'
Reasoning cURL Example
For harder tasks, use deepseek-v4-pro, enable thinking mode, and choose a reasoning effort:
curl https://api.deepseek.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
-d '{
"model": "deepseek-v4-pro",
"messages": [
{"role": "system", "content": "You are a careful technical assistant."},
{"role": "user", "content": "Design a safe retry strategy for an API client."}
],
"thinking": {"type": "enabled"},
"reasoning_effort": "high",
"stream": false
}'
Python Example with the OpenAI SDK
DeepSeek supports OpenAI-compatible SDK usage. In the OpenAI Python SDK, pass the DeepSeek-specific thinking object through extra_body.
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["DEEPSEEK_API_KEY"],
base_url="https://api.deepseek.com",
)
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Give me three DeepSeek API integration tips."},
],
stream=False,
extra_body={"thinking": {"type": "disabled"}},
)
print(response.choices[0].message.content)
Node.js Example with the OpenAI SDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.DEEPSEEK_API_KEY,
baseURL: "https://api.deepseek.com",
});
async function main() {
const completion = await client.chat.completions.create({
model: "deepseek-v4-flash",
messages: [
{ role: "system", content: "You are a concise assistant." },
{ role: "user", content: "Give me three DeepSeek Chat Completions tips." }
],
thinking: { type: "disabled" },
stream: false,
});
console.log(completion.choices[0].message.content);
}
main();
Request Fields and Message Roles
The request body must include model and messages. The messages array must contain at least one message and can use the roles system, user, assistant, and tool.
Common Request Fields
| Field | Required? | Purpose | Example |
|---|---|---|---|
model | Yes | The model ID to call. | deepseek-v4-flash |
messages | Yes | The current conversation history. | [{"role":"user","content":"Hello"}] |
thinking | No | Enables or disables thinking mode. | {"type":"enabled"} |
reasoning_effort | No | Controls reasoning effort when thinking is enabled. | high or max |
stream | No | Returns partial deltas over Server-Sent Events. | true |
max_tokens | No | Caps generated output tokens. | 1024 |
stop | No | Stops generation at one or more sequences. | ["\nEND"] |
response_format | No | Requests text or JSON output. | {"type":"json_object"} |
tools | No | Defines function tools the model may call. | Array of function definitions |
tool_choice | No | Controls whether or which tool is called. | none, auto, required, or a named function |
Message Roles
| Role | When to Use It | Important Fields |
|---|---|---|
system | Set behavior, scope, tone, or task rules. | role, content |
user | Send the user request or application input. | role, content |
assistant | Preserve prior assistant messages, tool-call proposals, or a Beta prefix. | role, nullable content, optional tool_calls, optional reasoning_content, optional prefix |
tool | Return the result of a function executed by your application. | role, content, tool_call_id |
Thinking Mode
DeepSeek V4 chat models support thinking and non-thinking modes. Thinking mode can return reasoning_content before the final content. The official default is enabled, so disable it explicitly when you want a simpler fast chat response.
Disable thinking mode:
{
"thinking": {"type": "disabled"}
}
Enable thinking mode with effort control:
{
"thinking": {"type": "enabled"},
"reasoning_effort": "high"
}
DeepSeek documents high and max as reasoning effort values. For compatibility, low and medium map to high, while xhigh maps to max.
In thinking mode, DeepSeek says temperature, top_p, presence_penalty, and frequency_penalty do not affect behavior. Passing those parameters for compatibility may not raise an error, but they should not be treated as thinking-mode controls.
Thinking Mode in Multi-turn and Tool-call Flows
If a thinking-mode assistant message does not involve tool calls, previously returned reasoning_content does not need to be included in the next request and is ignored if sent. If the assistant message does involve tool calls, the relevant reasoning_content must be passed back in subsequent requests for that tool-call flow.
Streaming Responses
Set stream to true to receive partial message deltas over Server-Sent Events. The stream ends with data: [DONE]. In streaming mode, read choices[0].delta instead of waiting for a final choices[0].message.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com",
)
stream = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Give me three deployment tips."},
],
stream=True,
extra_body={"thinking": {"type": "disabled"}},
)
for chunk in stream:
delta = chunk.choices[0].delta
if getattr(delta, "content", None):
print(delta.content, end="")
When stream_options includes {"include_usage": true}, DeepSeek streams one additional usage chunk before [DONE]. That chunk has an empty choices array and contains request-level usage data.
JSON Output
Use JSON Output when your application needs machine-readable structured data. Set response_format to {"type":"json_object"}, but also instruct the model to produce JSON in the prompt. DeepSeek recommends including the word “json”, providing an example of the desired shape, and setting max_tokens high enough to avoid truncation.
import json
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com",
)
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[
{
"role": "system",
"content": (
"Return valid json only. "
"Use this shape: {\"question\": string, \"answer\": string}."
),
},
{
"role": "user",
"content": "Convert this into json: What is the capital of Egypt? Cairo.",
},
],
response_format={"type": "json_object"},
max_tokens=256,
extra_body={"thinking": {"type": "disabled"}},
)
data = json.loads(response.choices[0].message.content)
print(data)
Always parse and validate the returned JSON before using it in business logic. If finish_reason is length, the JSON may be incomplete even when JSON mode is enabled.
Tool Calls
Tool calls let the model propose function calls, but the model does not execute your real functions. Your application defines available tools, the model may return tool_calls, your code executes the selected function, and then your app sends a tool message containing the result and the matching tool_call_id.
import json
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com",
)
tools = [
{
"type": "function",
"function": {
"name": "get_delivery_status",
"description": "Get delivery status for an order.",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The customer order ID"
}
},
"required": ["order_id"]
}
}
}
]
messages = [
{"role": "user", "content": "Where is order A12345?"}
]
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=messages,
tools=tools,
tool_choice="auto",
extra_body={"thinking": {"type": "disabled"}},
)
assistant_message = response.choices[0].message
messages.append(assistant_message)
if assistant_message.tool_calls:
call = assistant_message.tool_calls[0]
args = json.loads(call.function.arguments)
# Your application executes the real function here.
tool_result = f"Order {args['order_id']} is out for delivery."
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": tool_result,
})
final_response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=messages,
tools=tools,
extra_body={"thinking": {"type": "disabled"}},
)
print(final_response.choices[0].message.content)
tool_choice can be none, auto, required, or a named function object that forces a specific function. DeepSeek currently supports function tools and documents a maximum of 128 functions. Function names must use letters, numbers, underscores, or dashes, with a maximum length of 64 characters.
Important: tool_calls[].function.arguments is returned as JSON-format text, but the model may still produce invalid JSON or arguments outside your schema. Validate arguments before executing real code, database writes, transactions, or external API calls.
Strict Tool Mode Beta
Strict mode is a Beta tool-calling feature that makes the model follow your JSON Schema more closely. To use it, set the base URL to https://api.deepseek.com/beta and set strict: true on every function in the tools list.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com/beta",
)
tools = [
{
"type": "function",
"function": {
"name": "get_delivery_status",
"strict": True,
"description": "Get delivery status for an order.",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The customer order ID"
}
},
"required": ["order_id"],
"additionalProperties": False
}
}
}
]
DeepSeek documents a subset of JSON Schema for strict mode, including object, string, number, integer, boolean, array, enum, and anyOf. For object schemas, every property must be listed in required, and additionalProperties must be false.
Chat Prefix Completion Beta
Chat Prefix Completion lets your app provide the beginning of the assistant’s answer and ask the model to continue it. The last message must be an assistant message with prefix: true, and the request must use the Beta base URL.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com/beta",
)
response = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[
{"role": "user", "content": "Write a Python function that checks if a string is a palindrome."},
{"role": "assistant", "content": "```python\n", "prefix": True},
],
stop=["```"],
)
print(response.choices[0].message.content)
Use this feature for controlled continuation, code-completion-like behavior, and cases where your application needs the output to start from a known assistant prefix. For ordinary chat requests, use the standard base URL instead of the Beta base URL.
Multi-turn Conversations Are Stateless
The DeepSeek /chat/completions API is stateless. The server does not automatically remember previous turns. Your application must append the relevant prior messages and send them again with each request.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com",
)
messages = [
{"role": "user", "content": "What is the highest mountain in the world?"}
]
first = client.chat.completions.create(
model="deepseek-v4-flash",
messages=messages,
extra_body={"thinking": {"type": "disabled"}},
)
messages.append(first.choices[0].message)
messages.append({"role": "user", "content": "What is the second highest?"})
second = client.chat.completions.create(
model="deepseek-v4-flash",
messages=messages,
extra_body={"thinking": {"type": "disabled"}},
)
print(second.choices[0].message.content)
Send only the conversation history your task actually needs. For long sessions, use summarization, retrieval, or application-side memory to control prompt size while preserving the information required for the next answer.
Response Object and Usage Fields
A non-streamed response is a chat.completion object. Most apps read choices[0].message.content, but production systems should also inspect finish_reason, tool_calls, reasoning_content, system_fingerprint, and usage where relevant.
| Field | Meaning | Why It Matters |
|---|---|---|
choices | Candidate completions. | Most applications use choices[0]. |
message.content | The final assistant answer. | Main output text for non-streaming requests. |
message.reasoning_content | Reasoning content in thinking mode. | Needed for advanced thinking-mode workflows and some tool-call flows. |
message.tool_calls | Function call proposals. | Your app executes the real tool and returns a tool message. |
finish_reason | Why generation stopped. | Useful for retries, truncation handling, safety handling, and tool-call routing. |
system_fingerprint | Backend configuration fingerprint. | Helpful for debugging and reproducibility tracking. |
usage | Token accounting for the request. | Useful for monitoring, capacity planning, cache analysis, and output-limit tuning. |
DeepSeek documents these finish_reason values: stop, length, content_filter, tool_calls, and insufficient_system_resource.
Usage Fields to Log
The response usage object can include:
prompt_tokensprompt_cache_hit_tokensprompt_cache_miss_tokenscompletion_tokenscompletion_tokens_details.reasoning_tokenstotal_tokens
DeepSeek states that prompt_tokens equals prompt_cache_hit_tokens + prompt_cache_miss_tokens. Track cache-hit and cache-miss tokens separately so you can understand whether stable prompt prefixes are being reused successfully.
Context Caching
DeepSeek Context Caching is enabled by default. When later requests reuse an already persisted prompt prefix, the overlapping prefix can count as a cache hit. This can improve efficiency for repeated system prompts, long shared documents, repeated few-shot examples, and multi-turn conversations.
The practical rule is to keep reusable prompt prefixes stable. Put stable system instructions, shared documents, schemas, and examples before changing user questions, session IDs, timestamps, and request-specific metadata when your application design allows it.
Context caching is best-effort. The official guide says cache construction can take seconds, matching depends on persisted prefix units, and unused cache entries are usually cleared within a few hours to a few days.
Errors, Rate Limits and Keep-alives
DeepSeek dynamically limits concurrency based on server load. When the limit is reached, the API returns HTTP 429. While a request is waiting to be scheduled, non-streaming requests may return empty lines and streaming requests may return SSE keep-alive comments such as : keep-alive. If you parse HTTP manually, handle these keep-alives instead of treating them as malformed output.
| Status / Symptom | Common Cause | First Check |
|---|---|---|
400 Invalid Format | Malformed JSON, invalid message structure, or incompatible thinking/tool state. | Rebuild the request from a minimal known-good example. |
401 Authentication Fails | Wrong or missing API key. | Check the Bearer token and server-side environment variables. |
402 Insufficient Balance | No usable account balance. | Check account status in the official platform. |
422 Invalid Parameters | Unsupported field, invalid value, or invalid strict schema. | Remove optional fields, then add them back one by one. |
429 Rate Limit Reached | Requests sent too quickly or current concurrency limit reached. | Add backoff, reduce concurrency, and avoid retry storms. |
500 Server Error | Server-side issue. | Retry with backoff and inspect whether the request is idempotent. |
503 Server Overloaded | High traffic or overload. | Retry with backoff and monitor service status. |
| JSON response appears stuck | JSON Output enabled without clear prompt instructions to output JSON. | Add explicit JSON instructions, an example shape, and a reasonable max_tokens. |
| Tool-call arguments are unsafe | Arguments may be invalid JSON or include unexpected fields. | Parse, validate, authorize, and sanitize before execution. |
| Thinking + tool flow fails | Relevant reasoning_content was not passed back during the tool-call flow. | Follow the official thinking-mode tool-call pattern. |
Security and Production Best Practices
- Use
deepseek-v4-flashordeepseek-v4-prodirectly for new Chat Completions integrations. - Use
deepseek-v4-flashfirst for routine, high-volume, latency-sensitive workloads. - Route harder reasoning, coding, long-context, and agentic tasks to
deepseek-v4-pro. - Disable thinking mode for simple chat when you want simpler output.
- Enable thinking mode for reasoning-heavy tasks and handle
reasoning_contentcorrectly. - Manage conversation history in your own application because
/chat/completionsis stateless. - Validate JSON Output before using it in backend logic.
- Validate tool-call arguments before executing functions, database writes, purchases, emails, or external API calls.
- Store API keys in server-side secrets management, not in client-visible code.
- Log
usage, cache-hit tokens, cache-miss tokens, completion tokens, and reasoning tokens for observability. - Keep reusable prompt prefixes stable when you want context caching benefits.
- Add retry logic with exponential backoff and jitter for transient
429,500, and503responses. - Use idempotency protections in your own application when a retry could repeat an external action.
- Test streaming, JSON Output, tool calls, and strict mode separately before combining them in one workflow.
- Check the official DeepSeek change log before the legacy-alias retirement deadline on July 24, 2026.
Migration from deepseek-chat and deepseek-reasoner
The legacy aliases are useful for temporary compatibility, but the current V4 model IDs are clearer and safer for long-term integrations.
| Legacy Name | Current Mapping During Transition | Recommended Replacement |
|---|---|---|
deepseek-chat | DeepSeek-V4-Flash non-thinking mode | deepseek-v4-flash with {"thinking":{"type":"disabled"}} |
deepseek-reasoner | DeepSeek-V4-Flash thinking mode | deepseek-v4-flash or deepseek-v4-pro with {"thinking":{"type":"enabled"}} and reasoning_effort |
A practical migration path is to start by replacing routine deepseek-chat calls with deepseek-v4-flash, then route difficult reasoning or coding tasks to deepseek-v4-pro. Test output quality, latency, token usage, context-cache behavior, JSON validity, and tool-call reliability before switching production traffic.
Common DeepSeek Chat Completions Mistakes
| Mistake | Better Approach |
|---|---|
Calling deepseek-chat the current V3.2 chat model. | Describe it as a legacy alias that currently maps to V4-Flash non-thinking mode during the transition period. |
Forgetting that /chat/completions is stateless. | Send the relevant conversation history with each request. |
| Enabling JSON Output without prompting for JSON. | Use response_format, include the word “json”, and provide an example schema or shape. |
| Executing tool-call arguments without validation. | Parse, validate, authorize, and sanitize arguments before any real action. |
| Expecting temperature to control thinking-mode behavior. | Use thinking and reasoning_effort for thinking-mode behavior. |
| Putting changing metadata at the beginning of every prompt. | Keep stable instructions, schemas, documents, and examples early when context caching matters. |
| Treating Beta features as ordinary production behavior. | Use the Beta base URL only when you intentionally need features such as strict tool mode or Chat Prefix Completion. |
DeepSeek Chat Completions FAQ
What is the DeepSeek Chat Completions API?
It is DeepSeek’s main chat-generation endpoint: POST /chat/completions. You send a model and message history, and DeepSeek returns either a normal chat completion object or streamed chunks.
What fields are required in a DeepSeek chat completion request?
The body requires model and messages. The messages array must include at least one message.
Which model should I use: deepseek-v4-flash or deepseek-v4-pro?
Use deepseek-v4-flash for fast, efficient, high-volume workflows. Use deepseek-v4-pro for advanced reasoning, coding, long-context analysis, and complex agentic tasks.
Should I still use deepseek-chat or deepseek-reasoner?
Only for temporary compatibility. They are legacy aliases during the V4 transition period. New integrations should use deepseek-v4-flash or deepseek-v4-pro directly.
How do I enable or disable thinking mode?
Use {"thinking":{"type":"enabled"}} to enable thinking mode, or {"thinking":{"type":"disabled"}} to disable it. In the OpenAI Python SDK, pass the thinking object through extra_body.
What is reasoning_effort?
reasoning_effort controls thinking effort when thinking mode is enabled. DeepSeek documents high and max.
How do I stream DeepSeek chat completions?
Set stream to true. The API sends partial deltas over Server-Sent Events and terminates the stream with data: [DONE].
How do I get JSON output?
Set response_format to {"type":"json_object"}, include the word “json” in the prompt, provide an example of the target JSON shape, and set max_tokens high enough to avoid truncation.
How do tool calls work?
Your application defines function tools. The model may return tool_calls. Your code executes the real function, then returns the result as a tool message with the matching tool_call_id.
Is the DeepSeek Chat Completions API stateless?
Yes. The server does not automatically remember previous conversation turns. Your application must resend the relevant message history with each request.
Official Sources
- DeepSeek API Reference: Create Chat Completion
- DeepSeek API Docs: Your First API Call
- DeepSeek V4 Preview Release
- DeepSeek Thinking Mode Guide
- DeepSeek Multi-round Conversation Guide
- DeepSeek JSON Output Guide
- DeepSeek Tool Calls / Function Calling Guide
- DeepSeek Chat Prefix Completion Guide
- DeepSeek Context Caching Guide
- DeepSeek Rate Limit Guide
- DeepSeek Error Codes
- DeepSeek Anthropic API Guide
Related Chat-Deep.ai resources: DeepSeek API guide, DeepSeek Context Caching guide, DeepSeek V4 guide, DeepSeek Models hub, and DeepSeek Status guide.
