Last updated: June 1, 2026
Yes, you can run or invoke DeepSeek from an AWS Private VPC. The most managed path is Amazon Bedrock with AWS PrivateLink. The most customizable path is Amazon SageMaker AI or self-hosted inference in private subnets with VPC endpoints, IAM, KMS, CloudWatch, CloudTrail, VPC Flow Logs, and strict egress controls.
As of this update, AWS documentation lists DeepSeek models in Amazon Bedrock, including DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1.
Important model-name note: DeepSeek model names differ by deployment channel. DeepSeek’s direct API currently uses model IDs such as deepseek-v4-flash and deepseek-v4-pro, while Amazon Bedrock exposes AWS-managed DeepSeek models with Bedrock-specific names and IDs such as DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1. Do not copy model IDs between DeepSeek’s direct API and Amazon Bedrock without checking the official documentation for the deployment path you are using.
AWS also documents Amazon Bedrock interface VPC endpoint support for bedrock, bedrock-runtime, bedrock-mantle, bedrock-agent, and bedrock-agent-runtime, which is the core network primitive for a private Bedrock integration.
Table of Contents
What “DeepSeek on AWS Private VPC” Means
“DeepSeek on AWS Private VPC” can mean three different implementation patterns.
The first pattern is private API access to a managed DeepSeek model. Your application runs in a private subnet and calls DeepSeek through Amazon Bedrock using AWS PrivateLink. In this pattern, you do not host the model weights, provision GPUs, or operate inference containers. Your main responsibilities are IAM, endpoint policies, network routing, observability, data governance, and application security. AWS states that AWS PrivateLink lets you create a private connection between your VPC and Amazon Bedrock without using an internet gateway, NAT device, VPN connection, or Direct Connect connection.
The second pattern is DeepSeek SageMaker VPC deployment. You deploy DeepSeek or a distilled DeepSeek-compatible model to a SageMaker AI endpoint, usually from model artifacts stored in Amazon S3 or from SageMaker JumpStart where available. Your application then invokes the SageMaker Runtime through a SageMaker private endpoint. SageMaker AI supports interface VPC endpoints for the SageMaker API and SageMaker Runtime, and AWS documents that this lets VPC resources communicate with those services without connecting to the public internet.
The third pattern is self-hosting DeepSeek on EC2 or EKS in private subnets. This is the most flexible option, but also the most operationally demanding. You manage GPU capacity, serving software, scaling, patching, model artifacts, container scanning, network egress, observability, and incident response.
Private subnet does not always mean “no internet.” A workload in a private subnet can still reach the internet through a NAT Gateway. A stricter no internet egress LLM deployment removes NAT routes and uses VPC endpoints for AWS services such as Bedrock, SageMaker Runtime, S3, CloudWatch Logs, STS, ECR, Secrets Manager, and KMS where required.
Recommended Architecture
For most production teams, start with Amazon Bedrock + AWS PrivateLink if the required DeepSeek model, Region, API family, quota, and governance requirements are supported. Use SageMaker when you need custom containers, model artifact control, endpoint-level hosting control, or fine-tuning/deployment workflows. Use EC2/EKS only when you have a strong reason to own the full serving stack.

AWS documents bedrock-runtime as the runtime endpoint for inference operations and bedrock-mantle as the endpoint that serves OpenAI-compatible Responses API, Chat Completions API, and Anthropic Messages API traffic in Bedrock.
Model support note: Not every DeepSeek model in Bedrock supports every Mantle API. DeepSeek V3.2 and DeepSeek-V3.1 support Chat Completions through bedrock-mantle, while DeepSeek-R1 currently supports bedrock-runtime but not bedrock-mantle. Always check the model card before selecting an API family.
Choosing the Right Deployment Path
| Option | Best for | Private VPC pattern | Operational complexity | Customization level | Security controls | Cost model | Trade-offs |
|---|---|---|---|---|---|---|---|
| Amazon Bedrock + AWS PrivateLink | Teams that want managed DeepSeek API access | Private app calls bedrock-runtime or bedrock-mantle through interface endpoints | Low | Low to medium | IAM, endpoint policies, private DNS, CloudTrail, CloudWatch, Guardrails | Bedrock model pricing + PrivateLink | Least infrastructure control, dependent on model/API/Region availability |
| Bedrock fully managed DeepSeek | Production apps that want serverless DeepSeek access | Same as Bedrock PrivateLink | Low | Low | Bedrock security, IAM, Guardrails, logging | Model-dependent Bedrock pricing | Model support varies by Region and API family |
| SageMaker JumpStart or SageMaker endpoint inside VPC | ML teams that need hosted model control | App calls SageMaker Runtime through PrivateLink | Medium | Medium | VPC config, security groups, IAM, KMS, network isolation | Instance or endpoint-based pricing | More configuration and lifecycle management |
| SageMaker with TGI/vLLM and S3 artifacts | Custom model packaging and serving stack | Model artifacts in S3, container in private subnets, runtime endpoint private | High | High | S3 endpoint policies, model provenance, container scanning | SageMaker hosting instances + storage/logging | You own container behavior and model serving tuning |
| EC2/EKS self-hosted DeepSeek | Maximum control, custom GPU scheduling, custom runtime isolation | Private nodes, private load balancers, private ECR/S3/logging endpoints | Very high | Very high | Full network, host, container, IAM, KMS, runtime security | GPU compute + storage + networking | Highest operational burden |
AWS’s own Bedrock vs. SageMaker decision guide describes Bedrock as a managed service for integrating foundation models through API calls, while SageMaker AI is aimed at building, training, deploying, and customizing ML models with more infrastructure control.
Option 1 — Use DeepSeek through Amazon Bedrock PrivateLink
This is the recommended first option for most teams searching for DeepSeek AWS PrivateLink or DeepSeek Amazon Bedrock PrivateLink.
Step 1: Confirm model and Region support
Before creating infrastructure, confirm the exact DeepSeek model, API family, service tier, quotas, and Region. AWS currently lists DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1 in Amazon Bedrock documentation, but the available model ID, inference profile ID, endpoint URL, and supported API can vary by model and Region.
DeepSeek V3.2 is documented as a DeepSeek mixture-of-experts model with support for bedrock-runtime and bedrock-mantle endpoints. DeepSeek-R1 is documented with bedrock-runtime support, while its model card shows bedrock-mantle is not supported for that model.
Step 2: Choose the endpoint family
Use:
bedrock-runtimeforInvokeModel,InvokeModelWithResponseStream,Converse, andConverseStream.bedrock-mantlefor OpenAI-compatible API patterns where the chosen model and Region support it.bedrockfor control-plane actions such as model discovery or management operations.
AWS documents interface endpoint service names including com.amazonaws.region.bedrock, com.amazonaws.region.bedrock-runtime, and com.amazonaws.region.bedrock-mantle. It also documents private DNS names such as bedrock-runtime.region.amazonaws.com and bedrock-mantle.region.api.aws.
Step 3: Create interface VPC endpoints
Use private subnets in at least two Availability Zones and attach a security group that allows HTTPS from your application security group only.
REGION="us-west-2"
VPC_ID="vpc-xxxxxxxxxxxxxxxxx"
SUBNET_A="subnet-aaaaaaaaaaaaaaaaa"
SUBNET_B="subnet-bbbbbbbbbbbbbbbbb"
VPCE_SG_ID="sg-xxxxxxxxxxxxxxxxx"
# Bedrock Runtime endpoint
aws ec2 create-vpc-endpoint \
--region "$REGION" \
--vpc-id "$VPC_ID" \
--vpc-endpoint-type Interface \
--service-name "com.amazonaws.${REGION}.bedrock-runtime" \
--subnet-ids "$SUBNET_A" "$SUBNET_B" \
--security-group-ids "$VPCE_SG_ID" \
--private-dns-enabled
# Optional: Bedrock control-plane endpoint
aws ec2 create-vpc-endpoint \
--region "$REGION" \
--vpc-id "$VPC_ID" \
--vpc-endpoint-type Interface \
--service-name "com.amazonaws.${REGION}.bedrock" \
--subnet-ids "$SUBNET_A" "$SUBNET_B" \
--security-group-ids "$VPCE_SG_ID" \
--private-dns-enabled
Verify the exact VPC endpoint service name in your AWS Region before deployment. Amazon Bedrock endpoint availability, endpoint families, and model support can vary by Region and may change over time.
If your selected model supports OpenAI-compatible Bedrock Mantle APIs in your target Region, create the Mantle endpoint as well:
aws ec2 create-vpc-endpoint \
--region "$REGION" \
--vpc-id "$VPC_ID" \
--vpc-endpoint-type Interface \
--service-name "com.amazonaws.${REGION}.bedrock-mantle" \
--subnet-ids $SUBNET_IDS \
--security-group-ids "$VPCE_SG_ID" \
--private-dns-enabled
Always verify the service name in the VPC console or AWS CLI for your Region before deploying. Bedrock endpoint support evolves, and not every model supports every API family.
Step 4: Attach restrictive endpoint policies
The default endpoint policy allows full access through the interface endpoint. AWS recommends using a custom endpoint policy when you need to control what can be accessed from your VPC.
Example policy skeleton:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "InvokeApprovedDeepSeekModelsOnly",
"Effect": "Allow",
"Principal": "*",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
"bedrock:Converse",
"bedrock:ConverseStream"
],
"Resource": [
"arn:aws:bedrock:REGION::foundation-model/APPROVED_MODEL_ID",
"arn:aws:bedrock:REGION:ACCOUNT_ID:inference-profile/APPROVED_PROFILE_ID",
"arn:aws:bedrock:REGION:ACCOUNT_ID:application-inference-profile/APPROVED_PROFILE_ID"
]
}
]
}
Replace the resource ARN with the exact foundation model ARN, system inference profile ARN, or application inference profile ARN returned by AWS for your Region and account. Foundation model ARNs and inference profile ARNs use different formats.
Use IAM identity policies as the primary control for who can invoke which model, then use endpoint policies as an additional network boundary. For Bedrock Mantle, AWS’s example endpoint policy uses the bedrock-mantle:CreateInference action.
Step 5: Test from a private workload
The following Boto3 example uses bedrock-runtime. Replace the model ID or inference profile ID with a value verified for your Region and account. AWS documentation currently shows us.deepseek.r1-v1:0 as an example inference profile ID for DeepSeek-R1 in the US, but production code should not assume the same value across Regions, accounts, or model versions.
import boto3
from botocore.config import Config
REGION = "us-west-2"
MODEL_ID = "REPLACE_WITH_VERIFIED_DEEPSEEK_MODEL_OR_INFERENCE_PROFILE_ID"
client = boto3.client(
"bedrock-runtime",
region_name=REGION,
config=Config(connect_timeout=30, read_timeout=300)
)
response = client.converse(
modelId=MODEL_ID,
messages=[
{
"role": "user",
"content": [
{"text": "Summarize the security benefits of private LLM inference on AWS."}
]
}
],
inferenceConfig={
"maxTokens": 1024,
"temperature": 0.2
}
)
print(response["output"]["message"]["content"][0]["text"])
Step 6: Add monitoring
Enable CloudTrail, CloudWatch metrics, and model invocation logging according to your privacy policy. Amazon Bedrock integrates with CloudTrail, and CloudTrail records Bedrock API activity as events. Bedrock invocation logging can send supported request, response, and metadata logs to CloudWatch Logs or S3, but AWS notes that invocation logging is currently supported for bedrock-runtime calls and not for calls made through the bedrock-mantle Responses API endpoint.
Option 2 — Deploy DeepSeek on SageMaker inside a Private VPC
Choose this path when you need DeepSeek SageMaker VPC control: custom model packaging, custom containers, controlled model artifact storage, endpoint-level scaling, or model variants.
How SageMaker private access works
There are two related but separate network concerns:
- Your application invoking SageMaker Runtime privately.
Createcom.amazonaws.region.sagemaker.runtimeas an interface endpoint and enable private DNS. - Your SageMaker-hosted model accessing required resources.
Configure the model withVpcConfig, private subnets, security groups, S3 access, and KMS encryption.
AWS documents that SageMaker API and SageMaker Runtime can be accessed through interface VPC endpoints using AWS PrivateLink, and that the default runtime hostname resolves to the VPC endpoint when private DNS is enabled.
Required endpoints for a no-internet SageMaker deployment
At minimum, plan for:
com.amazonaws.region.sagemaker.apicom.amazonaws.region.sagemaker.runtimecom.amazonaws.region.s3as a Gateway endpointcom.amazonaws.region.stscom.amazonaws.region.logscom.amazonaws.region.monitoringcom.amazonaws.region.ecr.apiandcom.amazonaws.region.ecr.dkrif using private containerscom.amazonaws.region.secretsmanagerif your container reads secrets- KMS access through AWS APIs according to your key policy and architecture
AWS specifically notes that when SageMaker model containers do not have internet access, they cannot connect to S3 buckets containing data unless you create a VPC endpoint for S3. AWS also recommends custom policies that allow only requests from your private VPC to access the relevant S3 buckets.
SageMaker Python SDK example
from sagemaker.huggingface import HuggingFaceModel
from sagemaker.session import Session
import boto3
REGION = "us-west-2"
ROLE_ARN = "arn:aws:iam::ACCOUNT_ID:role/SageMakerExecutionRole"
MODEL_ARTIFACT = "s3://your-private-model-bucket/deepseek/model.tar.gz"
SUBNETS = ["subnet-private-a", "subnet-private-b"]
SECURITY_GROUPS = ["sg-sagemaker-endpoint"]
session = Session(boto_session=boto3.Session(region_name=REGION))
model = HuggingFaceModel(
role=ROLE_ARN,
model_data=MODEL_ARTIFACT,
transformers_version="REPLACE_WITH_SUPPORTED_VERSION",
pytorch_version="REPLACE_WITH_SUPPORTED_VERSION",
py_version="py310",
sagemaker_session=session,
# Place SageMaker-hosted model resources in private subnets.
vpc_config={
"Subnets": SUBNETS,
"SecurityGroupIds": SECURITY_GROUPS
},
# Use only if the selected container and deployment path support it.
# Network isolation prevents the container from making outbound network calls.
enable_network_isolation=True
)
predictor = model.deploy(
initial_instance_count=1,
instance_type="ml.g5.12xlarge",
endpoint_name="deepseek-private-vpc-endpoint"
)
SageMaker network isolation can be enabled when creating a model or training job. AWS documents that when network isolation is enabled, training and inference containers cannot make outbound network calls, and SageMaker handles required S3 download/upload operations using the SageMaker execution role outside the container runtime.
For SageMaker JumpStart, AWS states that JumpStart models run in network isolation mode and that a selected VPC does not need public internet access, but it does need S3 access.
Option 3 — Self-host DeepSeek on EC2 or EKS
Self-hosting is appropriate when you need maximum runtime control, custom GPU topology, custom quantization, custom serving software, or strict artifact provenance controls that cannot be satisfied by managed Bedrock or SageMaker.
Typical serving stacks include:
- vLLM
- Hugging Face TGI
- NVIDIA Triton
- Ollama for smaller or local-style deployments
- Custom FastAPI or gRPC inference services
A production EC2/EKS design should include private subnets, no public IPs, private ECR image pulls, S3 model artifact access, CloudWatch/Prometheus monitoring, KMS encryption, Secrets Manager, image scanning, AMI patching, GPU driver lifecycle management, and autoscaling policies. For a strict no-NAT design, every external AWS service dependency must be reachable through a VPC endpoint or private connectivity path.
This path gives you the most control, but it also shifts the highest responsibility to your team. You own GPU utilization, model loading, warmup, autoscaling, overload behavior, patching, vulnerability management, data retention, and incident response.
Security Checklist for DeepSeek on AWS Private VPC
Use this checklist before moving a DeepSeek secure deployment AWS design into production.
| Control | Recommended implementation |
|---|---|
| Private subnets | Run application and inference workloads in private subnets across at least two Availability Zones. |
| No public IPs | Disable public IP assignment for EC2, EKS nodes, ECS tasks, and SageMaker-related workloads where applicable. |
| No inbound internet | Avoid public load balancers unless explicitly required. Prefer internal ALB/NLB and private connectivity. |
| No internet egress | Remove NAT Gateway routes for strict environments and use VPC endpoints instead. |
| Bedrock PrivateLink | Create bedrock-runtime; add bedrock-mantle only when the model and Region support it. |
| SageMaker PrivateLink | Create sagemaker.api and sagemaker.runtime endpoints. |
| S3 Gateway endpoint | Required for private model artifacts, logs, datasets, and no-internet SageMaker patterns. |
| Endpoint policies | Restrict allowed actions and resources at the VPC endpoint layer. |
| IAM least privilege | Restrict model invocation, endpoint creation, S3 access, KMS usage, and logging permissions. |
| KMS encryption | Use customer-managed keys for sensitive S3 buckets, EBS volumes, logs, and model artifacts where required. |
| CloudTrail | Record Bedrock, SageMaker, IAM, KMS, and infrastructure API activity. |
| CloudWatch metrics | Monitor invocation count, latency, errors, token usage, and delivery failures. Bedrock supports CloudWatch metrics with dimensions such as ModelId. |
| VPC Flow Logs | Capture IP traffic metadata for network troubleshooting and egress monitoring. AWS states VPC Flow Logs capture information about IP traffic to and from network interfaces. |
| Guardrails | Use Amazon Bedrock Guardrails where applicable for content filters, denied topics, word filters, and sensitive information controls. |
| S3 bucket policies | Require access through approved VPC endpoints using conditions such as aws:SourceVpce. |
| Prompt logging policy | Decide whether prompts and responses may be logged; apply masking, retention, and access controls. |
| Container security | Scan images, pin versions, validate model artifact provenance, and restrict package downloads. |
| Organizations controls | Use SCPs to restrict unapproved Regions, model families, public networking, or unmanaged inference paths. |
Cost Considerations
Do not estimate production cost from model choice alone. A private LLM deployment includes model inference, endpoint infrastructure, networking, logging, storage, and security tooling.
Amazon Bedrock costs depend on the provider, model, modality, and service tier. AWS’s Bedrock pricing page states that model pricing depends on modality, provider, and model, and it lists DeepSeek as one of the model providers.
AWS PrivateLink costs include endpoint hours and data processing. AWS’s PrivateLink pricing examples show data processing charges and hourly endpoint ENI charges.
NAT Gateway costs matter if you keep NAT for package downloads, external APIs, or public endpoints. AWS states that NAT Gateway pricing includes each hour the NAT Gateway is available and each gigabyte of data processed. AWS also recommends considering interface or gateway endpoints when most traffic is to AWS services that support VPC endpoints.
SageMaker costs depend on the instance type, inference option, storage, and duration. AWS states that SageMaker AI uses pay-as-you-go pricing, and that real-time inference is charged based on the instance type used.
Also budget for S3 storage, CloudWatch Logs ingestion and retention, VPC Flow Logs, KMS requests, ECR storage, image scanning, multi-AZ duplication, and security monitoring.
Common Mistakes and Troubleshooting
| Problem | Likely cause | Fix |
|---|---|---|
| Bedrock calls still go through NAT | Private DNS disabled or endpoint missing | Enable private DNS and verify DNS resolution from the private subnet. |
AccessDeniedException | IAM policy, endpoint policy, model access, or SCP blocks invocation | Check identity policy, endpoint policy, Organizations SCPs, and model access. |
UnknownEndpoint or DNS failure | Wrong endpoint service name or unsupported Region | Verify service name and Region in AWS docs or VPC endpoint service list. |
| HTTPS timeout | Security group blocks TCP 443 | Allow outbound 443 from app SG to VPCE SG and inbound 443 on VPCE SG from app SG. |
| Bedrock Mantle call fails | Model or Region does not support bedrock-mantle | Use bedrock-runtime or select a model documented for Mantle. |
| Model ID not found | Hardcoded model ID or wrong inference profile | Query current Bedrock model metadata and use the right Region/profile. |
| SageMaker endpoint cannot load artifacts | Missing S3 endpoint or bucket policy blocks VPCE access | Add S3 Gateway endpoint and update bucket policy. |
| Container cannot pull image | Missing ECR endpoints in no-NAT architecture | Add ecr.api, ecr.dkr, S3 endpoint, and required IAM permissions. |
| No CloudWatch logs | Missing logs endpoint, IAM permissions, or logging config | Add logs endpoint, allow logs actions, and configure log groups. |
| Unexpected internet egress | NAT route remains in private route table | Remove NAT route for strict no-egress environments and monitor with Flow Logs. |
| Invocation logs missing for Mantle | Bedrock invocation logging does not currently capture bedrock-mantle Responses API calls | Use CloudTrail and Mantle-specific monitoring; check AWS docs for updates. |
| SageMaker no-internet deployment fails | Model container requires external downloads | Pre-stage model artifacts and dependencies in S3/ECR and validate offline startup. |
Best-practice Reference Architecture
For most enterprises, the best production pattern is:
- Use Amazon Bedrock + AWS PrivateLink for managed DeepSeek inference when model, API, Region, quota, and governance requirements match.
- Use SageMaker private endpoints when the team needs model artifact control, custom containers, endpoint-level tuning, or private model lifecycle management.
- Use S3 as the controlled artifact boundary for self-hosted or SageMaker models.
- Remove NAT for strict no-internet-egress workloads and replace service dependencies with VPC endpoints.
- Apply IAM and endpoint policies together rather than relying on network controls alone.
- Encrypt artifacts, logs, volumes, and outputs using KMS according to data classification.
- Enable CloudTrail, CloudWatch, and VPC Flow Logs for auditability and troubleshooting.
- Use Bedrock Guardrails where applicable for content filtering, sensitive information filtering, and policy alignment.
- Deploy with Infrastructure as Code using Terraform, AWS CDK, or CloudFormation.
- Separate dev, stage, and prod accounts and enforce model, Region, and network restrictions with AWS Organizations.
This design helps reduce public exposure and supports compliance goals, but it does not make the workload automatically compliant. Final compliance depends on your architecture, data classification, logging configuration, retention policy, access model, security review, and legal requirements.
Limitations
- Model IDs, inference profile IDs, API support, quotas, prices, and Regions change. Always verify immediately before deployment.
- AWS PrivateLink keeps traffic between your VPC and supported AWS services on the AWS network, but it does not replace IAM, encryption, logging, or application-layer controls.
- No-egress deployments require dependency discipline. Any runtime dependency that tries to download from the internet can break startup or create a governance exception.
- Bedrock invocation logging may not cover every endpoint family. AWS currently notes that model invocation logging is supported for
bedrock-runtimecalls and not for Responses API calls throughbedrock-mantle. - Self-hosting DeepSeek requires significant ML infrastructure maturity.
FAQ
1. Can DeepSeek run inside an AWS Private VPC?
Yes. You can invoke managed DeepSeek models from private subnets through Amazon Bedrock PrivateLink, deploy DeepSeek-related models on SageMaker AI inside a VPC, or self-host on EC2/EKS private subnets. The right option depends on how much model and infrastructure control you need.
2. Is Amazon Bedrock private when accessed through AWS PrivateLink?
AWS PrivateLink lets your VPC connect privately to Amazon Bedrock without using an internet gateway, NAT device, VPN, or Direct Connect connection, and instances in the VPC do not need public IP addresses to access Bedrock. You still need IAM, endpoint policies, encryption, and logging controls.
3. Should I use Bedrock or SageMaker for DeepSeek on AWS?
Use Bedrock when you want managed API access with less infrastructure management. Use SageMaker when you need custom model hosting, custom containers, private model artifacts, endpoint control, or ML lifecycle customization. AWS’s decision guide positions Bedrock for simplified foundation model API integration and SageMaker AI for more customized ML workflows.
4. Which VPC endpoints are needed for DeepSeek on Bedrock?
For basic private inference, create com.amazonaws.region.bedrock-runtime. Add com.amazonaws.region.bedrock for control-plane operations. Add com.amazonaws.region.bedrock-mantle only when your selected model, API pattern, and Region support Bedrock Mantle. AWS also documents Bedrock agent endpoint suffixes for agent use cases.
5. Which VPC endpoints are needed for DeepSeek on SageMaker?
For private SageMaker access, use com.amazonaws.region.sagemaker.api and com.amazonaws.region.sagemaker.runtime. For no-internet deployments, add S3, STS, CloudWatch Logs, CloudWatch monitoring, ECR, Secrets Manager, and any other service endpoints your container or deployment pipeline requires. AWS documents sagemaker.runtime as required for endpoint invocations in no-internet Studio/VPC configurations.
6. Can I deploy DeepSeek with no internet egress?
Yes, but you must remove NAT routes and provide private alternatives for every dependency. Bedrock calls can use PrivateLink. SageMaker model artifacts can come from S3 through a Gateway endpoint. Containers can be pulled from ECR through ECR interface endpoints. Logs and metrics should go through CloudWatch endpoints.
7. Does AWS PrivateLink remove the need for NAT Gateway?
For supported AWS service traffic, often yes. Bedrock and SageMaker API/Runtime calls can use PrivateLink. But NAT might still be needed for unsupported services or external internet destinations. In a strict no-egress design, avoid NAT and redesign dependencies around VPC endpoints and private connectivity.
8. How do I restrict access to only DeepSeek models?
Use IAM policies that allow invocation only for approved model ARNs, model IDs, or inference profile ARNs where applicable. Add endpoint policies as an extra boundary. In multi-account environments, use SCPs to deny unapproved Regions, models, or public network paths.
9. Can I use DeepSeek-R1, DeepSeek-V3.1, or DeepSeek V3.2 on AWS?
AWS documentation currently lists DeepSeek-R1, DeepSeek-V3.1, and DeepSeek V3.2 in Amazon Bedrock. However, model availability, endpoint support, service tier, model ID, and Region can vary, so verify the current Bedrock model card and Region list before deployment.
10. Is self-hosting DeepSeek on EC2/EKS better than Bedrock?
Not usually for teams that want managed private inference. Self-hosting is better only when you need maximum runtime control, custom serving stacks, specialized GPU scheduling, or custom isolation requirements. It creates more work in patching, scaling, monitoring, capacity planning, and security operations.
11. How do I monitor DeepSeek inference traffic?
For Bedrock, use CloudTrail, CloudWatch metrics, and invocation logging where supported. For VPC-level visibility, use VPC Flow Logs. For SageMaker, use endpoint metrics, container logs, CloudWatch, CloudTrail, and network logs. Bedrock metrics can be viewed in CloudWatch by searching for the model ID.
12. What are the biggest security risks in private LLM deployments?
The biggest risks are over-permissive IAM, accidental NAT egress, unapproved model usage, prompt/response leakage through logs, weak S3 bucket policies, unscanned containers, unverified model artifacts, missing KMS controls, and lack of audit visibility.
Conclusion
DeepSeek on AWS Private VPC is best implemented as a private, governed inference architecture rather than a simple model deployment task.
For most teams, start with Amazon Bedrock + AWS PrivateLink because it gives managed DeepSeek access without operating model infrastructure. Use SageMaker private endpoints when you need custom model hosting, model artifacts in S3, custom containers, or endpoint-level control. Use EC2/EKS self-hosting only when your team is ready to operate the full LLM serving stack.
Need help designing a private LLM architecture on AWS? Start with a security review of your model path, VPC endpoints, IAM policies, logging, and data classification before production deployment.
