DeepSeek on AWS Private VPC: Secure Deployment Guide

Last updated: June 1, 2026

Yes, you can run or invoke DeepSeek from an AWS Private VPC. The most managed path is Amazon Bedrock with AWS PrivateLink. The most customizable path is Amazon SageMaker AI or self-hosted inference in private subnets with VPC endpoints, IAM, KMS, CloudWatch, CloudTrail, VPC Flow Logs, and strict egress controls.

As of this update, AWS documentation lists DeepSeek models in Amazon Bedrock, including DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1.

Important model-name note: DeepSeek model names differ by deployment channel. DeepSeek’s direct API currently uses model IDs such as deepseek-v4-flash and deepseek-v4-pro, while Amazon Bedrock exposes AWS-managed DeepSeek models with Bedrock-specific names and IDs such as DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1. Do not copy model IDs between DeepSeek’s direct API and Amazon Bedrock without checking the official documentation for the deployment path you are using.

AWS also documents Amazon Bedrock interface VPC endpoint support for bedrock, bedrock-runtime, bedrock-mantle, bedrock-agent, and bedrock-agent-runtime, which is the core network primitive for a private Bedrock integration.

What “DeepSeek on AWS Private VPC” Means

“DeepSeek on AWS Private VPC” can mean three different implementation patterns.

The first pattern is private API access to a managed DeepSeek model. Your application runs in a private subnet and calls DeepSeek through Amazon Bedrock using AWS PrivateLink. In this pattern, you do not host the model weights, provision GPUs, or operate inference containers. Your main responsibilities are IAM, endpoint policies, network routing, observability, data governance, and application security. AWS states that AWS PrivateLink lets you create a private connection between your VPC and Amazon Bedrock without using an internet gateway, NAT device, VPN connection, or Direct Connect connection.

The second pattern is DeepSeek SageMaker VPC deployment. You deploy DeepSeek or a distilled DeepSeek-compatible model to a SageMaker AI endpoint, usually from model artifacts stored in Amazon S3 or from SageMaker JumpStart where available. Your application then invokes the SageMaker Runtime through a SageMaker private endpoint. SageMaker AI supports interface VPC endpoints for the SageMaker API and SageMaker Runtime, and AWS documents that this lets VPC resources communicate with those services without connecting to the public internet.

The third pattern is self-hosting DeepSeek on EC2 or EKS in private subnets. This is the most flexible option, but also the most operationally demanding. You manage GPU capacity, serving software, scaling, patching, model artifacts, container scanning, network egress, observability, and incident response.

Private subnet does not always mean “no internet.” A workload in a private subnet can still reach the internet through a NAT Gateway. A stricter no internet egress LLM deployment removes NAT routes and uses VPC endpoints for AWS services such as Bedrock, SageMaker Runtime, S3, CloudWatch Logs, STS, ECR, Secrets Manager, and KMS where required.

Recommended Architecture

For most production teams, start with Amazon Bedrock + AWS PrivateLink if the required DeepSeek model, Region, API family, quota, and governance requirements are supported. Use SageMaker when you need custom containers, model artifact control, endpoint-level hosting control, or fine-tuning/deployment workflows. Use EC2/EKS only when you have a strong reason to own the full serving stack.

AWS documents bedrock-runtime as the runtime endpoint for inference operations and bedrock-mantle as the endpoint that serves OpenAI-compatible Responses API, Chat Completions API, and Anthropic Messages API traffic in Bedrock.

Model support note: Not every DeepSeek model in Bedrock supports every Mantle API. DeepSeek V3.2 and DeepSeek-V3.1 support Chat Completions through bedrock-mantle, while DeepSeek-R1 currently supports bedrock-runtime but not bedrock-mantle. Always check the model card before selecting an API family.

Choosing the Right Deployment Path

Option	Best for	Private VPC pattern	Operational complexity	Customization level	Security controls	Cost model	Trade-offs
Amazon Bedrock + AWS PrivateLink	Teams that want managed DeepSeek API access	Private app calls `bedrock-runtime` or `bedrock-mantle` through interface endpoints	Low	Low to medium	IAM, endpoint policies, private DNS, CloudTrail, CloudWatch, Guardrails	Bedrock model pricing + PrivateLink	Least infrastructure control, dependent on model/API/Region availability
Bedrock fully managed DeepSeek	Production apps that want serverless DeepSeek access	Same as Bedrock PrivateLink	Low	Low	Bedrock security, IAM, Guardrails, logging	Model-dependent Bedrock pricing	Model support varies by Region and API family
SageMaker JumpStart or SageMaker endpoint inside VPC	ML teams that need hosted model control	App calls SageMaker Runtime through PrivateLink	Medium	Medium	VPC config, security groups, IAM, KMS, network isolation	Instance or endpoint-based pricing	More configuration and lifecycle management
SageMaker with TGI/vLLM and S3 artifacts	Custom model packaging and serving stack	Model artifacts in S3, container in private subnets, runtime endpoint private	High	High	S3 endpoint policies, model provenance, container scanning	SageMaker hosting instances + storage/logging	You own container behavior and model serving tuning
EC2/EKS self-hosted DeepSeek	Maximum control, custom GPU scheduling, custom runtime isolation	Private nodes, private load balancers, private ECR/S3/logging endpoints	Very high	Very high	Full network, host, container, IAM, KMS, runtime security	GPU compute + storage + networking	Highest operational burden

AWS’s own Bedrock vs. SageMaker decision guide describes Bedrock as a managed service for integrating foundation models through API calls, while SageMaker AI is aimed at building, training, deploying, and customizing ML models with more infrastructure control.

Option 1 — Use DeepSeek through Amazon Bedrock PrivateLink

This is the recommended first option for most teams searching for DeepSeek AWS PrivateLink or DeepSeek Amazon Bedrock PrivateLink.

Step 1: Confirm model and Region support

Before creating infrastructure, confirm the exact DeepSeek model, API family, service tier, quotas, and Region. AWS currently lists DeepSeek V3.2, DeepSeek-V3.1, and DeepSeek-R1 in Amazon Bedrock documentation, but the available model ID, inference profile ID, endpoint URL, and supported API can vary by model and Region.

DeepSeek V3.2 is documented as a DeepSeek mixture-of-experts model with support for bedrock-runtime and bedrock-mantle endpoints. DeepSeek-R1 is documented with bedrock-runtime support, while its model card shows bedrock-mantle is not supported for that model.

Step 2: Choose the endpoint family

Use:

bedrock-runtime for InvokeModel, InvokeModelWithResponseStream, Converse, and ConverseStream.
bedrock-mantle for OpenAI-compatible API patterns where the chosen model and Region support it.
bedrock for control-plane actions such as model discovery or management operations.

AWS documents interface endpoint service names including com.amazonaws.region.bedrock, com.amazonaws.region.bedrock-runtime, and com.amazonaws.region.bedrock-mantle. It also documents private DNS names such as bedrock-runtime.region.amazonaws.com and bedrock-mantle.region.api.aws.

Step 3: Create interface VPC endpoints

Use private subnets in at least two Availability Zones and attach a security group that allows HTTPS from your application security group only.

REGION="us-west-2"
VPC_ID="vpc-xxxxxxxxxxxxxxxxx"

SUBNET_A="subnet-aaaaaaaaaaaaaaaaa"
SUBNET_B="subnet-bbbbbbbbbbbbbbbbb"

VPCE_SG_ID="sg-xxxxxxxxxxxxxxxxx"

# Bedrock Runtime endpoint
aws ec2 create-vpc-endpoint \
  --region "$REGION" \
  --vpc-id "$VPC_ID" \
  --vpc-endpoint-type Interface \
  --service-name "com.amazonaws.${REGION}.bedrock-runtime" \
  --subnet-ids "$SUBNET_A" "$SUBNET_B" \
  --security-group-ids "$VPCE_SG_ID" \
  --private-dns-enabled

# Optional: Bedrock control-plane endpoint
aws ec2 create-vpc-endpoint \
  --region "$REGION" \
  --vpc-id "$VPC_ID" \
  --vpc-endpoint-type Interface \
  --service-name "com.amazonaws.${REGION}.bedrock" \
  --subnet-ids "$SUBNET_A" "$SUBNET_B" \
  --security-group-ids "$VPCE_SG_ID" \
  --private-dns-enabled

Verify the exact VPC endpoint service name in your AWS Region before deployment. Amazon Bedrock endpoint availability, endpoint families, and model support can vary by Region and may change over time.

If your selected model supports OpenAI-compatible Bedrock Mantle APIs in your target Region, create the Mantle endpoint as well:

aws ec2 create-vpc-endpoint \
  --region "$REGION" \
  --vpc-id "$VPC_ID" \
  --vpc-endpoint-type Interface \
  --service-name "com.amazonaws.${REGION}.bedrock-mantle" \
  --subnet-ids $SUBNET_IDS \
  --security-group-ids "$VPCE_SG_ID" \
  --private-dns-enabled

Always verify the service name in the VPC console or AWS CLI for your Region before deploying. Bedrock endpoint support evolves, and not every model supports every API family.

Step 4: Attach restrictive endpoint policies

The default endpoint policy allows full access through the interface endpoint. AWS recommends using a custom endpoint policy when you need to control what can be accessed from your VPC.

Example policy skeleton:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "InvokeApprovedDeepSeekModelsOnly",
      "Effect": "Allow",
      "Principal": "*",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:Converse",
        "bedrock:ConverseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:REGION::foundation-model/APPROVED_MODEL_ID",
        "arn:aws:bedrock:REGION:ACCOUNT_ID:inference-profile/APPROVED_PROFILE_ID",
        "arn:aws:bedrock:REGION:ACCOUNT_ID:application-inference-profile/APPROVED_PROFILE_ID"
      ]
    }
  ]
}

Replace the resource ARN with the exact foundation model ARN, system inference profile ARN, or application inference profile ARN returned by AWS for your Region and account. Foundation model ARNs and inference profile ARNs use different formats.

Use IAM identity policies as the primary control for who can invoke which model, then use endpoint policies as an additional network boundary. For Bedrock Mantle, AWS’s example endpoint policy uses the bedrock-mantle:CreateInference action.

Step 5: Test from a private workload

The following Boto3 example uses bedrock-runtime. Replace the model ID or inference profile ID with a value verified for your Region and account. AWS documentation currently shows us.deepseek.r1-v1:0 as an example inference profile ID for DeepSeek-R1 in the US, but production code should not assume the same value across Regions, accounts, or model versions.

import boto3
from botocore.config import Config

REGION = "us-west-2"
MODEL_ID = "REPLACE_WITH_VERIFIED_DEEPSEEK_MODEL_OR_INFERENCE_PROFILE_ID"

client = boto3.client(
    "bedrock-runtime",
    region_name=REGION,
    config=Config(connect_timeout=30, read_timeout=300)
)

response = client.converse(
    modelId=MODEL_ID,
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Summarize the security benefits of private LLM inference on AWS."}
            ]
        }
    ],
    inferenceConfig={
        "maxTokens": 1024,
        "temperature": 0.2
    }
)

print(response["output"]["message"]["content"][0]["text"])

Step 6: Add monitoring

Enable CloudTrail, CloudWatch metrics, and model invocation logging according to your privacy policy. Amazon Bedrock integrates with CloudTrail, and CloudTrail records Bedrock API activity as events. Bedrock invocation logging can send supported request, response, and metadata logs to CloudWatch Logs or S3, but AWS notes that invocation logging is currently supported for bedrock-runtime calls and not for calls made through the bedrock-mantle Responses API endpoint.

Option 2 — Deploy DeepSeek on SageMaker inside a Private VPC

Choose this path when you need DeepSeek SageMaker VPC control: custom model packaging, custom containers, controlled model artifact storage, endpoint-level scaling, or model variants.

How SageMaker private access works

There are two related but separate network concerns:

Your application invoking SageMaker Runtime privately.
Create com.amazonaws.region.sagemaker.runtime as an interface endpoint and enable private DNS.
Your SageMaker-hosted model accessing required resources.
Configure the model with VpcConfig, private subnets, security groups, S3 access, and KMS encryption.

AWS documents that SageMaker API and SageMaker Runtime can be accessed through interface VPC endpoints using AWS PrivateLink, and that the default runtime hostname resolves to the VPC endpoint when private DNS is enabled.

Required endpoints for a no-internet SageMaker deployment

At minimum, plan for:

com.amazonaws.region.sagemaker.api
com.amazonaws.region.sagemaker.runtime
com.amazonaws.region.s3 as a Gateway endpoint
com.amazonaws.region.sts
com.amazonaws.region.logs
com.amazonaws.region.monitoring
com.amazonaws.region.ecr.api and com.amazonaws.region.ecr.dkr if using private containers
com.amazonaws.region.secretsmanager if your container reads secrets
KMS access through AWS APIs according to your key policy and architecture

AWS specifically notes that when SageMaker model containers do not have internet access, they cannot connect to S3 buckets containing data unless you create a VPC endpoint for S3. AWS also recommends custom policies that allow only requests from your private VPC to access the relevant S3 buckets.

SageMaker Python SDK example

from sagemaker.huggingface import HuggingFaceModel
from sagemaker.session import Session
import boto3

REGION = "us-west-2"
ROLE_ARN = "arn:aws:iam::ACCOUNT_ID:role/SageMakerExecutionRole"
MODEL_ARTIFACT = "s3://your-private-model-bucket/deepseek/model.tar.gz"

SUBNETS = ["subnet-private-a", "subnet-private-b"]
SECURITY_GROUPS = ["sg-sagemaker-endpoint"]

session = Session(boto_session=boto3.Session(region_name=REGION))

model = HuggingFaceModel(
    role=ROLE_ARN,
    model_data=MODEL_ARTIFACT,
    transformers_version="REPLACE_WITH_SUPPORTED_VERSION",
    pytorch_version="REPLACE_WITH_SUPPORTED_VERSION",
    py_version="py310",
    sagemaker_session=session,

    # Place SageMaker-hosted model resources in private subnets.
    vpc_config={
        "Subnets": SUBNETS,
        "SecurityGroupIds": SECURITY_GROUPS
    },

    # Use only if the selected container and deployment path support it.
    # Network isolation prevents the container from making outbound network calls.
    enable_network_isolation=True
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.12xlarge",
    endpoint_name="deepseek-private-vpc-endpoint"
)

SageMaker network isolation can be enabled when creating a model or training job. AWS documents that when network isolation is enabled, training and inference containers cannot make outbound network calls, and SageMaker handles required S3 download/upload operations using the SageMaker execution role outside the container runtime.

For SageMaker JumpStart, AWS states that JumpStart models run in network isolation mode and that a selected VPC does not need public internet access, but it does need S3 access.

Option 3 — Self-host DeepSeek on EC2 or EKS

Self-hosting is appropriate when you need maximum runtime control, custom GPU topology, custom quantization, custom serving software, or strict artifact provenance controls that cannot be satisfied by managed Bedrock or SageMaker.

Typical serving stacks include:

vLLM
Hugging Face TGI
NVIDIA Triton
Ollama for smaller or local-style deployments
Custom FastAPI or gRPC inference services

A production EC2/EKS design should include private subnets, no public IPs, private ECR image pulls, S3 model artifact access, CloudWatch/Prometheus monitoring, KMS encryption, Secrets Manager, image scanning, AMI patching, GPU driver lifecycle management, and autoscaling policies. For a strict no-NAT design, every external AWS service dependency must be reachable through a VPC endpoint or private connectivity path.

This path gives you the most control, but it also shifts the highest responsibility to your team. You own GPU utilization, model loading, warmup, autoscaling, overload behavior, patching, vulnerability management, data retention, and incident response.

Security Checklist for DeepSeek on AWS Private VPC

Use this checklist before moving a DeepSeek secure deployment AWS design into production.

Control	Recommended implementation
Private subnets	Run application and inference workloads in private subnets across at least two Availability Zones.
No public IPs	Disable public IP assignment for EC2, EKS nodes, ECS tasks, and SageMaker-related workloads where applicable.
No inbound internet	Avoid public load balancers unless explicitly required. Prefer internal ALB/NLB and private connectivity.
No internet egress	Remove NAT Gateway routes for strict environments and use VPC endpoints instead.
Bedrock PrivateLink	Create `bedrock-runtime`; add `bedrock-mantle` only when the model and Region support it.
SageMaker PrivateLink	Create `sagemaker.api` and `sagemaker.runtime` endpoints.
S3 Gateway endpoint	Required for private model artifacts, logs, datasets, and no-internet SageMaker patterns.
Endpoint policies	Restrict allowed actions and resources at the VPC endpoint layer.
IAM least privilege	Restrict model invocation, endpoint creation, S3 access, KMS usage, and logging permissions.
KMS encryption	Use customer-managed keys for sensitive S3 buckets, EBS volumes, logs, and model artifacts where required.
CloudTrail	Record Bedrock, SageMaker, IAM, KMS, and infrastructure API activity.
CloudWatch metrics	Monitor invocation count, latency, errors, token usage, and delivery failures. Bedrock supports CloudWatch metrics with dimensions such as `ModelId`.
VPC Flow Logs	Capture IP traffic metadata for network troubleshooting and egress monitoring. AWS states VPC Flow Logs capture information about IP traffic to and from network interfaces.
Guardrails	Use Amazon Bedrock Guardrails where applicable for content filters, denied topics, word filters, and sensitive information controls.
S3 bucket policies	Require access through approved VPC endpoints using conditions such as `aws:SourceVpce`.
Prompt logging policy	Decide whether prompts and responses may be logged; apply masking, retention, and access controls.
Container security	Scan images, pin versions, validate model artifact provenance, and restrict package downloads.
Organizations controls	Use SCPs to restrict unapproved Regions, model families, public networking, or unmanaged inference paths.

Cost Considerations

Do not estimate production cost from model choice alone. A private LLM deployment includes model inference, endpoint infrastructure, networking, logging, storage, and security tooling.

Amazon Bedrock costs depend on the provider, model, modality, and service tier. AWS’s Bedrock pricing page states that model pricing depends on modality, provider, and model, and it lists DeepSeek as one of the model providers.

AWS PrivateLink costs include endpoint hours and data processing. AWS’s PrivateLink pricing examples show data processing charges and hourly endpoint ENI charges.

NAT Gateway costs matter if you keep NAT for package downloads, external APIs, or public endpoints. AWS states that NAT Gateway pricing includes each hour the NAT Gateway is available and each gigabyte of data processed. AWS also recommends considering interface or gateway endpoints when most traffic is to AWS services that support VPC endpoints.

SageMaker costs depend on the instance type, inference option, storage, and duration. AWS states that SageMaker AI uses pay-as-you-go pricing, and that real-time inference is charged based on the instance type used.

Also budget for S3 storage, CloudWatch Logs ingestion and retention, VPC Flow Logs, KMS requests, ECR storage, image scanning, multi-AZ duplication, and security monitoring.

Common Mistakes and Troubleshooting

Problem	Likely cause	Fix
Bedrock calls still go through NAT	Private DNS disabled or endpoint missing	Enable private DNS and verify DNS resolution from the private subnet.
`AccessDeniedException`	IAM policy, endpoint policy, model access, or SCP blocks invocation	Check identity policy, endpoint policy, Organizations SCPs, and model access.
`UnknownEndpoint` or DNS failure	Wrong endpoint service name or unsupported Region	Verify service name and Region in AWS docs or VPC endpoint service list.
HTTPS timeout	Security group blocks TCP 443	Allow outbound 443 from app SG to VPCE SG and inbound 443 on VPCE SG from app SG.
Bedrock Mantle call fails	Model or Region does not support `bedrock-mantle`	Use `bedrock-runtime` or select a model documented for Mantle.
Model ID not found	Hardcoded model ID or wrong inference profile	Query current Bedrock model metadata and use the right Region/profile.
SageMaker endpoint cannot load artifacts	Missing S3 endpoint or bucket policy blocks VPCE access	Add S3 Gateway endpoint and update bucket policy.
Container cannot pull image	Missing ECR endpoints in no-NAT architecture	Add `ecr.api`, `ecr.dkr`, S3 endpoint, and required IAM permissions.
No CloudWatch logs	Missing logs endpoint, IAM permissions, or logging config	Add `logs` endpoint, allow logs actions, and configure log groups.
Unexpected internet egress	NAT route remains in private route table	Remove NAT route for strict no-egress environments and monitor with Flow Logs.
Invocation logs missing for Mantle	Bedrock invocation logging does not currently capture `bedrock-mantle` Responses API calls	Use CloudTrail and Mantle-specific monitoring; check AWS docs for updates.
SageMaker no-internet deployment fails	Model container requires external downloads	Pre-stage model artifacts and dependencies in S3/ECR and validate offline startup.

Best-practice Reference Architecture

For most enterprises, the best production pattern is:

Use Amazon Bedrock + AWS PrivateLink for managed DeepSeek inference when model, API, Region, quota, and governance requirements match.
Use SageMaker private endpoints when the team needs model artifact control, custom containers, endpoint-level tuning, or private model lifecycle management.
Use S3 as the controlled artifact boundary for self-hosted or SageMaker models.
Remove NAT for strict no-internet-egress workloads and replace service dependencies with VPC endpoints.
Apply IAM and endpoint policies together rather than relying on network controls alone.
Encrypt artifacts, logs, volumes, and outputs using KMS according to data classification.
Enable CloudTrail, CloudWatch, and VPC Flow Logs for auditability and troubleshooting.
Use Bedrock Guardrails where applicable for content filtering, sensitive information filtering, and policy alignment.
Deploy with Infrastructure as Code using Terraform, AWS CDK, or CloudFormation.
Separate dev, stage, and prod accounts and enforce model, Region, and network restrictions with AWS Organizations.

This design helps reduce public exposure and supports compliance goals, but it does not make the workload automatically compliant. Final compliance depends on your architecture, data classification, logging configuration, retention policy, access model, security review, and legal requirements.

Limitations

Model IDs, inference profile IDs, API support, quotas, prices, and Regions change. Always verify immediately before deployment.
AWS PrivateLink keeps traffic between your VPC and supported AWS services on the AWS network, but it does not replace IAM, encryption, logging, or application-layer controls.
No-egress deployments require dependency discipline. Any runtime dependency that tries to download from the internet can break startup or create a governance exception.
Bedrock invocation logging may not cover every endpoint family. AWS currently notes that model invocation logging is supported for bedrock-runtime calls and not for Responses API calls through bedrock-mantle.
Self-hosting DeepSeek requires significant ML infrastructure maturity.

FAQ

1. Can DeepSeek run inside an AWS Private VPC?

Yes. You can invoke managed DeepSeek models from private subnets through Amazon Bedrock PrivateLink, deploy DeepSeek-related models on SageMaker AI inside a VPC, or self-host on EC2/EKS private subnets. The right option depends on how much model and infrastructure control you need.

2. Is Amazon Bedrock private when accessed through AWS PrivateLink?

AWS PrivateLink lets your VPC connect privately to Amazon Bedrock without using an internet gateway, NAT device, VPN, or Direct Connect connection, and instances in the VPC do not need public IP addresses to access Bedrock. You still need IAM, endpoint policies, encryption, and logging controls.

3. Should I use Bedrock or SageMaker for DeepSeek on AWS?

Use Bedrock when you want managed API access with less infrastructure management. Use SageMaker when you need custom model hosting, custom containers, private model artifacts, endpoint control, or ML lifecycle customization. AWS’s decision guide positions Bedrock for simplified foundation model API integration and SageMaker AI for more customized ML workflows.

4. Which VPC endpoints are needed for DeepSeek on Bedrock?

For basic private inference, create com.amazonaws.region.bedrock-runtime. Add com.amazonaws.region.bedrock for control-plane operations. Add com.amazonaws.region.bedrock-mantle only when your selected model, API pattern, and Region support Bedrock Mantle. AWS also documents Bedrock agent endpoint suffixes for agent use cases.

5. Which VPC endpoints are needed for DeepSeek on SageMaker?

For private SageMaker access, use com.amazonaws.region.sagemaker.api and com.amazonaws.region.sagemaker.runtime. For no-internet deployments, add S3, STS, CloudWatch Logs, CloudWatch monitoring, ECR, Secrets Manager, and any other service endpoints your container or deployment pipeline requires. AWS documents sagemaker.runtime as required for endpoint invocations in no-internet Studio/VPC configurations.

6. Can I deploy DeepSeek with no internet egress?

Yes, but you must remove NAT routes and provide private alternatives for every dependency. Bedrock calls can use PrivateLink. SageMaker model artifacts can come from S3 through a Gateway endpoint. Containers can be pulled from ECR through ECR interface endpoints. Logs and metrics should go through CloudWatch endpoints.

7. Does AWS PrivateLink remove the need for NAT Gateway?

For supported AWS service traffic, often yes. Bedrock and SageMaker API/Runtime calls can use PrivateLink. But NAT might still be needed for unsupported services or external internet destinations. In a strict no-egress design, avoid NAT and redesign dependencies around VPC endpoints and private connectivity.

8. How do I restrict access to only DeepSeek models?

Use IAM policies that allow invocation only for approved model ARNs, model IDs, or inference profile ARNs where applicable. Add endpoint policies as an extra boundary. In multi-account environments, use SCPs to deny unapproved Regions, models, or public network paths.

9. Can I use DeepSeek-R1, DeepSeek-V3.1, or DeepSeek V3.2 on AWS?

AWS documentation currently lists DeepSeek-R1, DeepSeek-V3.1, and DeepSeek V3.2 in Amazon Bedrock. However, model availability, endpoint support, service tier, model ID, and Region can vary, so verify the current Bedrock model card and Region list before deployment.

10. Is self-hosting DeepSeek on EC2/EKS better than Bedrock?

Not usually for teams that want managed private inference. Self-hosting is better only when you need maximum runtime control, custom serving stacks, specialized GPU scheduling, or custom isolation requirements. It creates more work in patching, scaling, monitoring, capacity planning, and security operations.

11. How do I monitor DeepSeek inference traffic?

For Bedrock, use CloudTrail, CloudWatch metrics, and invocation logging where supported. For VPC-level visibility, use VPC Flow Logs. For SageMaker, use endpoint metrics, container logs, CloudWatch, CloudTrail, and network logs. Bedrock metrics can be viewed in CloudWatch by searching for the model ID.

12. What are the biggest security risks in private LLM deployments?

The biggest risks are over-permissive IAM, accidental NAT egress, unapproved model usage, prompt/response leakage through logs, weak S3 bucket policies, unscanned containers, unverified model artifacts, missing KMS controls, and lack of audit visibility.

Conclusion

DeepSeek on AWS Private VPC is best implemented as a private, governed inference architecture rather than a simple model deployment task.

For most teams, start with Amazon Bedrock + AWS PrivateLink because it gives managed DeepSeek access without operating model infrastructure. Use SageMaker private endpoints when you need custom model hosting, model artifacts in S3, custom containers, or endpoint-level control. Use EC2/EKS self-hosting only when your team is ready to operate the full LLM serving stack.

Need help designing a private LLM architecture on AWS? Start with a security review of your model path, VPC endpoints, IAM policies, logging, and data classification before production deployment.

Table of Contents