MCP Server Kubernetes Deployment
Deploy and orchestrate MCP servers on Kubernetes with auto-scaling, health checks, and production-grade configurations
MCPgee Team
MCP Expert
MCP Server Kubernetes Deployment
Introduction
Kubernetes provides the orchestration layer needed to run MCP servers at scale in production. With Kubernetes, you get auto-scaling, self-healing, service discovery, rolling updates, and secrets management out of the box. This tutorial covers deploying MCP servers to Kubernetes, from basic deployments to production-grade configurations.
Before starting, ensure your MCP server is containerized. If not, follow our Docker deployment tutorial first.
Basic Deployment
Step 1: Create the Deployment Manifest
Step 2: Create the Service
Step 3: Deploy
Exposing MCP Servers
Ingress with TLS
For external access, use an Ingress controller with TLS:
Note the proxy timeout and buffering annotations. These are important for MCP's Streamable HTTP transport which uses long-lived connections. Without these settings, the proxy may terminate connections prematurely.
LoadBalancer Service
For cloud providers, use a LoadBalancer service:
Secrets Management
Kubernetes Secrets
Store sensitive configuration securely:
Reference secrets in your deployment:
External Secrets Operator
For production, use External Secrets to sync from AWS Secrets Manager, HashiCorp Vault, or other providers:
Auto-Scaling
Horizontal Pod Autoscaler
Scale MCP server pods based on CPU or custom metrics:
Vertical Pod Autoscaler
Automatically adjust resource requests and limits:
ConfigMaps for Runtime Configuration
Mount as a file in your pod:
Rolling Updates and Rollbacks
Update Strategy
Configure rolling updates for zero-downtime deployments:
Perform an Update
Monitoring and Observability
Prometheus Metrics
Add a metrics endpoint to your MCP server:
ServiceMonitor for Prometheus Operator
Network Policies
Restrict network access to your MCP servers:
Multi-Server Deployment
Deploy multiple MCP servers with shared infrastructure:
Security Hardening
For comprehensive MCP security guidance, see our security fundamentals and authentication tutorials.
Key Kubernetes-specific security practices:
- Pod Security Standards: Use restricted security context
- Network Policies: Limit pod-to-pod communication
- RBAC: Minimal service account permissions
- Image scanning: Scan container images for vulnerabilities
Conclusion
Kubernetes provides everything you need to run MCP servers at production scale. From auto-scaling and self-healing to secrets management and network policies, K8s handles the infrastructure so you can focus on building great MCP tools. Start with a simple deployment and add features as your needs grow.
For more deployment options, explore serverless deployment with AWS Lambda or browse our Kubernetes server examples.
Code Examples
Key Takeaways
- Kubernetes provides auto-scaling, self-healing, and service discovery for MCP servers
- Configure proxy timeouts in Ingress for Streamable HTTP long-lived connections
- Use Kubernetes Secrets and External Secrets Operator for credential management
- HorizontalPodAutoscaler scales MCP servers based on CPU, memory, or custom metrics
- Network Policies and Pod Security Standards harden your MCP deployment
Troubleshooting
Pods keep restarting with CrashLoopBackOff
Check pod logs with kubectl logs <pod-name>. Common causes: missing environment variables, incorrect image tag, health check endpoint not responding. Ensure your MCP server starts correctly in the container locally before deploying to Kubernetes.
Streamable HTTP connections are being terminated
Add proxy-read-timeout and proxy-buffering annotations to your Ingress. The default nginx timeout of 60 seconds is too short for MCP streaming connections. Set it to at least 3600 seconds.
Auto-scaler not scaling up under load
Verify the metrics-server is installed in your cluster (kubectl top pods). Check that resource requests are defined in your deployment, as the HPA needs these to calculate utilization percentages.
Next Steps
- Set up monitoring with Prometheus and Grafana
- Implement CI/CD pipelines for automated deployments
- Explore serverless alternatives with AWS Lambda
- Add service mesh for advanced traffic management
Was this helpful?
Stay Updated with MCP Insights
Join 5,000+ developers and get weekly insights on MCP development, new server releases, and implementation strategies delivered to your inbox.
We respect your privacy. Unsubscribe at any time.
MCPgee Team
We write in-depth guides, tutorials, and reviews to help developers get the most out of the Model Context Protocol ecosystem.
Frequently Asked Questions
Related Tutorials
Containerize MCP Servers with Docker
Containerize MCP servers with Docker for consistent, portable, and secure deployments
Serverless MCP on AWS Lambda
Deploy MCP servers as serverless functions on AWS Lambda with API Gateway, cold start optimization, and cost management
MCP Server Performance Optimization
Optimize MCP server performance with caching, connection pooling, rate limiting, and monitoring strategies
Recommended MCP Servers
Popular servers related to this tutorial that you can start using right away.
Librechat MCP Server
Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, M
AWS Nova Canvas
Provides image generation capabilities using Amazon Nova Canvas through Amazon Bedrock, enabling the creation of visuals
Webiny Js MCP Server
Open-source, self-hosted CMS platform on AWS serverless (Lambda, DynamoDB, S3). TypeScript framework with multi-tenancy,
mcp-server-cloudflare
📇 ☁️ - Manage Cloudflare Workers, KV, R2, Pages, DNS, and cache from your
skills-mcp-server
A high-performance MCP server that provides BM25-ranked search and structured access to over 1,300 AI skills, enabling c
kubernetes-mcp-server
A Model Context Protocol (MCP) server that provides safe, read-only access to Kubernetes resources for debugging and ins
Explore MCP Servers
Browse our directory of 33,000+ MCP servers. Find the perfect tools for your AI-powered workflows.