Your MCP server works locally. Tests pass. Time to run it in production. This module covers the operational concerns that separate a development prototype from a production service: packaging, process management, logging, monitoring, and scaling.
Docker Packaging
Docker is the standard way to package and distribute MCP servers, especially for HTTP-based transports (SSE, Streamable HTTP).
# Dockerfile for an MCP server
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:20-slim
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# MCP servers should NOT run as root
RUN addgroup --system mcp && adduser --system --group mcp
USER mcp
# Environment variables for configuration
ENV NODE_ENV=production
ENV PORT=3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:3000/health || exit 1
EXPOSE 3000
CMD ["node", "dist/index.js"]Key Docker considerations for MCP:
- Multi-stage build to keep the image small (no dev dependencies)
- Non-root user for security
- Health check endpoint for container orchestration
- For stdio servers: the container is invoked per-connection, not long-running. Use
docker run --rmin the client config.
// Claude Desktop config using Docker for stdio
{
"mcpServers": {
"my-server": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"--env", "API_KEY=...",
"my-mcp-server:latest"
]
}
}
}Process Management
For HTTP-based MCP servers running on a VM or bare metal, you need a process manager to handle restarts, log rotation, and graceful shutdowns.
// PM2 ecosystem config
// ecosystem.config.js
module.exports = {
apps: [{
name: "mcp-server",
script: "dist/index.js",
instances: 1, // Or "max" for cluster mode (HTTP only)
exec_mode: "fork",
env: {
NODE_ENV: "production",
PORT: 3000,
},
// Restart on failure
max_restarts: 10,
min_uptime: "10s",
// Graceful shutdown
kill_timeout: 5000,
listen_timeout: 10000,
// Log management
error_file: "/var/log/mcp-server/error.log",
out_file: "/var/log/mcp-server/out.log",
log_date_format: "YYYY-MM-DD HH:mm:ss",
merge_logs: true,
}],
};# systemd service (alternative to PM2)
# /etc/systemd/system/mcp-server.service
[Unit]
Description=MCP Server
After=network.target
[Service]
Type=simple
User=mcp
WorkingDirectory=/opt/mcp-server
ExecStart=/usr/bin/node dist/index.js
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
Environment=NODE_ENV=production
Environment=PORT=3000
[Install]
WantedBy=multi-user.targetStructured Logging
Remember: stdout is reserved for MCP protocol messages. All application logging must go to stderr, files, or a logging service.
// Structured logging with pino (writes to stderr by default)
import pino from "pino";
const logger = pino({
level: process.env.LOG_LEVEL || "info",
// Write to stderr, never stdout
transport: {
target: "pino/file",
options: { destination: 2 }, // fd 2 = stderr
},
// Structured fields for every log line
base: {
service: "mcp-weather-server",
version: "1.2.0",
},
});
// In tool handlers:
server.tool("get_weather", { city: z.string() }, async ({ city }) => {
const startTime = Date.now();
logger.info({ tool: "get_weather", city }, "Tool called");
try {
const result = await fetchWeather(city);
logger.info({
tool: "get_weather",
city,
duration_ms: Date.now() - startTime,
success: true,
}, "Tool completed");
return { content: [{ type: "text", text: JSON.stringify(result) }] };
} catch (e) {
logger.error({
tool: "get_weather",
city,
duration_ms: Date.now() - startTime,
error: e.message,
}, "Tool failed");
throw e;
}
});Structured logging lets you query logs by tool name, duration, success/failure, and any other field. This is critical for debugging production issues.
Health Checks & Monitoring
// Health check endpoint for HTTP MCP servers
app.get("/health", async (req, res) => {
const checks = {
server: "ok",
uptime: process.uptime(),
memory: process.memoryUsage(),
dependencies: {} as Record<string, string>,
};
// Check external dependencies
try {
await db.query("SELECT 1");
checks.dependencies.database = "ok";
} catch {
checks.dependencies.database = "error";
}
try {
await fetch("https://api.weather.com/health");
checks.dependencies.weather_api = "ok";
} catch {
checks.dependencies.weather_api = "error";
}
const allHealthy = Object.values(checks.dependencies)
.every(v => v === "ok");
res.status(allHealthy ? 200 : 503).json(checks);
});
// Metrics endpoint (Prometheus-compatible)
app.get("/metrics", (req, res) => {
res.set("Content-Type", "text/plain");
res.send([
`mcp_tool_calls_total{tool="get_weather"} ${metrics.getWeatherCount}`,
`mcp_tool_errors_total{tool="get_weather"} ${metrics.getWeatherErrors}`,
`mcp_tool_duration_seconds{tool="get_weather",quantile="0.99"} ${metrics.getWeatherP99}`,
`mcp_active_connections ${metrics.activeConnections}`,
].join("\n"));
});Horizontal Scaling
Stdio MCP servers are inherently single-connection: one process per client. HTTP-based servers can scale horizontally.
// Scaling patterns:
// 1. STDIO: One process per client (managed by the client)
// - Scale by letting clients spawn their own processes
// - No shared state between instances
// - Simplest model, no load balancer needed
// 2. SSE / Streamable HTTP: Load-balanced
// - Multiple server instances behind a load balancer
// - Sticky sessions required (SSE connections are stateful)
// - Shared state via Redis or database
// nginx config for SSE load balancing
// upstream mcp_servers {
// ip_hash; # Sticky sessions based on client IP
// server 127.0.0.1:3001;
// server 127.0.0.1:3002;
// server 127.0.0.1:3003;
// }
//
// server {
// location /mcp {
// proxy_pass http://mcp_servers;
// proxy_http_version 1.1;
// proxy_set_header Connection "";
// proxy_buffering off; # Required for SSE
// proxy_read_timeout 86400; # Keep SSE connections alive
// }
// }
// 3. Serverless (Lambda, Cloud Functions)
// - Each request is a new invocation
// - Stateless by design
// - Works for Streamable HTTP (stateless request/response)
// - Does NOT work for SSE (requires persistent connection)Exercise: Production-Ready Deployment
Take your MCP server and make it production-ready:
- Create a Dockerfile with multi-stage build, non-root user, and health check
- Add structured logging with pino (stderr only, structured fields)
- Add a
/healthendpoint that checks all dependencies - Create a PM2 ecosystem config with restart policies and log rotation
- Write a
docker-compose.ymlthat runs the server with environment variables
Check Your Understanding
- Why should Docker containers for MCP servers run as non-root?
- What is the difference between PM2 fork mode and cluster mode for MCP servers?
- Why must all logging go to stderr in an MCP server?
- What should a health check endpoint verify beyond “the server is running”?
- Why can't SSE-based MCP servers run on serverless platforms?
Key Takeaway
Production MCP servers need the same operational rigor as any other service: Docker for packaging, process managers for reliability, structured logging for debugging, health checks for monitoring, and a clear scaling strategy. The unique constraint is stdout — it belongs to the protocol, so everything else goes to stderr. Get these fundamentals right and your MCP server will run reliably at any scale.