In the previous module, you learned what MCP is and why it exists. Now we need to understand how it is structured. MCP defines three distinct roles, a capability negotiation handshake, and a transport layer. Once you understand these pieces, you will be able to reason about any MCP system you encounter.
The Three Roles
Every MCP system has exactly three types of participants. Understanding what each one does — and what it does not do — is the foundation for everything else.
Architecture Diagram:
┌──────────────────────────────────────────────┐
│ HOST │
│ (Claude Desktop, VS Code, your app) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Client A │ │ Client B │ │ Client C │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
└────────│──────────────│──────────────│─────────┘
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Server A │ │ Server B │ │ Server C │
│ (files) │ │ (GitHub) │ │ (database)│
└───────────┘ └───────────┘ └───────────┘
One host contains multiple clients.
Each client manages one connection to one server.
Servers run as separate processes.The Host
The host is the AI application that the user interacts with. Claude Desktop is a host. VS Code with the Copilot extension is a host. A custom chat application you build is a host.
The host is responsible for:
- Managing the AI model — sending prompts, receiving responses, handling the conversation loop.
- Creating and managing clients — one client per MCP server connection.
- Enforcing security boundaries — deciding which servers to trust, what permissions to grant, and whether to show user-consent prompts before tool execution.
- Aggregating capabilities — collecting all the tools, resources, and prompts from every connected server and passing them to the AI model.
Think of the host as the orchestrator. It sits between the user and the servers, deciding what the model can see and do.
The Client
The client lives inside the host. It is a protocol-level construct — you do not usually see it as a separate application. Each client manages exactly one connection to one server.
The client is responsible for:
- Establishing the connection — launching the server process (for stdio) or connecting to a URL (for HTTP).
- Performing the initialization handshake — exchanging protocol versions and negotiating capabilities.
- Routing messages — forwarding tool calls from the host to the server, and results from the server back to the host.
- Maintaining connection state — tracking what capabilities the server advertised, handling reconnection if the server crashes.
A useful mental model: the client is a bidirectional pipe with protocol awareness. It knows how to speak MCP, and it keeps the conversation going with exactly one server.
Key Takeaway: The host-client relationship is one-to-many. One host creates many clients. Each client connects to exactly one server. The client is not a separate application — it is an internal component of the host.
The Server
The server is the component you will build throughout this course. It is a separate process that exposes capabilities to the AI model through the MCP protocol.
The server is responsible for:
- Declaring capabilities — telling the client what tools, resources, and prompts it offers.
- Executing tool calls — receiving a tool name and parameters, doing the actual work (query a database, call an API, read a file), and returning the result.
- Serving resources — providing data that the AI model can read without calling a tool.
- Sending notifications — proactively informing the client about changes (e.g., a resource was updated) without being asked.
A server knows nothing about the AI model. It does not know what model is being used, what the user asked, or what conversation is happening. It just receives tool calls and returns results. This separation is intentional — it keeps servers simple and reusable.
How They Connect
Let us trace the lifecycle of a connection from start to finish.
Connection Lifecycle:
1. HOST starts up
└─> Creates Client A for "filesystem server"
└─> Creates Client B for "github server"
2. CLIENT A initializes
└─> Spawns server process: node filesystem-server.js
└─> Sends: initialize request (protocol version, client capabilities)
└─> Receives: initialize response (server capabilities)
└─> Sends: initialized notification (handshake complete)
3. CLIENT B initializes (same process, different server)
4. HOST aggregates
└─> Client A reports: [read_file, write_file, list_directory]
└─> Client B reports: [search_repos, create_issue]
└─> Host passes all 5 tools to the AI model
5. USER sends a message
└─> AI model decides to call "read_file"
└─> Host routes to Client A (which owns that tool)
└─> Client A sends tool call to Server A
└─> Server A executes, returns result
└─> Client A forwards result to Host
└─> Host feeds result back to AI model
└─> AI model formulates response to userNotice that the user never interacts with clients or servers directly. The host is the single point of contact. And the AI model sees a flat list of tools — it does not know or care which server each tool comes from.
Capability Negotiation
When a client connects to a server, they do not just start sending tool calls. First, they perform a capability negotiation. This is critical because not every server supports every feature, and not every client supports every feature either.
Here is what happens during initialization:
// Client sends:
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-03-26",
"capabilities": {
"roots": { "listChanged": true },
"sampling": {}
},
"clientInfo": {
"name": "claude-desktop",
"version": "1.5.0"
}
}
}
// Server responds:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2025-03-26",
"capabilities": {
"tools": { "listChanged": true },
"resources": { "subscribe": true },
"prompts": { "listChanged": true }
},
"serverInfo": {
"name": "filesystem-server",
"version": "0.3.0"
}
}
}Both sides declare what they support. The client says “I support roots and sampling.” The server says “I offer tools, resources, and prompts.” Now both sides know exactly what is available.
Why does this matter?
- Forward compatibility. When the protocol adds new capabilities, old servers that do not support them still work. The client simply does not use features the server did not advertise.
- Minimal surface area. A simple server that only offers two tools does not need to implement resources, prompts, or sampling. It advertises only what it supports.
- Version negotiation. Client and server agree on a protocol version. If they are incompatible, the connection fails cleanly instead of silently breaking.
The Six Capabilities
MCP defines six capability types. Three are server-provided (the server offers them to the client) and three are client-provided (the client offers them to the server).
Server Capabilities (things servers offer)
- Tools — Actions the AI can execute. Think verbs: “search,” “create,” “delete,” “calculate.” Each tool has a name, a description, and an input schema (JSON Schema). When the AI calls a tool, the server executes it and returns a result.
- Resources — Data the AI can read. Think nouns: “file contents,” “database schema,” “log entries.” Resources have URIs (like
file:///path/to/readme.md) and MIME types. The AI reads them for context without side effects. - Prompts — Reusable prompt templates. Think recipes: “summarize this code,” “review this PR,” “debug this error.” Prompts can accept arguments and produce multi-message prompt sequences that guide the AI through a task.
Client Capabilities (things clients offer)
- Sampling — The server can ask the AI model to generate text. This is the reverse direction: instead of the AI calling a tool, a tool asks the AI to think. Useful for servers that need AI reasoning as part of their execution.
- Roots — The client tells the server which filesystem directories it is allowed to access. This sets security boundaries — the server knows its sandbox.
- Elicitation — The server can ask the user a question through the client. For example, if a tool needs a confirmation (“Are you sure you want to delete this?”), the server can elicit a response from the human.
Key Takeaway: You do not need to support all six capabilities. Most servers you build will start with just Tools. You add Resources and Prompts when your use case calls for them. Sampling, Roots, and Elicitation are advanced features you will encounter later in the course.
The Transport Layer
MCP does not care how bytes move between client and server — it just needs a reliable, ordered, bidirectional channel. The specification defines three transport options:
Transport Options:
┌─────────────────┬──────────────────────┬────────────────────────┐
│ Transport │ When to Use │ How it Works │
├─────────────────┼──────────────────────┼────────────────────────┤
│ stdio │ Local servers, │ Client spawns server │
│ │ Claude Desktop, │ as child process. │
│ │ CLI tools │ Messages on stdin/out. │
├─────────────────┼──────────────────────┼────────────────────────┤
│ SSE + HTTP │ Remote servers, │ Server runs on a URL. │
│ │ web applications, │ SSE for server->client │
│ │ shared servers │ POST for client->server│
├─────────────────┼──────────────────────┼────────────────────────┤
│ Streamable HTTP │ Modern deployments, │ Single HTTP endpoint. │
│ │ serverless, CDN │ Bidirectional via │
│ │ compatible │ streaming responses. │
└─────────────────┴──────────────────────┴────────────────────────┘For this course, we will primarily use stdio. It is the simplest: the client spawns your server as a child process and communicates over standard input/output. No networking, no HTTP, no ports. When you test with Claude Desktop or the MCP Inspector, you are using stdio.
We will cover HTTP-based transports in Phase 1 when we discuss transports and lifecycle in detail.
Try It Yourself: Identify the Architecture
For each scenario below, identify the host, the client(s), and the server(s). Write your answers down before checking.
Scenario 1:
A developer uses Claude Desktop with a configured filesystem server and a PostgreSQL server. They ask Claude to “read the schema.sql file and create the tables in my database.”
Scenario 2:
A custom chat app (built with Next.js) connects to a weather API server over HTTP. A user asks “What is the weather in Chicago?”
Scenario 3:
VS Code with an AI extension connects to three MCP servers: a code analysis server, a documentation search server, and a terminal execution server.
Reveal Answers
Scenario 1: Host = Claude Desktop. Two clients (one per server). Server A = filesystem server, Server B = PostgreSQL server. Claude reads the file via Server A, then executes SQL via Server B.
Scenario 2: Host = the Next.js chat app. One client (for the weather server). Server = weather API server. The client connects over HTTP, sends tool calls, gets weather data back.
Scenario 3: Host = VS Code + AI extension. Three clients (one per server). Three servers, each specializing in a different capability. The AI model sees all tools from all three servers as a flat list.
Common Mistakes
- Confusing client and host. The client is not a separate application. It is an internal component of the host. Claude Desktop is the host; it contains clients internally.
- Thinking servers talk to each other. In standard MCP, servers do not communicate directly. They only talk to their connected client. If you need cross-server coordination, that happens in the host.
- Assuming one client per host. A host creates as many clients as it needs — one per server. If you have 5 MCP servers configured, you have 5 clients.
- Forgetting capability negotiation. Just because a capability exists in the spec does not mean every server supports it. Always check what was negotiated.
You now have a mental map of the full MCP architecture: hosts manage clients, clients connect to servers, capabilities are negotiated at startup, and transports carry the messages. In the next module, we will zoom into the primitives — Tools, Resources, Prompts, and Sampling — and understand exactly what each one does and when to use it.