P015 min

What is MCP?

The Model Context Protocol explained — why it exists, what problems it solves, and how it fits into the AI tooling landscape.

On This Page

Key Concepts

  • N x M integration problem
  • Universal protocol for AI-tool connectivity
  • Stateful, bidirectional connections
  • JSON-RPC 2.0 message format
  • Host / Client / Server roles
  • Tools, Resources, and Prompts

The Problem Before MCP

Imagine you are building an AI assistant. You want it to read files from a hard drive, query a database, search the web, send emails, and check a calendar. Without a shared protocol, every single one of those integrations is custom work. The file system connector speaks a different language than the database connector, which speaks a different language than the email connector.

This is the N x M problem. If you have N AI applications (Claude, ChatGPT, Copilot, your custom agent) and M data sources (PostgreSQL, Notion, GitHub, Slack, your internal API), you end up building N x M individual integrations. Each AI app needs a bespoke adapter for each tool. Every combination is a unique piece of glue code.

Without MCP:

  Claude ----custom code----> PostgreSQL
  Claude ----custom code----> GitHub
  Claude ----custom code----> Slack
  ChatGPT ---custom code----> PostgreSQL
  ChatGPT ---custom code----> GitHub
  ChatGPT ---custom code----> Slack
  Your App --custom code----> PostgreSQL
  Your App --custom code----> GitHub
  Your App --custom code----> Slack

  3 AI apps x 3 tools = 9 integrations
  Add one more tool? 3 more integrations.
  Add one more AI app? 3 more integrations.

This does not scale. Worse, each integration has its own error handling, its own authentication model, its own data format. You spend more time writing plumbing than building actual intelligence.

The USB-C for AI Analogy

Before USB-C, you had a different cable for every device. Micro-USB for the phone, Lightning for the tablet, barrel plugs for the laptop, HDMI for the monitor. Travel with five cables, lose one, and something does not charge.

USB-C solved this by defining one physical connector and one protocol that handles power, data, and video. The cable does not care what device is on either end. The device does not care what cable you use. They both speak the same protocol.

MCP is USB-C for AI. It defines one protocol that any AI application (the host) can use to connect to any data source or tool (the server). Build your tool once as an MCP server, and every MCP-compatible AI app can use it. Build your AI app once as an MCP host, and it can connect to every MCP server.

With MCP:

  Claude  ---MCP---> PostgreSQL MCP Server
  ChatGPT ---MCP---> PostgreSQL MCP Server
  Your App ---MCP--> PostgreSQL MCP Server

  Same server, three clients. No custom code per app.
  Add a new AI app? It connects instantly.
  Add a new tool server? All apps can use it.

  N + M integrations instead of N x M.

Key Takeaway: MCP turns an N x M problem into an N + M problem. Build once, connect everywhere.

What MCP Actually Defines

MCP is an open specification published by Anthropic. It defines:

  • A message format — built on JSON-RPC 2.0. Every request, response, and notification follows the same structure.
  • Three roles — Host, Client, and Server. These define who initiates connections, who manages them, and who provides capabilities.
  • Six capability types — Tools (actions the AI can take), Resources (data the AI can read), Prompts (reusable templates), Sampling (asking the AI to generate text), Roots (filesystem boundaries), and Elicitation (asking the user questions).
  • A lifecycle — how connections are initialized, how capabilities are negotiated, and how connections are terminated.
  • Transport mechanisms — stdio for local processes, HTTP with Server-Sent Events for remote servers, and Streamable HTTP for modern deployments.

What MCP does not define: it does not tell you what language to write your server in, what database to use, or how to design your tool's logic. It only defines the communication contract between the AI application and your server. Everything else is up to you.

MCP vs REST APIs

You might be thinking: “Why not just use REST APIs? I already know how to build those.” Fair question. Here is why MCP is different:

REST APIs are designed for applications talking to applications. A frontend calls a backend. A service calls another service. The caller knows exactly which endpoint to hit, what parameters to send, and what the response shape will be. The caller is a program following hardcoded instructions.

MCP is designed for AI models talking to tools. The AI does not have hardcoded instructions about your API. Instead, it discovers what tools are available at runtime, reads their descriptions and parameter schemas, and decides which tool to call based on the user's natural language request. This is fundamentally different.

REST API mindset:
  Developer writes: fetch("/api/users/123")
  The code knows the endpoint. Always.

MCP mindset:
  User says: "Find John's email address"
  AI reads available tools, sees "lookup_user" with schema,
  decides to call it with { name: "John" },
  gets the result, formulates an answer.

Other key differences:

  • Stateful vs stateless — REST is stateless by convention. MCP maintains a persistent connection with capability negotiation at startup.
  • Bidirectional — REST is request/response. MCP supports notifications from server to client (e.g., “a resource changed”) without the client asking.
  • Discovery — REST requires API docs or OpenAPI specs that developers read. MCP servers describe their capabilities programmatically so AI models can discover and use them autonomously.

MCP vs Function Calling

If you have used OpenAI's function calling or Anthropic's tool use, you have seen something similar: you describe tools with JSON schemas, the model chooses which to call, and you execute the call. So where does MCP fit in?

Function calling is the mechanism inside a single AI request. You pass tool definitions alongside the conversation, the model returns a tool call, your code executes it, you feed the result back. This happens within one API call cycle.

MCP is the protocol that provides those tool definitions. Instead of hardcoding tool schemas in your application, MCP servers dynamically expose them. The host application connects to MCP servers, discovers their tools, and passes those tool definitions to the model's function calling interface.

The relationship:

  MCP Server ──exposes tools──> MCP Client ──passes tools──> AI Model
                                                              │
                                                    model calls a tool
                                                              │
  MCP Server <──executes tool── MCP Client <──tool call────── AI Model

MCP and function calling are complementary, not competing.
MCP feeds tools INTO the function calling system.

Think of it this way: function calling is the AI model's ability to choose and invoke a tool. MCP is the protocol that puts tools on the table for the model to choose from.

Real-World Example: Claude Desktop

Let us trace how this works in practice with Claude Desktop, which is an MCP host.

  1. You configure Claude Desktop's settings file to point at MCP servers. For example, you add a filesystem server and a GitHub server.
  2. When Claude Desktop starts, it launches each MCP server as a separate process (using stdio transport) and performs a handshake: “What capabilities do you support?”
  3. The filesystem server responds: “I have tools for read_file, write_file, list_directory. Here are their schemas.”
  4. The GitHub server responds: “I have tools for search_repos, create_issue, list_pull_requests. Here are their schemas.”
  5. Now Claude Desktop has a combined toolset. When you ask Claude “Read the README in my project and create a GitHub issue summarizing it,” the model can call read_file from one server and create_issue from another, seamlessly.
  6. The user never thinks about which server does what. The AI orchestrates across multiple MCP servers transparently.

This is the power of a universal protocol. Each server is independently developed, independently maintained, and independently configured. But from the model's perspective, they all look the same: a set of tools with descriptions and schemas.

Key Takeaway

MCP is a standardized protocol that lets any AI application connect to any tool or data source through a common interface. It replaces custom integrations with a universal contract: servers expose capabilities, clients discover and use them, and the AI model orchestrates everything through natural language.

Check Your Understanding

Before moving on, make sure you can answer these questions. Do not look at the answers — try to recall from what you just read:

  1. What is the N x M problem, and how does MCP reduce it to N + M?
  2. Name the three roles in the MCP architecture.
  3. What is the difference between how a REST API and an MCP server are consumed? Who decides which endpoint to call?
  4. How do MCP and function calling relate to each other? Are they alternatives or complements?
  5. When Claude Desktop connects to an MCP server at startup, what information does the server provide?

If you got stuck on any of these, re-read the relevant section above. These concepts are the foundation for everything that follows.


Common Mistakes at This Stage

  • Confusing MCP with an API framework. MCP does not replace Express, FastAPI, or tRPC. It is a protocol for AI-to-tool communication, not a general-purpose API framework.
  • Thinking MCP replaces function calling. It does not. MCP feeds tools into the function calling system. They work together.
  • Assuming MCP is Anthropic-only. MCP is an open spec. Any AI provider or application can implement it. It is not tied to Claude.
  • Overcomplicating it. At its core, MCP is just: “Here are the tools I offer” + “Call this tool with these parameters” + “Here is the result.” The rest is details.