MCP Server Learnings

You have built an MCP server. You can register tools, validate inputs, and return results. Now comes the harder question: how do you design tools well? A poorly-designed tool suite confuses the AI model, wastes tokens, and leads to unreliable behavior. A well-designed tool suite feels invisible — the model picks the right tool on the first try, every time.

This module covers four design patterns that separate amateur MCP servers from production-grade ones. Each pattern includes a before/after comparison so you can see the difference in practice.

The Polymorphic Pattern

The problem: you have an entity (say, GitHub issues) and you need tools for creating, reading, updating, listing, and closing them. The naive approach is to register five separate tools.

// BAD: Five separate tools for one entity
server.tool("create_issue", { title: z.string(), body: z.string() }, ...);
server.tool("get_issue", { number: z.number() }, ...);
server.tool("update_issue", { number: z.number(), title: z.string().optional() }, ...);
server.tool("list_issues", { state: z.enum(["open", "closed"]) }, ...);
server.tool("close_issue", { number: z.number() }, ...);

This works, but it has problems. Each tool definition gets sent to the AI model as part of its context. Five tools means five JSON schemas the model must read and reason about. Multiply this across every entity in your server and you quickly hit 20, 30, 50 tools. The model's accuracy in selecting the right tool drops sharply after ~15 tools.

The fix: use a single polymorphic tool with anaction parameter.

// GOOD: One polymorphic tool for issue management
server.tool("manage_issues", {
  action: z.enum(["create", "get", "update", "list", "close"]),
  number: z.number().optional().describe("Issue number (required for get/update/close)"),
  title: z.string().optional().describe("Issue title (required for create, optional for update)"),
  body: z.string().optional().describe("Issue body"),
  state: z.enum(["open", "closed"]).optional().describe("Filter for list action"),
}, async ({ action, number, title, body, state }) => {
  switch (action) {
    case "create":
      return await createIssue(title!, body);
    case "get":
      return await getIssue(number!);
    case "update":
      return await updateIssue(number!, { title, body });
    case "list":
      return await listIssues(state);
    case "close":
      return await closeIssue(number!);
  }
});

Token efficiency analysis: five tools with their schemas might consume ~800 tokens of context. One polymorphic tool with the same capabilities uses ~250 tokens. That is a 3x reduction — and the model only needs to match one tool name instead of choosing between five similar ones.

Smell test: if you have three or more tools that operate on the same noun (issues, users, files), consider merging them into a polymorphic tool.

The Workflow Pattern

The problem: some operations require multiple steps that always happen together. Deploying an application means validating the config, running tests, building artifacts, and pushing to production. If you expose each step as a separate tool, the AI model has to orchestrate the sequence itself — and it will sometimes skip steps, reorder them, or forget error handling between stages.

// BAD: Four tools that MUST be called in sequence
server.tool("validate_config", ...);
server.tool("run_tests", ...);
server.tool("build_artifacts", ...);
server.tool("push_to_production", ...);
// The AI has to figure out the right order and handle
// failures at each step. It often gets this wrong.

The fix: create a workflow tool that encapsulates the entire multi-step process. The server handles sequencing, error propagation, and rollback internally.

// GOOD: One workflow tool that handles the full process
server.tool("deploy", {
  environment: z.enum(["staging", "production"]),
  skip_tests: z.boolean().default(false).describe("Skip test step (use with caution)"),
}, async ({ environment, skip_tests }) => {
  const steps = [];

  // Step 1: Validate
  const config = await validateConfig(environment);
  steps.push({ step: "validate", status: "passed" });

  // Step 2: Test (optional)
  if (!skip_tests) {
    const testResult = await runTests();
    if (!testResult.passed) {
      return {
        content: [{ type: "text", text: JSON.stringify({
          status: "failed",
          failed_at: "tests",
          completed_steps: steps,
          errors: testResult.errors,
        }, null, 2) }],
      };
    }
    steps.push({ step: "tests", status: "passed" });
  }

  // Step 3: Build
  const artifacts = await buildArtifacts(config);
  steps.push({ step: "build", status: "passed", artifact_id: artifacts.id });

  // Step 4: Deploy
  const deployment = await pushToProduction(artifacts, environment);
  steps.push({ step: "deploy", status: "passed", url: deployment.url });

  return {
    content: [{ type: "text", text: JSON.stringify({
      status: "success",
      completed_steps: steps,
      deployment_url: deployment.url,
    }, null, 2) }],
  };
});

The workflow pattern gives the AI a single entry point. The model says “deploy to staging” and gets back a structured result showing what happened at each step. If something fails, the response tells it exactly where and why — no guesswork.

Smell test: if you find yourself writing prompt instructions like “always call tool A before tool B,” you need a workflow tool instead.

The Idempotent Pattern

AI agents retry. Network hiccups, timeouts, and model reasoning loops all lead to the same tool being called multiple times with the same arguments. If your tool creates a new record every time, you get duplicates. If it charges money every time, you have a billing nightmare.

Idempotent tools produce the same result regardless of how many times they are called with the same input.

// BAD: Creates duplicate entries on retry
server.tool("add_todo", {
  title: z.string(),
}, async ({ title }) => {
  // Every call creates a new row, even with the same title
  const todo = await db.todos.create({ title });
  return { content: [{ type: "text", text: `Created todo #${todo.id}` }] };
});

// GOOD: Idempotent — uses upsert with a natural key
server.tool("add_todo", {
  title: z.string(),
  idempotency_key: z.string().optional()
    .describe("Unique key to prevent duplicates on retry"),
}, async ({ title, idempotency_key }) => {
  const key = idempotency_key || generateKeyFromTitle(title);
  const todo = await db.todos.upsert({
    where: { idempotency_key: key },
    create: { title, idempotency_key: key },
    update: {},  // No-op if already exists
  });
  return {
    content: [{ type: "text", text: JSON.stringify({
      id: todo.id,
      title: todo.title,
      already_existed: !todo._created,
    }, null, 2) }],
  };
});

Three strategies for idempotency:

Upsert with natural keys — use a unique constraint on a business-meaningful field (email, slug, title + date).
Idempotency keys — accept an optional key parameter. The AI can send the same key on retries.
Check-then-act — before creating, check if the resource already exists. Return the existing one if it does.

Smell test: any tool that creates, deletes, or charges something should be idempotent. Read-only tools are inherently idempotent.

Semantic Naming

The AI model selects tools based on their name and description. Names that describe implementation details confuse the model. Names that describe intent guide it.

// BAD: Named by implementation
"query_postgres_fts"        // What if you switch to Elasticsearch?
"call_openai_gpt4"          // What if you switch models?
"fetch_rest_endpoint"       // Every REST call would match this
"run_sql_select"            // The AI doesn't think in SQL

// GOOD: Named by intent
"search_knowledge_base"     // Clear what it does, not how
"generate_summary"          // Intent-based, implementation-agnostic
"get_user_profile"          // Domain-specific, unambiguous
"find_similar_documents"    // Describes the outcome

Naming rules:

Use the user's language — name tools the way a user would describe the action, not the way a developer would implement it.
Be specific — search_knowledge_base is better than search. The model sees tools from all connected servers; generic names collide.
Verb + noun — create_invoice,list_contacts, analyze_sentiment. Consistent structure helps the model pattern-match.
Avoid implementation leaks — no database names, no API provider names, no internal service names.

Descriptions matter even more than names. A good description tells the model when to use the tool, not just what it does:

// BAD description
"Queries the database for matching records"

// GOOD description
"Search the knowledge base for articles matching a query.
Use this when the user asks about a topic, wants to find
documentation, or needs help with a specific feature.
Returns the top 5 most relevant results with snippets."

Decision Matrix: When to Use Which Pattern

Use this matrix to decide which pattern fits your situation:

| Situation                              | Pattern       | Example                          |
|----------------------------------------|---------------|----------------------------------|
| Multiple CRUD operations on one entity | Polymorphic   | manage_issues, manage_users      |
| Multi-step process with fixed order    | Workflow       | deploy, onboard_customer         |
| Tool creates/modifies external state   | Idempotent    | add_todo, create_subscription    |
| Tool reads data (no side effects)      | Idempotent    | Already idempotent by nature     |
| Tool name describes "how" not "what"   | Rename        | query_pg -> search_knowledge     |
| >15 tools on your server               | Polymorphic   | Merge related tools              |
| AI calls tools in wrong order          | Workflow       | Combine into one orchestrated    |
| Duplicates appear on retries           | Idempotent    | Add upsert or idempotency keys   |

These patterns are not mutually exclusive. A workflow tool should also be idempotent. A polymorphic tool should use semantic naming. Layer them together for the best results.

Exercise: Redesign a Poorly-Designed Tool Suite

Here is a badly-designed MCP server for a project management app. It has 12 tools:

// The mess you inherited:
"create_project"
"get_project"
"update_project"
"delete_project"
"create_task"
"get_task"
"update_task"
"delete_task"
"assign_task"
"move_task_to_column"
"add_comment_to_task"
"get_task_comments"

Your challenge:

Apply the polymorphic pattern to reduce the tool count. Which tools should be merged?
Identify any operations that should be a workflow tool. (Hint: does creating a project always require setting up default columns?)
Which tools need idempotency? What natural keys would you use?
Rename any tools that leak implementation details.

Target: reduce from 12 tools to 3-4 well-designed ones. Write out the tool names, their action enums (if polymorphic), and one-sentence descriptions. Compare your answer to the solution pattern: manage_projects (polymorphic),manage_tasks (polymorphic with assign/move/comment as actions), and setup_project (workflow that creates project + default columns + invites team).

Check Your Understanding

What problem does the polymorphic pattern solve? When would younot use it?
Explain why the workflow pattern is safer than exposing individual steps as separate tools.
Name three strategies for making a tool idempotent.
A tool is named pg_full_text_search. What is wrong with this name, and what would you rename it to?
Your server has 22 tools and the AI frequently calls the wrong one. Which patterns would you apply to fix this?

Key Takeaway

Good tool design is not about exposing every operation — it is about giving the AI model the fewest, clearest choices that cover the most ground. Merge related operations with the polymorphic pattern, hide complexity with workflows, protect against retries with idempotency, and name everything by intent. The best MCP servers feel like they only have a handful of tools, but each one is remarkably capable.

Tool Design Patterns

Key Concepts