MCP Server Development

Adapted from anthropics/skills/mcp-builder. MCP-server quality is measured by how well it lets LLMs accomplish real-world tasks — not by endpoint count.

Stack default for our projects

Language: TypeScript (matches our stack; static typing + Zod schemas + good LLM code-gen)
Transport: stdio for local tools, Streamable HTTP (stateless JSON) for remote
SDK: @modelcontextprotocol/sdk
Package manager: pnpm (never npm/yarn in our repos)

Phase 1 — Research & Plan

1.1 Design principles

API coverage vs. workflow tools. Balance comprehensive endpoint coverage with specialized workflow shortcuts. Default to coverage unless you have a clear reason — agents compose basic tools well; workflow tools ossify.

Tool naming & discoverability. Consistent prefix + action verb. Examples:

github_create_issue, github_list_repos
gitlab_search_issues, gitlab_close_mr

Context management. Return focused, paginated data. Agents suffer when a single tool call floods context.

Actionable error messages. Errors must guide the next action:

❌ "Invalid input"
✅ "Field 'project_id' is required. Call gitlab_list_projects to enumerate available IDs."

1.2 Read the spec

Sitemap: https://modelcontextprotocol.io/sitemap.xml
Append .md to any page URL for markdown (e.g. https://modelcontextprotocol.io/specification/draft.md)

Focus on: tool definitions, resource definitions, transport mechanisms.

1.3 Load SDK docs

TS SDK README: https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md
Python SDK README: https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md

Fetch via WebFetch only when needed — don't dump entire docs into context upfront.

1.4 Plan implementation

Review the target service's API docs (auth, core endpoints, data models)
List endpoints by priority — most-common operations first
Identify destructive vs. read-only operations (matters for tool annotations)

Tool-Hosting Pattern — In-Process vs Stdio MCP

Before writing a line of implementation code, choose a hosting pattern. The wrong choice cannot be refactored cheaply once tooling is wired.

Decision tree

≤ 5 tools AND latency-critical (<50ms tool resolution)?
│
├─ Yes → tools share the SDK process AND no external auth required?
│        │
│        ├─ Yes → In-process @tool decorator (single-process, sub-ms resolution)
│        └─ No  → Stdio MCP Server
│
└─ No  → Stdio MCP Server
         (≥ 6 tools, external auth, language/runtime mismatch, long-lived process)

In-process @tool decorator (Python — anthropics/claude-agent-sdk-python)

Use create_sdk_mcp_server when your tools live entirely inside the SDK process and you need the lowest possible latency. Source reference: examples/mcp_calculator.py L11–99.

from claude_agent_sdk import tool, create_sdk_mcp_server

@tool(name="add", description="Add two numbers", input_schema={"a": int, "b": int})
async def add(args):
    return {"content": [{"type": "text", "text": str(args["a"] + args["b"])}]}

server = create_sdk_mcp_server(name="calc", version="1.0.0", tools=[add])

In-process registration (TypeScript — @modelcontextprotocol/sdk)

Our default stack uses McpServer.registerTool() from @modelcontextprotocol/sdk. The inline Zod schema is parsed at registration time — no separate schema file needed for small tool sets.

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'calc', version: '1.0.0' });

server.registerTool(
  'add',
  {
    title: 'Add two numbers',
    inputSchema: { a: z.number(), b: z.number() },
  },
  async ({ a, b }) => ({
    content: [{ type: 'text', text: String(a + b) }],
  }),
);

Tool annotations — `readOnlyHint` and `destructiveHint`

Annotations are first-class SDK metadata that Claude and downstream hooks use for permission decisions. Set them on every tool:

server.registerTool(
  'delete-file',
  {
    title: 'Delete a file',
    inputSchema: { path: z.string() },
    annotations: { readOnlyHint: false, destructiveHint: true },
  },
  handler,
);

readOnlyHint: true — signals the tool only reads state; Claude can call it freely without a permission prompt.
destructiveHint: true — signals irreversible side effects; our pre-bash-destructive-guard hook and agents/security-reviewer.md both elevate review priority for tools carrying this flag. Any tool that deletes, overwrites, or mutates shared state must set this.
Missing destructiveHint: true on a destructive tool is a known pitfall — see the "Common pitfalls" table below.

Pattern comparison

Aspect	In-Process @tool	Stdio MCP Server
Tool count	≤ 5	6+
Latency	Sub-ms resolution	5–50 ms IPC overhead
Auth complexity	Shares SDK auth	Separate auth context
Language constraint	Must match SDK	Any runtime
Process isolation	None (in-SDK)	Full (separate child)
Lifecycle	Bound to SDK session	Long-lived independent

For the stdio MCP server implementation path (≥ 6 tools, external auth, or language mismatch), continue with Phase 2 — Implementation below, which covers project structure, core infrastructure, and the full TypeScript stdio setup.

Phase 2 — Implementation

2.1 Project structure (TypeScript)

mcp-server-name/
├── package.json
├── tsconfig.json
├── src/
│   ├── index.ts            (server entry, transport wiring)
│   ├── tools/              (one file per tool or tool group)
│   ├── schemas.ts          (shared Zod schemas)
│   └── client.ts           (API client with auth + error handling)
└── README.md               (setup + config)

2.2 Core infrastructure

Build once, reuse everywhere:

API client with auth (env-var-driven, never hardcoded)
Error-handler helper that returns actionable MCP error responses
Pagination helper (most APIs paginate; most tools forget)
Response formatter (JSON for structured, Markdown for human-readable where agents benefit from it)

2.3 Implement tools

For each tool:

Input schema — Zod, with descriptions per field:

z.object({
  projectId: z.string().describe("GitLab project ID. Call gitlab_list_projects to discover."),
  state: z.enum(["opened", "closed", "all"]).default("opened"),
});

Output schema — define outputSchema where possible; use structuredContent in tool responses (TS SDK feature). This helps downstream agents parse results.

Annotations — set all four:

readOnlyHint: true/false
destructiveHint: true/false
idempotentHint: true/false
openWorldHint: true/false

These inform Claude's hook decisions (destructive-guard, permission prompts).

Implementation — async/await for I/O; errors must surface with enough context for the LLM to fix them.

Phase 3 — Review & Test

3.1 Code quality

DRY — no duplicated API-call logic
Consistent error handling (one helper, not ad-hoc throws)
Full TypeScript coverage — tsgo --noEmit or tsc --noEmit clean
Clear tool descriptions

3.2 Build & test

pnpm build              # or npm run build in non-pnpm projects
npx @modelcontextprotocol/inspector   # interactive testing UI

Walk through every tool in the Inspector. If a tool can fail, trigger the failure and verify the error message is actionable.

Phase 4 — Evaluations

Create 10 evaluation questions. An MCP server without evals is a guess, not a deliverable.

Each question must be:

Independent — doesn't depend on a previous question's answer
Read-only — no destructive side effects
Complex — requires multiple tool calls, not a single lookup
Realistic — a real user would actually ask this
Verifiable — has a single correct answer checkable by string comparison
Stable — answer doesn't change over time

Output format

<evaluation>
  <qa_pair>
    <question>Which GitLab project in group 'X' has the highest number of open issues labeled 'bug'?</question>
    <answer>project-name-here</answer>
  </qa_pair>
</evaluation>

Run the eval via: Claude-with-MCP-server on each question, compare output to expected answer. Any eval below 80% accuracy signals tool-design problems (usually: unclear descriptions, missing pagination, or bad error messages).

Common pitfalls

Pitfall	Fix
Tool returns 10k rows, agent context blows up	Add pagination + default page size
Agent can't figure out auth failure	Error message: "Set ENV_VAR_NAME — current value is empty"
Tool name collision across MCP servers	Always prefix with service name
Destructive tools without `destructiveHint: true`	Breaks our destructive-guard hook
Async errors swallowed	Wrap every handler in try/catch that returns structured error

References

Upstream reference material (worth reading once, not mirroring here):