how-to 12 min read May 04, 2026

Build a Custom MCP Server in 30 Minutes (Python + TypeScript)

Q: How long does it take to build a working MCP server?

A minimal MCP server with one tool takes 15-30 minutes. FastMCP gets you running in under 20 lines of Python; the TypeScript SDK takes ~40 lines because you wire Zod schemas explicitly.

Q: Do I need to learn JSON-RPC to build an MCP server?

No. Both FastMCP and the official MCP SDK abstract JSON-RPC away. You write tool functions with typed inputs; the SDK handles message framing, schema generation, and the handshake.

Q: What's the difference between MCP tools and OpenAI function calling?

Function calling is in-process; MCP is out-of-process. MCP tools live in a separate server that any MCP-compatible client can connect to (Claude Desktop, Cursor, ChatGPT, custom agents).

Q: Can I use MCP servers with ChatGPT and Gemini, or just Claude?

Both. ChatGPT added MCP connector support in 2025, and most major agent frameworks now speak MCP. 67% of CTOs surveyed expect MCP to be their default agent integration standard within 12 months.

Q: Is SSE transport still supported in MCP?

SSE is deprecated as of the 2025-03-26 MCP spec. New servers should use Streamable HTTP, which uses a single endpoint for both client requests and server streaming.

Q: How do I add authentication to an MCP server?

For stdio, auth is inherited from the process. For Streamable HTTP, use bearer tokens for service-to-service or OAuth 2.1 for end-user flows. FastMCP 3.0 includes granular auth primitives.

Q: What are the biggest mistakes when building an MCP server?

Using console.log() in a stdio server (corrupts stdout), throwing raw exceptions instead of MCP-shaped errors, and keeping session state in memory when running Streamable HTTP.

Q: Do I have to publish my MCP server to a public registry?

No. Many production MCP servers are internal-only. The public registry is for community discovery. Internal servers register directly in your team's Claude Desktop or Cursor configs.

Q: What's coming in the June 2026 MCP spec release?

Stateless Streamable HTTP: per-request context replaces the initialize handshake, sessions move to the data model layer, and servers scale horizontally without sticky sessions.

By Peter Foy

Build the same MCP server twice -- in Python (FastMCP) and TypeScript -- with a Notion search tool. Covers stdio, SSE, stateless HTTP, and Claude Desktop setup.

TL;DR

To build an MCP server, pick a language SDK (FastMCP for Python, the official @modelcontextprotocol/sdk for TypeScript), define typed tools with JSON Schema input validation, choose a transport (stdio for local, Streamable HTTP for remote), test with the MCP Inspector at localhost:6274, then register the server in Claude Desktop or Cursor via mcpServers JSON config.

FastMCP gives you a working Python MCP server in under 20 lines; the TypeScript SDK takes ~40 lines with Zod validation.
Use stdio transport for local tools, Streamable HTTP for remote and production deployments. SSE is deprecated.
MCP servers grew from 1,200 in Q1 2025 to 9,400+ by Q2 2026, a 7.8x expansion in 18 months.
Test every server with MCP Inspector (npx @modelcontextprotocol/inspector) before registering it in Claude Desktop or Cursor.
The June 2026 MCP spec release introduces stateless Streamable HTTP -- plan new servers without sticky sessions.

To build a custom MCP server, define tools as typed functions, validate inputs against a JSON Schema, and expose them over a transport (stdio, Streamable HTTP, or the new stateless HTTP shipping in the June 2026 spec). This guide builds the same server twice -- in Python with FastMCP and in TypeScript with the official @modelcontextprotocol/sdk -- exposing one tool that searches a Notion database. By the end you will have a server registered in Claude Desktop and Cursor, tested with the MCP Inspector, and ready to deploy.

What is an MCP server and why build one?

An MCP (Model Context Protocol) server is a process that exposes tools, resources, and prompts to an LLM client over JSON-RPC. The client (Claude Desktop, Cursor, ChatGPT) calls your server when the model needs to take an action -- query a database, hit an API, read a file -- and your server returns structured results.

MCP is now the default integration standard for agents. According to MCP Manager's 2026 adoption report, the public MCP server registry grew from 1,200 servers in Q1 2025 to 9,400+ by April 2026, with 78% of enterprise AI teams running at least one MCP-backed agent in production.

You build a custom MCP server when:

An off-the-shelf server in the official MCP servers list doesn't cover your internal API.
You want to expose a private database, queue, or service to Claude or Cursor without writing a one-off integration per client.
You need fine-grained auth, logging, or schema control that public servers don't offer.

If you are still deciding between MCP and OpenAI-style function calling, read MCP vs function calling first.

MCP Server Registry Growth (Q1 2025 to Q2 2026)

Q1 2025

1200

Q3 2025

3400

Q1 2026

5950

Q2 2026

9400

Source: MCP Manager Adoption Statistics, 2026

How does the MCP protocol work?

MCP uses JSON-RPC 2.0 over a transport. The client and server perform an initialize handshake, negotiate capabilities, then exchange messages: tools/list, tools/call, resources/read, prompts/get. The server can stream partial results back as Server-Sent Events when using HTTP.

Three primitives matter:

Tools -- functions the LLM can call. Each has a name, description, JSON Schema for inputs, and a handler. This is the most-used primitive.
Resources -- file-like blobs the client can read (database schemas, file contents, docs).
Prompts -- reusable templates the client can surface to users.

For the deeper protocol walkthrough, see our Model Context Protocol explained breakdown. For everything else in this guide, you only need tools.

What's the difference between FastMCP and the official MCP SDK?

FastMCP is a higher-level Python framework; the official MCP SDK is the lower-level reference implementation. FastMCP 1.0 was merged into the official Python SDK in 2024, but FastMCP continued as a separately maintained project and reached 3.0 on January 19, 2026 with componentversioning, granular auth, and OpenTelemetry support, per the Apigene FastMCP 3.0 review.

Feature	FastMCP (Python)	Official MCP SDK (Python / TS)
Abstraction level	High -- decorators, type-hint inference	Low -- explicit handlers, manual schemas
Boilerplate for one tool	~10 lines	~30 lines
Transport detection	Automatic via `fastmcp run`	Manual
Schema validation	Auto from type hints	Manual via Zod (TS) or pydantic (Py)
Best for	Servers with <50 tools, prototyping	Custom transports, protocol-level control
Auth, OTel, versioning	Built-in (3.0+)	Build it yourself

Use FastMCP for Python unless you have a specific reason to drop down. Use the official @modelcontextprotocol/sdk for TypeScript -- there's no equivalent high-level TS framework with the same maturity.

How do I build an MCP server in Python with FastMCP?

Build a FastMCP server in five steps: install the package, define a tool with a decorator, validate inputs via type hints, call the Notion API, and run with mcp.run(). The whole server is under 50 lines.

1. Install dependencies

uv add "fastmcp>=3.0" httpx python-dotenv

2. Define the server and a tool

Create server.py:

import os
import httpx
from fastmcp import FastMCP
from dotenv import load_dotenv

load_dotenv()
mcp = FastMCP("notion-search")

NOTION_TOKEN = os.environ["NOTION_TOKEN"]
NOTION_API = "https://api.notion.com/v1"
HEADERS = {
    "Authorization": f"Bearer {NOTION_TOKEN}",
    "Notion-Version": "2025-09-03",
    "Content-Type": "application/json",
}

@mcp.tool
async def search_notion(
    query: str,
    page_size: int = 10,
) -> list[dict]:
    """Search a Notion workspace by query string.

    Args:
        query: Free-text query to match titles and content.
        page_size: Max rows to return (1-100).
    """
    if not 1 <= page_size <= 100:
        raise ValueError("page_size must be between 1 and 100")

    async with httpx.AsyncClient(timeout=15) as client:
        r = await client.post(
            f"{NOTION_API}/search",
            headers=HEADERS,
            json={"query": query, "page_size": page_size},
        )
        r.raise_for_status()
        data = r.json()

    return [
        {
            "id": row["id"],
            "title": _title(row),
            "url": row.get("url"),
            "last_edited": row.get("last_edited_time"),
        }
        for row in data.get("results", [])
    ]

def _title(row: dict) -> str:
    props = row.get("properties", {})
    for prop in props.values():
        if prop.get("type") == "title":
            parts = prop.get("title", [])
            return "".join(p.get("plain_text", "") for p in parts)
    return row.get("id", "untitled")

if __name__ == "__main__":
    mcp.run()

3. What FastMCP does for you

Reads the type hints (query: str, page_size: int = 10) and generates the JSON Schema automatically.
Pulls the docstring and Args: block into the tool description Claude sees.
Handles async vs sync transparently.
Picks a transport based on the run mode (stdio by default).

4. Run it

uv run server.py

That's the whole server. We add error handling and Streamable HTTP transport in the deploy section.

How do I build the same MCP server in TypeScript?

Use @modelcontextprotocol/sdk plus Zod for input validation. The TypeScript version is more verbose than FastMCP because you wire schemas explicitly, but you get end-to-end type safety and zero runtime dependencies beyond the SDK and Zod.

1. Project setup

mkdir notion-mcp && cd notion-mcp
npm init -y
npm i @modelcontextprotocol/sdk zod
npm i -D typescript tsx @types/node
npx tsc --init

2. The server

Create src/server.ts:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const NOTION_TOKEN = process.env.NOTION_TOKEN;
if (!NOTION_TOKEN) throw new Error("NOTION_TOKEN is required");

const server = new McpServer({
  name: "notion-search",
  version: "1.0.0",
});

const SearchInput = z.object({
  query: z.string().min(1).describe("Free-text query"),
  page_size: z.number().int().min(1).max(100).default(10),
});

server.registerTool(
  "search_notion",
  {
    description: "Search a Notion workspace by query string.",
    inputSchema: SearchInput.shape,
  },
  async ({ query, page_size }) => {
    const res = await fetch("https://api.notion.com/v1/search", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${NOTION_TOKEN}`,
        "Notion-Version": "2025-09-03",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ query, page_size }),
    });

    if (!res.ok) {
      const body = await res.text();
      return {
        isError: true,
        content: [{ type: "text", text: `Notion API ${res.status}: ${body}` }],
      };
    }

    const data = await res.json();
    const rows = (data.results ?? []).map((r: any) => ({
      id: r.id,
      url: r.url,
      last_edited: r.last_edited_time,
    }));
    return { content: [{ type: "text", text: JSON.stringify(rows, null, 2) }] };
  },
);

const transport = new StdioServerTransport();
await server.connect(transport);
console.error("notion-search MCP server ready on stdio");

3. Critical stdio gotcha

Never use console.log() in a stdio server. It writes to stdout, which is reserved for JSON-RPC frames. Use console.error() (stderr) for all logs. The official MCP TypeScript SDK docs flag this as the most common bug in new servers.

4. Run

NOTION_TOKEN=ntn_xxx npx tsx src/server.ts

How do you validate tool inputs and handle errors?

Validate inputs at the schema layer, return MCP-shaped error responses for runtime failures. The protocol distinguishes two error categories: invalid arguments (rejected before your handler runs) and tool errors (your handler ran but the operation failed).

Schema-level validation

Python (FastMCP): type hints generate the schema. Add Annotated[int, Field(ge=1, le=100)] from pydantic for ranges.
TypeScript: Zod schemas convert to JSON Schema automatically. Use .min(), .max(), .regex(), .describe().

If the client sends an invalid argument, the SDK rejects it with -32602 before your handler is called. You don't write code for this case.

Tool-level errors

When the Notion API returns a 401 or 429, return an MCP error response, not a thrown exception. The TS example above sets isError: true and includes the error text in content. In FastMCP:

from mcp.server.fastmcp.exceptions import ToolError

if r.status_code == 401:
    raise ToolError("Notion token rejected. Check NOTION_TOKEN.")

FastMCP converts ToolError into a structured MCP error the client can render. Plain Exception instances become opaque internal errors -- the model can't recover from them.

Three error-handling rules

Never leak stack traces to the client. Catch and re-raise as ToolError (Python) or isError: true content (TS).
Surface remediation in the message. "Notion token rejected. Check NOTION_TOKEN." beats "401 Unauthorized."
Log to stderr, never stdout in stdio servers. Stdout corruption is the #1 cause of "server not connecting" reports.

Should I use stdio, SSE, or Streamable HTTP transport?

Use stdio for local tools, Streamable HTTP for remote and production. SSE is deprecated. The MCP transports spec and MCPcat's transport comparison align: SSE is being removed, and Streamable HTTP is the single remote transport going forward.

Transport	When to use	Latency	Auth	Status
stdio	Local CLI tools, single user, Claude Desktop / Cursor on dev machine	Microsecond (no network stack)	Inherited from process	Stable
Streamable HTTP	Remote access, teammates, CI/CD, production	Network RTT	Standard HTTP auth (bearer, OAuth)	Recommended (2025-03-26 spec)
SSE	Legacy clients only	Network RTT	HTTP	Deprecated -- migrate to Streamable HTTP
Stateless Streamable HTTP	Horizontally-scaled production behind a load balancer	Network RTT	HTTP	New in June 2026 spec (2026 roadmap)

Why stateless HTTP matters in 2026

The current Streamable HTTP transport keeps session state on a specific server instance. That fights load balancers and breaks horizontal scaling without sticky sessions. The official MCP 2026 roadmap targets a Q1 2026 SEP finalization for stateless Streamable HTTP, slated for the June 2026 spec release.

If you are starting a new production server today, design it to be stateless: no in-memory session state, all session info passed in headers or request bodies. Migration to the June 2026 spec will be a config change, not a rewrite.

Adding Streamable HTTP to the FastMCP server

if __name__ == "__main__":
    mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)

Adding it to the TypeScript server

import { StreamableHTTPServerTransport } from
  "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";

const app = express();
app.use(express.json());
const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
await server.connect(transport);
app.post("/mcp", (req, res) => transport.handleRequest(req, res, req.body));
app.listen(8000);

Setting sessionIdGenerator: undefined runs the transport in stateless mode -- the recommended default for new builds.

MCP Transport Choice for Production Servers (2026)

Streamable HTTP (recommended)

62%

stdio (local only)

28%

SSE (deprecated)

Stateless HTTP (June 2026 spec)

Source: MCPcat Transport Comparison + MCP 2026 Roadmap

How do you test an MCP server?

Test with the MCP Inspector before connecting any client. The Inspector is the official testing tool, runs via npx with no install, and exposes a UI at http://localhost:6274 for calling tools, reading resources, and watching the JSON-RPC trace.

Launch it

For the Python server:

npx @modelcontextprotocol/inspector uv run server.py

For the TypeScript server:

npx @modelcontextprotocol/inspector npx tsx src/server.ts

For a deployed Streamable HTTP server:

npx @modelcontextprotocol/inspector
# then point the UI at https://your-server.com/mcp

What to verify

Tools panel lists search_notion with description and schema.
Calling the tool with valid args returns rows. Check the JSON shape matches what your handler returns.
Invalid args (e.g. page_size: 999) get rejected by schema validation, not your handler.
The log panel at the bottom shows the JSON-RPC tools/call request and response with no protocol errors.

The Inspector spawns two processes: the React UI on port 6274 and a proxy on 6277. Both bind to localhost only. If you see "server not connecting," 90% of the time it's a stray console.log() in stdio mode or a missing env var.

How do you register an MCP server in Claude Desktop and Cursor?

Both clients use the same mcpServers JSON format. Drop in a command and args for stdio servers, or a url for remote Streamable HTTP servers, then restart the client.

Claude Desktop

Open Settings -> Developer -> Edit Config. The file lives at:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

For the Python server (stdio):

{
  "mcpServers": {
    "notion-search": {
      "command": "uv",
      "args": ["--directory", "/path/to/project", "run", "server.py"],
      "env": { "NOTION_TOKEN": "ntn_xxx" }
    }
  }
}

For the TypeScript server (stdio):

{
  "mcpServers": {
    "notion-search": {
      "command": "npx",
      "args": ["tsx", "/path/to/project/src/server.ts"],
      "env": { "NOTION_TOKEN": "ntn_xxx" }
    }
  }
}

For a deployed Streamable HTTP server, use Claude Desktop's UI: Search and Tools -> Manage Connectors -> Add Custom Connector, then paste the URL.

Cursor

Create .cursor/mcp.json in your project root or open Cursor Settings -> Tools & MCP. The format is identical:

{
  "mcpServers": {
    "notion-search": {
      "command": "uv",
      "args": ["--directory", "/path/to/project", "run", "server.py"],
      "env": { "NOTION_TOKEN": "ntn_xxx" }
    }
  }
}

Restart the client after every config edit. Claude Desktop won't reload MCP configs hot. Cursor sometimes does, but a full restart is the safe bet.

How do you deploy an MCP server in production?

For production, deploy the Streamable HTTP transport behind your existing API gateway, with bearer-token or OAuth auth, and design the server to be stateless. The deployment shape depends on your traffic and team:

Option 1: Single VM or container

Fastest path. Drop the server in a Docker container, expose port 8000, terminate TLS at a reverse proxy (Caddy, nginx). Good for internal tools with <100 QPS.

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install uv && uv sync
CMD ["uv", "run", "server.py"]

Option 2: Serverless (Cloud Run, Lambda, Workers)

Works well when your server is stateless. Cloud Run scales to zero, charges per request, and natively supports Streamable HTTP. This is the deployment target the June 2026 stateless HTTP spec is designed for.

Option 3: Kubernetes with horizontal scaling

For servers handling >1k QPS or shared across teams. Run multiple replicas behind a load balancer. Avoid sticky sessions -- design every request to carry its own context. The MCP team's transport future post recommends moving session info to the data model layer (e.g. signed cookies or request headers) rather than the transport.

Production checklist

[ ] Auth. Bearer tokens for service-to-service, OAuth for end-user clients.
[ ] Rate limiting. Per-token, not per-IP -- agent traffic comes from a few cloud egress IPs.
[ ] Structured logging to stderr (Python) or your logger of choice (TS). Never stdout in stdio mode.
[ ] Tracing. FastMCP 3.0 ships OpenTelemetry instrumentation; for TS, wrap handlers manually.
[ ] Schema versioning. When you change a tool's input schema, bump the server version so clients can detect drift.
[ ] MCP Inspector smoke test in CI. Boot the server, call every tool, assert the response shape.

Feature	FastMCP (Python)	Official SDK (TypeScript)
Lines of code for one tool	~20	~40
Schema generation	Auto from type hints	Manual via Zod
Async support	Built-in	Native (async/await)
Transport detection	Auto via fastmcp run	Explicit (stdio, HTTP)
Auth, OTel, versioning	Built-in (3.0+, Jan 2026)	Build it yourself
Best for	Prototyping, <50 tools	Custom transports, full control
Stateless HTTP ready	Yes (transport=streamable-http)	Yes (sessionIdGenerator: undefined)
Active maintenance	PrefectHQ + community	Anthropic + community

Frequently asked questions

How long does it take to build a working MCP server?

A minimal MCP server with one tool takes 15-30 minutes if you know the language. FastMCP gets you to a running server in under 20lines of Python; the TypeScript SDK takes ~40 lines because you wire Zod schemas explicitly. Most of the time goes into testing with the MCP Inspector and registering the server in your client, not writing protocol code.

Do I need to learn JSON-RPC to build an MCP server?

No. Both FastMCP and the official MCP SDK abstract JSON-RPC away. You write tool functions with typed inputs; the SDK handles message framing, schema generation, the initialize handshake, and error envelopes. You only need to understand JSON-RPC if you're implementing a custom transport, which is rare.

What's the difference between MCP tools and OpenAI function calling?

Function calling is in-process: the LLM emits a JSON blob and your application code dispatches it. MCP is out-of-process: tools live in a separate server that any MCP-compatible client can connect to (Claude Desktop, Cursor, ChatGPT, custom agents). MCP also adds resources and prompts as first-class primitives, plus a standard discovery and auth flow. See our MCP vs function calling breakdown for a full comparison.

Can I use MCP servers with ChatGPT and Gemini, or just Claude?

Both. ChatGPT added MCP connector support in 2025, and most major agent frameworks (Cursor, Cline, Continue, LangChain, LlamaIndex) now speak MCP. According to MCP Manager's 2026 data, 67% of CTOs surveyed expect MCP to be their default agent integration standard within 12 months. Build once, connect anywhere.

Is SSE transport still supported in MCP?

SSE is deprecated as of the 2025-03-26 MCP spec. New servers should use Streamable HTTP, which uses a single endpoint for both client requests and server streaming. SSE servers still work for now, but the official roadmap removes them in favor of stateless Streamable HTTP in the June 2026 release.

How do I add authentication to an MCP server?

For stdio servers, auth is inherited from the process (env vars, OS user). For Streamable HTTP servers, use standard HTTP auth: bearer tokens for service-to-service, OAuth 2.1 for end-user flows. FastMCP 3.0 (released January 2026) includes granular authorization primitives; for TypeScript, wrap the Express handler with your existing auth middleware.

What are the biggest mistakes when building an MCP server?

Three recurring failures: (1) using console.log() in a stdio server -- it corrupts the JSON-RPC frame on stdout, use console.error() instead; (2) throwing raw exceptions instead of returning MCP-shaped errors, which makes the model unable to recover; (3) keeping session state in process memory when running Streamable HTTP, which breaks horizontal scaling. Test every server with MCP Inspector before shipping.

Do I have to publish my MCP server to a public registry?

No. Many production MCP servers are internal-only, used by a single company's agents. The public registry (which grew from 1,200 to 9,400+ servers between Q1 2025 and Q2 2026) is for community discovery. Internal servers register directly in your team's Claude Desktop or Cursor configs and skip the registry entirely.

What's coming in the June 2026 MCP spec release?

The headline change is stateless Streamable HTTP: the initialize handshake gets replaced with per-request context, sessions move to the data model layer, and servers can scale horizontally behind load balancers without sticky sessions. The MCP team's 2026 roadmap targets Q1 2026 for SEP finalization and June 2026 for the spec release. Design new servers stateless today to make migration trivial.

Place after the production deployment section. Frame it as: 'Ship the server, then ship the docs that get cited. Growth Engineer's AEO agent plans, writes, and tracks the content that AI engines actually quote.'

Get the AEO content engine that ranks pages like this in ChatGPT and Perplexity