To build a custom MCP server, define tools as typed functions, validate inputs against a JSON Schema, and expose them over a transport (stdio, Streamable HTTP, or the new stateless HTTP shipping in the June 2026 spec). This guide builds the same server twice -- in Python with FastMCP and in TypeScript with the official @modelcontextprotocol/sdk -- exposing one tool that searches a Notion database. By the end you will have a server registered in Claude Desktop and Cursor, tested with the MCP Inspector, and ready to deploy.
What is an MCP server and why build one?
An MCP (Model Context Protocol) server is a process that exposes tools, resources, and prompts to an LLM client over JSON-RPC. The client (Claude Desktop, Cursor, ChatGPT) calls your server when the model needs to take an action -- query a database, hit an API, read a file -- and your server returns structured results.
MCP is now the default integration standard for agents. According to MCP Manager's 2026 adoption report, the public MCP server registry grew from 1,200 servers in Q1 2025 to 9,400+ by April 2026, with 78% of enterprise AI teams running at least one MCP-backed agent in production.
You build a custom MCP server when:
- An off-the-shelf server in the official MCP servers list doesn't cover your internal API.
- You want to expose a private database, queue, or service to Claude or Cursor without writing a one-off integration per client.
- You need fine-grained auth, logging, or schema control that public servers don't offer.
If you are still deciding between MCP and OpenAI-style function calling, read MCP vs function calling first.
How does the MCP protocol work?
MCP uses JSON-RPC 2.0 over a transport. The client and server perform an initialize handshake, negotiate capabilities, then exchange messages: tools/list, tools/call, resources/read, prompts/get. The server can stream partial results back as Server-Sent Events when using HTTP.
Three primitives matter:
- Tools -- functions the LLM can call. Each has a name, description, JSON Schema for inputs, and a handler. This is the most-used primitive.
- Resources -- file-like blobs the client can read (database schemas, file contents, docs).
- Prompts -- reusable templates the client can surface to users.
For the deeper protocol walkthrough, see our Model Context Protocol explained breakdown. For everything else in this guide, you only need tools.
What's the difference between FastMCP and the official MCP SDK?
FastMCP is a higher-level Python framework; the official MCP SDK is the lower-level reference implementation. FastMCP 1.0 was merged into the official Python SDK in 2024, but FastMCP continued as a separately maintained project and reached 3.0 on January 19, 2026 with componentversioning, granular auth, and OpenTelemetry support, per the Apigene FastMCP 3.0 review.
| Feature | FastMCP (Python) | Official MCP SDK (Python / TS) |
|---|---|---|
| Abstraction level | High -- decorators, type-hint inference | Low -- explicit handlers, manual schemas |
| Boilerplate for one tool | ~10 lines | ~30 lines |
| Transport detection | Automatic via fastmcp run |
Manual |
| Schema validation | Auto from type hints | Manual via Zod (TS) or pydantic (Py) |
| Best for | Servers with <50 tools, prototyping | Custom transports, protocol-level control |
| Auth, OTel, versioning | Built-in (3.0+) | Build it yourself |
Use FastMCP for Python unless you have a specific reason to drop down. Use the official @modelcontextprotocol/sdk for TypeScript -- there's no equivalent high-level TS framework with the same maturity.
How do I build an MCP server in Python with FastMCP?
Build a FastMCP server in five steps: install the package, define a tool with a decorator, validate inputs via type hints, call the Notion API, and run with mcp.run(). The whole server is under 50 lines.
1. Install dependencies
uv add "fastmcp>=3.0" httpx python-dotenv
2. Define the server and a tool
Create server.py:
import os
import httpx
from fastmcp import FastMCP
from dotenv import load_dotenv
load_dotenv()
mcp = FastMCP("notion-search")
NOTION_TOKEN = os.environ["NOTION_TOKEN"]
NOTION_API = "https://api.notion.com/v1"
HEADERS = {
"Authorization": f"Bearer {NOTION_TOKEN}",
"Notion-Version": "2025-09-03",
"Content-Type": "application/json",
}
@mcp.tool
async def search_notion(
query: str,
page_size: int = 10,
) -> list[dict]:
"""Search a Notion workspace by query string.
Args:
query: Free-text query to match titles and content.
page_size: Max rows to return (1-100).
"""
if not 1 <= page_size <= 100:
raise ValueError("page_size must be between 1 and 100")
async with httpx.AsyncClient(timeout=15) as client:
r = await client.post(
f"{NOTION_API}/search",
headers=HEADERS,
json={"query": query, "page_size": page_size},
)
r.raise_for_status()
data = r.json()
return [
{
"id": row["id"],
"title": _title(row),
"url": row.get("url"),
"last_edited": row.get("last_edited_time"),
}
for row in data.get("results", [])
]
def _title(row: dict) -> str:
props = row.get("properties", {})
for prop in props.values():
if prop.get("type") == "title":
parts = prop.get("title", [])
return "".join(p.get("plain_text", "") for p in parts)
return row.get("id", "untitled")
if __name__ == "__main__":
mcp.run()
3. What FastMCP does for you
- Reads the type hints (
query: str,page_size: int = 10) and generates the JSON Schema automatically. - Pulls the docstring and
Args:block into the tool description Claude sees. - Handles async vs sync transparently.
- Picks a transport based on the run mode (stdio by default).
4. Run it
uv run server.py
That's the whole server. We add error handling and Streamable HTTP transport in the deploy section.
How do I build the same MCP server in TypeScript?
Use @modelcontextprotocol/sdk plus Zod for input validation. The TypeScript version is more verbose than FastMCP because you wire schemas explicitly, but you get end-to-end type safety and zero runtime dependencies beyond the SDK and Zod.
1. Project setup
mkdir notion-mcp && cd notion-mcp
npm init -y
npm i @modelcontextprotocol/sdk zod
npm i -D typescript tsx @types/node
npx tsc --init
2. The server
Create src/server.ts:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const NOTION_TOKEN = process.env.NOTION_TOKEN;
if (!NOTION_TOKEN) throw new Error("NOTION_TOKEN is required");
const server = new McpServer({
name: "notion-search",
version: "1.0.0",
});
const SearchInput = z.object({
query: z.string().min(1).describe("Free-text query"),
page_size: z.number().int().min(1).max(100).default(10),
});
server.registerTool(
"search_notion",
{
description: "Search a Notion workspace by query string.",
inputSchema: SearchInput.shape,
},
async ({ query, page_size }) => {
const res = await fetch("https://api.notion.com/v1/search", {
method: "POST",
headers: {
Authorization: `Bearer ${NOTION_TOKEN}`,
"Notion-Version": "2025-09-03",
"Content-Type": "application/json",
},
body: JSON.stringify({ query, page_size }),
});
if (!res.ok) {
const body = await res.text();
return {
isError: true,
content: [{ type: "text", text: `Notion API ${res.status}: ${body}` }],
};
}
const data = await res.json();
const rows = (data.results ?? []).map((r: any) => ({
id: r.id,
url: r.url,
last_edited: r.last_edited_time,
}));
return { content: [{ type: "text", text: JSON.stringify(rows, null, 2) }] };
},
);
const transport = new StdioServerTransport();
await server.connect(transport);
console.error("notion-search MCP server ready on stdio");
3. Critical stdio gotcha
Never use console.log() in a stdio server. It writes to stdout, which is reserved for JSON-RPC frames. Use console.error() (stderr) for all logs. The official MCP TypeScript SDK docs flag this as the most common bug in new servers.
4. Run
NOTION_TOKEN=ntn_xxx npx tsx src/server.ts
How do you validate tool inputs and handle errors?
Validate inputs at the schema layer, return MCP-shaped error responses for runtime failures. The protocol distinguishes two error categories: invalid arguments (rejected before your handler runs) and tool errors (your handler ran but the operation failed).
Schema-level validation
- Python (FastMCP): type hints generate the schema. Add
Annotated[int, Field(ge=1, le=100)]from pydantic for ranges. - TypeScript: Zod schemas convert to JSON Schema automatically. Use
.min(),.max(),.regex(),.describe().
If the client sends an invalid argument, the SDK rejects it with -32602 before your handler is called. You don't write code for this case.
Tool-level errors
When the Notion API returns a 401 or 429, return an MCP error response, not a thrown exception. The TS example above sets isError: true and includes the error text in content. In FastMCP:
from mcp.server.fastmcp.exceptions import ToolError
if r.status_code == 401:
raise ToolError("Notion token rejected. Check NOTION_TOKEN.")
FastMCP converts ToolError into a structured MCP error the client can render. Plain Exception instances become opaque internal errors -- the model can't recover from them.
Three error-handling rules
- Never leak stack traces to the client. Catch and re-raise as
ToolError(Python) orisError: truecontent (TS). - Surface remediation in the message. "Notion token rejected. Check NOTION_TOKEN." beats "401 Unauthorized."
- Log to stderr, never stdout in stdio servers. Stdout corruption is the #1 cause of "server not connecting" reports.
Should I use stdio, SSE, or Streamable HTTP transport?
Use stdio for local tools, Streamable HTTP for remote and production. SSE is deprecated. The MCP transports spec and MCPcat's transport comparison align: SSE is being removed, and Streamable HTTP is the single remote transport going forward.
| Transport | When to use | Latency | Auth | Status |
|---|---|---|---|---|
| stdio | Local CLI tools, single user, Claude Desktop / Cursor on dev machine | Microsecond (no network stack) | Inherited from process | Stable |
| Streamable HTTP | Remote access, teammates, CI/CD, production | Network RTT | Standard HTTP auth (bearer, OAuth) | Recommended (2025-03-26 spec) |
| SSE | Legacy clients only | Network RTT | HTTP | Deprecated -- migrate to Streamable HTTP |
| Stateless Streamable HTTP | Horizontally-scaled production behind a load balancer | Network RTT | HTTP | New in June 2026 spec (2026 roadmap) |
Why stateless HTTP matters in 2026
The current Streamable HTTP transport keeps session state on a specific server instance. That fights load balancers and breaks horizontal scaling without sticky sessions. The official MCP 2026 roadmap targets a Q1 2026 SEP finalization for stateless Streamable HTTP, slated for the June 2026 spec release.
If you are starting a new production server today, design it to be stateless: no in-memory session state, all session info passed in headers or request bodies. Migration to the June 2026 spec will be a config change, not a rewrite.
Adding Streamable HTTP to the FastMCP server
if __name__ == "__main__":
mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)
Adding it to the TypeScript server
import { StreamableHTTPServerTransport } from
"@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";
const app = express();
app.use(express.json());
const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
await server.connect(transport);
app.post("/mcp", (req, res) => transport.handleRequest(req, res, req.body));
app.listen(8000);
Setting sessionIdGenerator: undefined runs the transport in stateless mode -- the recommended default for new builds.
How do you test an MCP server?
Test with the MCP Inspector before connecting any client. The Inspector is the official testing tool, runs via npx with no install, and exposes a UI at http://localhost:6274 for calling tools, reading resources, and watching the JSON-RPC trace.
Launch it
For the Python server:
npx @modelcontextprotocol/inspector uv run server.py
For the TypeScript server:
npx @modelcontextprotocol/inspector npx tsx src/server.ts
For a deployed Streamable HTTP server:
npx @modelcontextprotocol/inspector
# then point the UI at https://your-server.com/mcp
What to verify
- Tools panel lists
search_notionwith description and schema. - Calling the tool with valid args returns rows. Check the JSON shape matches what your handler returns.
- Invalid args (e.g.
page_size: 999) get rejected by schema validation, not your handler. - The log panel at the bottom shows the JSON-RPC
tools/callrequest and response with no protocol errors.
The Inspector spawns two processes: the React UI on port 6274 and a proxy on 6277. Both bind to localhost only. If you see "server not connecting," 90% of the time it's a stray console.log() in stdio mode or a missing env var.
How do you register an MCP server in Claude Desktop and Cursor?
Both clients use the same mcpServers JSON format. Drop in a command and args for stdio servers, or a url for remote Streamable HTTP servers, then restart the client.
Claude Desktop
Open Settings -> Developer -> Edit Config. The file lives at:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
For the Python server (stdio):
{
"mcpServers": {
"notion-search": {
"command": "uv",
"args": ["--directory", "/path/to/project", "run", "server.py"],
"env": { "NOTION_TOKEN": "ntn_xxx" }
}
}
}
For the TypeScript server (stdio):
{
"mcpServers": {
"notion-search": {
"command": "npx",
"args": ["tsx", "/path/to/project/src/server.ts"],
"env": { "NOTION_TOKEN": "ntn_xxx" }
}
}
}
For a deployed Streamable HTTP server, use Claude Desktop's UI: Search and Tools -> Manage Connectors -> Add Custom Connector, then paste the URL.
Cursor
Create .cursor/mcp.json in your project root or open Cursor Settings -> Tools & MCP. The format is identical:
{
"mcpServers": {
"notion-search": {
"command": "uv",
"args": ["--directory", "/path/to/project", "run", "server.py"],
"env": { "NOTION_TOKEN": "ntn_xxx" }
}
}
}
Restart the client after every config edit. Claude Desktop won't reload MCP configs hot. Cursor sometimes does, but a full restart is the safe bet.
How do you deploy an MCP server in production?
For production, deploy the Streamable HTTP transport behind your existing API gateway, with bearer-token or OAuth auth, and design the server to be stateless. The deployment shape depends on your traffic and team:
Option 1: Single VM or container
Fastest path. Drop the server in a Docker container, expose port 8000, terminate TLS at a reverse proxy (Caddy, nginx). Good for internal tools with <100 QPS.
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install uv && uv sync
CMD ["uv", "run", "server.py"]
Option 2: Serverless (Cloud Run, Lambda, Workers)
Works well when your server is stateless. Cloud Run scales to zero, charges per request, and natively supports Streamable HTTP. This is the deployment target the June 2026 stateless HTTP spec is designed for.
Option 3: Kubernetes with horizontal scaling
For servers handling >1k QPS or shared across teams. Run multiple replicas behind a load balancer. Avoid sticky sessions -- design every request to carry its own context. The MCP team's transport future post recommends moving session info to the data model layer (e.g. signed cookies or request headers) rather than the transport.
Production checklist
- [ ] Auth. Bearer tokens for service-to-service, OAuth for end-user clients.
- [ ] Rate limiting. Per-token, not per-IP -- agent traffic comes from a few cloud egress IPs.
- [ ] Structured logging to stderr (Python) or your logger of choice (TS). Never stdout in stdio mode.
- [ ] Tracing. FastMCP 3.0 ships OpenTelemetry instrumentation; for TS, wrap handlers manually.
- [ ] Schema versioning. When you change a tool's input schema, bump the server
versionso clients can detect drift. - [ ] MCP Inspector smoke test in CI. Boot the server, call every tool, assert the response shape.
| Feature | FastMCP (Python) | Official SDK (TypeScript) |
|---|---|---|
| Lines of code for one tool | ~20 | ~40 |
| Schema generation | Auto from type hints | Manual via Zod |
| Async support | Built-in | Native (async/await) |
| Transport detection | Auto via fastmcp run | Explicit (stdio, HTTP) |
| Auth, OTel, versioning | Built-in (3.0+, Jan 2026) | Build it yourself |
| Best for | Prototyping, <50 tools | Custom transports, full control |
| Stateless HTTP ready | Yes (transport=streamable-http) | Yes (sessionIdGenerator: undefined) |
| Active maintenance | PrefectHQ + community | Anthropic + community |