Fetch MCP Server
Reference MCP server that fetches web content from URLs and converts HTML to markdown for efficient LLM consumption.
Fetch is the official reference MCP server maintained by the Model Context Protocol project. It gives an AI agent the ability to retrieve content from a URL and return it as markdown, stripping boilerplate HTML and producing a representation that fits well into an LLM context window. It is one of the simplest and most widely used MCP servers, often paired with research or browsing workflows.
The server exposes a single fetch tool that accepts a URL, optional max_length, optional start_index, and an optional raw flag. Long pages can be read in chunks by advancing start_index, which is useful when content exceeds the default 5000 character limit. The server can be configured with a custom user agent, a proxy URL, or with robots.txt enforcement disabled via command-line flags.
It is distributed as a Python package (mcp-server-fetch) and is most commonly run with uvx, but it can also be installed via pip or run from Docker. Node.js is an optional dependency that improves the HTML simplification step. The README notes that the server can reach local and internal IP addresses, which is a security consideration when exposing it to untrusted prompts.
Tools
| Tool | Description |
|---|---|
fetch |
Retrieves a URL and returns its content converted to markdown. Supports chunked reading of long pages via start_index, character limits via max_length, and a raw mode that skips HTML to markdown conversion. |
Installation
The server is published as the Python package mcp-server-fetch. The recommended way to run it is with uvx, which fetches and runs it without a manual install.
Using uvx (recommended)
{
"mcpServers": {
"fetch": {
"command": "uvx",
"args": ["mcp-server-fetch"]
}
}
}
Using pip
pip install mcp-server-fetch
Then configure:
{
"mcpServers": {
"fetch": {
"command": "python",
"args": ["-m", "mcp_server_fetch"]
}
}
}
VS Code (.vscode/mcp.json)
{
"mcp": {
"servers": {
"fetch": {
"command": "uvx",
"args": ["mcp-server-fetch"]
}
}
}
}
Optional command-line arguments
--ignore-robots-txt: skip robots.txt checks--user-agent=VALUE: override the default user agent--proxy-url=VALUE: route requests through a proxy
Notes
- Node.js is not required, but installing it improves HTML to markdown conversion robustness.
- On Windows, set
PYTHONIOENCODING=utf-8if you hit character encoding timeouts. - Debug with:
npx @modelcontextprotocol/inspector uvx mcp-server-fetch
- Pull a documentation page or blog post into the conversation and have the agent summarize, extract code samples, or compare it to another source.
- Read a long article in chunks by repeatedly calling
fetchwith an advancingstart_indexwhen content exceeds the default 5000 character window. - Inspect raw HTML for debugging or scraping work by setting
raw: trueinstead of returning markdown. - Route fetches through a proxy or custom user agent for accessing region-restricted or rate-limited pages in agent workflows.
- Provide lightweight web grounding for LLM responses without needing a full browser automation stack.
- "Fetch https://modelcontextprotocol.io/introduction and give me a 5 bullet summary."
- "Read the changelog at https://example.com/releases and tell me what changed in the last 3 versions."
- "Fetch this page in raw mode so I can inspect the HTML structure: https://example.com"
- "The page is long. Fetch the next 5000 characters starting at index 5000 and continue summarizing."
- "Compare the pricing pages at https://a.com/pricing and https://b.com/pricing and tell me which is cheaper for 10 seats."
- Official reference implementation maintained by the Model Context Protocol project, so behavior tracks the spec closely.
- Zero setup: with
uvx, the server runs without an install step and needs no API keys. - HTML to markdown conversion plus chunked reading via
start_indexkeeps long pages usable within context limits. - Simple, focused surface area (one tool) makes it easy to combine with other MCP servers.
- Only retrieves static HTML. JavaScript-rendered pages, logins, and dynamic content are not handled. A headless browser server is needed for those.
- Can reach local and internal IP addresses by default, which the README flags as a security risk for untrusted prompts.
- No caching, search, or crawling. Each call fetches one URL and you must orchestrate pagination yourself.
- Firecrawl MCP: scraping, crawling, and search with JS rendering.
- Puppeteer MCP or Playwright-based servers for JavaScript-heavy pages and interaction.
- Tavily MCP: web search and extraction tuned for LLM agents.