Docling MCP MCP Server
MCP server that turns Docling into an agent-accessible toolkit for converting PDFs to structured documents and generating DoclingDocument files programmatically.
Docling MCP is the official Model Context Protocol server for Docling, the open-source document processing library originally created at IBM Research and now hosted by the LF AI & Data Foundation. It exposes Docling's PDF parsing and document generation capabilities as MCP tools, allowing AI agents to convert PDFs into a structured DoclingDocument JSON representation and to author new documents programmatically (titles, headings, paragraphs, lists) before exporting to formats like Markdown.
The v2.0 release introduces a hybrid architecture with three operational modes: a lightweight remote mode (default, around 50MB) that delegates conversion to a hosted Docling service, a local mode that installs the full conversion stack for offline use, and a hybrid mode that uses a remote service with local fallback. The server supports stdio, SSE, and streamable-HTTP transports, and includes a document cache plus optional Milvus integration for RAG workflows.
Because Docling is purpose-built for GenAI pipelines, this MCP server is most useful when an agent needs to ingest source PDFs, extract layout-aware structure (tables, sections, lists), and feed clean Markdown into downstream LLM steps or vector stores.
Tools
| Tool | Description |
|---|---|
convert_pdf_document |
Converts a PDF document from a file path or URL into a structured DoclingDocument JSON. |
create_new_docling_document |
Creates a new empty DoclingDocument in the cache for incremental authoring. |
add_title_to_docling_document |
Adds a title element to an existing DoclingDocument. |
add_section_heading_to_docling_document |
Adds a section heading at a given level to a DoclingDocument. |
add_paragraph_to_docling_document |
Appends a paragraph of body text to a DoclingDocument. |
open_list_in_docling_document |
Opens a new list block (ordered or unordered) in a DoclingDocument. |
add_listitem_to_list_in_docling_document |
Adds a list item to the currently open list in a DoclingDocument. |
close_list_in_docling_document |
Closes the currently open list in a DoclingDocument. |
export_docling_document_to_markdown |
Exports a cached DoclingDocument to Markdown format. |
Prerequisites
- Python 3.10+ and
uv/uvxorpip - For remote mode: access to a hosted Docling conversion service and API key
- For local mode: extra dependencies installed via the
[local]extra
Installation
Remote mode (default, lightweight):
pip install docling-mcp
export DOCLING_SERVICE_URL=https://your-docling-service.example.com
export DOCLING_SERVICE_API_KEY=your-api-key-here
export DOCLING_CONVERSION_MODE=remote
Local mode (offline, full conversion stack):
pip install "docling-mcp[local]"
export DOCLING_CONVERSION_MODE=local
Hybrid mode (remote with local fallback):
pip install "docling-mcp[local]"
export DOCLING_SERVICE_URL=https://your-docling-service.example.com
export DOCLING_CONVERSION_MODE=remote
export DOCLING_FALLBACK_TO_LOCAL=true
MCP Client Configuration
Add to claude_desktop_config.json (or your MCP client's equivalent):
{
"mcpServers": {
"docling": {
"command": "uvx",
"args": ["--from=docling-mcp", "docling-mcp-server"]
}
}
}
Transport Options
The server can be launched with different transports:
uvx --from docling-mcp docling-mcp-server --transport stdio
uvx --from docling-mcp docling-mcp-server --transport sse
uvx --from docling-mcp docling-mcp-server --transport streamable-http
Compatible with Claude for Desktop, LM Studio, and other MCP-supporting clients.
- Convert a batch of research PDFs into structured Markdown so an agent can summarize them or feed them into a RAG index
- Extract layout-aware content (sections, lists, tables) from contracts or reports for downstream LLM analysis
- Programmatically generate a new DoclingDocument (title, headings, paragraphs, lists) and export it to Markdown as part of an agent workflow
- Run document ingestion offline in local mode for sensitive material that cannot leave a controlled environment
- Pair with Milvus to build a RAG pipeline that ingests PDFs and serves retrieval to an MCP-aware agent
- "Convert this PDF at https://arxiv.org/pdf/2408.09869 into a DoclingDocument and export it to Markdown."
- "Create a new DoclingDocument titled 'Q2 Report', add a section heading 'Summary', and append three paragraphs from the source PDF."
- "Parse the attached contract PDF and give me a clean Markdown version, preserving the section structure."
- "Build a structured document with a numbered list of the five key findings from this whitepaper, then export it to Markdown."
- "Ingest these PDFs into the Milvus collection so I can query them later."
- Official MCP server for Docling, maintained under the
docling-projectGitHub org and the LF AI & Data Foundation - Flexible deployment: lightweight remote mode (~50MB), full offline local mode, or hybrid with automatic fallback
- Supports multiple MCP transports (stdio, SSE, streamable-HTTP) and integrates with Milvus for RAG
- Exposes both ingestion (PDF to structured JSON) and document authoring tools in one server
- Remote mode requires a separately hosted Docling conversion service plus API key, which the project does not provide for free
- Local mode pulls in heavy ML dependencies and model downloads (~500MB)
- Document authoring tools are stateful and operate on cached document IDs, which can be awkward for one-shot agent calls
- zanetworker/mcp-docling: community MCP wrapper around Docling with similar conversion features
- saleemh/doc-ingestor: Docling-based MCP server focused on PDF/DOCX/image/audio ingestion to Markdown, optimized for Apple Silicon
- kgand/document-parser-mcp: lightweight MCP server for universal document-to-Markdown ingestion