Docling MCP MCP Server

MCP server that turns Docling into an agent-accessible toolkit for converting PDFs to structured documents and generating DoclingDocument files programmatically.

Collaboration by Docling Project (LF AI & Data Foundation) API Key active

GitHub Docs

Overview

Docling MCP is the official Model Context Protocol server for Docling, the open-source document processing library originally created at IBM Research and now hosted by the LF AI & Data Foundation. It exposes Docling's PDF parsing and document generation capabilities as MCP tools, allowing AI agents to convert PDFs into a structured DoclingDocument JSON representation and to author new documents programmatically (titles, headings, paragraphs, lists) before exporting to formats like Markdown.

The v2.0 release introduces a hybrid architecture with three operational modes: a lightweight remote mode (default, around 50MB) that delegates conversion to a hosted Docling service, a local mode that installs the full conversion stack for offline use, and a hybrid mode that uses a remote service with local fallback. The server supports stdio, SSE, and streamable-HTTP transports, and includes a document cache plus optional Milvus integration for RAG workflows.

Because Docling is purpose-built for GenAI pipelines, this MCP server is most useful when an agent needs to ingest source PDFs, extract layout-aware structure (tables, sections, lists), and feed clean Markdown into downstream LLM steps or vector stores.

Tools

Tool	Description
`convert_pdf_document`	Converts a PDF document from a file path or URL into a structured DoclingDocument JSON.
`create_new_docling_document`	Creates a new empty DoclingDocument in the cache for incremental authoring.
`add_title_to_docling_document`	Adds a title element to an existing DoclingDocument.
`add_section_heading_to_docling_document`	Adds a section heading at a given level to a DoclingDocument.
`add_paragraph_to_docling_document`	Appends a paragraph of body text to a DoclingDocument.
`open_list_in_docling_document`	Opens a new list block (ordered or unordered) in a DoclingDocument.
`add_listitem_to_list_in_docling_document`	Adds a list item to the currently open list in a DoclingDocument.
`close_list_in_docling_document`	Closes the currently open list in a DoclingDocument.
`export_docling_document_to_markdown`	Exports a cached DoclingDocument to Markdown format.

Setup Guide

Prerequisites

Python 3.10+ and uv/uvx or pip
For remote mode: access to a hosted Docling conversion service and API key
For local mode: extra dependencies installed via the [local] extra

Installation

Remote mode (default, lightweight):

pip install docling-mcp
export DOCLING_SERVICE_URL=https://your-docling-service.example.com
export DOCLING_SERVICE_API_KEY=your-api-key-here
export DOCLING_CONVERSION_MODE=remote

Local mode (offline, full conversion stack):

pip install "docling-mcp[local]"
export DOCLING_CONVERSION_MODE=local

Hybrid mode (remote with local fallback):

pip install "docling-mcp[local]"
export DOCLING_SERVICE_URL=https://your-docling-service.example.com
export DOCLING_CONVERSION_MODE=remote
export DOCLING_FALLBACK_TO_LOCAL=true

MCP Client Configuration

Add to claude_desktop_config.json (or your MCP client's equivalent):

{
  "mcpServers": {
    "docling": {
      "command": "uvx",
      "args": ["--from=docling-mcp", "docling-mcp-server"]
    }
  }
}

Transport Options

The server can be launched with different transports:

uvx --from docling-mcp docling-mcp-server --transport stdio
uvx --from docling-mcp docling-mcp-server --transport sse
uvx --from docling-mcp docling-mcp-server --transport streamable-http

Compatible with Claude for Desktop, LM Studio, and other MCP-supporting clients.

Use Cases

Convert a batch of research PDFs into structured Markdown so an agent can summarize them or feed them into a RAG index
Extract layout-aware content (sections, lists, tables) from contracts or reports for downstream LLM analysis
Programmatically generate a new DoclingDocument (title, headings, paragraphs, lists) and export it to Markdown as part of an agent workflow
Run document ingestion offline in local mode for sensitive material that cannot leave a controlled environment
Pair with Milvus to build a RAG pipeline that ingests PDFs and serves retrieval to an MCP-aware agent

Example Prompts

"Convert this PDF at https://arxiv.org/pdf/2408.09869 into a DoclingDocument and export it to Markdown."
"Create a new DoclingDocument titled 'Q2 Report', add a section heading 'Summary', and append three paragraphs from the source PDF."
"Parse the attached contract PDF and give me a clean Markdown version, preserving the section structure."
"Build a structured document with a numbered list of the five key findings from this whitepaper, then export it to Markdown."
"Ingest these PDFs into the Milvus collection so I can query them later."

Pros

Official MCP server for Docling, maintained under the docling-project GitHub org and the LF AI & Data Foundation
Flexible deployment: lightweight remote mode (~50MB), full offline local mode, or hybrid with automatic fallback
Supports multiple MCP transports (stdio, SSE, streamable-HTTP) and integrates with Milvus for RAG
Exposes both ingestion (PDF to structured JSON) and document authoring tools in one server

Limitations

Remote mode requires a separately hosted Docling conversion service plus API key, which the project does not provide for free
Local mode pulls in heavy ML dependencies and model downloads (~500MB)
Document authoring tools are stateful and operate on cached document IDs, which can be awkward for one-shot agent calls

Alternatives

zanetworker/mcp-docling: community MCP wrapper around Docling with similar conversion features
saleemh/doc-ingestor: Docling-based MCP server focused on PDF/DOCX/image/audio ingestion to Markdown, optimized for Apple Silicon
kgand/document-parser-mcp: lightweight MCP server for universal document-to-Markdown ingestion