Back to MCP Servers

Qdrant MCP Server

Official MCP server from Qdrant that turns the vector database into a semantic memory layer for storing and retrieving information by meaning.

Database by Qdrant API Key active
Overview

The Qdrant MCP server is the official Model Context Protocol implementation maintained by the Qdrant team. It acts as a semantic memory layer on top of the Qdrant vector search engine, letting LLM agents store snippets of information along with metadata and retrieve them later through natural language queries rather than exact keyword matches.

The server exposes two simple tools, qdrant-store and qdrant-find, that wrap upserts and similarity search against a Qdrant collection. Embeddings are generated locally via FastEmbed (default model sentence-transformers/all-MiniLM-L6-v2), so the server works without an external embedding API. It can run against Qdrant Cloud, a self-hosted Qdrant instance, or a fully local on-disk database via QDRANT_LOCAL_PATH.

It supports multiple transports (stdio, SSE, and streamable HTTP), automatic collection creation, a read-only mode for safe retrieval-only deployments, and customizable tool descriptions so the same server can be repurposed as, for example, a code snippet memory or a personal note store.

Tools

Tool Description
qdrant-store Store a piece of information (and optional JSON metadata) into a Qdrant collection. Embeds the text and upserts it as a point.
qdrant-find Retrieve relevant information from a Qdrant collection using semantic similarity search over embeddings.
Setup Guide

Prerequisites

  • A running Qdrant instance (Qdrant Cloud, self-hosted, or local on-disk mode)
  • uv / uvx installed (or Docker)
  • For remote Qdrant: a QDRANT_API_KEY

Run with uvx

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
uvx mcp-server-qdrant

Claude Desktop config (remote Qdrant)

{
  "mcpServers": {
    "qdrant": {
      "command": "uvx",
      "args": ["mcp-server-qdrant"],
      "env": {
        "QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
        "QDRANT_API_KEY": "your_api_key",
        "COLLECTION_NAME": "your-collection-name",
        "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
      }
    }
  }
}

Claude Desktop config (local on-disk mode)

{
  "mcpServers": {
    "qdrant": {
      "command": "uvx",
      "args": ["mcp-server-qdrant"],
      "env": {
        "QDRANT_LOCAL_PATH": "/path/to/qdrant/database",
        "COLLECTION_NAME": "your-collection-name",
        "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
      }
    }
  }
}

Cursor / VS Code (SSE transport)

QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
uvx mcp-server-qdrant --transport sse

Then point your MCP client at http://localhost:8000/sse.

Docker

docker build -t mcp-server-qdrant .
docker run -p 8000:8000 \
  -e QDRANT_URL="http://your-qdrant-server:6333" \
  -e COLLECTION_NAME="your-collection" \
  mcp-server-qdrant

Key environment variables

  • QDRANT_URL: remote Qdrant endpoint
  • QDRANT_LOCAL_PATH: path for local on-disk Qdrant (mutually exclusive with QDRANT_URL)
  • QDRANT_API_KEY: API key for authenticated Qdrant instances
  • COLLECTION_NAME: default collection used by the tools
  • EMBEDDING_MODEL: FastEmbed model, defaults to sentence-transformers/all-MiniLM-L6-v2
  • EMBEDDING_PROVIDER: defaults to fastembed
  • QDRANT_SEARCH_LIMIT: max results returned, default 10
  • QDRANT_READ_ONLY: set to true to disable qdrant-store
  • FASTMCP_SERVER_PORT: HTTP/SSE port, default 8000
Use Cases
  • Give an agent a long-term semantic memory: store user preferences, prior decisions, and conversation summaries, then recall them by meaning across sessions.
  • Build a code snippet librarian: index reusable code blocks with language and project metadata, then fetch the right snippet by intent rather than filename.
  • Internal knowledge retrieval: index runbooks, ADRs, or meeting notes and let an LLM answer questions by pulling the most relevant chunks.
  • Personal note search across Markdown or Obsidian vaults using semantic similarity instead of keyword search.
  • Deploy a read-only retrieval endpoint (QDRANT_READ_ONLY=true) that lets multiple agents query an existing curated collection without writing to it.
Example Prompts
  • "Remember that our production database uses Postgres 15 on AWS RDS with point-in-time recovery enabled."
  • "Find anything we previously stored about rate limiting strategies for the public API."
  • "Store this snippet under collection python-utils with metadata {\"language\": \"python\", \"topic\": \"retry\"}."
  • "Search the meeting-notes collection for decisions about the Q3 roadmap."
  • "What do we know about the customer Acme Corp from past notes?"
Pros
  • Official server maintained by the Qdrant team, kept in sync with the database
  • Works fully offline: embeddings via FastEmbed and an optional local on-disk Qdrant mode mean no external API calls required
  • Supports stdio, SSE, and streamable HTTP transports, so it plugs into Claude Desktop, Cursor, VS Code, and remote agent setups
  • Customizable tool descriptions and a read-only mode let you repurpose the same binary for very different memory use cases
Limitations
  • Only two tools (qdrant-store and qdrant-find); no built-in operations for deleting points, listing collections, or managing payload indexes
  • Embedding provider is effectively limited to FastEmbed models, so using OpenAI or Cohere embeddings requires custom work
  • Requires a Qdrant instance (cloud, self-hosted, or local path) and some understanding of collections and embedding models to use well
Alternatives