Back to MCP Servers

Dremio MCP Server

Official Dremio MCP server for querying lakehouse data via natural language. Hosted by Dremio Cloud with OAuth, or self-hosted from the open source repo.

Database by Dremio OAuth2 active
Overview

The Dremio MCP server connects AI agents to the Dremio lakehouse platform, letting LLMs explore datasets, inspect schemas, and execute SQL through the Model Context Protocol. It is built and maintained by Dremio and ships in two flavors: a Dremio-hosted endpoint for web clients like Claude.ai and ChatGPT that authenticate via OAuth 2.0, and a self-hosted open source server for local clients like Claude Desktop, Claude Code, Cursor, and VS Code that authenticate via a Personal Access Token.

The hosted variant exposes a focused tool set centered on data discovery and SQL execution: GetUsefulSystemTableNames, GetSchemaOfTable, and RunSqlQuery. The self-hosted variant from the dremio/dremio-mcp repo additionally exposes operational tooling such as job failure analysis, usage reporting, lineage, semantic search, and Prometheus metrics integration. Tools are gated by server modes (FOR_DATA_PATTERNS, FOR_SELF, FOR_PROMETHEUS) that can be combined.

Agent actions are bound by existing Dremio access controls, so an agent only sees data the authenticated user is allowed to see. SQL execution is read-only by default; DML (such as view creation) must be explicitly enabled. The hosted endpoint is unique per project and is found in the Dremio admin console under Project Overview.

Tools

Tool Description
RunSqlQuery Executes SELECT queries against the Dremio cluster.
GetSchemaOfTable Retrieves schema information for a table or view.
GetUsefulSystemTableNames Lists important Dremio system tables that can be queried for metadata and operational information.
GetTableOrViewLineage Returns upstream and downstream lineage for a table or view (self-hosted).
SemanticSearch Performs semantic search across catalog objects in the cluster (self-hosted, optional).
GetFailedJobDetails Analyzes failed or canceled jobs over the past 7 days (self-hosted, FOR_SELF mode).
BuildUsageReport Generates usage reports grouped by engines or projects (self-hosted, FOR_SELF mode).
GetNameOfJobsRecentTable Returns the system table name used for recent job information (self-hosted).
GetRelevantMetrics Retrieves Prometheus metrics relevant to the Dremio cluster (self-hosted, FOR_PROMETHEUS mode).
GetMetricSchema Returns metric labels and sample values for Prometheus metrics (self-hosted, FOR_PROMETHEUS mode).
RunPromQL Executes a PromQL query against the configured Prometheus endpoint (self-hosted, FOR_PROMETHEUS mode).
Setup Guide

Option A: Dremio Cloud hosted MCP server (OAuth)

Best for web clients like Claude.ai and ChatGPT.

Prerequisites:

  • Dremio Cloud account with admin access to the target project
  • A Claude (Pro/Team/Enterprise/Max) or ChatGPT (Plus/Enterprise) subscription that supports remote MCP

Steps:

  • In Dremio admin console, create a Native OAuth Application for your project.
  • Open Admin > Project > Project Overview and copy the MCP endpoint URL.
  • In your AI client, add a custom MCP connector pointing at the endpoint with the OAuth client ID.

Endpoint URL pattern:

US:  https://mcp.dremio.cloud/mcp/{project_id}
EU:  https://mcp.eu.dremio.cloud/mcp/{project_id}

Option B: Self-hosted from GitHub (PAT)

Best for local clients like Claude Desktop, Claude Code, Cursor, and VS Code.

Prerequisites:

  • Python 3.11+
  • uv package manager
  • Git
  • A Dremio Personal Access Token (PAT)

Install and configure:

git clone https://github.com/dremio/dremio-mcp.git
cd dremio-mcp
uv sync

uv run dremio-mcp-server config create dremioai \
  --uri https://api.dremio.cloud \
  --pat @/path/to/token.file \
  --project-id <project_id>

URI values:

  • https://api.dremio.cloud or prod for Dremio Cloud US
  • https://api.eu.dremio.cloud or prodemea for Dremio Cloud EMEA
  • https://<host>:<port> for self-managed Dremio

Generate the Claude Desktop config:

uv run dremio-mcp-server config create claude

Or add manually to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "Dremio": {
      "command": "uv",
      "args": [
        "run",
        "--directory", "/absolute/path/to/dremio-mcp",
        "dremio-mcp-server",
        "run"
      ]
    }
  }
}

The server config file lives at $HOME/.config/dremioai/config.yaml:

dremio:
  uri: https://api.dremio.cloud
  pat: "@~/.dremio/token"
  project_id: <project_id>
  enable_search: false
  allow_dml: false
tools:
  server_mode: FOR_DATA_PATTERNS

Combine modes with commas, for example FOR_DATA_PATTERNS,FOR_SELF,FOR_PROMETHEUS.

Use Cases
  • Let an analyst chat with their Dremio lakehouse, asking questions in natural language and having the agent generate and run SQL via RunSqlQuery.
  • Discover unfamiliar datasets by asking the agent to list system tables and pull schemas with GetUsefulSystemTableNames and GetSchemaOfTable before writing a query.
  • Trace data lineage across views and sources using GetTableOrViewLineage to support impact analysis or audit reviews.
  • Diagnose performance issues by asking the agent to summarize failed jobs and usage with GetFailedJobDetails and BuildUsageReport in FOR_SELF mode.
  • Correlate query workload with cluster health by combining Dremio job data with Prometheus metrics via RunPromQL and GetRelevantMetrics.
Example Prompts
  • "What are the top 10 customers by revenue in the sales.orders table over the last 90 days?"
  • "Show me the schema of analytics.fact_events and suggest a query to count events per user per day."
  • "List the useful Dremio system tables and tell me which one tracks recent jobs."
  • "Find every job that failed in the past week and group the errors by root cause."
  • "Build a usage report grouped by engine for this project and flag anything over 80% utilization."
Pros
  • Official, maintained by Dremio under the dremio GitHub org with both a hosted endpoint and an open source repo.
  • Honors existing Dremio access controls so agents inherit the authenticated user's permissions.
  • Tool modes (FOR_DATA_PATTERNS, FOR_SELF, FOR_PROMETHEUS) let you scope what an agent can do.
  • Hosted variant supports OAuth 2.0 directly from Claude.ai and ChatGPT without local installation.
Limitations
  • Hosted MCP endpoint requires a paid Claude or ChatGPT subscription that supports remote MCP connectors.
  • DML, semantic search, and Prometheus features are opt-in and need extra configuration.
  • Self-hosted setup requires Python 3.11+, the uv package manager, and manual config files, which is heavier than a single npm install.
Alternatives
  • Snowflake MCP server for natural language queries against Snowflake warehouses.
  • Databricks MCP server for SQL and notebook access on the Databricks lakehouse.
  • Generic Trino or DuckDB MCP servers when you need ad-hoc SQL over a lakehouse without Dremio specifics.