Back to MCP Servers

ElevenLabs MCP Server

Official ElevenLabs MCP server for text-to-speech, voice cloning, transcription, sound effects, music generation, and conversational AI agents.

Communication by ElevenLabs API Key active
Overview

The ElevenLabs MCP server is the official Model Context Protocol integration for ElevenLabs, exposing the company's audio and voice AI APIs to MCP clients like Claude Desktop, Cursor, Windsurf, and OpenAI Agents. Through a single server, agents can generate speech, clone voices, transcribe recordings, design new voices, isolate audio, compose music, and manage conversational AI agents that can even place outbound phone calls via Twilio or SIP.

The server is implemented in Python and distributed via PyPI as elevenlabs-mcp. It is typically launched with uvx and authenticated using an ElevenLabs API key. Generated audio output can be saved to disk, returned as base64-encoded MCP resources, or both, configurable via environment variables. The default output directory is the user's Desktop.

Notable capabilities include conversational agent management (create agents, attach knowledge bases, list conversations), outbound calling, instant voice cloning from sample audio, and music composition with multi-step planning. The free ElevenLabs tier provides 10,000 credits per month, sufficient for evaluation.

Tools

Tool Description
text_to_speech Convert text to speech using a chosen voice and model with parameters such as stability and speed.
speech_to_text Transcribe an audio file to text with optional speaker diarization.
text_to_sound_effects Generate a sound effect from a text description, 0.5 to 5 seconds.
search_voices Search the user's voice library by name, description, labels, and category.
list_models List available speech synthesis models and their supported languages.
get_voice Retrieve detailed information about a specific voice by ID.
voice_clone Create an instant voice clone from provided audio sample files.
isolate_audio Extract and isolate speech from mixed audio containing background noise.
check_subscription Return current ElevenLabs subscription status and API usage.
create_agent Create a conversational AI agent with system prompt, voice, and language settings.
add_knowledge_base_to_agent Attach documents, URLs, or text to an agent's knowledge base.
list_agents List all conversational AI agents on the account.
get_agent Retrieve configuration and metadata for a specific agent.
get_conversation Fetch a conversation record including transcript and analysis.
list_conversations List agent conversations filtered by date, agent, and pagination.
speech_to_speech Convert audio from one voice to another while preserving content.
text_to_voice Generate three voice preview variations from a text description.
create_voice_from_preview Save a generated preview voice to the permanent voice library.
make_outbound_call Place an outbound phone call using an agent via Twilio or SIP trunk.
search_voice_library Search ElevenLabs' shared voice library by gender, accent, and language.
list_phone_numbers List phone numbers connected to the account and their assigned agents.
play_audio Play a WAV or MP3 audio file locally.
compose_music Generate instrumental music from a text prompt or composition plan.
create_composition_plan Build a structured music generation plan without consuming credits.
Setup Guide

Prerequisites

  • An ElevenLabs API key from elevenlabs.io/app/settings/api-keys. The free tier includes 10,000 credits per month.
  • uv installed: curl -LsSf https://astral.sh/uv/install.sh | sh

Claude Desktop

Add the following to your claude_desktop_config.json:

{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": {
        "ELEVENLABS_API_KEY": "<your-api-key>"
      }
    }
  }
}

Windows users may need to enable Developer Mode in Claude Desktop via the Help menu so the server can write audio files.

Other MCP clients (Cursor, Windsurf, OpenAI Agents)

pip install elevenlabs-mcp
python -m elevenlabs_mcp --api-key=YOUR_KEY --print

The --print flag outputs the JSON snippet you can paste into your client's MCP config.

Optional environment variables

  • ELEVENLABS_MCP_BASE_PATH: directory where audio files are written (default: ~/Desktop)
  • ELEVENLABS_MCP_OUTPUT_MODE: files, resources, or both
  • ELEVENLABS_API_RESIDENCY: data region (enterprise plans only)

Local development

git clone https://github.com/elevenlabs/elevenlabs-mcp
cd elevenlabs-mcp
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"
cp .env.example .env
./scripts/test.sh
Use Cases
  • Generate narrated audio (voiceovers, podcast intros, audiobook samples) from drafted scripts directly inside an editor like Cursor.
  • Clone a stakeholder or actor's voice from sample recordings, then produce multilingual voiced content with the cloned voice.
  • Transcribe meeting recordings or interviews to text with speaker diarization for downstream summarization.
  • Build and manage ElevenLabs conversational AI agents, attach knowledge bases, and place outbound phone calls via Twilio.
  • Compose background music and sound effects on demand for video edits, game prototypes, or ad creatives.
Example Prompts
  • "Read this blog post aloud in Rachel's voice and save the MP3 to my Desktop."
  • "Clone a voice from the three WAV files in ~/samples and name it 'CEO Voice'."
  • "Transcribe meeting.mp3 with speaker diarization and give me a summary."
  • "Create a conversational agent named 'Support Bot' with a friendly system prompt and attach our docs site as a knowledge base."
  • "Compose a 30 second upbeat electronic track for a product launch video."
Pros
  • Official server maintained by ElevenLabs covering the full product surface: TTS, STT, voice cloning, voice design, sound effects, music, and conversational agents.
  • Broad client support: Claude Desktop, Cursor, Windsurf, and OpenAI Agents are explicitly documented.
  • Flexible output handling (files, base64 resources, or both) and configurable output directory.
  • Free tier with 10,000 credits per month is enough to evaluate without payment.
Limitations
  • Some operations like voice design and audio isolation can hit timeouts, especially in development mode.
  • Generation consumes ElevenLabs credits, so heavy usage requires a paid subscription.
  • Outbound calling features require additional setup (Twilio or SIP trunk plus a configured phone number).
Alternatives