Back to MCP Servers

ElevenLabs Player MCP Server

Official ElevenLabs MCP server for text-to-speech, voice cloning, audio transcription, sound effects, and conversational AI agents.

AI/ML by ElevenLabs API Key active
Overview

The ElevenLabs MCP server is the official Model Context Protocol integration for ElevenLabs' audio AI platform. It exposes the full suite of ElevenLabs APIs to MCP clients like Claude Desktop, Cursor, Windsurf, and OpenAI Agents, letting an LLM generate speech from text, clone voices from audio samples, transcribe audio with speaker diarization, isolate voices from noisy recordings, generate sound effects, and design new voices from text descriptions.

Beyond core audio generation, the server also covers ElevenLabs' Conversational AI stack: creating voice agents, attaching knowledge bases, listing conversations, retrieving transcripts, and even initiating outbound phone calls through Twilio or SIP trunk integrations. Output can be saved to disk, returned as base64 resources, or both, controlled via environment variables.

The server is maintained directly by ElevenLabs (the elevenlabs GitHub org) and published to PyPI as elevenlabs-mcp. It uses an API key for authentication and works with the free tier, which includes 10,000 credits per month.

Tools

Tool Description
text_to_speech Convert text to speech using a specified voice and model.
speech_to_text Transcribe an audio file to text, with optional speaker diarization.
text_to_sound_effects Generate a sound effect from a text description.
text_to_voice Create voice previews from a text prompt, returning three variations.
voice_clone Create an instant voice clone from one or more audio files.
speech_to_speech Transform audio from one voice to another while preserving delivery.
isolate_audio Isolate voice from background noise in an audio file.
search_voices Search voices in the user's library.
get_voice Retrieve details about a specific voice by ID.
list_models List all available ElevenLabs synthesis models.
play_audio Play an audio file locally on the host machine.
create_agent Create a Conversational AI agent with custom voice, prompt, and LLM config.
list_agents List all conversational AI agents in the account.
get_agent Get details about a specific conversational AI agent.
add_knowledge_base_to_agent Attach a knowledge base (URL, file, or text) to an agent.
list_conversations List conversations from agents with filtering options.
get_conversation Retrieve full conversation details including transcript.
make_outbound_call Initiate an outbound phone call using an ElevenLabs agent over Twilio or SIP.
list_phone_numbers List phone numbers available in the ElevenLabs account.
check_subscription Check current ElevenLabs subscription status and credit usage.
Setup Guide

Prerequisites

  • ElevenLabs account and API key (free tier includes 10,000 credits per month)
  • uv Python package manager installed
  • An MCP-compatible client (Claude Desktop, Cursor, Windsurf, etc.)

Installation

For Claude Desktop, add the following to claude_desktop_config.json:

{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": {
        "ELEVENLABS_API_KEY": "<insert-your-api-key-here>"
      }
    }
  }
}

For Cursor, Windsurf, or other clients, install via pip and run:

pip install elevenlabs-mcp
python -m elevenlabs_mcp --api-key=YOUR_KEY --print

Optional environment variables

  • ELEVENLABS_API_KEY (required): your ElevenLabs API key
  • ELEVENLABS_MCP_BASE_PATH: default directory for saved audio output (defaults to ~/Desktop)
  • ELEVENLABS_MCP_OUTPUT_MODE: files, resources, or both (default files)
  • ELEVENLABS_API_RESIDENCY: data residency region for enterprise accounts (default us)

Windows notes

Enable Developer Mode in Claude Desktop. If uvx fails to resolve, use the absolute path to the executable in the command field.

Use Cases
  • Generate narrated audio for videos, podcasts, or documentation directly from a script in the editor
  • Clone a voice from a short sample, then synthesize new lines in that voice for prototyping voiceovers
  • Transcribe meeting or interview recordings to text with speaker diarization
  • Spin up and configure a Conversational AI agent with a knowledge base, then trigger outbound phone calls
  • Pull conversation transcripts and analytics from existing ElevenLabs voice agents for review
Example Prompts
  • "Generate a 30 second narration of this paragraph using the voice 'Rachel' and save it to my Desktop."
  • "Transcribe ~/recordings/interview.mp3 with speaker diarization and save the transcript to a file."
  • "Clone my voice from these three wav files and call the new voice 'Demo Clone'."
  • "Create a Conversational AI agent named 'Support Bot' with this system prompt, attach our docs site as a knowledge base, and list its phone numbers."
  • "Show me the last 5 conversations from agent abc123 and pull the full transcript for the most recent one."
Pros
  • Officially maintained by ElevenLabs under the elevenlabs GitHub org
  • Broad coverage spanning TTS, STT, voice cloning, sound effects, and Conversational AI
  • Flexible output modes (files, base64 resources, or both) with configurable save path
  • Works with all major MCP clients and supports outbound phone calls via Twilio or SIP
Limitations
  • All operations consume ElevenLabs credits; heavy usage requires a paid plan
  • Voice design and audio isolation can time out in MCP Inspector even though the underlying job succeeds
  • Windows users may hit uvx path resolution issues and need to hardcode the executable path
Alternatives