Back to MCP Servers

Appium MCP Server

Mobile test automation MCP server for iOS and Android. Drive simulators, emulators, and real devices through Appium with AI-assisted element finding and test generation.

Developer Tools by Appium None active
Overview

Appium MCP is an official Model Context Protocol server from the Appium project that lets AI assistants drive real mobile app automation across iOS (XCUITest) and Android (UiAutomator2). It runs as a local stdio MCP server via npx appium-mcp@latest and exposes the full Appium surface area: session management, element discovery, gestures, app lifecycle, permissions, screenshots, screen recording, geolocation, and more.

Beyond standard Appium bindings, the server adds AI-specific capabilities: a vision-based element finder (appium_ai) that locates UI elements from natural language descriptions, automatic locator generation (generate_locators), and a test code generator (appium_generate_tests) that converts plain-English scenarios into TestNG/Java with Page Object Model templates. A built-in appium_documentation_query tool provides RAG-style search over Appium docs.

The server supports simulators, emulators, and physical devices, runs in headless NO_UI mode for CI/CD, and integrates with Cursor, Claude Code, Gemini CLI, and any other MCP client. Vision features are opt-in and require an OpenAI-compatible vision model endpoint (Qwen3-VL or Gemini Flash are recommended in the README).

Tools

Tool Description
select_device Discover and select available iOS or Android devices
prepare_ios_simulator Boot an iOS simulator and install WebDriverAgent
appium_prepare_ios_real_device Prepare a physical iOS device with provisioning profiles
appium_session_management Create, attach, delete, list, or switch Appium sessions
appium_mobile_device_control Lock/unlock screen, shake device, manage notifications
appium_driver_settings Read or update Appium driver session settings
appium_context List and switch between NATIVE_APP and WEBVIEW contexts
appium_orientation Get or set device screen orientation
appium_geolocation Get or spoof GPS coordinates on the device
appium_find_element Locate elements using accessibility id, id, platform-native strategies, or xpath
appium_ai Vision-based element finding from natural language (requires AI_VISION_ENABLED)
generate_locators Generate intelligent locators for interactive elements on the current screen
appium_gesture Execute touch actions: tap, swipe, scroll, pinch zoom, scroll-to-element
appium_drag_and_drop Drag and drop between elements or coordinates
appium_perform_actions Execute raw W3C Actions API sequences for complex multi-touch
appium_set_value Enter text into an input field
appium_get_text Extract text content from an element
appium_mobile_keyboard Show or hide the on-screen keyboard
appium_mobile_clipboard Read or write the device clipboard
appium_alert Handle system alerts: accept, dismiss, or read text
appium_screenshot Capture a screen or element screenshot
appium_get_page_source Retrieve the XML page source of the current screen
appium_screen_recording Start or stop MP4 video recording of the device screen
appium_get_window_size Get the current display dimensions
appium_mobile_device_info Query device specs, battery, and system time
appium_app_lifecycle Install, launch, terminate, clear data, or query app state
appium_mobile_permissions Get, update, or reset Android runtime and iOS privacy permissions
appium_generate_tests Generate TestNG/Java test code from natural language scenarios with Page Object Model templates
appium_documentation_query Semantic RAG search over Appium documentation
appium_skills Retrieve ordered setup and troubleshooting guides
Setup Guide

Prerequisites

  • Node.js v22+
  • Java Development Kit v8+
  • Android SDK with USB debugging enabled (set ANDROID_HOME)
  • Xcode 16+ with Command Line Tools (macOS, for iOS testing)
  • Provisioning profile for real iOS devices

Install via npx

npx appium-mcp@latest

Claude Desktop / generic MCP config

{
  "mcpServers": {
    "appium-mcp": {
      "disabled": false,
      "timeout": 100,
      "type": "stdio",
      "command": "npx",
      "args": ["appium-mcp@latest"],
      "env": {
        "ANDROID_HOME": "/path/to/android/sdk",
        "CAPABILITIES_CONFIG": "/path/to/capabilities.json"
      }
    }
  }
}

Claude Code CLI

claude mcp add appium-mcp -- npx -y appium-mcp@latest

Gemini CLI

gemini mcp add appium-mcp npx -y appium-mcp@latest

Capabilities file example

{
  "android": {
    "appium:app": "/path/to/app.apk",
    "appium:platformVersion": "11.0"
  },
  "ios": {
    "appium:app": "/path/to/app.ipa",
    "appium:platformVersion": "17.0"
  }
}

Useful environment variables

  • ANDROID_HOME: Path to Android SDK
  • CAPABILITIES_CONFIG: Path to JSON file with device capability presets
  • SCREENSHOTS_DIR: Directory for screenshots and recordings
  • NO_UI=true: Disables HTML UI for 60-90% token reduction (recommended for CI/CD)
  • AI_VISION_ENABLED=true: Enables the appium_ai vision tool
  • AI_VISION_API_BASE_URL: OpenAI-compatible vision endpoint
  • AI_VISION_API_KEY: Vision service API key
  • APPIUM_MCP_ON_CLIENT_DISCONNECT: delete_all (default) or skip
Use Cases
  • Drive an iOS simulator or Android emulator from a chat prompt to run smoke tests against a freshly built app binary
  • Convert plain-English QA scenarios into ready-to-commit TestNG/Java tests with Page Object Model scaffolding via appium_generate_tests
  • Use vision-based element finding (appium_ai) to automate flows in apps that lack accessibility identifiers
  • Reproduce a customer bug by spoofing geolocation, granting/revoking permissions, and recording video evidence
  • Run unattended mobile regression suites in CI/CD using NO_UI=true mode to minimize token and bandwidth usage
Example Prompts
  • "Launch my Android app on the connected emulator, log in as test@example.com, and screenshot the home screen."
  • "Find the yellow search button at the bottom of the screen and tap it." (uses appium_ai)
  • "Generate a TestNG test that signs up a new user, verifies the welcome email screen, and logs out."
  • "Spoof the device location to 37.7749, -122.4194 and verify the 'Nearby' tab shows San Francisco results."
  • "Record a video while running the checkout flow on iOS, then save the MP4 to ./recordings."
Pros
  • Official server maintained by the Appium project, with broad coverage of the Appium API surface
  • Cross-platform support for iOS and Android on simulators, emulators, and physical devices
  • AI-native extras: vision-based element finding, locator generation, and natural-language test code generation
  • NO_UI mode and session persistence options make it practical for CI/CD pipelines
Limitations
  • Heavy local setup: Node 22+, JDK, Android SDK, and Xcode are required before anything works
  • Vision features require an external OpenAI-compatible vision model endpoint and API key (not bundled)
  • iOS real-device testing needs a paid Apple developer account and provisioning profiles
Alternatives
  • mobile-mcp: community MCP server for mobile automation across iOS, Android, and simulators
  • Playwright MCP: browser automation for mobile web flows when a native driver is not required
  • Native Appium scripts driven by Claude Code without an MCP layer, for teams already invested in Java/Python Appium clients