Appium MCP Server
Mobile test automation MCP server for iOS and Android. Drive simulators, emulators, and real devices through Appium with AI-assisted element finding and test generation.
Appium MCP is an official Model Context Protocol server from the Appium project that lets AI assistants drive real mobile app automation across iOS (XCUITest) and Android (UiAutomator2). It runs as a local stdio MCP server via npx appium-mcp@latest and exposes the full Appium surface area: session management, element discovery, gestures, app lifecycle, permissions, screenshots, screen recording, geolocation, and more.
Beyond standard Appium bindings, the server adds AI-specific capabilities: a vision-based element finder (appium_ai) that locates UI elements from natural language descriptions, automatic locator generation (generate_locators), and a test code generator (appium_generate_tests) that converts plain-English scenarios into TestNG/Java with Page Object Model templates. A built-in appium_documentation_query tool provides RAG-style search over Appium docs.
The server supports simulators, emulators, and physical devices, runs in headless NO_UI mode for CI/CD, and integrates with Cursor, Claude Code, Gemini CLI, and any other MCP client. Vision features are opt-in and require an OpenAI-compatible vision model endpoint (Qwen3-VL or Gemini Flash are recommended in the README).
Tools
| Tool | Description |
|---|---|
select_device |
Discover and select available iOS or Android devices |
prepare_ios_simulator |
Boot an iOS simulator and install WebDriverAgent |
appium_prepare_ios_real_device |
Prepare a physical iOS device with provisioning profiles |
appium_session_management |
Create, attach, delete, list, or switch Appium sessions |
appium_mobile_device_control |
Lock/unlock screen, shake device, manage notifications |
appium_driver_settings |
Read or update Appium driver session settings |
appium_context |
List and switch between NATIVE_APP and WEBVIEW contexts |
appium_orientation |
Get or set device screen orientation |
appium_geolocation |
Get or spoof GPS coordinates on the device |
appium_find_element |
Locate elements using accessibility id, id, platform-native strategies, or xpath |
appium_ai |
Vision-based element finding from natural language (requires AI_VISION_ENABLED) |
generate_locators |
Generate intelligent locators for interactive elements on the current screen |
appium_gesture |
Execute touch actions: tap, swipe, scroll, pinch zoom, scroll-to-element |
appium_drag_and_drop |
Drag and drop between elements or coordinates |
appium_perform_actions |
Execute raw W3C Actions API sequences for complex multi-touch |
appium_set_value |
Enter text into an input field |
appium_get_text |
Extract text content from an element |
appium_mobile_keyboard |
Show or hide the on-screen keyboard |
appium_mobile_clipboard |
Read or write the device clipboard |
appium_alert |
Handle system alerts: accept, dismiss, or read text |
appium_screenshot |
Capture a screen or element screenshot |
appium_get_page_source |
Retrieve the XML page source of the current screen |
appium_screen_recording |
Start or stop MP4 video recording of the device screen |
appium_get_window_size |
Get the current display dimensions |
appium_mobile_device_info |
Query device specs, battery, and system time |
appium_app_lifecycle |
Install, launch, terminate, clear data, or query app state |
appium_mobile_permissions |
Get, update, or reset Android runtime and iOS privacy permissions |
appium_generate_tests |
Generate TestNG/Java test code from natural language scenarios with Page Object Model templates |
appium_documentation_query |
Semantic RAG search over Appium documentation |
appium_skills |
Retrieve ordered setup and troubleshooting guides |
Prerequisites
- Node.js v22+
- Java Development Kit v8+
- Android SDK with USB debugging enabled (set
ANDROID_HOME) - Xcode 16+ with Command Line Tools (macOS, for iOS testing)
- Provisioning profile for real iOS devices
Install via npx
npx appium-mcp@latest
Claude Desktop / generic MCP config
{
"mcpServers": {
"appium-mcp": {
"disabled": false,
"timeout": 100,
"type": "stdio",
"command": "npx",
"args": ["appium-mcp@latest"],
"env": {
"ANDROID_HOME": "/path/to/android/sdk",
"CAPABILITIES_CONFIG": "/path/to/capabilities.json"
}
}
}
}
Claude Code CLI
claude mcp add appium-mcp -- npx -y appium-mcp@latest
Gemini CLI
gemini mcp add appium-mcp npx -y appium-mcp@latest
Capabilities file example
{
"android": {
"appium:app": "/path/to/app.apk",
"appium:platformVersion": "11.0"
},
"ios": {
"appium:app": "/path/to/app.ipa",
"appium:platformVersion": "17.0"
}
}
Useful environment variables
ANDROID_HOME: Path to Android SDKCAPABILITIES_CONFIG: Path to JSON file with device capability presetsSCREENSHOTS_DIR: Directory for screenshots and recordingsNO_UI=true: Disables HTML UI for 60-90% token reduction (recommended for CI/CD)AI_VISION_ENABLED=true: Enables theappium_aivision toolAI_VISION_API_BASE_URL: OpenAI-compatible vision endpointAI_VISION_API_KEY: Vision service API keyAPPIUM_MCP_ON_CLIENT_DISCONNECT:delete_all(default) orskip
- Drive an iOS simulator or Android emulator from a chat prompt to run smoke tests against a freshly built app binary
- Convert plain-English QA scenarios into ready-to-commit TestNG/Java tests with Page Object Model scaffolding via
appium_generate_tests - Use vision-based element finding (
appium_ai) to automate flows in apps that lack accessibility identifiers - Reproduce a customer bug by spoofing geolocation, granting/revoking permissions, and recording video evidence
- Run unattended mobile regression suites in CI/CD using
NO_UI=truemode to minimize token and bandwidth usage
- "Launch my Android app on the connected emulator, log in as test@example.com, and screenshot the home screen."
- "Find the yellow search button at the bottom of the screen and tap it." (uses
appium_ai) - "Generate a TestNG test that signs up a new user, verifies the welcome email screen, and logs out."
- "Spoof the device location to 37.7749, -122.4194 and verify the 'Nearby' tab shows San Francisco results."
- "Record a video while running the checkout flow on iOS, then save the MP4 to ./recordings."
- Official server maintained by the Appium project, with broad coverage of the Appium API surface
- Cross-platform support for iOS and Android on simulators, emulators, and physical devices
- AI-native extras: vision-based element finding, locator generation, and natural-language test code generation
NO_UImode and session persistence options make it practical for CI/CD pipelines
- Heavy local setup: Node 22+, JDK, Android SDK, and Xcode are required before anything works
- Vision features require an external OpenAI-compatible vision model endpoint and API key (not bundled)
- iOS real-device testing needs a paid Apple developer account and provisioning profiles
- mobile-mcp: community MCP server for mobile automation across iOS, Android, and simulators
- Playwright MCP: browser automation for mobile web flows when a native driver is not required
- Native Appium scripts driven by Claude Code without an MCP layer, for teams already invested in Java/Python Appium clients