Windows-MCP MCP Server
Lightweight open-source MCP server that lets AI agents control Windows: UI automation, app launching, PowerShell, file ops, and browser DOM access.
Windows-MCP is an open-source MCP server (MIT licensed, maintained by CursorTouch) that bridges LLM agents and the Windows operating system. It exposes native Windows UI automation, keyboard and mouse input, application and window management, file system operations, PowerShell execution, clipboard access, registry operations, and browser DOM extraction across Chrome, Edge, and Firefox. Unlike vision-only computer-use agents, Windows-MCP works without specialized vision models by providing structured snapshots of the desktop with interactive element IDs, while still offering screenshots when needed.
The server runs as a local Python process via the uv package manager. It supports stdio (default for desktop clients), Server-Sent Events, and streamable HTTP transports, with optional bearer token auth, IP allowlisting, TLS, and OAuth 2.0 with PKCE for networked deployments. Action latency is typically 0.2 to 0.5 seconds, and the project ships an installable background service.
Windows-MCP is community-maintained, not an official Microsoft project, but it has seen wide adoption (reportedly 2M+ users via Claude Desktop Extensions) and is listed in the official MCP Registry. It is compatible with Claude Desktop, Claude Code, Cursor, Perplexity Desktop, Gemini CLI, and Qwen Code.
Tools
| Tool | Description |
|---|---|
Click |
Click on screen coordinates or UI elements |
Type |
Type text into focused or specified UI element |
Scroll |
Perform vertical or horizontal scrolling |
Move |
Move the pointer or perform drag operations |
Shortcut |
Execute keyboard shortcut combinations (e.g., Ctrl+C) |
Wait |
Pause execution for a specified duration |
Screenshot |
Capture a fast visual screenshot with cursor and window data |
Snapshot |
Return full desktop state with interactive element IDs and DOM extraction for browsers |
App |
Launch applications and resize/move windows |
PowerShell |
Execute PowerShell commands on the host |
FileSystem |
Read, write, copy, move, delete, list, and search files |
Scrape |
Extract content from a webpage (with SSRF protection) |
MultiSelect |
Select multiple items in bulk |
MultiEdit |
Enter text across multiple input fields in one call |
Clipboard |
Read from or write to the Windows clipboard |
Process |
List and manage running processes |
Notification |
Send Windows toast notifications |
Registry |
Read and modify Windows Registry keys and values |
Prerequisites
- Windows 7, 8, 8.1, 10, or 11
- Python 3.13 or newer
uvpackage manager:pip install uv- English as the default system language (recommended for the App tool)
Quick start
Run the server directly from PyPI:
uvx windows-mcp serve
Claude Desktop configuration
Add to claude_desktop_config.json:
{
"mcpServers": {
"windows-mcp": {
"command": "uvx",
"args": ["windows-mcp", "serve"]
}
}
}
Run as a background service
windows-mcp install
This registers Windows-MCP as a scheduled task so it starts automatically.
Transports
stdio(default, recommended for local desktop clients)sse(Server-Sent Events for network access)streamable-http(recommended for production network deployments)
Optional config file location: ~/.windows-mcp/config.toml. CLI flags and environment variables override file settings.
Security (for non-stdio transports)
Supports bearer token auth, IP allowlist/CIDR restrictions, TLS/HTTPS, OAuth 2.0 with PKCE, CORS origin configuration, and tool-level access control. SSRF protection is enabled for the Scrape tool.
Source install
Clone the repo and point your MCP client at the local uv invocation:
git clone https://github.com/CursorTouch/Windows-MCP
cd Windows-MCP
uv sync
- Drive end-to-end QA tests across native Windows apps and browsers using Snapshot + Click + Type instead of brittle coordinate-based scripts.
- Automate repetitive desktop workflows like opening Outlook, copying values from Excel, and pasting into a line-of-business app.
- Let an AI agent triage files: search the file system, read documents, and move or rename them based on content.
- Run PowerShell-based system administration tasks (service restarts, disk checks, registry tweaks) through an LLM chat interface.
- Scrape a webpage in Chrome/Edge/Firefox via DOM mode, then trigger a Windows toast Notification when conditions are met.
- "Open Excel, create a new workbook, and fill A1:A5 with the numbers 1 through 5."
- "Take a Snapshot of the active window, find the Submit button, and click it."
- "Use PowerShell to list all services that are stopped and report their names."
- "Search my Documents folder for any PDF containing the word 'invoice' and move them to D:\Archive\Invoices."
- "Scrape the top story from news.ycombinator.com and send me a Windows notification with the headline."
- Broad capability surface: UI input, screenshots, structured snapshots, PowerShell, files, registry, and DOM all in one server.
- Works with non-vision LLMs thanks to Snapshot returning interactive element IDs, reducing model cost and latency.
- Multiple transports (stdio, SSE, streamable HTTP) plus production security options (bearer tokens, TLS, OAuth 2.0 PKCE, IP allowlists).
- Active project with PyPI distribution,
uvxone-line install, and listing in the official MCP Registry.
- Community-maintained, not an official Microsoft or Anthropic project, so production support is community-driven.
- Requires Python 3.13, which is newer than many corporate baselines, and works only on Windows.
- Highly privileged: PowerShell, Registry, and FileSystem tools give broad host access, so misconfiguration or prompt injection can be dangerous.
- mario-andreschak/mcp-windows-desktop-automation: TypeScript MCP server wrapping AutoIt for Windows automation.
- SecretiveShell/mcp-windows: Lighter MCP server focused on the Windows API.
- mukul975/mcp-windows-automation: Community Windows automation MCP with a large tool catalog.