Kapture MCP Server
Browser automation MCP server that drives Chrome via a DevTools extension and WebSocket bridge, letting AI agents navigate, click, fill, and inspect pages.
Kapture is a browser automation MCP server that controls Chrome (or Brave) through a Chrome DevTools extension rather than a headless browser like Playwright or Puppeteer. It uses a three-layer architecture: an MCP server that AI clients connect to, a Chrome extension running inside DevTools, and a WebSocket bridge between them on port 61822. Because automation happens inside an existing DevTools session, agents can drive whatever tab the developer already has open, including pages that require an authenticated user session.
The server exposes a small set of action tools (navigate, back, forward, reload, click, hover, fill, select, keypress, elements) plus MCP resources under the kapture:// URI scheme for listing tabs, fetching console logs, screenshots, DOM content, and elements at a given coordinate. Selectors accept both CSS and XPath. Multiple MCP clients (Claude Desktop, Cline, etc.) can connect to the same server instance and share access to the same tabs.
Kapture is a community project maintained by William Kapke, released under MIT and distributed as the kapture-mcp npm package. It targets developers who want lightweight, visible browser automation tied to a real Chrome profile rather than a sandboxed headless browser.
Tools
| Tool | Description |
|---|---|
navigate |
Navigate the active tab to a URL. |
back |
Navigate back in browser history. |
forward |
Navigate forward in browser history. |
reload |
Reload the current page. |
click |
Click the first element matching a CSS selector or XPath. |
hover |
Hover over the first element matching a selector or XPath. |
fill |
Fill text into an input or textarea matching the selector. |
select |
Select an option in an HTML <select> element. |
keypress |
Send a keyboard event, optionally targeting a specific element. Supports modifier keys. |
elements |
Query all elements matching a CSS selector or XPath, with optional visibility filtering. |
kapture://tabs |
Resource: list all connected browser tabs. |
kapture://tab/{tabId} |
Resource: detailed info about a specific tab. |
kapture://tab/{tabId}/console |
Resource: paginated console logs for a tab. |
kapture://tab/{tabId}/screenshot |
Resource: capture a screenshot of the tab or a specific element. |
kapture://tab/{tabId}/dom |
Resource: HTML content of the tab or a specific element. |
kapture://tab/{tabId}/elementsFromPoint |
Resource: elements located at given x,y coordinates. |
kapture://tab/{tabId}/elements |
Resource: query elements in a tab by selector or XPath. |
Prerequisites
- Chrome or Brave browser
- Node.js with
npxavailable
Install the Chrome extension
Either install from the Chrome Web Store, or load unpacked:
- Open
chrome://extensions/ - Enable Developer mode
- Click "Load unpacked" and select the
extensionfolder from the repo
Configure your MCP client
Recommended config for Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"kapture": {
"command": "npx",
"args": ["-y", "kapture-mcp@latest", "bridge"]
}
}
}
The bridge command auto-starts a server on port 61822 if one is not already running, and reuses an existing server otherwise.
Connect a browser tab
- Open DevTools (F12) on the page you want to automate
- Switch to the "Kapture" panel
- The extension auto-connects to
ws://localhost:61822/mcp
Each connected tab gets a tabId that the agent uses to address it via kapture://tab/{tabId}/... resources.
Direct WebSocket (advanced)
{
"mcpServers": {
"kapture": {
"transport": "websocket",
"url": "ws://localhost:61822/mcp"
}
}
}
No API key or auth is required. The extension runs in the DevTools sandbox.
- Drive a logged-in web app using your real Chrome profile so the agent inherits existing cookies and SSO sessions
- Have an agent fill out and submit forms while you watch in a visible browser tab, useful for QA and acceptance testing
- Capture page screenshots, console logs, and DOM snapshots from a running tab for bug triage or scraping
- Let multiple MCP clients (Claude Desktop and Cline) collaborate on the same browser session via the shared WebSocket server
- Build lightweight scripted workflows (navigate, click, fill, keypress) without provisioning a headless browser
- "Navigate the active Kapture tab to example.com/login, fill #email with my address, fill #password, and click the submit button."
- "List all Kapture tabs, then return the console logs for the tab whose URL contains 'staging.app'."
- "Take a screenshot of the .pricing-table element on the current tab and describe what you see."
- "Find every visible link on this page using the elements tool and group them by section."
- "Use keypress to send Ctrl+F on the active tab, then fill the find box with 'invoice'."
- Runs inside a real Chrome session via a DevTools extension, so authenticated state and extensions work naturally
- Multi-client support: Claude Desktop, Cline, and other MCP clients can share the same server and tabs
- Small, focused tool surface that is easy for an agent to use correctly
- Open source under MIT with a published
kapture-mcpnpm package
- Community project maintained by a single author, not an official product from Google or Anthropic
- Requires manually opening DevTools and the Kapture panel on each tab you want to automate
- Screenshot, DOM, and console access are exposed as MCP resources rather than tools, which some clients handle less ergonomically
- chrome-devtools-mcp by the Chrome DevTools team, a Node-based MCP server that drives Chrome via CDP
- Playwright MCP by Microsoft for headless and headed browser automation
- browser-tools-mcp by AgentDesk for monitoring browser logs and network traffic from MCP-compatible IDEs