Introducing Hubcap: Chrome DevTools Protocol CLI for AI Agents

17 February 2025

TL;DR 🥱

▶

The Chrome DevTools Protocol is one of the most powerful interfaces in the browser. It's what powers Chrome's developer tools, covering everything from network inspection to performance profiling to DOM manipulation. But until now, there hasn't been a good way for AI agents to use it.

The journey to Hubcap

It started with the DevTools MCP. The Model Context Protocol has a DevTools integration, but it covers a subset of what CDP can do: basic navigation, clicking, and form filling. That's fine for simple automation, but CDP has over a hundred domains covering network interception, performance profiling, heap snapshots, device emulation, and more. MCP also consumes a lot of context. Every tool call requires describing the tool, the parameters, the response format. For something as rich as browser automation, this overhead adds up fast.

Browser extensions like Claude in Chrome take a different approach, using vision to see the page and interact like a human. That works well for end-user workflows, but it operates at the UI level. It can't intercept network requests, capture heap snapshots, export HAR files, or do the kind of low-level inspection and automation that the protocol makes possible.

So I started working directly with the Chrome DevTools Protocol. Claude Code can connect to the WebSocket endpoint and send raw CDP commands. This worked surprisingly well. The protocol is comprehensive, with a command for virtually everything the browser can do. But there was friction. Every session involved some trial and error, writing little scripts, installing websocket libraries. It worked, but it wasn't smooth.

Then I found cdp-cli, which wraps CDP in discrete shell commands, explicitly as an alternative to the MCP approach. This was closer to what I wanted, but it covers a similar scope to the MCP server: navigation, clicking, screenshots, console output. The full breadth of CDP (network interception, heap snapshots, device emulation, performance traces) wasn't there.

I considered contributing, but I also wanted something that ships as a single binary with no runtime dependencies. No Node.js, no npm. Just brew install and go. That pointed to Go, and at that point it made more sense to start fresh than to try to extend an npm package into something fundamentally different. The cost of building from scratch, including an extensive test suite, is much lower in an agentic coding world.

What Hubcap is

Hubcap is a Go binary that wraps the entire Chrome DevTools Protocol in 118 composable commands. Every command:

Takes simple arguments on the command line
Outputs structured JSON to stdout
Uses semantic exit codes (0 success, 1 error, 2 connection failed, 3 timeout)
Works with standard Unix tools (jq, pipes, && chaining)

This makes it trivial for an AI agent to invoke. No SDK, no library imports, no websocket management. Just shell commands that return JSON.

hubcap goto https://example.com
hubcap click '#login-button'
hubcap fill '#email' 'user@example.com'
hubcap screenshot --output page.png

Why this matters for AI agents

The Chrome DevTools Protocol is a superpower. Here's a fraction of what you can do:

Navigate and interact. Go to URLs, click elements, fill forms, press keys, handle dialogs, drag and drop. The basics, but reliable and fast.

Extract anything. Get page title, full HTML source, text from any selector, all links, all images, all tables (parsed into rows and columns), form structures, meta tags. An agent can understand a page without vision.

Wait intelligently. Wait for an element to appear, for text to show up, for an element to disappear, for a JavaScript expression to become truthy, for the network to go idle, for a specific request or response. No arbitrary sleeps.

Capture state. Screenshot the page or a specific element. Export to PDF. Capture the full accessibility tree. Take a DOM snapshot. Get computed styles for any element.

Debug and monitor. Stream console output. Capture JavaScript errors. Monitor network requests in real-time. Export HAR files. Get performance metrics. Profile memory with heap snapshots. Record performance traces.

Intercept and modify. Block specific URLs (ads, tracking). Intercept requests and modify responses. Throttle network to simulate slow connections. Mock API responses.

Emulate environments. Pretend to be an iPhone or Pixel. Set geolocation. Change user agent. Go offline. Emulate CSS media features (dark mode, reduced motion).

Assert and verify. Built-in assertions for title, URL, element existence, visibility, text content, element count. Return proper exit codes for test automation.

Agentic Experience

Hubcap is designed with AX (Agentic Experience) as the primary concern. Every decision optimises for how an AI agent will use it:

JSON output by default. Agents parse JSON reliably. No regex needed.

Semantic exit codes. An agent can distinguish between "element not found" (1), "Chrome not running" (2), and "operation timed out" (3) without parsing error messages.

Progressive verbosity. Basic commands are terse. Add flags for more detail. An agent can start simple and drill down when needed.

Self-describing. hubcap --list shows all commands. hubcap --describe click explains a command. hubcap --search network finds relevant commands. The agent can explore the tool's capabilities without external documentation.

No shared state. Each command connects, acts, and disconnects. No session to manage, no state to track. Statelessness is agent-friendly.

Some things you can do

Scrape a page intelligently

hubcap goto --wait https://news.ycombinator.com
hubcap tables | jq '.tables[0].rows[:10]'

Fill out and submit a form

hubcap goto --wait https://example.com/login
hubcap fill '#username' 'admin'
hubcap fill '#password' 'secret'
hubcap click '#submit'
hubcap waitnav
hubcap assert url '/dashboard'

Monitor what an SPA is doing

hubcap goto --wait https://my-app.com
hubcap network --duration 10s &
hubcap click '#load-data'
wait

Test responsive design

hubcap emulate "iPhone 12"
hubcap goto --wait https://example.com
hubcap screenshot --output mobile.png
hubcap emulate "iPad"
hubcap screenshot --output tablet.png

Debug performance issues

hubcap goto --wait https://slow-site.com
hubcap metrics | jq '.metrics | {jsHeap: .JSHeapUsedSize, domNodes: .Nodes}'
hubcap trace --duration 2s --output trace.json

Block distractions while testing

hubcap block '*.ads.js' '*.tracking.com' '*.analytics.*'
hubcap goto --wait https://example.com

Getting Hubcap

Install via Homebrew:

brew install tomyan/tap/hubcap

Or with Go:

go install github.com/tomyan/hubcap/cmd/hubcap@latest

Launch Chrome with debugging enabled:

hubcap setup launch

Then use Hubcap:

hubcap tabs      # List open tabs
hubcap goto https://example.com

Using Hubcap with Claude Code

Hubcap includes a Claude Code skill that gives Claude contextual knowledge of all 118 commands. Install the skill after installing Hubcap:

/plugin marketplace add tomyan/claude-skills
/plugin install hubcap@tomyan-skills

Then just ask Claude to do things with the browser:

Open a new tab, go to Hacker News, and screenshot the front page

Claude will open a new tab, navigate to Hacker News, and take the screenshot, choosing the right Hubcap commands without you needing to know them.

Try it out

Hubcap is open source and ready to use. If you're working with AI agents and browsers, I'd love to hear how it goes. Feature requests, bug reports, and feedback are all welcome via GitHub issues.

Full documentation and command reference at hubcap.tomyandell.dev.