Chrome DevTools MCP (2025)

[−] dataviz1000 62d ago

I use Playwright to intercept all requests and responses and have Claude Code navigate to a website like YouTube and click and interact with all the elements and inputs while recording all the requests and responses associated with each interaction. Then it creates a detailed strongly typed API to interact with any website using the underlying API.

Yes, I know it likely breaks everybody's terms of service but at the same time I'm not loading gigabytes of ads, images, markup, to accomplish things.

If anyone is interested I can take some time and publish it this week.

[−] bredren 62d ago

I also do this. My primary use case is for reproducing page layout and styling at any given tree in the dom. So, capturing various states of a component etc.

I also use it to automatically retrieve page responsiveness behavior in complex web apps. It uses playwright to adjust the width and monitor entire trees for exact changes which it writes structured data that includes the complete cascade of styles relevant with screenshots to support the snapshots.

There are tools you can buy that let you do this kind of inspection manually, but they are designed for humans. So, lots of clickety-clackety and human speed results.

---

My first reaction to seeing this FP was why are people still releasing MCPs? So far I've managed to completely avoid that hype loop and went straight to building custom CLIs even before skills were a thing.

I think people are still not realizing the power and efficiency of direct access to things you want and skills to guide the AI in using the access effectively.

Maybe I'm missing something in this particular use case?

[−] mrieck 61d ago

> There are tools you can buy that let you do this kind of inspection manually, but they are designed for humans.

You should try my SnipCSS Claude Code plugin. It still uses MCP as skill (haven't converted to CLI yet), but it does exactly what you want for reproducing designs in Tailwind/CSS at AI speeds.

https://snipcss.com/claude_plugin

[−] AlphaSite 62d ago

its mostly because MCPs handle auth in a standardised way and give you a framework you can layer things like auth, etc on top of.

Without it youre stuck with the basic http firewall, etc which is extremely dangerous and this is maybe the 1 opportunity we have to do this.

[−] re5i5tor 61d ago

And people forget, Claude Code isn’t the only Claude surface, and CLIs don’t help in other surfaces other than Cowork.

[−] ranyume 61d ago

> My first reaction to seeing this FP was why are people still releasing MCPs?

MCPs are more difficult to use. You need to use an agent to use the tools, can't do it manually easily. I wonder if some people see that friction as a feature.

[−] A7OM 61d ago

[dead]

[−] Axsuul 62d ago

Why even use Playwright for this? I feel like Claude just needs agent-browser and it can generate deterministic code from it.

[−] dsrtslnd23 62d ago

you mean this one? https://github.com/vercel-labs/agent-browser

[−] dataviz1000 62d ago

It is 2 months old!

My excuse for not keeping up is that I'm in so deep that Claude Code can predict the stock market.

I'll still publish mine and see if has any value but agent browser looks very complete.

Thank you for sharing!

[−] bartek_gdn 62d ago

Yes please, maybe there will be some solution that will fit the problem better! I recently released something similar, and because of the small API, I'm more comfortable using it.

https://news.ycombinator.com/item?id=47207790

[−] Barbing 62d ago

>I'm in so deep that Claude Code can predict the stock market.

“What?”, more polite than “yeah right” :)

(oh I guess obviously it would have a chance at nailing it for weeks in a row, and have more good years than bad—since actively managed funds can pull that off until, universally, they can’t [beat the market])

[−] botanrice 61d ago

I'm curious, have you developed your own reasoning system for how Claude can predict the stock market? Or have you trained it on past data combined with news sources?

[−] thefreeman 62d ago

You can just start claude with the —chrome flag too and it will connect to the chrome extension.

[−] felarof 61d ago

I do this via BrowserOS -- https://github.com/browseros-ai/BrowserOS

It has an in-built MCP server and I use it with claude code, codex and like it quite a lot.

[−] schainks 62d ago

Very interested. Would even pay for an api for this. I am doing something similar with vibium and need something more token efficient.

[−] defen 62d ago

Would this hypothetically be able to download arbitrary videos from youtube without the constant yt-dlp arms race?

[−] Johnny_Bonk 62d ago

Yes, please do and ping me when it's done lol. Did you make it into an agent skill?

[−] sidwyn 61d ago

I do something similar [1] but it leverages WebMCP (see Amazon example [2]). Could probably turn it into a strongly typed API.

[1] https://github.com/sidwyn/webmcp-tool-library

[2] https://github.com/sidwyn/webmcp-tool-library/blob/main/cont...

[−] kolinko 61d ago

Please do.

Did you compare playwright with mcp? Why one over another?

I use MCP usually, because I heard it’s less detectable than playwright, and more robust against design changes, but I didn’t compare/test myself

[−] miohtama 62d ago

I just ask Claude to reverse engineer the site with Chrome MCP. It goes to work by itself, uses your Chrome logged in session cookies, etc.

[−] arjunchint 61d ago

With our rtrvr.ai Extension we are actually about to allow anyone to do this with just prompting:

- the agent takes actions on a page

- network calls recorded

- agent writes script to hit endpoints directly at scale

- requests made from main world of webpage so automatically get auth headers added

Basically what you do manually but done via the agent in a minute and for FREE from an AI Studio API key

[−] mikrl 62d ago

I was doing similar by capturing XHR requests while clicking through manually, then asking codex to reverse engineer the API from the export.

Never tried that level of autonomy though. How long is your iteration cycle?

If I had to guess, mine was maybe 10-20 minutes over a few prompts.

[−] rkagerer 62d ago

I assume you're not logged into those sites, in order to avoid bans and the risk of hitting the wrong button like, say, "Delete Account".

[−] cbility 61d ago

I use chrome devtools MCP to the same end - it works great for me. Interested in what advantages you see in using Playwright over chrome devtools?

[−] zacmps 61d ago

I would love it if you had time to publish it!

[−] 3abiton 62d ago

I always used playwrite as an alternative to selenium, relatively surprised by its ability to interface with LLMs.

[−] swyx 61d ago

yes please! i need a "comment to follow" functionality on HN

[−] TimCTRL 61d ago

+1, publish, but how will we know when you have published...

[−] bheadmaster 61d ago

I am EXTREMELY interested. Please publish it.

[−] citizenpaul 62d ago

Id like to see this published as well thx!

[−] xrd 62d ago

Yes, please do!

[−] paulirish 62d ago

The DevTools MCP project just recently landed a standalone CLI: https://github.com/ChromeDevTools/chrome-devtools-mcp/blob/m...

Great news to all of us keenly aware of MCP's wild token costs. ;)

The CLI hasn't been announced yet (sorry guys!), but it is shipping in the latest v0.20.0 release. (Disclaimer: I used to work on the DevTools team. And I still do, too)

[−] aadishv 62d ago

Someone already made a great agent skill for this, which I'm using daily, and it's been very cool!

https://github.com/pasky/chrome-cdp-skill

For example, I use codex to manage a local music library, and it was able to use the skill to open a YT Music tab in my browser, search for each album, and get the URL to pass to yt-dlp.

Do note that it only works for Chrome browsers rn, so you have to edit the script to point to a different Chromium browser's binary (e.g. I use Helium) but it's simple enough

[−] mmaunder 62d ago

Google is so far behind agentic cli coding. Gemini CLI is awful. So bad in fact that it’s clear none of their team use it. Also MCP is very obviously dead, as any of us doing heavy agentic coding know. Why permanently sacrifice that chunk of your context window when you can just use CLI tools which are also faster and more flexible and many are already trained in. Playwright with headless Chromium or headed chrome is what anyone serious is using and we get all the dev and inspection tools already. And it works perfectly. This only has appeal to those starting out and confused into thinking this is the way. The answer is almost never MCP.

[−] boomskats 62d ago

Been using this one for a while, mostly with codex on opencode. It's more reliable and token efficient than other devtools protocol MCPs i've tried.

Favourite unexpected use case for me was telling gemini to use it as a SVG editing repl, where it was able to produce some fantastic looking custom icons for me after 3-4 generate/refresh/screenshot iterations.

Also works very nicely with electron apps, both reverse engineering and extending.

[−] cheema33 62d ago

How does this compare with playwright CLI?

https://github.com/microsoft/playwright-cli

[−] zxspectrumk48 62d ago

I found this one working amazingly well (same idea - connect to existing session): https://github.com/remorses/playwriter

[−] guard402 62d ago

We tested this — the default take_snapshot path (Accessibility.getFullAXTree) is safe. It filters display:none elements because they're excluded from the accessibility tree.

But evaluate_script is the escape hatch. If an agent runs document.body.textContent instead of using the AX tree, hidden injections in display:none divs show up in the output. innerText is safe (respects CSS visibility), textContent is not (returns all text nodes regardless of styling).

The gap: the agent decides which extraction method to use, not the user. When the AX tree doesn't return enough text, a plausible next step is evaluate_script with textContent — which is even shown as an example in the docs.

Also worth noting: opacity:0 and font-size:0 bypass even the safe defaults. The AX tree includes those because the elements are technically 'rendered' and accessible to screen readers. display:none is just the most common hiding technique, not the only one.

[−] jasonjmcghee 62d ago

I had fun playing with it + WebMCP this weekend, but I think, similarly to how claude code / codex + MCP require SKILL.md, websites might too.

We could put them in a dedicated tag:

For all the skills with you want on the page, optionally set to default which "should be read in full to properly use the page".

And then add some javascript functions to wrap it / simplify required tokens.

Made a repo and a website if anyone is interested: https://webagentskills.dev/

[−] rossvc 62d ago

I've been using the DevTools MCP for months now, but it's extremely token heavy. Is there an alternative that provides the same amount of detail when it comes to reading back network requests?

[−] recroad 62d ago

I've been using TideWave[1] for the last few months and it has this built-in. It started off as an Elixir/LiveView thing but now they support popular JavaScript frameworks and RoR as well. For those who like this, check it out. It even takes it further and has access to the runtime of your app (not just the browser).

The agent basically is living inside your running app with access to databases, endpoints etc. It's awesome.

1. https://tidewave.ai/

[−] tonyhschu 62d ago

Very cool. I do something like this but with Playwright. It used to be a real token hog though, and got expensive fast. So much so that I built a wrapper to dump results to disk first then let the agent query instead. https://uisnap.dev/

Will check this out to see if they’ve solved the token burn problem.

[−] yan5xu 61d ago

I built something in this space, bb-browser (https://github.com/epiral/bb-browser). Same CDP connection, but the approach is honestly kind of cheating.

Instead of giving agents browser primitives like snapshot, click, fill, I wrapped websites into CLI commands. It connects via CDP to a managed Chrome where you're already logged in, then runs small JS functions that call the site's own internal APIs. No headless browser, no stolen cookies, no API keys. Your browser is already the best place for fetch to happen. It has all the cookies, sessions, auth state. Traditional crawlers spend so much effort on login flows, CSRF tokens, CAPTCHAs, anti-bot detection... all of that just disappears when you fetch from inside the browser itself. Frontend engineers would probably hate me for this because it's really hard to defend against.

So instead of snapshot the DOM (easily 50K+ tokens), find element, click, snapshot again, parse... you just run

  bb-browser site twitter/feed

and get structured JSON back.

Here's the thing I keep thinking about though. Operating websites through raw CDP is a genuinely hard problem. A model needs to understand page structure, find the right elements, handle dynamic loading, deal with SPAs. That takes a SOTA model. But calling a CLI command? Any model can do that. So the SOTA model only needs to run once, to write the adapter. After that, even a small open-source model runs "bb-browser site reddit/hot" just fine.

And not everyone even needs to write adapters themselves. I created a community repo, bb-sites (https://github.com/epiral/bb-sites), where people freely contribute adapters for different websites. So in a sense, someone with just an open-source model can already feel the real impact of agents in their daily workflow. Agents shouldn't be a privilege only for people who can access SOTA models and afford the token costs.

There's a guide command baked in so if you do want to add a new site, you can tell your agent "turn this website into a CLI" and it reverse-engineers the site's APIs and writes the adapter.

v0.8.x dropped the Chrome extension entirely. Pure CDP, managed Chrome instance. "npm install -g bb-browser" and it works.

[−] nubsero 62d ago

I’ve been experimenting with a similar approach using Playwright, and the biggest takeaway for me was how much “hidden API” most modern websites actually have.

Once you start mapping interactions → network calls, a lot of UI complexity just disappears. It almost feels like the browser becomes a reverse-engineering tool for undocumented APIs.

That said, I do think there’s a tradeoff people don’t talk about enough:

- Sites change frequently, so these inferred APIs can be brittle - Auth/session handling gets messy fast - And of course, the ToS / ethical side is a gray area

Still, for personal automation or internal tooling, it’s insanely powerful. Way more efficient than driving full browser sessions for everything.

Curious how others are handling stability — are you just regenerating these mappings periodically, or building some abstraction layer on top?

[−] LauraMedia 61d ago

Is this really the state of AI in 2026?

It takes over your entire browser to center a div... and then fails to do so?

[−] silverwind 62d ago

I found Firefox with https://github.com/padenot/firefox-devtools-mcp to work better then the default Chrome MCP, is seems much faster.

[−] NiekvdMaas 62d ago

Also works nicely together with agent-browser (https://github.com/vercel-labs/agent-browser) using --auto-connect

[−] danielraffel 62d ago

I asked Claude to use this with the new scheduled tasks /loop skill to update my Oscar picks site every five minutes during tonight’s awards show. It simply visited the Oscars' realtime feed via Chrome DevTools, and updated my picks and pushed to gh pages. It even handled the tie correctly.

https://danielraffel.me/2026/03/16/my-oscar-2026-picks/

I know I could just use claude --chrome, but I’m used to this excellent MCP server.

Chrome DevTools MCP (2025) (developer.chrome.com)

234 comments