Browser Use

Browser Use

The open-source default for browser-controlling AI agents.

50k+

GitHub stars

In three months from launch · MIT licensed

14%

of YC W25 in production

Default browser-agent harness inside the batch

89.1%

WebVoyager benchmark

SOTA — ahead of Operator's 87%

Adopted by

OpenAI

Cited in research releases

Anthropic Computer Use

Companion harness

Mastra

Agent framework · YC W25

Autumn

Pricing infra · YC S25

14% of YC W25

In production

5k+ Discord members, 500+ contributors, and growing weekly inside the W25 cohort.[2] [10]

Thesis

The browser is the universal computer-use surface for AI. Every agent that does meaningful work — research, booking, shopping, filing, ops — ends up controlling a browser. Browser Use is the open-source default for browser-controlling agents, and it won distribution the only way an agent framework can: 50,000 GitHub stars in three months.[1] [2] The longer-term bet: Browser Use becomes the Playwright of the agent era — the primitive every AI lab, startup, and enterprise builds on when they need their model to actually do something.
  1. 01

    The browser is the universal computer-use surface. Operating systems fragment; the web unifies. Every workflow that matters — Salesforce, Google Workspace, Kayak, Notion, LinkedIn, every internal admin tool — already runs in a browser. When Anthropic shipped Computer Use and OpenAI shipped Operator within three months of each other, they were both betting on the same thesis.[3] [4] The browser is where agents go to do work because the work is already there.

  2. 02

    OSS won this category in three months. Browser Use went from launch to 50k+ stars by being the cleanest implementation of the thing every agent dev was hacking together. 14% of YC's W25 batch was already in production at memo time.[5] OpenAI Operator is the closed, managed answer; Browser Use is the open one — and 14% of YC W25 chose open.[2]

  3. 03

    The next agent stack is built on Browser Use the same way the last one was built on Playwright. Selenium → Playwright → Browser Use is the same arc. Each step abstracts away the layer underneath. Browser Use is model-agnostic — Claude, GPT, Gemini, DeepSeek, Qwen, and local Ollama all work the same way — so the bet is on the harness, not the model.[6] [13]

  4. 04

    The founders are exactly the right team. Magnus Müller (ETH Zürich, ML researcher, repeat founder, web-scraping hobbyist since he learned to code) and Gregor Žunič (ETH Zürich Data Science MSc, Physics BSc) built the open-source default for what every AI lab is now scrambling to ship. Pivoted from an SEO startup the moment the bigger surface became visible. Shipped Browser Use Cloud within 72 hours of OpenAI's Operator announcement.[7] [9]

Problem

LLMs can think. The web is where they have to act. Every workflow that matters runs behind a UI built for humans.

Every AI agent that does paid work ends up needing to use a browser. Booking a flight, filing a form, reading a SaaS dashboard, posting on LinkedIn, scraping a competitor's pricing page, running a research workflow across ten sites — all of it lives behind interfaces designed for humans, not APIs.

Building that capability in-house is brutal. Selectors break every time a site ships a redesign. Bot-detection systems fingerprint your headless Chrome and block you. Authentication adds friction at every step. Rate limits, parsing errors, and API-key sprawl turn every agent into a maintenance project that nobody on the team wants to own.

The 61% of the web with no public API is exactly the surface where the highest-value agent work happens — and it's the surface that breaks the fastest. The category has been waiting for a Playwright-equivalent primitive — something built for LLM-driven control rather than scripted automation. Browser Use is that primitive.[13]

61%

of the web has no public API

The high-value workflow surface, unaddressed by API-first tools

$5B → $30B

Web RPA TAM (2024 → 2030)

Agent-driven control expands it another 10×

3 months

Browser Use launch → 50k stars

Operator's January 2025 launch added 25k stars in four days

Browser Use GitHub[2] · YC W25 launch post[7]

Why Now

Anthropic shipped Computer Use in October. OpenAI shipped Operator in January. The category opened in front of our eyes.

Three preconditions converged in the same six months: frontier labs validated the browser surface, the harness caught up with the model, and OSS distribution beat closed distribution.

Operator can go to the web and do tasks for you. We think this will be a really big deal — agents that actually do things on the internet.

Sam Altman

Sam Altman[4]

CEO · OpenAI

Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills — allowing it to use a wide range of standard tools and software programs designed for people.

Anthropic

Anthropic[3]

Claude 3.5 Sonnet Computer Use

Our vision is simple: tell your computer what to do, and it gets it done.

Magnus Müller

Magnus Müller[7]

Co-founder · Browser Use

Three preconditions converged in the same six months.

Frontier labs validated computer use. Anthropic shipped Computer Use with Claude 3.5 Sonnet in October 2024.[3] OpenAI shipped Operator on January 23, 2025.[4] Two of the three biggest labs placed the same bet inside a single quarter: the browser is where agents work. The category that didn't exist on YC application day was suddenly the most-discussed surface in the AI stack.

The harness caught up with the model. Browser Use hit 89.1% on the WebVoyager benchmark — state-of-the-art at the time, ahead of OpenAI Operator's 87%.[8] A year ago, none of this worked reliably. Now the bottleneck has moved from "can the model see the page?" to "what's the cleanest abstraction on top?" That second question is the entire venture opportunity, and it's the question Browser Use answers.

OSS distribution beat closed distribution. When Operator launched in a closed beta available only to $200/month ChatGPT Pro subscribers, Browser Use gained 25k+ stars in four days as developers searched for the open alternative.[2] [7] Magnus and Gregor were building in public on X the entire time, shipping the cloud version within 72 hours of the announcement. The closed launch became Browser Use's largest distribution event.

Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills — allowing it to use a wide range of standard tools and software programs designed for people.
Anthropic, on shipping Computer Use[3]

How It Works

Three layers. One Python import. Any LLM controls a browser.

Step 01

Perception — see the page

Browser Use builds a structured representation of the page from DOM inspection and visual understanding. Interactive elements are identified with high precision; the agent sees what a human would click. The same perception layer works whether the page is a Salesforce admin panel or a Kayak booking flow.

Step 02

Reasoning — pick the next action

Any LLM (Claude, GPT, Gemini, Qwen, DeepSeek-R1, local Ollama) reasons over the perception output to decide the next action. Model-agnostic by design — when a new frontier model ships, Browser Use users get the upgrade for free. No vendor lock-in to GPT-4o (Operator) or Claude 3.5 (Computer Use).

Step 03

Execution — drive the browser

Built on Playwright. Human-like interaction patterns to evade bot detection. Auto-retry engine for transient errors. 30-day session persistence for workflows that need to stay logged in. Cloud handles proxy rotation, parallelism, and observability on top.

Open core, managed cloud. Same shape that won Postgres, Next.js, and Kafka their categories.

The OSS core is the install. pip install browser-use, point it at the LLM of your choice, and you're running. MIT licensed, no telemetry tax, no vendor lock-in. The reason 14% of YC W25 is already in production: the integration takes an afternoon.[2] [5]

Browser Use Cloud is the renewal. Proxy rotation, persistent sessions, parallel execution, and a managed control plane at $30/month — vs. $200/month for ChatGPT Pro (the only way to get Operator at launch).[9] The cloud absorbs the operational complexity the OSS version can't.

Model neutrality is the structural moat. Every closed competitor — Operator, Computer Use — is locked to a single model vendor. Every AI engineer building a real product needs fallbacks, cost ceilings, and the freedom to swap models when a new SOTA lands. Browser Use is the only harness that respects that.[6]

The OSS Distribution Story

OpenAI launched Operator. Browser Use gained 25k stars in four days.

The textbook example of how OSS captures demand created by a closed competitor's launch — and the reason Browser Use is now the default browser-agent harness inside the AI cohort.

The Operator launch was Browser Use's largest distribution event.

January 23, 2025 — Operator ships behind a paywall. OpenAI announces Operator: closed beta, $200/month ChatGPT Pro requirement, GPT-4o locked, no self-hosting, no enterprise API at launch.[4] Every developer who wanted to ship a browser agent against a non-OpenAI model — or didn't want to pay $200/month — went looking for an alternative.

72 hours later — Browser Use Cloud ships. Magnus and Gregor shipped the managed cloud version of Browser Use inside the news cycle. $30/month entry pricing.[9] Model neutral. Self-hostable. Every Operator limitation became a Browser Use feature on the launch page.

Four days later — 25k stars. The GitHub star curve absorbed the entire Operator news cycle.[2] The W25 batch picked Browser Use as the default browser-agent harness. Three months later: 50k stars, 5k Discord members, 500+ contributors. The OSS distribution flywheel is the same one Vercel rode with Next.js, Supabase with Postgres, and Mastra with TypeScript agents — except Browser Use compressed that arc into a quarter.[10]

Our vision is simple: tell your computer what to do, and it gets it done.
Magnus Müller, Co-founder[7]

Market

The densest buyer pool is every AI startup shipping an agent.

Every AI startup shipping an agent ends up needing browser control. YC's W25 batch already adopted Browser Use as the default — 14% in production at memo time.[5] The next ring is the broader Seed-to-Series-B AI cohort: research agents, sales agents, ops agents, coding agents that need to browse. The agent framework boom is mid-cycle; Browser Use sits underneath it as the primitive everyone reaches for.

The long pole is every business workflow that runs in a browser. Web RPA sits at ~$5B today growing to ~$30B by 2030 — and that's just the existing automation pool. The agent layer expands it 10× by making automation viable for the 61% of the web with no API. When agents replace BPO, when every SaaS company ships an agent SKU, when every enterprise has a fleet of agents doing knowledge-worker tasks — Browser Use is the harness underneath.

Near term — AI-native agent stack

YC current and recent cohorts plus the broader AI-native seed-to-Series-A pool. Dense network, technical buyers, OSS-friendly. Browser Use is already the default for 14% of W25 and the OSS pip-install of choice for every new agent project that needs browser control.[5]

Long term — the agent layer for the web

Every browser-based workflow is addressable. Web RPA market: ~$5B (2024) → ~$30B (2030), with agent-driven control expanding the surface another order of magnitude by addressing the 61% of high-value workflows with no public API. Browser Use is the layer underneath every "Operator-for-X" startup that ships in the next three years.[13]

Every AI agent project is a browser-control project in waiting. Browser Use should be the answer by default — and it is, three months into the curve.
Orange Collective

Competitive landscape

Four categories of competition. Browser Use is positioned against all of them.

Each category has a structural limitation — model lock-in, missing harness, missing OSS, or missing AI-native abstraction. Model-agnostic + OSS + Python + Cloud is the answer to all four.

OpenAI Operator

Closed · managed

GPT-4o only, $200/month behind ChatGPT Pro, no self-hosting and no broad API at launch.[4] [9] The exact shape that pushed 25k developers to Browser Use in four days. Operator's existence is bullish for the category and validates the exact wedge Browser Use is built around.

Anthropic Computer Use

Model-native

Claude 3.5 Sonnet ships with computer-use capability natively.[3] Powerful primitive but Claude-only — the developer still has to build the harness around it (virtualization, action execution, retry logic, session management). Browser Use is the harness Claude users reach for when they ship to production.

Playwright / Selenium

Previous generation

The current default for scripted browser automation.[13] Brittle by design — every selector is a contract that the next site redesign breaks. No LLM-first abstractions. Browser Use is built on Playwright for execution and replaces the layer above it — the layer where humans currently write and maintain selector logic.

Multi-On · Skyvern · Adept

Closed agent frameworks

Closed-source agent platforms with their own model harnesses. Smaller communities, narrower model support, less mind-share among AI developers. None matched Browser Use's GitHub growth curve; none matched its YC penetration. The OSS + model-agnostic stance is the structural moat against this cohort.

The closed labs ship the agent. The OSS default ships the harness underneath it. Browser Use is the harness — and the harness compounds across model generations the agent never will.
Orange Collective

Founder deep dive

Two ETH Zürich engineers who built bots before it was a category — and pivoted the moment the bigger surface arrived.

Why Magnus built it. Magnus has been building bots and web automations since he learned to code. ML researcher with publications for Cambridge CARES. Worked R&D in process automation before YC. Saw firsthand that every interesting workflow ended in a browser — and that LLMs had no clean primitive to drive one. "How hard could it be to build the interface between LLMs and the web?"[7] The repeat-founder pattern shows up in execution speed: Browser Use shipped Cloud inside 72 hours of Operator's launch.

Why Gregor built it. Gregor is a hard-problems guy — Data Science MSc at ETH, Physics BSc, previous startups in SEO automation and image generation. He pairs with Magnus on the technical-depth side. The team's first startup was SEO-adjacent; Browser Use is the pivot they made when they realized the deeper surface was browser control itself, not the workflow on top.

Why this team is the right team. Both founders have been inside the brittle web-automation problem for years. They built the OSS solution because they had built every brittle part of this stack at least twice before. The combination — ML research + scraping muscle + ETH technical depth + a repeat-founder bias toward shipping — is the exact profile that wins an OSS category.

Why velocity is a feature. 72 hours from Operator announcement to Browser Use Cloud launch. 25k GitHub stars in four days. Building in public on X the entire time.[11] [12] The market noticed before the press cycle did. The OSS adopt-rate curve and the W25 batch penetration are both downstream of the same founding instinct: ship fast, ship in public, let the community do the marketing.

The long arc. Browser Use becomes the Playwright of the agent era — the primitive every AI lab, startup, and enterprise builds on when they need their model to actually do something. The OSS core wins distribution; Browser Use Cloud captures the operational value; the long-term moat is the operational memory of every business workflow that runs through it. The category is open; the incumbent is being chosen now.

Founder & team

Magnus Müller

Magnus Müller

Repeat Founder

Co-founder & CEO

Repeat founder. ML researcher and scraping hobbyist since he learned to code. MSc Data Science at ETH Zürich. Published for Cambridge CARES. Built bots, web automations, and process-automation R&D before Browser Use. Saw the gap between LLMs and the web and built the bridge.

Gregor Žunič

Gregor Žunič

Co-founder

MSc Data Science at ETH Zürich; BSc Physics. Previous startups in SEO automation and image generation. Loves solving hard problems. Pivoted with Magnus from an SEO startup the moment the bigger browser-agent surface became obvious.

Risks & mitigations

Risk

Frontier labs absorb the harness — OpenAI Operator, Anthropic Computer Use, or Google Project Mariner ship native browser-control SDKs that obsolete a third-party framework.

Mitigation

The labs are model-locked by design — Operator is GPT-4o only; Computer Use is Claude only. Every AI engineer building a real product needs model neutrality (cost, latency, fallback). Browser Use's structural advantage is model-agnostic — Claude, GPT, Gemini, DeepSeek, Qwen, and local Ollama work the same way. The OSS distribution moat is already five figures of stars deep; absorbing it would require either acquiring the team or shipping a strictly superior open-source product, which the labs structurally avoid.

Risk

Bot detection and authentication friction. Cloudflare, Datadome, and ReCAPTCHA keep raising the floor; sites lock down content and break automation.

Mitigation

Browser Use already invests heavily in undetectable browsers, residential proxy rotation, and human-like interaction patterns — the same problem space the company's founders have been working in since before YC. The cloud product captures the willingness-to-pay generated by exactly this difficulty: the harder the problem gets, the more customers move from OSS to managed. The risk is also the wedge.

Risk

OSS monetization — the perennial question. Will the cloud product and enterprise tier compound, or will the OSS core absorb all the upside?

Mitigation

Browser Use Cloud is positioned where the OSS version is structurally weak: parallelism, proxy rotation, persistent sessions, observability, and the control plane. Same playbook as Vercel + Next.js, Supabase + Postgres, and Mastra + agents. The OSS core builds distribution; the managed plane scales ARR per customer. $30/month entry pricing is intentionally low — the trajectory is enterprise SKUs, fleet management, and an observability layer.

Risk

Web APIs eventually cover the surface — every site ships a public API, browser automation becomes unnecessary.

Mitigation

Reverse-trending. 61% of the high-value web has no public API today, and the LLM era has if anything accelerated the lock-down (Reddit, Stack Overflow, X, news sites all closing or pricing access). Browser-level access is the open access path — the same way Selenium and Playwright became infrastructure because the web kept being the web. Agent-driven browser control is the LLM-era restatement of the same bet.

What we're watching

  • Star count past 100k and the next 10× of YC penetration — does the OSS curve hold as the agent stack matures?
  • Cloud revenue ramp — paid conversion of the OSS base, enterprise SKUs, fleet management features.
  • Benchmark trajectory — does Browser Use stay ahead of Operator and Computer Use on WebVoyager and successor benchmarks?
  • Strategic positioning vs. OpenAI / Anthropic / Google — partnership, acquisition offer, or independent compounding?

References

  1. [1]Browser Use — Product homepage
  2. [2]GitHub — browser-use/browser-use (MIT license, 50k+ stars at memo time, 98k+ today)
  3. [3]Anthropic — Introducing computer use, Claude 3.5 Sonnet (Oct 22, 2024)
  4. [4]OpenAI — Introducing Operator (Jan 23, 2025)
  5. [5]Y Combinator — Browser Use company profile (W25)
  6. [6]Browser Use — Supported models documentation
  7. [7]Y Combinator Launches — Browser Use: Open-Source Alternative to OpenAI Operator
  8. [8]Browser Use — State-of-the-art on WebVoyager (89.1% success across 586 web tasks)
  9. [9]Browser Use Cloud — Hosted version ($30/month vs. $200 ChatGPT Pro for Operator)
  10. [10]Browser Use Discord — Community (5k+ active members)
  11. [11]Magnus Müller on X — @mamagnus00 (building in public)
  12. [12]Gregor Žunič on X — @gregpr07 (building in public)
  13. [13]Microsoft Playwright — the previous-generation browser-automation primitive
  14. [14]WebVoyager — Benchmark for vision-language web agents