{{< resource-info >}}
GitHub: browser-use/browser-use | Stars: 94,731 | License: MIT | Version: 0.12.7
IntroductionWriting and maintaining Selenium scripts for modern web automation is a slow death by a thousand selectors. A class name changes, a button moves, and your entire pipeline collapses at 3 AM. Browser Use, an open-source Python framework launched in late 2024 by Magnus Müller and Gregor Žunič, takes a different approach: it hands the browser controls to a large language model and lets the AI figure out what to click, type, and read. With 94,731 GitHub stars, 319 contributors, and an 89.1% success rate on the WebVoyager benchmark, it has become the de facto open-source standard for AI-driven browser automation. This tutorial covers the setup, real benchmark data, integration with popular LLMs, and a head-to-head comparison against Selenium, Puppeteer, and Scrapy.—## What Is Browser Use?Browser Use is a Python library (≥3.11) that connects any LangChain-compatible LLM to a real web browser via Playwright. Instead of hardcoding CSS selectors or XPath expressions, you describe the task in natural language — “find the cheapest flight from NYC to SFO next Friday” — and the agent handles navigation, form filling, clicking, and data extraction autonomously.### Key Design Principles- Model-agnostic: Works with OpenAI GPT-4o/5.1, Anthropic Claude Sonnet 4, Google Gemini 3 Flash, and local models via LiteLLM. #
- DOM distillation: Strips pages to essential interactive elements before sending them to the LLM, cutting token consumption by up to 60%.
- Multi-tab support: Agents can operate across multiple browser tabs simultaneously.
- Persistent memory: Maintains context and conversation history across navigation steps.
- Built on Playwright: Inherits all Playwright features — stealth mode, proxy support, network interception, and video recording.—## How Browser Use WorksBrowser Use operates on a continuous observe → plan → act → verify loop:### Architecture Overview``` ┌─────────────┐ DOM + Screenshot ┌─────────────┐ │ Browser │ ──────────────────────> │ LLM │ │ (Playwright)│ │(Claude/GPT/)│ │ │ <────────────────────── │ Gemini │ └─────────────┘ Action (click/type) └─────────────┘ ↑ │ └────────── Page State Change ───────────┘
2. **Distill**: The DOM is filtered to only interactive elements (buttons, inputs, links), reducing noise.
3. **Reason**: The LLM receives the distilled page state and the user's goal, then plans the next action.
4. **Execute**: Browser Use translates the LLM's decision into Playwright API calls (`page.click()`, `page.fill()`).
5. **Verify**: The loop repeats until the task is complete or `max_steps` is reached.```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncioasync def main():
browser = Browser()
agent = Agent(
task="Find the number of stars of the browser-use repo",
llm=ChatOpenAI(model="gpt-4.1"),
browser=browser,
)
result = await agent.run()
print(result)if __name__ == "__main__"```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncio
async def main():
browser = Browser()
agent = Agent(
task="Find the number of stars of the browser-use repo",
llm=ChatOpenAI(model="gpt-4.1"),
browser=browser,
)
result = await agent.run()
print(result)
if __name__ == "__main__":
asyncio.run(main())
``` chromium
```### Step 2: Configure Environment Variables```bash
# .env file
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-api-key# Optional: Browser Use Cloud for stealth browsers
BROWSER_USE_API_KEY=your-cloud-key
```### Step 3: Run Your First Agent```python
import asyncio
from browser_use import Agent, Browser, ChatBrowserUseasync def main():
browser = Browser()
agent = Agent(
task="List the top 20 posts on Hacker News today with their points",
llm=ChatBrowserUse(),
browser=browser,
)
result = ```bash
# Using uv (recommended)
uv init
uv add browser-use
uv sync
# Using pip
pip install browser-use
# Install Chromium if not already present
playwright install chromium
```/quickstart-browser-use-cloud.png)### Docker Setup (Production)```dockerfile
# Dockerfile
FROM python:3.11-slimWORKDIR /app
RUN pip install browser-use playwright
RUN playwright install chromium
RUN playwright ins```bash
# .env file
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-api-key
# Optional: Browser Use Cloud for stealth browsers
BROWSER_USE_API_KEY=your-cloud-key
```HROPIC_API_KEY}
volumes:
- ./scripts:/app
command: python agent.py
```---## Integration with Popular Tools### OpenAI GPT-4o / GPT-5.1```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import a```python
import asyncio
from browser_use import Agent, Browser, ChatBrowserUse
async def main():
browser = Browser()
agent = Agent(
task="List the top 20 posts on Hacker News today with their points",
llm=ChatBrowserUse(),
browser=browser,
)
result = await agent.run()
print(result.output)
if __name__ == "__main__":
asyncio.run(main())
```hatAnthropic
import asyncioasync def extract_data():
agent = Agent(
task="Extract all pricing plans from example.com/pricing",
llm=ChatAnthropic(model="claude-sonnet-4-6"),
browser=Browser(),
)
result = await agent.run()
print(result.output)asyncio.run(extract_data())
```### Google Gemini 3 Flash```python
from browser_use import Agent, Browser
from langchain_google_genai import ChatGoogleGenerativeAI
import asyncioasync def research_topic():
agent = Agent(
task="Res```dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN pip install browser-use playwright
RUN playwright install chromium
RUN playwright install-deps
COPY . .
CMD ["python", "agent.py"]
```arch_topic())
```### Ollama (Local Models)```python
from browser_use import Agent, Browser
from langchain_ollama import ChatOllama
import asyncioasync def local_automation():
agent = Agen```yaml
# docker-compose.yml
version: '3.8'
services:
browser-use:
build: .
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./scripts:/app
command: python agent.py
```on
from playwright.async_api import async_playwright
from browser_use import Agent
from langchain_openai import ChatOpenAIasync def hybrid_automation():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
# Determin```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncio
async def search_flights():
agent = Agent(
task="Find the cheapest flight from NYC to London next week",
llm=ChatOpenAI(model="gpt-4o", temperature=0),
browser=Browser(),
)
return await agent.run()
asyncio.run(search_flights())
```)
return result
```---## Benchmarks / Real-World Use Cases### WebVoyager Benchmark ResultsThe WebVoyager benchmark evaluates browser agents on 586 diverse real-world web tasks. Browser Use ranks #7 on the global leaderboard with an 89.1% success rate — the highest among fully open-source frameworks.:
agent = Agent(
task="Extract all pricing plans from example.com/pricing",
llm=ChatAnthropic(model="claude-sonnet-4-6"),
browser=Browser(),
)
result = await agent.run()
print(result.output)
asyncio.run(extract_data())
```bo | 88.5% | Z.ai |
| 9 | Agent Kura | 87.0% | Kura |
| 9 | OpenAI Operator | 87% | OpenAI |
| 11 | Skyvern 2.0 | 85.85% | Skyvern |
| 12 | Project Mariner | 83.5% | Google |### Performance Metrics (vs Traditional Tools)| Metric | Browser Use (AI) | Playwright | Puppeteer | Selenium |
|--------|-----------------|-----------|-----------|----------|
| Cold start to first navigation | ~0.5–0.8s | ~0.4–0.7s | ~0.3–0.```python
from browser_use import Agent, Browser
from langchain_google_genai import ChatGoogleGenerativeAI
import asyncio
async def research_topic():
agent = Agent(
task="Research the latest AI news and summarize top 5 stories",
llm=ChatGoogleGenerativeAI(model="gemini-3-flash-preview"),
browser=Browser(),
)
return await agent.run()
asyncio.run(research_topic())
```s) | Best For |
|-------------|---------------------------|----------|
| GPT-4o | ~$0.15–$0.30 | Complex reasoning tasks |
| Claude Sonnet 4 | ~$0.10–$0.20 | Production reliability |
| Gemini 3 Flash | ~$0.02–$0.05 | Cost-sensitive batch jobs |
| Local (Qwen2.5-72B) | ~$0.005 (GPU cost) | Privacy-first deployments |### Use Case: Automated Price Monitoring```python
import asyncio
from browser_use import Agent, Browser
fro```python
from browser_use import Agent, Browser
from langchain_ollama import ChatOllama
import asyncio
async def local_automation():
agent = Agent(
task="Fill out the contact form on example.com/contact",
llm=ChatOllama(model="qwen2.5:72b"),
browser=Browser(),
)
return await agent.run()
asyncio.run(local_automation())
```r name",
llm=ChatOpenAI(model="gpt-4o-mini"),
browser=Browser(),
)
result = await agent.run()
results.append({"url": url, "data": result.output})
return results# Run daily via cron or scheduled task
prices = asyncio.run(monitor_prices())
```---## Advanced Usage / Production Hardening### Parallel Agent Execution```python
im```python
from playwright.async_api import async_playwright
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def hybrid_automation():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
# Deterministic Playwright step
await page.goto("https://example.com")
# Hand off to Browser Use agent for complex task
agent = Agent(
task="Navigate to pricing and extract all plan details",
llm=ChatOpenAI(model="gpt-4o"),
browser=browser,
)
result = await agent.run()
await browser.close()
return result
config = BrowserConfig(
proxy={
“server”: “http://proxy.example.com:8080”,
“username”: “user”,
“password”: “pass”,
},
headless=True,
)browser = Browser(config=config)
agent = Agent(
task=“Extract data from a geo-restricted site”,
llm=ChatOpenAI(model=“gpt-4o”),
browser=browser,
)
### Session Persistence and Authenticationpython
from browser_use import Browser, BrowserConfig, Agent
from langchain_openai import ChatOpenAI# Use persistent browser profile to maintain login state
config = BrowserConfig(
user_data_dir="./browser_profile",
headless=False, # Use headed mode for initial login
)async def authenticated_task():
browser = Browser(config=config)
agent = Agent(
task=“Download my monthly invoice from the billing page”,
llm=ChatOpenAI(model=“gpt-4o”),
browser=browser,
)
return await agent.run()
### Error Handling and Retriespython
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAIasync def robust_agent(task, max_retries=3):
for attempt in range(max_retries):
try:
agent = Agent(
task=task,
llm=ChatOpenAI(model=“gpt-4o”),
browser=Browser(),
max_steps=25, # Limit steps to prevent runaway loops
)
result = await agent.run()
if result.success:
return result
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
await asyncio.sleep(2 ** attempt) # Exponential backoff
raise Exception(f"Task failed after {max_retries} attempts")
### Monitoring with Prometheuspython
from prometheus_client import Counter, Histogram, start_http_server
from browser_use import Agent, Browseragent_runs = Counter(“browseruse_agent_runs_total”, “Total agent runs”)
agent_failures = Counter(“browseruse_agent_failures_total”, “Total agent failures”)
agent_duration = Histogram(“browseruse_agent_duration_seconds”, “Agent run duration”)start_http_server(8000)async def monitored_agent(task):
agent_runs.inc()
with agent_duration.time():
try:
agent = Agent(task=task, llm=llm, browser=Browser())
result = await agent.run()
return result
except Exception:
agent_failures.inc()
raise
|---------|-------------|--------|-----------|----------|
| **Language** | Python | Python | JavaScript/TypeScript | Python, Java, C#, JS |
| **```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
async def monitor_prices():
urls = [
"https://amazon.com/dp/B0DHTYW7P5",
"https://bestbuy.com/site/xyz",
"https://newegg.com/product/abc",
]
results = []
for url in urls:
agent = Agent(
task=f"Go to {url} and extract the current price, availability, and seller name",
llm=ChatOpenAI(model="gpt-4o-mini"),
browser=Browser(),
)
result = await agent.run()
results.append({"url": url, "data": result.output})
return results
# Run daily via cron or scheduled task
prices = asyncio.run(monitor_prices())
```x tasks | Large-scale static scraping | Chrome automation, testing | Cross-browser testing, legacy |
| **GitHub Stars** | 94,731 | 54,200 | 90,800 | 25,400 |### When to Choose What- **Browser Use**: Complex multi-step tasks on dynamic sites where writing selectors is impractical. AI agents that need to adapt to changing UIs.
- **Scrapy**: High-volume extraction of static HTML pages. Best for structured crawling at scale.
- **Puppeteer**: Chrome-only automation where speed matters and you control the target site. Ideal for PDF generation and screenshots.
- **Selenium**: Cross-browser testing for enterprise applications with strict browser coverage requirements.---## Limitations / Honest AssessmentBrowser Use is not a universal replacement for traditional browser automation. Here i```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
async def run_parallel_agents(tasks):
browser = Browser()
agents = [
Agent(task=task, llm=ChatOpenAI(model="gpt-4o-mini"), browser=browser)
for task in tasks
]
results = await asyncio.gather(*[agent.run() for agent in agents])
return results
tasks = [
"Find iPhone 16 price on Amazon",
"Find iPhone 16 price on Best Buy",
"Find iPhone 16 price on Apple Store",
]
results = asyncio.run(run_parallel_agents(tasks))
```the current agent loop architecture.4. **Simple, stable sites**: If the target site never changes and has clean selectors, traditional automation is faster, cheaper, and more reliable.5. **LLM dependency**: You are bound to the availability and pricing of third-party LLM APIs. Rate limits can bottleneck production workloads.---## Frequently Asked Questions### What is Browser Use used for?
Browser Use is a Python framework that lets LLMs control web browsers via Playwright. It is used for AI-driven web automation — tasks like form filling, data extraction, price monitoring, and multi-ste```python
from browser_use import Browser, BrowserConfig
from browser_use import Agent
from langchain_openai import ChatOpenAI
config = BrowserConfig(
proxy={
"server": "http://proxy.example.com:8080",
"username": "user",
"password": "pass",
},
headless=True,
)
browser = Browser(config=config)
agent = Agent(
task="Extract data from a geo-restricted site",
llm=ChatOpenAI(model="gpt-4o"),
browser=browser,
)
```or complex AI-driven tasks.### What LLMs work with Browser Use?
Any LangChain-compatible LLM: OpenAI GPT-4o/5.1, Anthropic Claude Sonnet 4, Google Gemini 3 Flash, and local models via Ollama (Qwen2.5, Llama 3, Mistral). The framework is model-agnostic through LiteLLM.### How much does Browser Use cost?
The framework is free (MIT license). You pay for LLM API usage: approximately $0.02–$0.30 per 10-step task depending on the model. Browser Use Cloud offers managed stealth browsers starting at```python
from browser_use import Browser, BrowserConfig, Agent
from langchain_openai import ChatOpenAI
# Use persistent browser profile to maintain login state
config = BrowserConfig(
user_data_dir="./browser_profile",
headless=False, # Use headed mode for initial login
)
async def authenticated_task():
browser = Browser(config=config)
agent = Agent(
task="Download my monthly invoice from the billing page",
llm=ChatOpenAI(model="gpt-4o"),
browser=browser,
)
return await agent.run()
```ation in Browser Use?
Use persistent browser profiles (`user_data_dir` in `BrowserConfig`) to maintain cookies and login state across sessions. For OAuth or 2FA flows, run the initial login in headed mode, then switch to headless for subsequent tasks.### What is the difference between Browser Use and Stagehand?
Browser Use is a fully autonomous agent framework — the LLM controls all navigation decisions. Stagehand (by Browserbase) adds AI primitives (`act()`, `extract()`, `observe()`) on top of Playwright for hybrid workflows where deterministic and AI-driven ```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
async def robust_agent(task, max_retries=3):
for attempt in range(max_retries):
try:
agent = Agent(
task=task,
llm=ChatOpenAI(model="gpt-4o"),
browser=Browser(),
max_steps=25, # Limit steps to prevent runaway loops
)
result = await agent.run()
if result.success:
return result
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
await asyncio.sleep(2 ** attempt) # Exponential backoff
raise Exception(f"Task failed after {max_retries} attempts")
```ervention, Browser Use is the most mature open-source option available.**Action items**:
1. Clone the [browser-use/browser-use](https://github.com/browser-use/browser-use) repository
2. Run `pip install browser-use` and set up your first agent with the code examples above
3. Evaluate the WebVoyager benchmark against your use case
4. Join the [Browser Use Discord](https://link.browser-use.com/discord) for community support and production tips> **Want more AI automation tutorials?** Join our [Telegram group](https://t.me/dibi8opensource) for weekly deep-dives on open-source AI tools, production deployment tips, and benchmark data.---
## Recommended Hosting & InfrastructureBefore you deploy any of the tools above into production, you'll need solid infrast```python
from prometheus_client import Counter, Histogram, start_http_server
from browser_use import Agent, Browser
agent_runs = Counter("browseruse_agent_runs_total", "Total agent runs")
agent_failures = Counter("browseruse_agent_failures_total", "Total agent failures")
agent_duration = Histogram("browseruse_agent_duration_seconds", "Agent run duration")
start_http_server(8000)
async def monitored_agent(task):
agent_runs.inc()
with agent_duration.time():
try:
agent = Agent(task=task, llm=llm, browser=Browser())
result = await agent.run()
return result
except Exception:
agent_failures.inc()
raise
```se.com)
- [Browser Use Cloud Platform](https://cloud.browser-use.com)
- [WebVoyager Benchmark Leaderboard](https://leaderboard.steel.dev/)
- [Playwright vs Puppeteer vs Selenium 2026 Benchmarks](https://use-apify.com/blog/playwright-vs-puppeteer-vs-selenium-2026)
- [AI Browser Automation Tools Comparison 2026](https://awesomeagents.ai/tools/best-ai-browser-automation-tools-2026/)
- [Browser Use Proxy Setup Guide](https://www.coronium.io/blog/browser-use-proxy-setup)
- [Stagehand vs Browser Use vs Playwright Comparison](https://www.nxcode.io/resources/news/stagehand-vs-browser-use-vs-playwright-ai-browser-automation-2026)
---*This article was written for developers who need production-grade browser automation. All benchmark data is sourced from publicly available leaderboards and independent testing as of May 2026.*
💬 Discussion