| Dibi8

GitHub: browser-use/browser-use | Stars: 94,731 | License: MIT | Version: 0.12.7

IntroductionWriting and maintaining Selenium scripts for modern web automation is a slow death by a thousand selectors. A class name changes, a button moves, and your entire pipeline collapses at 3 AM. Browser Use, an open-source Python framework launched in late 2024 by Magnus Müller and Gregor Žunič, takes a different approach: it hands the browser controls to a large language model and lets the AI figure out what to click, type, and read. With 94,731 GitHub stars, 319 contributors, and an 89.1% success rate on the WebVoyager benchmark, it has become the de facto open-source standard for AI-driven browser automation. This tutorial covers the setup, real benchmark data, integration with popular LLMs, and a head-to-head comparison against Selenium, Puppeteer, and Scrapy.—## What Is Browser Use?Browser Use is a Python library (≥3.11) that connects any LangChain-compatible LLM to a real web browser via Playwright. Instead of hardcoding CSS selectors or XPath expressions, you describe the task in natural language — “find the cheapest flight from NYC to SFO next Friday” — and the agent handles navigation, form filling, clicking, and data extraction autonomously.### Key Design Principles- Model-agnostic: Works with OpenAI GPT-4o/5.1, Anthropic Claude Sonnet 4, Google Gemini 3 Flash, and local models via LiteLLM. #

DOM distillation: Strips pages to essential interactive elements before sending them to the LLM, cutting token consumption by up to 60%.
Multi-tab support: Agents can operate across multiple browser tabs simultaneously.
Persistent memory: Maintains context and conversation history across navigation steps.
Built on Playwright: Inherits all Playwright features — stealth mode, proxy support, network interception, and video recording.—## How Browser Use WorksBrowser Use operates on a continuous observe → plan → act → verify loop:### Architecture Overview``` ┌─────────────┐ DOM + Screenshot ┌─────────────┐ │ Browser │ ──────────────────────> │ LLM │ │ (Playwright)│ │(Claude/GPT/)│ │ │ <────────────────────── │ Gemini │ └─────────────┘ Action (click/type) └─────────────┘ ↑ │ └────────── Page State Change ───────────┘

2. **Distill**: The DOM is filtered to only interactive elements (buttons, inputs, links), reducing noise.
3. **Reason**: The LLM receives the distilled page state and the user's goal, then plans the next action.
4. **Execute**: Browser Use translates the LLM's decision into Playwright API calls (`page.click()`, `page.fill()`).
5. **Verify**: The loop repeats until the task is complete or `max_steps` is reached.```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncioasync def main():
    browser = Browser()
    agent = Agent(
        task="Find the number of stars of the browser-use repo",
        llm=ChatOpenAI(model="gpt-4.1"),
        browser=browser,
    )
    result = await agent.run()
    print(result)if __name__ == "__main__"```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncio

async def main():
    browser = Browser()
    agent = Agent(
        task="Find the number of stars of the browser-use repo",
        llm=ChatOpenAI(model="gpt-4.1"),
        browser=browser,
    )
    result = await agent.run()
    print(result)

if __name__ == "__main__":
    asyncio.run(main())
``` chromium
```### Step 2: Configure Environment Variables```bash
# .env file
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-api-key# Optional: Browser Use Cloud for stealth browsers
BROWSER_USE_API_KEY=your-cloud-key
```### Step 3: Run Your First Agent```python
import asyncio
from browser_use import Agent, Browser, ChatBrowserUseasync def main():
    browser = Browser()
    agent = Agent(
        task="List the top 20 posts on Hacker News today with their points",
        llm=ChatBrowserUse(),
        browser=browser,
    )
    result = ```bash
# Using uv (recommended)
uv init
uv add browser-use
uv sync

# Using pip
pip install browser-use

# Install Chromium if not already present
playwright install chromium
```/quickstart-browser-use-cloud.png)### Docker Setup (Production)```dockerfile
# Dockerfile
FROM python:3.11-slimWORKDIR /app
RUN pip install browser-use playwright
RUN playwright install chromium
RUN playwright ins```bash
# .env file
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-api-key

# Optional: Browser Use Cloud for stealth browsers
BROWSER_USE_API_KEY=your-cloud-key
```HROPIC_API_KEY}
    volumes:
      - ./scripts:/app
    command: python agent.py
```---## Integration with Popular Tools### OpenAI GPT-4o / GPT-5.1```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import a```python
import asyncio
from browser_use import Agent, Browser, ChatBrowserUse

async def main():
    browser = Browser()
    agent = Agent(
        task="List the top 20 posts on Hacker News today with their points",
        llm=ChatBrowserUse(),
        browser=browser,
    )
    result = await agent.run()
    print(result.output)

if __name__ == "__main__":
    asyncio.run(main())
```hatAnthropic
import asyncioasync def extract_data():
    agent = Agent(
        task="Extract all pricing plans from example.com/pricing",
        llm=ChatAnthropic(model="claude-sonnet-4-6"),
        browser=Browser(),
    )
    result = await agent.run()
    print(result.output)asyncio.run(extract_data())
```### Google Gemini 3 Flash```python
from browser_use import Agent, Browser
from langchain_google_genai import ChatGoogleGenerativeAI
import asyncioasync def research_topic():
    agent = Agent(
        task="Res```dockerfile
# Dockerfile
FROM python:3.11-slim

WORKDIR /app
RUN pip install browser-use playwright
RUN playwright install chromium
RUN playwright install-deps

COPY . .
CMD ["python", "agent.py"]
```arch_topic())
```### Ollama (Local Models)```python
from browser_use import Agent, Browser
from langchain_ollama import ChatOllama
import asyncioasync def local_automation():
    agent = Agen```yaml
# docker-compose.yml
version: '3.8'
services:
  browser-use:
    build: .
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - ./scripts:/app
    command: python agent.py
```on
from playwright.async_api import async_playwright
from browser_use import Agent
from langchain_openai import ChatOpenAIasync def hybrid_automation():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        
        # Determin```python
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI
import asyncio

async def search_flights():
    agent = Agent(
        task="Find the cheapest flight from NYC to London next week",
        llm=ChatOpenAI(model="gpt-4o", temperature=0),
        browser=Browser(),
    )
    return await agent.run()

asyncio.run(search_flights())
```)
        return result
```---## Benchmarks / Real-World Use Cases### WebVoyager Benchmark ResultsThe WebVoyager benchmark evaluates browser agents on 586 diverse real-world web tasks. Browser Use ranks #7 on the global leaderboard with an 89.1% success rate — the highest among fully open-source frameworks.![WebVoyager Leaderboard showing Browser Use at 89.1%](https://docs.browser-use.c```python
from browser_use import Agent, Browser
from langchain_anthropic import ChatAnthropic
import asyncio

async def extract_data():
    agent = Agent(
        task="Extract all pricing plans from example.com/pricing",
        llm=ChatAnthropic(model="claude-sonnet-4-6"),
        browser=Browser(),
    )
    result = await agent.run()
    print(result.output)

asyncio.run(extract_data())
```bo | 88.5% | Z.ai |
| 9 | Agent Kura | 87.0% | Kura |
| 9 | OpenAI Operator | 87% | OpenAI |
| 11 | Skyvern 2.0 | 85.85% | Skyvern |
| 12 | Project Mariner | 83.5% | Google |### Performance Metrics (vs Traditional Tools)| Metric | Browser Use (AI) | Playwright | Puppeteer | Selenium |
|--------|-----------------|-----------|-----------|----------|
| Cold start to first navigation | ~0.5–0.8s | ~0.4–0.7s | ~0.3–0.```python
from browser_use import Agent, Browser
from langchain_google_genai import ChatGoogleGenerativeAI
import asyncio

async def research_topic():
    agent = Agent(
        task="Research the latest AI news and summarize top 5 stories",
        llm=ChatGoogleGenerativeAI(model="gemini-3-flash-preview"),
        browser=Browser(),
    )
    return await agent.run()

asyncio.run(research_topic())
```s) | Best For |
|-------------|---------------------------|----------|
| GPT-4o | ~$0.15–$0.30 | Complex reasoning tasks |
| Claude Sonnet 4 | ~$0.10–$0.20 | Production reliability |
| Gemini 3 Flash | ~$0.02–$0.05 | Cost-sensitive batch jobs |
| Local (Qwen2.5-72B) | ~$0.005 (GPU cost) | Privacy-first deployments |### Use Case: Automated Price Monitoring```python
import asyncio
from browser_use import Agent, Browser
fro```python
from browser_use import Agent, Browser
from langchain_ollama import ChatOllama
import asyncio

async def local_automation():
    agent = Agent(
        task="Fill out the contact form on example.com/contact",
        llm=ChatOllama(model="qwen2.5:72b"),
        browser=Browser(),
    )
    return await agent.run()

asyncio.run(local_automation())
```r name",
            llm=ChatOpenAI(model="gpt-4o-mini"),
            browser=Browser(),
        )
        result = await agent.run()
        results.append({"url": url, "data": result.output})
    
    return results# Run daily via cron or scheduled task
prices = asyncio.run(monitor_prices())
```---## Advanced Usage / Production Hardening### Parallel Agent Execution```python
im```python
from playwright.async_api import async_playwright
from browser_use import Agent
from langchain_openai import ChatOpenAI

async def hybrid_automation():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        
        # Deterministic Playwright step
        await page.goto("https://example.com")
        
        # Hand off to Browser Use agent for complex task
        agent = Agent(
            task="Navigate to pricing and extract all plan details",
            llm=ChatOpenAI(model="gpt-4o"),
            browser=browser,
        )
        result = await agent.run()
        await browser.close()
        return result

config = BrowserConfig( proxy={ “server”: “http://proxy.example.com:8080”, “username”: “user”, “password”: “pass”, }, headless=True, )browser = Browser(config=config) agent = Agent( task=“Extract data from a geo-restricted site”, llm=ChatOpenAI(model=“gpt-4o”), browser=browser, ) ### Session Persistence and Authenticationpython from browser_use import Browser, BrowserConfig, Agent from langchain_openai import ChatOpenAI# Use persistent browser profile to maintain login state config = BrowserConfig( user_data_dir="./browser_profile", headless=False, # Use headed mode for initial login )async def authenticated_task(): browser = Browser(config=config) agent = Agent( task=“Download my monthly invoice from the billing page”, llm=ChatOpenAI(model=“gpt-4o”), browser=browser, ) return await agent.run() ### Error Handling and Retriespython import asyncio from browser_use import Agent, Browser from langchain_openai import ChatOpenAIasync def robust_agent(task, max_retries=3): for attempt in range(max_retries): try: agent = Agent( task=task, llm=ChatOpenAI(model=“gpt-4o”), browser=Browser(), max_steps=25, # Limit steps to prevent runaway loops ) result = await agent.run() if result.success: return result except Exception as e: print(f"Attempt {attempt + 1} failed: {e}") await asyncio.sleep(2 ** attempt) # Exponential backoff raise Exception(f"Task failed after {max_retries} attempts") ### Monitoring with Prometheuspython from prometheus_client import Counter, Histogram, start_http_server from browser_use import Agent, Browseragent_runs = Counter(“browseruse_agent_runs_total”, “Total agent runs”) agent_failures = Counter(“browseruse_agent_failures_total”, “Total agent failures”) agent_duration = Histogram(“browseruse_agent_duration_seconds”, “Agent run duration”)start_http_server(8000)async def monitored_agent(task): agent_runs.inc() with agent_duration.time(): try: agent = Agent(task=task, llm=llm, browser=Browser()) result = await agent.run() return result except Exception: agent_failures.inc() raise

|---------|-------------|--------|-----------|----------|
| **Language** | Python | Python | JavaScript/TypeScript | Python, Java, C#, JS |
| **```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI

async def monitor_prices():
    urls = [
        "https://amazon.com/dp/B0DHTYW7P5",
        "https://bestbuy.com/site/xyz",
        "https://newegg.com/product/abc",
    ]
    
    results = []
    for url in urls:
        agent = Agent(
            task=f"Go to {url} and extract the current price, availability, and seller name",
            llm=ChatOpenAI(model="gpt-4o-mini"),
            browser=Browser(),
        )
        result = await agent.run()
        results.append({"url": url, "data": result.output})
    
    return results

# Run daily via cron or scheduled task
prices = asyncio.run(monitor_prices())
```x tasks | Large-scale static scraping | Chrome automation, testing | Cross-browser testing, legacy |
| **GitHub Stars** | 94,731 | 54,200 | 90,800 | 25,400 |### When to Choose What- **Browser Use**: Complex multi-step tasks on dynamic sites where writing selectors is impractical. AI agents that need to adapt to changing UIs.
- **Scrapy**: High-volume extraction of static HTML pages. Best for structured crawling at scale.
- **Puppeteer**: Chrome-only automation where speed matters and you control the target site. Ideal for PDF generation and screenshots.
- **Selenium**: Cross-browser testing for enterprise applications with strict browser coverage requirements.---## Limitations / Honest AssessmentBrowser Use is not a universal replacement for traditional browser automation. Here i```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI

async def run_parallel_agents(tasks):
    browser = Browser()
    agents = [
        Agent(task=task, llm=ChatOpenAI(model="gpt-4o-mini"), browser=browser)
        for task in tasks
    ]
    results = await asyncio.gather(*[agent.run() for agent in agents])
    return results

tasks = [
    "Find iPhone 16 price on Amazon",
    "Find iPhone 16 price on Best Buy",
    "Find iPhone 16 price on Apple Store",
]

results = asyncio.run(run_parallel_agents(tasks))
```the current agent loop architecture.4. **Simple, stable sites**: If the target site never changes and has clean selectors, traditional automation is faster, cheaper, and more reliable.5. **LLM dependency**: You are bound to the availability and pricing of third-party LLM APIs. Rate limits can bottleneck production workloads.---## Frequently Asked Questions### What is Browser Use used for?
Browser Use is a Python framework that lets LLMs control web browsers via Playwright. It is used for AI-driven web automation — tasks like form filling, data extraction, price monitoring, and multi-ste```python
from browser_use import Browser, BrowserConfig
from browser_use import Agent
from langchain_openai import ChatOpenAI

config = BrowserConfig(
    proxy={
        "server": "http://proxy.example.com:8080",
        "username": "user",
        "password": "pass",
    },
    headless=True,
)

browser = Browser(config=config)
agent = Agent(
    task="Extract data from a geo-restricted site",
    llm=ChatOpenAI(model="gpt-4o"),
    browser=browser,
)
```or complex AI-driven tasks.### What LLMs work with Browser Use?
Any LangChain-compatible LLM: OpenAI GPT-4o/5.1, Anthropic Claude Sonnet 4, Google Gemini 3 Flash, and local models via Ollama (Qwen2.5, Llama 3, Mistral). The framework is model-agnostic through LiteLLM.### How much does Browser Use cost?
The framework is free (MIT license). You pay for LLM API usage: approximately $0.02–$0.30 per 10-step task depending on the model. Browser Use Cloud offers managed stealth browsers starting at```python
from browser_use import Browser, BrowserConfig, Agent
from langchain_openai import ChatOpenAI

# Use persistent browser profile to maintain login state
config = BrowserConfig(
    user_data_dir="./browser_profile",
    headless=False,  # Use headed mode for initial login
)

async def authenticated_task():
    browser = Browser(config=config)
    agent = Agent(
        task="Download my monthly invoice from the billing page",
        llm=ChatOpenAI(model="gpt-4o"),
        browser=browser,
    )
    return await agent.run()
```ation in Browser Use?
Use persistent browser profiles (`user_data_dir` in `BrowserConfig`) to maintain cookies and login state across sessions. For OAuth or 2FA flows, run the initial login in headed mode, then switch to headless for subsequent tasks.### What is the difference between Browser Use and Stagehand?
Browser Use is a fully autonomous agent framework — the LLM controls all navigation decisions. Stagehand (by Browserbase) adds AI primitives (`act()`, `extract()`, `observe()`) on top of Playwright for hybrid workflows where deterministic and AI-driven ```python
import asyncio
from browser_use import Agent, Browser
from langchain_openai import ChatOpenAI

async def robust_agent(task, max_retries=3):
    for attempt in range(max_retries):
        try:
            agent = Agent(
                task=task,
                llm=ChatOpenAI(model="gpt-4o"),
                browser=Browser(),
                max_steps=25,  # Limit steps to prevent runaway loops
            )
            result = await agent.run()
            if result.success:
                return result
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            await asyncio.sleep(2 ** attempt)  # Exponential backoff
    raise Exception(f"Task failed after {max_retries} attempts")
```ervention, Browser Use is the most mature open-source option available.**Action items**:
1. Clone the [browser-use/browser-use](https://github.com/browser-use/browser-use) repository
2. Run `pip install browser-use` and set up your first agent with the code examples above
3. Evaluate the WebVoyager benchmark against your use case
4. Join the [Browser Use Discord](https://link.browser-use.com/discord) for community support and production tips> **Want more AI automation tutorials?** Join our [Telegram group](https://t.me/dibi8opensource) for weekly deep-dives on open-source AI tools, production deployment tips, and benchmark data.---







## Recommended Hosting & InfrastructureBefore you deploy any of the tools above into production, you'll need solid infrast```python
from prometheus_client import Counter, Histogram, start_http_server
from browser_use import Agent, Browser

agent_runs = Counter("browseruse_agent_runs_total", "Total agent runs")
agent_failures = Counter("browseruse_agent_failures_total", "Total agent failures")
agent_duration = Histogram("browseruse_agent_duration_seconds", "Agent run duration")

start_http_server(8000)

async def monitored_agent(task):
    agent_runs.inc()
    with agent_duration.time():
        try:
            agent = Agent(task=task, llm=llm, browser=Browser())
            result = await agent.run()
            return result
        except Exception:
            agent_failures.inc()
            raise
```se.com)
- [Browser Use Cloud Platform](https://cloud.browser-use.com)
- [WebVoyager Benchmark Leaderboard](https://leaderboard.steel.dev/)
- [Playwright vs Puppeteer vs Selenium 2026 Benchmarks](https://use-apify.com/blog/playwright-vs-puppeteer-vs-selenium-2026)
- [AI Browser Automation Tools Comparison 2026](https://awesomeagents.ai/tools/best-ai-browser-automation-tools-2026/)
- [Browser Use Proxy Setup Guide](https://www.coronium.io/blog/browser-use-proxy-setup)
- [Stagehand vs Browser Use vs Playwright Comparison](https://www.nxcode.io/resources/news/stagehand-vs-browser-use-vs-playwright-ai-browser-automation-2026)
---*This article was written for developers who need production-grade browser automation. All benchmark data is sourced from publicly available leaderboards and independent testing as of May 2026.*

🔗 Related Resources

💬 Discussion