# Explain It Simply

> What this repo does in plain language, the simplest useful analogy, and the few ideas the reader should remember.

- Repository: browser-use/terminal
- GitHub: https://github.com/browser-use/terminal
- Human wiki: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c
- Complete Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/llms-full.txt

## Source Files

- `README.md`
- `Cargo.toml`
- `pyproject.toml`
- `crates/browser-use-cli/src/main.rs`
- `crates/browser-use-tui/src/main.rs`
- `crates/browser-use-core/src/lib.rs`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [README.md](README.md)
- [Cargo.toml](Cargo.toml)
- [pyproject.toml](pyproject.toml)
- [crates/browser-use-cli/src/main.rs](crates/browser-use-cli/src/main.rs)
- [crates/browser-use-tui/src/main.rs](crates/browser-use-tui/src/main.rs)
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-core/src/tools/mod.rs](crates/browser-use-core/src/tools/mod.rs)
- [crates/browser-use-providers/src/lib.rs](crates/browser-use-providers/src/lib.rs)
- [crates/browser-use-store/src/lib.rs](crates/browser-use-store/src/lib.rs)
- [crates/browser-use-protocol/src/lib.rs](crates/browser-use-protocol/src/lib.rs)
- [crates/browser-use-browser/src/lib.rs](crates/browser-use-browser/src/lib.rs)
- [crates/browser-use-python-worker/src/lib.rs](crates/browser-use-python-worker/src/lib.rs)
- [crates/browser-use-tui/src/settings.rs](crates/browser-use-tui/src/settings.rs)
- [crates/browser-use-tui/src/runtime.rs](crates/browser-use-tui/src/runtime.rs)
- [docs/terminal-ui-testing.md](docs/terminal-ui-testing.md)
</details>

# Explain It Simply

Browser Use Terminal lets you give a browser task to an AI agent from your terminal. The repo is not just a wrapper around a website: it has a Rust terminal interface, a Rust agent loop, a persistent local event store, model-provider adapters, and browser-control code that can attach to Chrome, start managed Chromium, or use Browser Use cloud.

The simplest way to think about it: this repo is a control room for browser work. You type the job, the model decides what to do next, the runtime gives it tools, the browser layer drives Chrome, and the store keeps a log so the UI can show history, results, failures, and follow-ups.

Sources: [README.md:7-10](), [README.md:20-28](), [Cargo.toml:1-11]()

## The One-Sentence Version

Browser Use Terminal is a Rust-first command-line and terminal UI application for running browser agents: it accepts a task, records it as a session, asks a model what to do, gives that model browser and local tools, and saves the task’s events and artifacts locally.

Sources: [README.md:29-49](), [crates/browser-use-core/src/lib.rs:252-285](), [crates/browser-use-store/src/lib.rs:195-240]()

## The Simplest Useful Analogy

Imagine a careful assistant sitting at a computer:

| Analogy part | Real repo part | What it does |
|---|---|---|
| The task notebook | Store and protocol | Records sessions, statuses, events, artifacts, and transcript data. |
| The assistant’s brain | Model provider | Produces text, tool calls, usage, and done events. |
| The hands on the browser | Browser runtime | Connects to local Chrome, managed Chromium, remote CDP, or Browser Use cloud. |
| The workbench screen | TUI | Shows setup, task state, history, model/browser/account choices, and results. |
| The toolbox | Core tool registry | Gives the model browser, browser script, file, shell, patch, planning, and helper-agent tools. |

This analogy maps directly to the code: sessions have statuses like `created`, `running`, `done`, `failed`, and `cancelled`; model events can be text, tool calls, usage, or done markers; and the tool registry explicitly registers browser, browser script, filesystem, command, plan, and helper-agent tools.

Sources: [crates/browser-use-protocol/src/lib.rs:15-23](), [crates/browser-use-protocol/src/lib.rs:130-148](), [crates/browser-use-core/src/tools/mod.rs:40-62]()

## What Happens When You Type a Task

In the TUI, submitting text creates a new session, writes a `session.input` event, selects that session, and starts an agent thread. If you already selected an old session, the same input becomes a follow-up event instead.

```rust
// crates/browser-use-tui/src/main.rs
let session = self.store.create_session(None, std::env::current_dir()?)?;
self.store.append_event(
    &session.id,
    "session.input",
    serde_json::json!({ "text": text }),
)?;
self.start_agent_for_session(session.id)?;
```

Then the TUI worker opens the store, checks browser-specific requirements, builds provider run config, and calls the core agent runner.

Sources: [crates/browser-use-tui/src/main.rs:913-963](), [crates/browser-use-tui/src/main.rs:980-1078](), [crates/browser-use-tui/src/runtime.rs:12-60]()

## The Main Pieces

```text
User
  |
  v
TUI or CLI
  |
  v
Local session store  <---->  Protocol projections for history/result/activity
  |
  v
Core agent loop
  |        \
  |         \-- Model provider: Codex, OpenAI, Anthropic, OpenRouter, fake
  |
  \-- Tools: browser, browser_script, files, shell, patch, helper agents
            |
            v
     Browser runtime / Python worker
            |
            v
 Local Chrome, managed Chromium, remote CDP, or Browser Use cloud
```

The workspace layout supports this split. The root Cargo workspace includes separate crates for browser control, CLI, core, providers, protocol, store, Python worker, and TUI. The Python package exists too, exposing script entry points and dependencies for the worker side.

Sources: [Cargo.toml:1-11](), [pyproject.toml:5-27](), [crates/browser-use-core/src/tools/mod.rs:40-62]()

### The TUI Is the Steering Wheel

The TUI has product states such as setup needed, ready, running, result, failed, and cancelled. It also has surfaces for account, model, browser, history, developer, and setup views. That is why the app can feel like a controllable terminal application rather than a fire-and-forget command.

Sources: [crates/browser-use-tui/src/main.rs:117-163](), [crates/browser-use-tui/src/main.rs:190-198](), [crates/browser-use-tui/src/main.rs:2288-2315]()

### The CLI Is the Scriptable Door

The CLI has commands for starting tasks, running with specific providers, following up, cancelling, showing history, exporting/importing, diagnostics, tracing, datasets, and agent coordination. The scriptable run commands build a `ProviderRunConfig` for OpenAI, Codex, Anthropic, or OpenRouter and then run the core agent loop.

Sources: [crates/browser-use-cli/src/main.rs:46-186](), [crates/browser-use-cli/src/main.rs:954-1065](), [crates/browser-use-cli/src/main.rs:1128-1168]()

### The Store Is the Memory

The store creates `~/.browser-use-terminal` by default, creates an `artifacts` directory, opens `state.db`, runs migrations, and writes events. When events like `session.input`, `session.done`, or `session.failed` arrive, the store updates the session status automatically.

Sources: [crates/browser-use-store/src/lib.rs:33-50](), [crates/browser-use-store/src/lib.rs:104-145](), [crates/browser-use-store/src/lib.rs:321-378](), [crates/browser-use-store/src/lib.rs:938-951]()

### The Core Agent Loop Is the Coordinator

The core loop creates or loads a session, inserts browser-mode instructions, starts a Python worker, records model config, loops over provider turns, captures model deltas and usage, dispatches tool calls, and marks the session done or failed. It also handles retries for transient provider errors and records retry events.

Sources: [crates/browser-use-core/src/lib.rs:694-811](), [crates/browser-use-core/src/lib.rs:833-1039](), [crates/browser-use-core/src/lib.rs:1116-1274]()

## Provider Neutrality and Bring-Your-Own-Key

The model side is intentionally swappable. The `ModelProvider` trait only requires provider name, model name, and a turn interface. Core config can run Codex, OpenAI, Anthropic, OpenRouter, fake, or no provider. API keys and base URLs can come from stored settings or environment variables, which keeps the architecture BYOK-friendly and avoids hard-coding one model vendor as the only path.

The TUI exposes account choices for Codex login, OpenAI API key, Anthropic API key, and OpenRouter API key. The provider code also supports configurable base URLs for OpenAI-compatible and other backends, which is important for BYOC/BYOK and vendor-agnostic deployments.

Sources: [crates/browser-use-providers/src/lib.rs:18-45](), [crates/browser-use-core/src/lib.rs:55-71](), [crates/browser-use-core/src/lib.rs:310-337](), [crates/browser-use-core/src/lib.rs:426-607](), [crates/browser-use-tui/src/settings.rs:60-83]()

## Browser Control Is Separate From Model Choice

The browser runtime is its own control plane. The source comments say the split is intentional: `browser` controls connection, lifecycle, and debug state, while `browser_script` runs fresh Python for page interaction through the Rust-held CDP connection.

The browser command layer supports local browser connection, managed Chromium, and remote CDP. Local Chrome can be blocked until the user enables remote debugging. Managed Chromium is Rust-owned. Browser Use cloud starts a remote browser through the Browser Use API and records a live URL when available.

Sources: [crates/browser-use-browser/src/lib.rs:1-7](), [crates/browser-use-browser/src/lib.rs:438-469](), [crates/browser-use-browser/src/lib.rs:710-802](), [crates/browser-use-browser/src/lib.rs:836-890](), [crates/browser-use-browser/src/lib.rs:893-958]()

## The Python Worker Is a Tool Bridge

The Rust code starts a Python worker process and sends JSON requests containing session id, working directory, artifact directory, code, cancellation state, timeout, and optional control command. Responses can include text, structured data, outputs, artifacts, images, browser events, and browser harness availability.

This is why the repo can be Rust-first while still using Python for page-level scripts and browser helper packages.

Sources: [crates/browser-use-python-worker/src/lib.rs:32-65](), [crates/browser-use-python-worker/src/lib.rs:75-126](), [crates/browser-use-python-worker/src/lib.rs:128-163]()

## The Few Ideas To Remember

1. A task is a session.
   The system records the prompt, model output, tool calls, browser events, result, and failures as session events.

2. The terminal is not just output.
   The TUI can start tasks, send follow-ups, retry, pick history, choose model/account/browser settings, and display different product states.

3. The model is replaceable.
   Providers share a trait, and the core can run several backends. Keys and base URLs can come from settings or environment variables.

4. The browser is replaceable too.
   The agent can work through local Chrome, managed Chromium, remote CDP, or Browser Use cloud, depending on user choice and available credentials.

5. Testing terminal behavior needs a real terminal.
   The project’s own testing notes say compilation and Ratatui dumps are not enough for TUI changes; the verification loop includes a real terminal, key sequences, captured panes, and checks for broken terminal artifacts.

Sources: [crates/browser-use-protocol/src/lib.rs:56-84](), [crates/browser-use-tui/src/settings.rs:74-83](), [crates/browser-use-tui/src/runtime.rs:75-102](), [docs/terminal-ui-testing.md:1-33]()

## Closing Summary

Browser Use Terminal is best understood as a local, Rust-first browser-agent workbench. It keeps model choice, browser choice, session storage, terminal UI, and Python/browser scripting as separate pieces, so users can bring their own accounts, keys, browser mode, and runtime expectations without the whole repo depending on a single hosted model or one browser path.

Sources: [Cargo.toml:1-11](), [crates/browser-use-core/src/lib.rs:301-337](), [crates/browser-use-browser/src/lib.rs:56-75]()