# Browser Use Terminal Explain Like I'm 5 Wiki

> A plain-language map of a Rust terminal app that lets a person steer browser agents from the command line. The structure stays provider-neutral: model providers, browser backends, local files, and catalog-style prompt material are treated as swappable parts.

## Context Links

- [Agent index](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/llms.txt)
- [Human interactive wiki](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c)
- [GitHub repository](https://github.com/browser-use/terminal)

## Repository Metadata

- Repository: browser-use/terminal

- Generated: 2026-05-22T22:46:16.272Z
- Updated: 2026-05-22T22:46:30.480Z
- Runtime: Codex CLI
- Format: Explain Like I'm 5
- Pages: 6

## Page Index

- 01. [Explain It Simply](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/01-explain-it-simply.md) - What this repo does in plain language, the simplest useful analogy, and the few ideas the reader should remember.
- 02. [Install, Run, and Pick a Browser](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/02-install-run-and-pick-a-browser.md) - How the repo turns shell commands into Rust binaries, where state lives, and why the browser choice can be local Chrome, headless Chromium, or Browser Use cloud.
- 03. [The Terminal Workbench](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/03-the-terminal-workbench.md) - The TUI is the steering wheel: it renders the current work, accepts keys and slash-style choices, starts agent runs, and keeps the screen usable while work is live.
- 04. [The Agent Loop and Its Tools](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/04-the-agent-loop-and-its-tools.md) - The agent loop is the careful helper: it asks a model provider for the next move, exposes browser and filesystem tools, records events, and stops only when a done, failure, or cancellation path is reached.
- 05. [The Browser Driver](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/05-the-browser-driver.md) - The browser layer is the remote-control box: Rust owns CDP connections and browser lifecycle, while Python helper code runs page scripts, collects artifacts, and reports browser events without forcing one model vendor.
- 06. [Remember This Map](https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/06-remember-this-map.md) - A closing recap: the TUI is the steering wheel, the agent loop is the helper, the browser driver is the remote-control box, and the store plus tests are how the repo remembers and proves what happened.

## Source File Index

- `Cargo.toml`
- `crates/browser-use-browser/src/browser_script_helpers.py`
- `crates/browser-use-browser/src/lib.rs`
- `crates/browser-use-cli/src/main.rs`
- `crates/browser-use-core/src/lib.rs`
- `crates/browser-use-core/src/tools/command.rs`
- `crates/browser-use-core/src/tools/files.rs`
- `crates/browser-use-core/src/tools/mod.rs`
- `crates/browser-use-protocol/src/lib.rs`
- `crates/browser-use-providers/src/lib.rs`
- `crates/browser-use-python-worker/src/lib.rs`
- `crates/browser-use-store/migrations/0001_initial.sql`
- `crates/browser-use-store/src/lib.rs`
- `crates/browser-use-tui/src/composer.rs`
- `crates/browser-use-tui/src/main.rs`
- `crates/browser-use-tui/src/palette.rs`
- `crates/browser-use-tui/src/render.rs`
- `crates/browser-use-tui/src/runtime.rs`
- `crates/browser-use-tui/src/settings.rs`
- `crates/browser-use-tui/src/transcript.rs`
- `docs/terminal-ui-product-ux.md`
- `docs/terminal-ui-testing.md`
- `prompts/browser-agent-system.md`
- `prompts/browser-script-tool-description.md`
- `prompts/browser-tool-description.md`
- `prompts/interaction-skills/connection.md`
- `prompts/interaction-skills/profile-sync.md`
- `prompts/python-tool-description.md`
- `pyproject.toml`
- `python/llm_browser_worker/rust_cli.py`
- `python/llm_browser_worker/worker.py`
- `README.md`
- `scripts/install/install.sh`
- `scripts/tui-terminal-smoke.py`
- `scripts/verify-terminal-ui.sh`
- `tests/golden-events/running-browser-session/events.jsonl`
- `tests/golden-events/running-browser-session/session.json`

---

## 01. Explain It Simply

> What this repo does in plain language, the simplest useful analogy, and the few ideas the reader should remember.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/01-explain-it-simply.md
- Generated: 2026-05-22T22:45:30.555Z

### Source Files

- `README.md`
- `Cargo.toml`
- `pyproject.toml`
- `crates/browser-use-cli/src/main.rs`
- `crates/browser-use-tui/src/main.rs`
- `crates/browser-use-core/src/lib.rs`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [README.md](README.md)
- [Cargo.toml](Cargo.toml)
- [pyproject.toml](pyproject.toml)
- [crates/browser-use-cli/src/main.rs](crates/browser-use-cli/src/main.rs)
- [crates/browser-use-tui/src/main.rs](crates/browser-use-tui/src/main.rs)
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-core/src/tools/mod.rs](crates/browser-use-core/src/tools/mod.rs)
- [crates/browser-use-providers/src/lib.rs](crates/browser-use-providers/src/lib.rs)
- [crates/browser-use-store/src/lib.rs](crates/browser-use-store/src/lib.rs)
- [crates/browser-use-protocol/src/lib.rs](crates/browser-use-protocol/src/lib.rs)
- [crates/browser-use-browser/src/lib.rs](crates/browser-use-browser/src/lib.rs)
- [crates/browser-use-python-worker/src/lib.rs](crates/browser-use-python-worker/src/lib.rs)
- [crates/browser-use-tui/src/settings.rs](crates/browser-use-tui/src/settings.rs)
- [crates/browser-use-tui/src/runtime.rs](crates/browser-use-tui/src/runtime.rs)
- [docs/terminal-ui-testing.md](docs/terminal-ui-testing.md)
</details>

# Explain It Simply

Browser Use Terminal lets you give a browser task to an AI agent from your terminal. The repo is not just a wrapper around a website: it has a Rust terminal interface, a Rust agent loop, a persistent local event store, model-provider adapters, and browser-control code that can attach to Chrome, start managed Chromium, or use Browser Use cloud.

The simplest way to think about it: this repo is a control room for browser work. You type the job, the model decides what to do next, the runtime gives it tools, the browser layer drives Chrome, and the store keeps a log so the UI can show history, results, failures, and follow-ups.

Sources: [README.md:7-10](), [README.md:20-28](), [Cargo.toml:1-11]()

## The One-Sentence Version

Browser Use Terminal is a Rust-first command-line and terminal UI application for running browser agents: it accepts a task, records it as a session, asks a model what to do, gives that model browser and local tools, and saves the task’s events and artifacts locally.

Sources: [README.md:29-49](), [crates/browser-use-core/src/lib.rs:252-285](), [crates/browser-use-store/src/lib.rs:195-240]()

## The Simplest Useful Analogy

Imagine a careful assistant sitting at a computer:

| Analogy part | Real repo part | What it does |
|---|---|---|
| The task notebook | Store and protocol | Records sessions, statuses, events, artifacts, and transcript data. |
| The assistant’s brain | Model provider | Produces text, tool calls, usage, and done events. |
| The hands on the browser | Browser runtime | Connects to local Chrome, managed Chromium, remote CDP, or Browser Use cloud. |
| The workbench screen | TUI | Shows setup, task state, history, model/browser/account choices, and results. |
| The toolbox | Core tool registry | Gives the model browser, browser script, file, shell, patch, planning, and helper-agent tools. |

This analogy maps directly to the code: sessions have statuses like `created`, `running`, `done`, `failed`, and `cancelled`; model events can be text, tool calls, usage, or done markers; and the tool registry explicitly registers browser, browser script, filesystem, command, plan, and helper-agent tools.

Sources: [crates/browser-use-protocol/src/lib.rs:15-23](), [crates/browser-use-protocol/src/lib.rs:130-148](), [crates/browser-use-core/src/tools/mod.rs:40-62]()

## What Happens When You Type a Task

In the TUI, submitting text creates a new session, writes a `session.input` event, selects that session, and starts an agent thread. If you already selected an old session, the same input becomes a follow-up event instead.

```rust
// crates/browser-use-tui/src/main.rs
let session = self.store.create_session(None, std::env::current_dir()?)?;
self.store.append_event(
    &session.id,
    "session.input",
    serde_json::json!({ "text": text }),
)?;
self.start_agent_for_session(session.id)?;
```

Then the TUI worker opens the store, checks browser-specific requirements, builds provider run config, and calls the core agent runner.

Sources: [crates/browser-use-tui/src/main.rs:913-963](), [crates/browser-use-tui/src/main.rs:980-1078](), [crates/browser-use-tui/src/runtime.rs:12-60]()

## The Main Pieces

```text
User
  |
  v
TUI or CLI
  |
  v
Local session store  <---->  Protocol projections for history/result/activity
  |
  v
Core agent loop
  |        \
  |         \-- Model provider: Codex, OpenAI, Anthropic, OpenRouter, fake
  |
  \-- Tools: browser, browser_script, files, shell, patch, helper agents
            |
            v
     Browser runtime / Python worker
            |
            v
 Local Chrome, managed Chromium, remote CDP, or Browser Use cloud
```

The workspace layout supports this split. The root Cargo workspace includes separate crates for browser control, CLI, core, providers, protocol, store, Python worker, and TUI. The Python package exists too, exposing script entry points and dependencies for the worker side.

Sources: [Cargo.toml:1-11](), [pyproject.toml:5-27](), [crates/browser-use-core/src/tools/mod.rs:40-62]()

### The TUI Is the Steering Wheel

The TUI has product states such as setup needed, ready, running, result, failed, and cancelled. It also has surfaces for account, model, browser, history, developer, and setup views. That is why the app can feel like a controllable terminal application rather than a fire-and-forget command.

Sources: [crates/browser-use-tui/src/main.rs:117-163](), [crates/browser-use-tui/src/main.rs:190-198](), [crates/browser-use-tui/src/main.rs:2288-2315]()

### The CLI Is the Scriptable Door

The CLI has commands for starting tasks, running with specific providers, following up, cancelling, showing history, exporting/importing, diagnostics, tracing, datasets, and agent coordination. The scriptable run commands build a `ProviderRunConfig` for OpenAI, Codex, Anthropic, or OpenRouter and then run the core agent loop.

Sources: [crates/browser-use-cli/src/main.rs:46-186](), [crates/browser-use-cli/src/main.rs:954-1065](), [crates/browser-use-cli/src/main.rs:1128-1168]()

### The Store Is the Memory

The store creates `~/.browser-use-terminal` by default, creates an `artifacts` directory, opens `state.db`, runs migrations, and writes events. When events like `session.input`, `session.done`, or `session.failed` arrive, the store updates the session status automatically.

Sources: [crates/browser-use-store/src/lib.rs:33-50](), [crates/browser-use-store/src/lib.rs:104-145](), [crates/browser-use-store/src/lib.rs:321-378](), [crates/browser-use-store/src/lib.rs:938-951]()

### The Core Agent Loop Is the Coordinator

The core loop creates or loads a session, inserts browser-mode instructions, starts a Python worker, records model config, loops over provider turns, captures model deltas and usage, dispatches tool calls, and marks the session done or failed. It also handles retries for transient provider errors and records retry events.

Sources: [crates/browser-use-core/src/lib.rs:694-811](), [crates/browser-use-core/src/lib.rs:833-1039](), [crates/browser-use-core/src/lib.rs:1116-1274]()

## Provider Neutrality and Bring-Your-Own-Key

The model side is intentionally swappable. The `ModelProvider` trait only requires provider name, model name, and a turn interface. Core config can run Codex, OpenAI, Anthropic, OpenRouter, fake, or no provider. API keys and base URLs can come from stored settings or environment variables, which keeps the architecture BYOK-friendly and avoids hard-coding one model vendor as the only path.

The TUI exposes account choices for Codex login, OpenAI API key, Anthropic API key, and OpenRouter API key. The provider code also supports configurable base URLs for OpenAI-compatible and other backends, which is important for BYOC/BYOK and vendor-agnostic deployments.

Sources: [crates/browser-use-providers/src/lib.rs:18-45](), [crates/browser-use-core/src/lib.rs:55-71](), [crates/browser-use-core/src/lib.rs:310-337](), [crates/browser-use-core/src/lib.rs:426-607](), [crates/browser-use-tui/src/settings.rs:60-83]()

## Browser Control Is Separate From Model Choice

The browser runtime is its own control plane. The source comments say the split is intentional: `browser` controls connection, lifecycle, and debug state, while `browser_script` runs fresh Python for page interaction through the Rust-held CDP connection.

The browser command layer supports local browser connection, managed Chromium, and remote CDP. Local Chrome can be blocked until the user enables remote debugging. Managed Chromium is Rust-owned. Browser Use cloud starts a remote browser through the Browser Use API and records a live URL when available.

Sources: [crates/browser-use-browser/src/lib.rs:1-7](), [crates/browser-use-browser/src/lib.rs:438-469](), [crates/browser-use-browser/src/lib.rs:710-802](), [crates/browser-use-browser/src/lib.rs:836-890](), [crates/browser-use-browser/src/lib.rs:893-958]()

## The Python Worker Is a Tool Bridge

The Rust code starts a Python worker process and sends JSON requests containing session id, working directory, artifact directory, code, cancellation state, timeout, and optional control command. Responses can include text, structured data, outputs, artifacts, images, browser events, and browser harness availability.

This is why the repo can be Rust-first while still using Python for page-level scripts and browser helper packages.

Sources: [crates/browser-use-python-worker/src/lib.rs:32-65](), [crates/browser-use-python-worker/src/lib.rs:75-126](), [crates/browser-use-python-worker/src/lib.rs:128-163]()

## The Few Ideas To Remember

1. A task is a session.
   The system records the prompt, model output, tool calls, browser events, result, and failures as session events.

2. The terminal is not just output.
   The TUI can start tasks, send follow-ups, retry, pick history, choose model/account/browser settings, and display different product states.

3. The model is replaceable.
   Providers share a trait, and the core can run several backends. Keys and base URLs can come from settings or environment variables.

4. The browser is replaceable too.
   The agent can work through local Chrome, managed Chromium, remote CDP, or Browser Use cloud, depending on user choice and available credentials.

5. Testing terminal behavior needs a real terminal.
   The project’s own testing notes say compilation and Ratatui dumps are not enough for TUI changes; the verification loop includes a real terminal, key sequences, captured panes, and checks for broken terminal artifacts.

Sources: [crates/browser-use-protocol/src/lib.rs:56-84](), [crates/browser-use-tui/src/settings.rs:74-83](), [crates/browser-use-tui/src/runtime.rs:75-102](), [docs/terminal-ui-testing.md:1-33]()

## Closing Summary

Browser Use Terminal is best understood as a local, Rust-first browser-agent workbench. It keeps model choice, browser choice, session storage, terminal UI, and Python/browser scripting as separate pieces, so users can bring their own accounts, keys, browser mode, and runtime expectations without the whole repo depending on a single hosted model or one browser path.

Sources: [Cargo.toml:1-11](), [crates/browser-use-core/src/lib.rs:301-337](), [crates/browser-use-browser/src/lib.rs:56-75]()

---

## 02. Install, Run, and Pick a Browser

> How the repo turns shell commands into Rust binaries, where state lives, and why the browser choice can be local Chrome, headless Chromium, or Browser Use cloud.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/02-install-run-and-pick-a-browser.md
- Generated: 2026-05-22T22:45:36.760Z

### Source Files

- `scripts/install/install.sh`
- `python/llm_browser_worker/rust_cli.py`
- `crates/browser-use-cli/src/main.rs`
- `crates/browser-use-tui/src/settings.rs`
- `crates/browser-use-store/src/lib.rs`
- `pyproject.toml`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [scripts/install/install.sh](scripts/install/install.sh)
- [python/llm_browser_worker/rust_cli.py](python/llm_browser_worker/rust_cli.py)
- [crates/browser-use-cli/src/main.rs](crates/browser-use-cli/src/main.rs)
- [crates/browser-use-tui/src/settings.rs](crates/browser-use-tui/src/settings.rs)
- [crates/browser-use-tui/src/runtime.rs](crates/browser-use-tui/src/runtime.rs)
- [crates/browser-use-tui/src/main.rs](crates/browser-use-tui/src/main.rs)
- [crates/browser-use-store/src/lib.rs](crates/browser-use-store/src/lib.rs)
- [crates/browser-use-store/migrations/0001_initial.sql](crates/browser-use-store/migrations/0001_initial.sql)
- [crates/browser-use-store/migrations/0004_app_settings.sql](crates/browser-use-store/migrations/0004_app_settings.sql)
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-python-worker/src/lib.rs](crates/browser-use-python-worker/src/lib.rs)
- [python/llm_browser_worker/worker.py](python/llm_browser_worker/worker.py)
- [crates/browser-use-browser/src/lib.rs](crates/browser-use-browser/src/lib.rs)
- [pyproject.toml](pyproject.toml)
- [Cargo.toml](Cargo.toml)
- [README.md](README.md)
</details>

# Install, Run, and Pick a Browser

Browser Use Terminal looks simple from the outside: install it, type `browser`, choose a model, and choose a browser. Under the hood, the repo keeps the user-facing commands thin and pushes real work into Rust binaries, a SQLite-backed state directory, and a browser runtime that can attach to local Chrome, launch managed Chromium, or start Browser Use cloud.

Think of it like a small train station. The shell command is the ticket booth, the Rust binary is the train, the state directory is the station logbook, and the browser choice decides which track the train uses.

Generation note: this page applies the provided bundled Compound Engineering guidance as page-shaping guidance only. Implementation claims below are grounded in repository files, not in a provider-specific workflow.

## The Short Version

The repo exposes three main command names:

| Command | What it does |
|---|---|
| `browser` | Opens the TUI when run with no arguments; forwards arguments to the CLI when arguments are present. |
| `browser-use` | Same hybrid behavior as `browser`. |
| `browser-use-terminal` | Same hybrid behavior when installed, and also the Rust CLI binary name. |
| `but` | Runs the TUI directly. |

The installer downloads a platform-specific release archive, verifies its SHA-256 digest, installs `bin/but` and `bin/browser-use-terminal` into a versioned package directory, and writes small shell wrappers into the visible install directory. The Python package entry points do a similar thing for development: they call `cargo run` when the repo checkout is present.

Sources: [scripts/install/install.sh:724-808](), [scripts/install/install.sh:670-685](), [pyproject.toml:23-30](), [python/llm_browser_worker/rust_cli.py:9-25]()

## Install: Shell Wrappers Around Rust Binaries

The installer starts with a few important defaults:

```sh
REPO="${BUT_RELEASE_REPO:-browser-use/terminal}"
BIN_DIR="${BUT_INSTALL_DIR:-$HOME/.local/bin}"
BUT_HOME_DIR="${BUT_HOME:-$HOME/.browser-use-terminal}"
STANDALONE_ROOT="$BUT_HOME_DIR/packages/standalone"
```

That means visible commands normally go in `~/.local/bin`, while downloaded packages and update metadata live under `~/.browser-use-terminal/packages/standalone`.

Sources: [scripts/install/install.sh:5-14]()

The install script accepts `--release VERSION`, resolves `latest` through GitHub release metadata, chooses a target triple from the current OS and CPU, downloads `browser-use-terminal-$target.tar.gz`, checks the matching `.sha256`, extracts it, and installs it into a versioned release directory.

Sources: [scripts/install/install.sh:46-79](), [scripts/install/install.sh:117-148](), [scripts/install/install.sh:172-183](), [scripts/install/install.sh:692-746](), [scripts/install/install.sh:771-790]()

The release payload is expected to contain executable Rust binaries and a Python worker package:

```text
release dir
├─ bin/
│  ├─ but
│  └─ browser-use-terminal
└─ python/
   └─ llm_browser_worker/worker.py
```

The script treats a release as complete only when both binaries exist and the Python worker file is present.

Sources: [scripts/install/install.sh:481-512]()

## Run: `browser` Decides TUI Or CLI

The installed wrappers are intentionally tiny. For `but`, the wrapper always executes the installed `CURRENT/bin/but`. For `browser`, `browser-use`, and `browser-use-terminal`, the wrapper is hybrid: no arguments opens the TUI, while any arguments are passed to the CLI binary.

```sh
if [ "$#" -eq 0 ]; then
  exec "$CURRENT/bin/but"
fi
exec "$CURRENT/bin/browser-use-terminal" "$@"
```

Sources: [scripts/install/install.sh:522-590](), [scripts/install/install.sh:595-664](), [scripts/install/install.sh:670-685]()

The wrappers also set `BUT_HOME`, `BUT_INSTALL_DIR`, `BUT_RELEASE_REPO`, and `PYTHONPATH`, then run an automatic update check unless `BUT_AUTO_UPDATE` disables it. By default the update interval is `72000` seconds, and `BUT_REQUIRE_LATEST=1` makes a failed update check block startup.

Sources: [scripts/install/install.sh:528-589](), [scripts/install/install.sh:600-664]()

In a source checkout, Python entry points are just development conveniences. `pyproject.toml` maps `browser-use-terminal` to `llm_browser_worker.rust_cli:main` and maps `but` to `llm_browser_worker.rust_cli:tui_main`. Those functions detect the repo root and run the matching Cargo package.

```python
def main() -> None:
    _exec_rust_binary("browser-use-cli", "browser-use-terminal", sys.argv[1:])

def tui_main() -> None:
    _exec_rust_binary("browser-use-tui", "but", sys.argv[1:])
```

Sources: [pyproject.toml:23-30](), [python/llm_browser_worker/rust_cli.py:9-25](), [Cargo.toml:1-11]()

## Where State Lives

There are two related locations:

| Location | Default | Purpose |
|---|---:|---|
| Install/package root | `~/.browser-use-terminal/packages/standalone` | Versioned release packages, `current` symlink, installer lock, update stamp/log. |
| Runtime state dir | `~/.browser-use-terminal` | SQLite state database, artifacts, settings, sessions, events. |

The CLI and TUI both default `--state-dir` to `~/.browser-use-terminal`. The store resolves that path, creates the directory, creates an `artifacts` subdirectory, opens `state.db`, enables WAL mode, and applies migrations.

Sources: [crates/browser-use-cli/src/main.rs:35-43](), [crates/browser-use-tui/src/main.rs:88-99](), [crates/browser-use-store/src/lib.rs:33-50](), [crates/browser-use-store/src/lib.rs:104-145]()

The first migration creates the durable core tables: `sessions`, `events`, `artifacts`, `runs`, and `agent_edges`. A later migration adds `app_settings`, which is where choices such as model, account, browser, and stored API keys are saved.

Sources: [crates/browser-use-store/migrations/0001_initial.sql:1-49](), [crates/browser-use-store/migrations/0004_app_settings.sql:1-5](), [crates/browser-use-store/src/lib.rs:16-25]()

Settings are plain key/value rows. `set_setting` upserts by key, `get_setting` reads one value, and `list_settings` returns all settings ordered by key.

Sources: [crates/browser-use-store/src/lib.rs:653-694]()

## Browser Choice Is Separate From Model Choice

The TUI keeps model/provider and browser as separate settings. That matters for BYOC and BYOK: the browser can be local, managed, or cloud while the agent backend can be Codex, OpenAI, Anthropic, OpenRouter, fake, or none.

Sources: [crates/browser-use-tui/src/settings.rs:4-50](), [crates/browser-use-tui/src/settings.rs:60-79]()

Default settings are provider-neutral at the storage layer: account, model, provider model, browser, agent backend, and setup completion are all separate keys. The default browser is `Local Chrome`.

Sources: [crates/browser-use-cli/src/main.rs:1288-1297]()

## The Three Browser Options

| TUI label | Internal mode | What it means | Main requirement |
|---|---|---|---|
| `Local Chrome` | `local` | Attach to the user's already-open browser after remote debugging is enabled. | Local Chrome setup and user approval when needed. |
| `Headless Chromium` | `managed-headless` | Start a Rust-owned managed Chromium with an isolated automation profile. | A launchable Chromium candidate. |
| `Browser Use cloud` | `cloud` | Start and connect to a Browser Use cloud browser, with live view support. | `BROWSER_USE_API_KEY` or a stored cloud key. |

Sources: [crates/browser-use-tui/src/settings.rs:74-83](), [crates/browser-use-tui/src/runtime.rs:75-101](), [crates/browser-use-core/src/lib.rs:177-205]()

### Local Chrome

Local Chrome is the default because it can use the user's real browser state. The core instruction tells the agent to run `browser connect local` before page work. The browser runtime has a matching `connect local` command path.

Sources: [crates/browser-use-core/src/lib.rs:180-184](), [crates/browser-use-browser/src/lib.rs:438-443]()

### Headless Chromium

Headless Chromium maps to `managed-headless`. The core instruction tells the agent to use `browser connect managed --headless`, and the browser runtime routes managed connections through `connect_managed`. When the managed launch is headless, the runtime adds `--headless=new` and uses a managed profile.

Sources: [crates/browser-use-core/src/lib.rs:186-190](), [crates/browser-use-browser/src/lib.rs:444-456](), [crates/browser-use-browser/src/lib.rs:838-858](), [crates/browser-use-browser/src/lib.rs:1688-1702]()

The Python worker also has legacy managed-browser helpers. It reads `LLM_BROWSER_BROWSER_MODE`, normalizes it, and can launch a managed Chrome process for `headless` or `headless-chromium`, but the Rust browser command path is the source-backed path for the TUI's `managed-headless` mode.

Sources: [python/llm_browser_worker/worker.py:229-231](), [python/llm_browser_worker/worker.py:339-345](), [python/llm_browser_worker/worker.py:359-373]()

### Browser Use Cloud

Browser Use cloud maps to `cloud`. The TUI checks for a stored key first, then for the `BROWSER_USE_API_KEY` environment variable. If cloud is selected without a key, the TUI records a failure for a run or prompts the user during setup instead of silently running the wrong browser.

Sources: [crates/browser-use-tui/src/runtime.rs:20-42](), [crates/browser-use-tui/src/runtime.rs:62-73](), [crates/browser-use-tui/src/main.rs:1774-1797](), [crates/browser-use-tui/src/main.rs:2383-2403]()

When a cloud key is saved through the TUI, it is stored at `auth.browser_use_cloud.api_key`. The CLI auth path uses the same setting and switches the saved browser label to `Browser Use cloud`.

Sources: [crates/browser-use-tui/src/settings.rs:74-76](), [crates/browser-use-tui/src/main.rs:1800-1817](), [crates/browser-use-cli/src/main.rs:1307-1317](), [crates/browser-use-cli/src/main.rs:1398-1406]()

The browser runtime itself requires `BROWSER_USE_API_KEY` before calling the Browser Use API. When cloud mode starts, it records the remote browser id and live URL and names the browser `Browser Use cloud`.

Sources: [crates/browser-use-browser/src/lib.rs:944-958](), [crates/browser-use-browser/src/lib.rs:1138-1144](), [crates/browser-use-browser/src/lib.rs:2600-2608]()

## How The Choice Reaches The Worker

The browser choice flows through a few layers:

```text
TUI/CLI setting
  -> AgentRunOptions.browser_mode
  -> core system instruction for the agent
  -> PythonWorker launch env: LLM_BROWSER_BROWSER_MODE
  -> browser helper behavior inside worker.py
```

The core run options contain `browser_mode`, and `with_browser_mode` sets it. Before an agent run, core inserts a system message explaining the selected browser mode and the browser command the agent should use. Core then starts the Python worker with the browser mode and any extra Python environment.

Sources: [crates/browser-use-core/src/lib.rs:121-154](), [crates/browser-use-core/src/lib.rs:177-205](), [crates/browser-use-core/src/lib.rs:729-777]()

The Rust Python worker launcher builds `PYTHONPATH`, tries `uv run` with pinned helper packages first, falls back to `python3`, and exports `LLM_BROWSER_BROWSER_MODE` when a browser mode is present.

Sources: [crates/browser-use-python-worker/src/lib.rs:84-126](), [crates/browser-use-python-worker/src/lib.rs:128-163](), [crates/browser-use-python-worker/src/lib.rs:426-443]()

## CLI Browser Preference Commands

The CLI also supports browser preference state. `browser preference use <local|cloud|managed-headless>` normalizes the mode, stores both an internal mode and display label, and reports the next step as `browser connect`.

Sources: [crates/browser-use-core/src/lib.rs:3017-3035](), [crates/browser-use-core/src/lib.rs:3233-3261]()

A plain `browser connect` can then resolve through the saved preference:

| Saved mode | Resolved command |
|---|---|
| `local` | `browser connect local` |
| `managed-headless` | `browser connect managed --headless` |
| `managed-headed` | `browser connect managed --headed` |
| `cloud` | `browser remote start` |

Sources: [crates/browser-use-core/src/lib.rs:3132-3150](), [crates/browser-use-core/src/lib.rs:3152-3188]()

## Architecture Map

```mermaid
flowchart TB
  subgraph Shell["Shell commands"]
    Browser["browser / browser-use"]
    But["but"]
  end

  subgraph Install["Installed package root"]
    Current["packages/standalone/current"]
    Bin["bin/but + bin/browser-use-terminal"]
    Py["python/llm_browser_worker"]
  end

  subgraph Rust["Rust binaries"]
    TUI["browser-use-tui"]
    CLI["browser-use-cli"]
    Core["browser-use-core"]
    Store["browser-use-store"]
    BrowserRuntime["browser-use-browser"]
    WorkerLauncher["browser-use-python-worker"]
  end

  subgraph State["Runtime state dir"]
    DB["state.db"]
    Artifacts["artifacts/"]
    Settings["app_settings"]
  end

  subgraph BrowserOptions["Browser options"]
    Local["Local Chrome"]
    Headless["Headless Chromium"]
    Cloud["Browser Use cloud"]
  end

  Browser -->|no args| TUI
  Browser -->|args| CLI
  But --> TUI
  TUI --> Core
  CLI --> Core
  Core --> Store
  Store --> DB
  Store --> Artifacts
  Store --> Settings
  Core --> WorkerLauncher
  Core --> BrowserRuntime
  BrowserRuntime --> Local
  BrowserRuntime --> Headless
  BrowserRuntime --> Cloud
  Current --> Bin
  Current --> Py
```

Sources: [scripts/install/install.sh:595-676](), [Cargo.toml:1-11](), [crates/browser-use-store/src/lib.rs:104-145](), [crates/browser-use-tui/src/runtime.rs:75-101](), [crates/browser-use-browser/src/lib.rs:438-456]()

## Practical Rules For Newcomers

Use `browser` when you want the guided terminal UI. Use `browser auth status`, `browser config show`, or other arguments when you want CLI behavior. The README exposes the same user-facing split: launch with `browser`, then use `/auth`, `/model`, `/browser`, and `/update` inside the TUI.

Sources: [README.md:65-88](), [scripts/install/install.sh:440-461]()

Choose `Local Chrome` when the task needs your real logged-in browser. Choose `Headless Chromium` when you want a cleaner Rust-owned automation browser. Choose `Browser Use cloud` when the browser should be remote, and make sure the cloud key is available through stored settings or `BROWSER_USE_API_KEY`.

Sources: [README.md:20-27](), [crates/browser-use-tui/src/settings.rs:74-83](), [crates/browser-use-tui/src/runtime.rs:20-42]()

Keep model and browser decisions separate. The code supports multiple model backends and multiple browser modes as independent settings, which keeps the architecture BYOC/BYOK friendly instead of tying the terminal to one model provider or one browser provider.

Sources: [crates/browser-use-tui/src/settings.rs:4-79](), [crates/browser-use-cli/src/main.rs:1288-1297]()

## Summary

Browser Use Terminal turns shell commands into Rust binaries through small wrappers and Python entry-point shims, keeps durable state in `~/.browser-use-terminal/state.db` plus an `artifacts` directory, and routes browser work through an explicit browser mode. Local Chrome, Headless Chromium, and Browser Use cloud are not separate apps; they are saved settings that become run options, agent instructions, worker environment, and browser-runtime commands.

Sources: [scripts/install/install.sh:670-808](), [crates/browser-use-store/src/lib.rs:104-145](), [crates/browser-use-core/src/lib.rs:177-205](), [crates/browser-use-tui/src/runtime.rs:75-101]()

---

## 03. The Terminal Workbench

> The TUI is the steering wheel: it renders the current work, accepts keys and slash-style choices, starts agent runs, and keeps the screen usable while work is live.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/03-the-terminal-workbench.md
- Generated: 2026-05-22T22:45:16.151Z

### Source Files

- `crates/browser-use-tui/src/main.rs`
- `crates/browser-use-tui/src/runtime.rs`
- `crates/browser-use-tui/src/render.rs`
- `crates/browser-use-tui/src/composer.rs`
- `crates/browser-use-tui/src/settings.rs`
- `crates/browser-use-tui/src/palette.rs`
- `crates/browser-use-tui/src/transcript.rs`
- `docs/terminal-ui-product-ux.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [crates/browser-use-tui/src/main.rs](crates/browser-use-tui/src/main.rs)
- [crates/browser-use-tui/src/runtime.rs](crates/browser-use-tui/src/runtime.rs)
- [crates/browser-use-tui/src/render.rs](crates/browser-use-tui/src/render.rs)
- [crates/browser-use-tui/src/composer.rs](crates/browser-use-tui/src/composer.rs)
- [crates/browser-use-tui/src/settings.rs](crates/browser-use-tui/src/settings.rs)
- [crates/browser-use-tui/src/palette.rs](crates/browser-use-tui/src/palette.rs)
- [crates/browser-use-tui/src/transcript.rs](crates/browser-use-tui/src/transcript.rs)
- [docs/terminal-ui-product-ux.md](docs/terminal-ui-product-ux.md)
- [docs/terminal-ui-testing.md](docs/terminal-ui-testing.md)
</details>

# The Terminal Workbench

The Terminal Workbench is the Rust TUI for Browser Use Terminal. It is the place where a user sets up an account, picks a model and browser backend, types a task, watches the live work, steers the task with follow-ups, opens history, and fixes setup when something is missing.

A simple way to read the code is: `main.rs` is the driver, `render.rs` is the dashboard, `composer.rs` is the text input, `palette.rs` is the slash menu, `transcript.rs` turns raw events into readable work, and `runtime.rs` starts the agent run with the selected model and browser settings.

Sources: [crates/browser-use-tui/src/main.rs:56-78](), [docs/terminal-ui-product-ux.md:53-80]()

## Product Model

The product document says there is one main screen: the workbench. Setup, browser, history, and action surfaces are temporary helpers around that main screen. The intended user flow is small: set up once, tell the browser what to do, watch enough to trust it, interrupt or steer when needed, then get a useful result.

The code follows that idea with a `Surface` enum for the main screen, setup screens, account/API key flows, model and browser screens, history, and developer tools. It also has a smaller `ProductState` enum for the user-facing lifecycle: setup needed, ready, running, result, failed, or cancelled.

```rust
// crates/browser-use-tui/src/main.rs
enum ProductState {
    SetupNeeded,
    Ready,
    Running,
    Result,
    Failed,
    Cancelled,
}
```

Sources: [docs/terminal-ui-product-ux.md:53-80](), [crates/browser-use-tui/src/main.rs:116-163](), [crates/browser-use-tui/src/main.rs:190-198]()

## Main Ownership Boundaries

```text
User keys / paste
      |
      v
main.rs
  - App state
  - key routing
  - command dispatch
  - setup/auth/model/browser choices
      |
      +--> composer.rs       edits prompt text
      +--> palette.rs        filters slash commands
      +--> render.rs         draws main view, popups, composer, footer
      +--> transcript.rs     converts event records into readable lines
      +--> runtime.rs        starts the selected agent/browser run
      +--> Store             sessions, settings, events
```

The workbench is not just a screen renderer. It owns the loop that reads terminal events, drains store and auth notifications, starts worker threads, and redraws when the state changes. Rendering is deliberately separate: `render(frame, app)` asks the app for a projected `WorkbenchState`, decides the product state, then draws either setup, the main view, or a modal overlay.

Sources: [crates/browser-use-tui/src/main.rs:679-777](), [crates/browser-use-tui/src/main.rs:3085-3170](), [crates/browser-use-tui/src/render.rs:93-129]()

## State Comes From the Store

The TUI keeps a local cache of sessions and event records. `AppStateCache::hydrate` loads sessions and events from `Store`; later notifications refresh sessions or only the events after the last seen sequence number. When the view needs data, `project_if_needed` calls `project_workbench` to create the current `WorkbenchState`.

This matters because the terminal does not poll the agent directly for every display detail. It watches durable session events and settings, then projects them into a screen model.

Sources: [crates/browser-use-tui/src/main.rs:324-360](), [crates/browser-use-tui/src/main.rs:362-377](), [crates/browser-use-tui/src/main.rs:445-463](), [crates/browser-use-tui/src/main.rs:484-557]()

## Starting and Steering Work

When the user presses Enter on a new task, `submit` checks readiness, takes the trimmed composer text, and dispatches `StartTask`. That command creates a session, appends a `session.input` event, selects the session, resets native history, and starts an agent thread. Follow-ups use the same composer but append `session.followup`; if the selected session is no longer active, the TUI can restart the agent for that session.

```rust
// crates/browser-use-tui/src/main.rs
AppCommand::StartTask(text) => {
    let session = self.store.create_session(None, std::env::current_dir()?)?;
    self.store.append_event(
        &session.id,
        "session.input",
        serde_json::json!({ "text": text }),
    )?;
    self.selected_session_id = Some(session.id.clone());
    self.native_history.reset_with_clear();
    self.start_agent_for_session(session.id)?;
}
```

The agent is started on a named background thread. The thread calls `run_agent_thread`; panics are caught so they can be recorded instead of silently breaking the terminal loop.

Sources: [crates/browser-use-tui/src/main.rs:913-963](), [crates/browser-use-tui/src/main.rs:980-1012](), [crates/browser-use-tui/src/main.rs:1052-1083]()

## Runtime and Provider Neutrality

The workbench is BYOC/BYOK-friendly because account, model, backend, browser, and API keys are choices stored as settings, not hardcoded assumptions. `settings.rs` maps `AgentBackend` values to core provider backends and defines account choices for Codex login, OpenAI API key, Anthropic API key, and OpenRouter API key. Browser choices are also explicit: Local Chrome, Browser Use cloud, and Headless Chromium.

`runtime.rs` turns the selected browser into `AgentRunOptions`: local mode for Local Chrome, managed headless mode for Headless Chromium, and cloud mode for Browser Use cloud. Cloud mode requires a `BROWSER_USE_API_KEY` from either the store setting or the environment. This keeps the workbench portable across local credentials, user-provided keys, and different model providers.

Sources: [crates/browser-use-tui/src/settings.rs:4-50](), [crates/browser-use-tui/src/settings.rs:60-89](), [crates/browser-use-tui/src/settings.rs:89-153](), [crates/browser-use-tui/src/runtime.rs:12-60](), [crates/browser-use-tui/src/runtime.rs:62-102]()

## Composer: The Steering Input

The composer owns prompt text, cursor position, wrapping, paste insertion, and editing keys. Empty composer lines render a placeholder; non-empty composer lines use a `> ` prompt prefix. The renderer chooses the placeholder by session state: active sessions say "Type to steer the agent...", completed sessions say "Ask a follow-up...", and no session says "Tell the browser what to do...".

The composer supports normal typing, Enter submission handled by `main.rs`, Shift/Alt/Meta Enter as newline paths, cursor movement, word deletion, line deletion, paste normalization, and wrapped cursor placement. This is why the workbench can feel like a small text editor without making the rest of the app handle text mechanics.

Sources: [crates/browser-use-tui/src/composer.rs:6-41](), [crates/browser-use-tui/src/composer.rs:70-123](), [crates/browser-use-tui/src/composer.rs:125-207](), [crates/browser-use-tui/src/render.rs:1242-1285]()

## Slash Palette and Temporary Surfaces

A leading `/` opens the slash command palette only when the main composer is empty. While the palette is open, typed characters update `palette_filter` instead of the composer; Backspace edits the filter; Enter executes the selected action; Esc closes the palette. The command list is intentionally small: task, history, browser, model, auth, and update.

The palette and other temporary surfaces render as centered modal overlays when possible. Text-input popups like API key and telemetry own their input cursor, so the composer underneath is cleared or hidden to avoid duplicated typing.

Sources: [crates/browser-use-tui/src/main.rs:1294-1336](), [crates/browser-use-tui/src/main.rs:2168-2226](), [crates/browser-use-tui/src/palette.rs:1-70](), [crates/browser-use-tui/src/render.rs:458-491](), [crates/browser-use-tui/src/render.rs:686-826]()

## Rendering the Workbench

`render_main` splits the terminal into a body, bottom area, and optional footer. The body changes by product state: setup lines, ready lines, or work lines. The bottom area is either a temporary bottom pane or the composer. Running, result, failed, and cancelled states pin the important work near the composer so new activity grows toward the input rather than drifting away.

The composer is a bordered input box with the selected browser punched into the bottom border. Beneath it, a status row shows the model, context bar, and session cost when usage events include cost data. The context bar is based on `model.usage` events and a 60k-token display budget.

Sources: [crates/browser-use-tui/src/render.rs:151-295](), [crates/browser-use-tui/src/render.rs:297-367](), [crates/browser-use-tui/src/render.rs:979-1114](), [crates/browser-use-tui/src/render.rs:1152-1227]()

## Transcript: Turning Events Into Human Work

The raw event log is not shown directly. `transcript.rs` maps event types into transcript nodes: prompts, assistant markdown, result files, grouped timeline rows, browser activity, errors, cancelled state, and live status. It hides many low-level model and tool events, merges repeated timeline groups, compacts repeated file reads, and keeps transient thinking out of terminal scrollback.

For a newcomer, this is the difference between "the engine made many internal noises" and "the dashboard says it read files, ran a command, opened a page, or produced a result."

Sources: [crates/browser-use-tui/src/transcript.rs:20-92](), [crates/browser-use-tui/src/transcript.rs:368-443](), [crates/browser-use-tui/src/transcript.rs:591-715](), [crates/browser-use-tui/src/transcript.rs:760-910](), [crates/browser-use-tui/src/transcript.rs:913-981]()

## Keyboard Behavior

| Input | Behavior |
| --- | --- |
| `Enter` | Run a new task, send a follow-up, or execute the selected surface row |
| `Tab` | Open history |
| `F2` | Open browser surface |
| `/` | Open slash palette when the main composer is empty |
| `Esc` | Close overlays; on main with an active task, press again quickly to stop |
| `Ctrl+C` | Clear input, stop current task, or require a second press to quit |
| `Ctrl+Q` | Quit immediately |

The product UX document asks for a tiny keyboard model, and the implementation mostly keeps that model centralized in `App::handle_key`.

Sources: [docs/terminal-ui-product-ux.md:485-498](), [crates/browser-use-tui/src/main.rs:1125-1200](), [crates/browser-use-tui/src/main.rs:1215-1291]()

## Setup, Auth, Model, and Browser Choices

Setup is an activation and repair flow, not a permanent dashboard. First-run setup appears when setup is incomplete, no session is selected, and the composer is empty. Model selection persists the display model, provider model, account, and backend. Browser selection persists the browser and, for Browser Use cloud, starts API key entry if the key is missing.

This is also where provider-neutral design shows up in the UI: users choose an account path and a model, while the provider/backend mapping stays in settings and runtime code.

Sources: [crates/browser-use-tui/src/main.rs:1369-1374](), [crates/browser-use-tui/src/main.rs:1474-1508](), [crates/browser-use-tui/src/main.rs:1733-1772](), [crates/browser-use-tui/src/main.rs:1774-1798](), [crates/browser-use-tui/src/main.rs:2135-2143]()

## Terminal Safety and Live Redraws

The TUI runs in raw mode, enables bracketed paste, requests enhanced keyboard reporting, and uses an inline Ratatui viewport. Its loop drains store/auth notifications, refreshes from the store as a fallback, debounces resize events, animates the welcome view, animates live status, polls for terminal events, and redraws only when needed.

Terminal UI correctness is treated as more than compilation. The repository testing guide requires real terminal or tmux checks for keyboard behavior, visible text, missing escape leaks, paste markers, resize behavior, and clean selectable output.

Sources: [crates/browser-use-tui/src/main.rs:3085-3177](), [crates/browser-use-tui/src/main.rs:3180-3261](), [crates/browser-use-tui/src/main.rs:3474-3559](), [docs/terminal-ui-testing.md:1-33]()

## Why This Page Matters

The workbench is the part of the repo where product shape and runtime architecture meet. It must stay simple for users, but it cannot be vendor-specific or brittle: the code keeps account/model/browser choices explicit, stores durable events, renders from projected state, and starts the selected backend through runtime configuration. That is the key pattern to preserve when adding new providers, browser modes, setup paths, or workbench actions.

Sources: [crates/browser-use-tui/src/settings.rs:52-153](), [crates/browser-use-tui/src/runtime.rs:43-50](), [crates/browser-use-tui/src/main.rs:980-1049]()

---

## 04. The Agent Loop and Its Tools

> The agent loop is the careful helper: it asks a model provider for the next move, exposes browser and filesystem tools, records events, and stops only when a done, failure, or cancellation path is reached.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/04-the-agent-loop-and-its-tools.md
- Generated: 2026-05-22T22:45:50.545Z

### Source Files

- `crates/browser-use-core/src/lib.rs`
- `crates/browser-use-core/src/tools/mod.rs`
- `crates/browser-use-core/src/tools/command.rs`
- `crates/browser-use-core/src/tools/files.rs`
- `crates/browser-use-providers/src/lib.rs`
- `crates/browser-use-protocol/src/lib.rs`
- `prompts/browser-agent-system.md`
- `prompts/python-tool-description.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-core/src/tools/mod.rs](crates/browser-use-core/src/tools/mod.rs)
- [crates/browser-use-core/src/tools/command.rs](crates/browser-use-core/src/tools/command.rs)
- [crates/browser-use-core/src/tools/files.rs](crates/browser-use-core/src/tools/files.rs)
- [crates/browser-use-providers/src/lib.rs](crates/browser-use-providers/src/lib.rs)
- [crates/browser-use-protocol/src/lib.rs](crates/browser-use-protocol/src/lib.rs)
- [prompts/browser-agent-system.md](prompts/browser-agent-system.md)
- [prompts/browser-script-tool-description.md](prompts/browser-script-tool-description.md)
- [prompts/python-tool-description.md](prompts/python-tool-description.md)
</details>

# The Agent Loop and Its Tools

This page explains the agent loop: the part of `browser-use/terminal` that asks a model provider what to do next, gives that model a controlled set of browser and local tools, records what happened, and stops only through a terminal path like `done`, failure, or cancellation.

Context note: the requested Compound Engineering page-shape guidance was used as synthesis guidance only. This checkout did not expose local `STRATEGY.md` or `docs/solutions/**` files, so implementation claims below stay grounded in repository code and prompts.

## The Simple Mental Model

Think of the agent as a careful helper with a notebook. On each turn it rereads the notebook, asks the model for the next move, runs only the tools the repo registered, writes the result back into the notebook, and repeats. The notebook is the session event stream; the tool menu is `ToolRegistry`; the model is any implementation of `ModelProvider`.

```text
User task + prior events
        |
        v
ProviderTurn { messages, tools }
        |
        v
ModelEvent stream: text, usage, tool calls, done
        |
        v
Tool dispatch + event recording
        |
        v
done / next turn / failed / cancelled
```

Sources: [crates/browser-use-core/src/lib.rs:796-852](), [crates/browser-use-core/src/lib.rs:920-1039](), [crates/browser-use-protocol/src/lib.rs:56-148]()

## Sessions, Events, And Status

A session has an id, cwd, artifact root, status, and timestamps. Status is intentionally small: `created`, `running`, `done`, `failed`, or `cancelled`. Only `created` and `running` are active. Events are generic records with sequence number, session id, event type, and JSON payload, so the runtime can store model deltas, tool starts, browser state, artifacts, failures, and final answers without hard-coding a separate table for each event kind.

The protocol crate also defines the shared contract for tools and model output: `ToolSpec`, `ToolCall`, `ToolResult`, image attachments, usage accounting, and `ModelEvent`. This is the language the loop, providers, UI projections, and tools all share.

Sources: [crates/browser-use-protocol/src/lib.rs:4-65](), [crates/browser-use-protocol/src/lib.rs:79-148](), [crates/browser-use-protocol/src/lib.rs:177-200]()

## Provider Neutrality: BYOC And BYOK Friendly

The loop does not call a single vendor API directly. It depends on a `ModelProvider` trait that accepts a `ProviderTurn` containing messages and tool specs, then returns model events. The core config can route to Codex, OpenAI, Anthropic, OpenRouter/OpenAI-compatible chat, fake, or none. API keys and base URLs come from store settings or environment variables, which keeps bring-your-own-key and compatible endpoint setups possible.

```rust
// crates/browser-use-providers/src/lib.rs
pub trait ModelProvider {
    fn provider_name(&self) -> &'static str { "unknown" }
    fn model_name(&self) -> &str { "unknown" }
    fn start_turn(&self, turn: ProviderTurn) -> Result<Vec<ModelEvent>>;
}
```

Sources: [crates/browser-use-providers/src/lib.rs:18-45](), [crates/browser-use-core/src/lib.rs:55-70](), [crates/browser-use-core/src/lib.rs:301-337](), [crates/browser-use-core/src/lib.rs:426-455](), [crates/browser-use-core/src/lib.rs:555-570]()

### Provider Adapters

| Adapter area | What it proves |
| --- | --- |
| OpenAI Responses | Uses a configurable base URL and sends tool specs when present. |
| OpenAI-compatible chat | Defaults to OpenRouter but accepts compatible base URLs and tool calling. |
| Anthropic Messages | Supports API key and OAuth-token style credentials with a configurable base URL. |
| Fake and Scripted | Let tests or local flows produce deterministic model events without a hosted provider. |

Sources: [crates/browser-use-providers/src/lib.rs:119-202](), [crates/browser-use-providers/src/lib.rs:205-295](), [crates/browser-use-providers/src/lib.rs:298-480](), [crates/browser-use-providers/src/lib.rs:47-116]()

## One Turn Of The Loop

At the start of a run, the core appends `session.status = running` and records the selected provider/model. For every turn up to `max_turns`, it checks cancellation, pulls in any external messages, normalizes and compacts context, prepares `ProviderTurn { messages, tools }`, and calls the provider with retry support.

Provider events are folded back into local state. Text deltas become `model.delta`; usage becomes `model.usage`; tool calls become `model.tool_call`. If the model produced no tool calls but did produce final text, the loop records `session.done`. If the model produced tool calls, the dispatcher runs them and appends tool-result messages for the next provider turn.

Sources: [crates/browser-use-core/src/lib.rs:778-852](), [crates/browser-use-core/src/lib.rs:920-959](), [crates/browser-use-core/src/lib.rs:967-1039](), [crates/browser-use-core/src/lib.rs:1116-1205]()

## The Registered Tool Surface

`ToolRegistry::browser_agent()` is the menu the model sees. It includes local command and file tools, browser runtime control, browser page scripting, completion, planning, image viewing, patching, and helper-agent coordination. The registry exposes `browser` and `browser_script`, and tests explicitly assert that the legacy `python` name is not exposed in the browser-agent tool specs, even though the dispatcher still has a compatibility path for a `python` call.

| Tool group | Examples | Main job |
| --- | --- | --- |
| Completion | `done` | End the user-facing task with text or a result file. |
| Browser runtime | `browser` | Connect, start, inspect, recover, and manage browser runtime. |
| Browser interaction | `browser_script` | Run Python against the Rust-held CDP connection. |
| Local workspace | `exec_command`, `write_stdin`, `read_file`, `search_files`, `list_files`, `apply_patch`, `view_image` | Inspect or change local files and processes. |
| Coordination | `update_plan`, `spawn_agent`, `wait_agent`, `send_input`, `send_message`, `followup_task`, `list_agents`, `close_agent` | Track longer work and manage helper sessions. |

Sources: [crates/browser-use-core/src/tools/mod.rs:6-78](), [crates/browser-use-core/src/tools/mod.rs:80-208](), [crates/browser-use-core/src/tools/mod.rs:311-375](), [crates/browser-use-core/src/tools/mod.rs:416-593](), [crates/browser-use-core/src/tools/mod.rs:652-662]()

## Browser Tools: Runtime Versus Page Work

The prompts draw a hard line between browser lifecycle and page interaction. `browser` is the control plane: status, connect, setup, doctor, recovery, profiles, logs, and ownership. `browser_script` is the page/data plane: navigation, inspection, clicks, typing, screenshots, downloads, uploads, network inspection, extraction, and browser-backed verification.

The current `browser_script` contract says each call starts a fresh Python process, Python variables do not persist across calls, browser/CDP state persists in Rust, helpers are preimported, and raw CDP is the fallback when helpers are incomplete. That keeps browser state durable while avoiding a hidden long-lived Python object model.

Sources: [prompts/browser-agent-system.md:1-14](), [prompts/browser-agent-system.md:25-43](), [prompts/browser-script-tool-description.md:1-15](), [prompts/browser-script-tool-description.md:17-72](), [crates/browser-use-core/src/lib.rs:2928-2987](), [crates/browser-use-core/src/lib.rs:3263-3321]()

## Local Tools And Event Recording

Local command execution can either finish immediately or return a command session id for later `write_stdin`. Both paths record `tool.started`, command-specific events, and `tool.finished`. File tools follow the same pattern through `run_file_tool`: record start, run the operation, record finish, or record `tool.failed`.

The file tools are intentionally practical: `read_file` supports line and byte limits; `search_files` prefers `rg --json` and falls back when needed; `list_files` respects ignore files; `view_image` records an image artifact; `apply_patch` parses and applies Codex-style patches while recording changed files.

Sources: [crates/browser-use-core/src/tools/command.rs:62-210](), [crates/browser-use-core/src/tools/command.rs:213-326](), [crates/browser-use-core/src/tools/files.rs:23-101](), [crates/browser-use-core/src/tools/files.rs:103-170](), [crates/browser-use-core/src/tools/files.rs:173-253](), [crates/browser-use-core/src/tools/files.rs:256-361](), [crates/browser-use-core/src/tools/files.rs:364-404]()

## Parallelism Is Narrow And Deliberate

The agent can run a batch of safe read-oriented tool calls in parallel, but not every tool is parallel-safe. The dispatcher groups adjacent parallel-capable calls, records `tool.batch_started`, runs each in a thread with its own store handle, records individual batch results, then records `tool.batch_finished`. Parallel eligibility is limited to file reads/search/listing and known read-only commands. Browser work, mutation, stdin, patching, planning, and helper coordination stay ordered because they touch shared state.

Sources: [crates/browser-use-core/src/lib.rs:2429-2565](), [crates/browser-use-core/src/lib.rs:2699-2724](), [prompts/browser-agent-system.md:16-20](), [crates/browser-use-core/src/tools/mod.rs:281-309]()

## Completion, Failure, And Cancellation

There are three normal ways out of the loop. `done` validates that it has either a non-empty result or a readable `result_file`, optionally records the file as an artifact, appends `session.done`, and returns a finished dispatch outcome. Failures append `session.failed`, either when the run exceeds the provider turn budget or when an unrecovered error reaches the run wrapper. Cancellation is checked before and after provider/tool work; cancelled sessions finish the run as `cancelled`.

The protocol projection code reads those events back into user-visible state. `result_from_events` prefers the latest `session.done` or helper completion, while `failure_from_events` reads the latest `session.failed`.

Sources: [crates/browser-use-core/src/lib.rs:1035-1113](), [crates/browser-use-core/src/lib.rs:1447-1467](), [crates/browser-use-core/src/lib.rs:3417-3517](), [crates/browser-use-protocol/src/lib.rs:262-327](), [crates/browser-use-protocol/src/lib.rs:419-430]()

## Why This Shape Matters

The useful boundary is simple: providers decide the next move, tools execute bounded local or browser actions, and events make the whole run reconstructable. Because providers are adapters around `ModelProvider`, the architecture can stay portable across hosted APIs, compatible endpoints, stored credentials, environment-provided keys, and deterministic fake providers. The loop is not vendor-owned; it is event-owned.

Sources: [crates/browser-use-providers/src/lib.rs:18-45](), [crates/browser-use-core/src/lib.rs:301-337](), [crates/browser-use-protocol/src/lib.rs:56-148]()

---

## 05. The Browser Driver

> The browser layer is the remote-control box: Rust owns CDP connections and browser lifecycle, while Python helper code runs page scripts, collects artifacts, and reports browser events without forcing one model vendor.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/05-the-browser-driver.md
- Generated: 2026-05-22T22:45:57.694Z

### Source Files

- `crates/browser-use-browser/src/lib.rs`
- `crates/browser-use-browser/src/browser_script_helpers.py`
- `crates/browser-use-python-worker/src/lib.rs`
- `python/llm_browser_worker/worker.py`
- `prompts/browser-tool-description.md`
- `prompts/browser-script-tool-description.md`
- `prompts/interaction-skills/connection.md`
- `prompts/interaction-skills/profile-sync.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [crates/browser-use-browser/src/lib.rs](crates/browser-use-browser/src/lib.rs)
- [crates/browser-use-browser/src/browser_script_helpers.py](crates/browser-use-browser/src/browser_script_helpers.py)
- [crates/browser-use-python-worker/src/lib.rs](crates/browser-use-python-worker/src/lib.rs)
- [python/llm_browser_worker/worker.py](python/llm_browser_worker/worker.py)
- [prompts/browser-tool-description.md](prompts/browser-tool-description.md)
- [prompts/browser-script-tool-description.md](prompts/browser-script-tool-description.md)
- [prompts/interaction-skills/connection.md](prompts/interaction-skills/connection.md)
- [prompts/interaction-skills/profile-sync.md](prompts/interaction-skills/profile-sync.md)
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-browser/Cargo.toml](crates/browser-use-browser/Cargo.toml)
- [crates/browser-use-python-worker/Cargo.toml](crates/browser-use-python-worker/Cargo.toml)
</details>

# The Browser Driver

The browser driver is the remote-control box for web work in `browser-use/terminal`. Rust keeps the durable browser state: which browser is connected, which CDP websocket is open, which tab target is current, and which recovery actions are safe. Python is used for the small scripts that inspect pages, click, type, collect screenshots, and return artifacts.

A simple way to picture it: Rust holds the remote; Python presses the buttons for one task. This matters because a model can change, a user can bring their own browser, and a deployment can bring its own CDP endpoint without rewriting the page-interaction API.

Generation note: requested strategy and solved-problem source classes were not present in this checkout. The Compound Engineering guidance was available as bundled page-shape metadata for this wiki run, not as an installed local skill execution. Repository code and prompts remain the cited source of truth.

## The Main Split

The repository names the split directly: `browser` controls connection, lifecycle, and debug state; `browser_script` runs Python page interaction through the Rust-held CDP connection. The Rust output types also show the boundary: browser commands return command content plus browser events, while browser scripts return text, artifacts, images, and browser events.

Sources: [crates/browser-use-browser/src/lib.rs:1-6](), [crates/browser-use-browser/src/lib.rs:33-54](), [prompts/browser-tool-description.md:23-30](), [prompts/browser-script-tool-description.md:1-15]()

```text
model/tool call
    |
    | browser cmd                browser_script code
    v                           v
Rust browser session       fresh Python process
CDP websocket              helper functions
target/session ids         page JS, clicks, screenshots
browser lifecycle          artifacts/images/output
```

### What Rust Owns

Rust stores the session-level control state in `BrowserSession`: mode, owner, endpoint, CDP connection, current target id, current session id, connection generation, managed process handle, remote browser id, live URL, last errors, and logs. That state is in an in-process session registry keyed by session id.

Sources: [crates/browser-use-browser/src/lib.rs:146-192](), [crates/browser-use-browser/src/lib.rs:194-218]()

### What Python Owns

In the Rust `browser_script` path, Python is fresh per call. Rust builds a Python prelude, injects helper code, runs the user script, auto-collects newly written files, and emits one JSON result marker back to Rust. The prompt repeats the same model-facing rule: Python variables do not persist across `browser_script` calls, while browser/CDP state persists in Rust.

Sources: [crates/browser-use-browser/src/lib.rs:220-344](), [crates/browser-use-browser/src/lib.rs:2782-2974](), [prompts/browser-script-tool-description.md:7-16]()

## Browser Modes And Ownership

The driver supports more than one browser source. This is the key BYOC shape: the user can attach to an existing local Chromium-family browser, let Rust launch a managed Chromium, connect to an external CDP endpoint, or start a Browser Use cloud browser. Ownership decides what Rust may safely stop or restart.

Sources: [crates/browser-use-browser/src/lib.rs:56-92](), [crates/browser-use-browser/src/lib.rs:438-468](), [crates/browser-use-browser/src/lib.rs:630-686]()

| Mode | How it connects | Owner | Safe actions |
|---|---|---:|---|
| `local` | Finds a running browser exposing `DevToolsActivePort` or a known local CDP port | External | Rust attaches, but does not kill the user browser |
| `managed` | Launches Chromium with a temp or explicit automation profile | Rust | Rust can stop or restart it |
| `remote-cdp` | Connects to an external DevTools HTTP URL or websocket | External | Rust reconnects to the endpoint, but does not own the browser |
| `remote-cloud` | Starts Browser Use cloud through API and connects to its CDP URL | Rust | Rust can stop the cloud browser it created |

`browser status --json` exposes this ownership back to the agent, including `safety.can_restart_browser`, `safety.can_close_browser`, and `safety.can_stop_remote`. `browser runtime ownership --json` gives a more direct safe-action view before stopping anything.

Sources: [crates/browser-use-browser/src/lib.rs:630-686](), [prompts/browser-tool-description.md:59-72](), [prompts/browser-tool-description.md:80-87]()

## CDP Is The Wire

CDP, the Chrome DevTools Protocol, is the wire between the terminal and the browser. Rust opens a websocket, sends messages with incrementing ids, optionally includes a CDP `sessionId`, and waits for the matching response. If a CDP call fails, Rust records the error, classifies it, clears the live connection, and remembers the last target/session ids for recovery.

Sources: [crates/browser-use-browser/src/lib.rs:1183-1201](), [crates/browser-use-browser/src/lib.rs:1280-1324](), [crates/browser-use-browser/src/lib.rs:1350-1395]()

```rust
// crates/browser-use-browser/src/lib.rs
fn call(&mut self, method: &str, session_id: Option<&str>, params: Value) -> Result<Value> {
    let id = self.next_id;
    self.next_id += 1;
    let mut message = json!({ "id": id, "method": method, "params": params });
    if let Some(session_id) = session_id {
        message["sessionId"] = Value::String(session_id.to_string());
    }
    self.socket.send(Message::Text(serde_json::to_string(&message)?))?;
    // waits until the response with the same id arrives
}
```

The browser-script bridge is deliberately narrow. Python sends JSON requests like `{"kind":"cdp","method":"Page.navigate"}` to a localhost bridge. Rust temporarily removes the session from the registry while handling the bridge request, runs the CDP call, then puts the session back. That prevents two page scripts from mutating the same CDP session at the same instant.

Sources: [crates/browser-use-browser/src/lib.rs:2637-2692](), [crates/browser-use-browser/src/lib.rs:2694-2721](), [crates/browser-use-browser/src/lib.rs:2723-2780]()

## Tabs, Targets, And The Invisible-Tab Problem

Chrome exposes tabs and some internal surfaces as CDP targets. The driver tries to attach to a real page target first. If no real page exists, Rust creates an `about:blank` tab and attaches to that. This avoids a common failure where automation accidentally attaches to `chrome://omnibox-popup.top-chrome/`, a tiny invisible page target.

Sources: [crates/browser-use-browser/src/lib.rs:1203-1225](), [crates/browser-use-browser/src/lib.rs:2978-2988](), [prompts/interaction-skills/connection.md:3-15]()

Python helpers make tab work explicit:

```python
# crates/browser-use-browser/src/browser_script_helpers.py
tabs = list_tabs(include_chrome=False)
tab = ensure_real_tab()
switch_tab(tab["target_id"])
goto_url("https://example.com")
```

`list_tabs()` reads `Target.getTargets`, `switch_tab()` activates and attaches to a target, and `new_tab()` creates a blank target first before navigating to avoid attach/load races.

Sources: [crates/browser-use-browser/src/browser_script_helpers.py:219-264](), [crates/browser-use-browser/src/browser_script_helpers.py:267-285](), [prompts/interaction-skills/connection.md:34-41]()

## Page Interaction Helpers

The helper layer turns CDP into simple page actions. `cdp()` is still available as the source of truth, but normal work uses helpers such as `js()`, `goto_url()`, `page_info()`, `wait_for_element()`, `screenshot()`, `click_at_xy()`, `fill_input()`, `press_key()`, `scroll()`, and `upload_file()`.

Sources: [crates/browser-use-browser/src/browser_script_helpers.py:21-45](), [crates/browser-use-browser/src/browser_script_helpers.py:167-202](), [crates/browser-use-browser/src/browser_script_helpers.py:299-356](), [crates/browser-use-browser/src/browser_script_helpers.py:369-541](), [prompts/browser-script-tool-description.md:17-59]()

| Helper group | Examples | What it really does |
|---|---|---|
| Raw protocol | `cdp()`, `cdp_batch()` | Sends CDP methods through Rust |
| DOM and JS | `js()`, `page_info()` | Runs `Runtime.evaluate` and returns JSON-like values |
| Navigation | `goto_url()`, `new_tab()`, `switch_tab()` | Uses `Page.navigate` and `Target.*` |
| Input | `click_at_xy()`, `type_text()`, `press_key()`, `fill_input()` | Dispatches CDP mouse and keyboard events |
| Evidence | `screenshot()`, `screenshot_clip()`, `copy_artifact()` | Writes files/images into the artifact result |

## Artifacts, Images, And Events

The Rust `browser_script` path scans the artifact directory and output directory before and after Python runs. New files are automatically reported as artifacts, and screenshots can be emitted as image records. This lets page scripts produce evidence without depending on a particular model provider.

Sources: [crates/browser-use-browser/src/lib.rs:2782-2825](), [crates/browser-use-browser/src/lib.rs:2858-2915](), [crates/browser-use-browser/src/lib.rs:2961-2973]()

There is also a longer-lived Python worker for the general Python tool surface. It sends JSON requests over stdin/stdout, streams host-helper events before the final response, copies artifacts into `files` or `images`, and records browser events such as `browser.state`, `browser.connected`, `browser.reconnected`, and `browser.target_changed`.

Sources: [crates/browser-use-python-worker/src/lib.rs:298-372](), [python/llm_browser_worker/worker.py:593-641](), [python/llm_browser_worker/worker.py:685-772](), [python/llm_browser_worker/worker.py:1495-1537](), [python/llm_browser_worker/worker.py:1570-1613]()

The core runtime records those events into the session store. Browser command events, browser-script response events, Python worker output, images, artifacts, and browser events all pass through explicit record functions rather than being hidden in transcript text.

Sources: [crates/browser-use-core/src/lib.rs:2928-2987](), [crates/browser-use-core/src/lib.rs:3263-3315](), [crates/browser-use-core/src/lib.rs:5109-5150]()

## Profiles And Cookies

Local profile handling is intentionally conservative. The browser tool can list local Chromium-family profiles with Rust filesystem discovery, inspect a selected profile through CDP, and return cookie domain/count/expiry summaries. It does not return raw cookie values by default.

Sources: [crates/browser-use-browser/src/lib.rs:1838-1876](), [crates/browser-use-browser/src/lib.rs:1878-1934](), [prompts/browser-tool-description.md:50-58]()

Cloud profile sync from local Chrome is not part of the current terminal release. The documented flow is: use local Chrome by attaching to an already-open browser, or use an existing Browser Use cloud profile by id/name. Do not assume local-to-cloud cookie copying works.

Sources: [prompts/interaction-skills/profile-sync.md:1-13](), [prompts/interaction-skills/profile-sync.md:14-26]()

## Recovery Model

Recovery is explicit because browser automation can go stale in several different ways. A websocket can drop, a target can close, a session id can change, or a user-owned browser can stop exposing CDP. The driver reports `next_step` values and safe actions instead of silently switching tabs, relaunching browsers, or killing external Chrome.

Sources: [crates/browser-use-browser/src/lib.rs:688-708](), [crates/browser-use-browser/src/lib.rs:1009-1074](), [crates/browser-use-browser/src/lib.rs:1096-1181](), [prompts/browser-tool-description.md:74-87]()

| Symptom | Likely state | Recovery path |
|---|---|---|
| No endpoint configured | `not-configured` | Connect local, managed, or remote |
| CDP port is stale | `stale-port` / `browser-closed` | Reopen Chrome/profile, then reconnect |
| Websocket dropped | `websocket-dropped` | `browser recover reconnect-websocket` |
| Current target disappeared | `target-gone` | List/switch tabs or open a new tab |
| Session id changed | `browser.reconnected` | Treat old JS object ids as stale |

Tests lock down this behavior: status includes recovery fields, browser events are transition-based rather than heartbeat spam, recovery without a configured endpoint fails without side effects, script timeouts become tool failures, and the bridge handles large JSON responses.

Sources: [crates/browser-use-browser/src/lib.rs:3136-3192](), [crates/browser-use-browser/src/lib.rs:3216-3277](), [crates/browser-use-browser/src/lib.rs:3279-3315]()

## Provider-Neutral Shape

The browser driver is not tied to a model vendor. The browser crate depends on general Rust libraries for JSON, HTTP, websockets, temp files, and opening local URLs; it does not depend on a model SDK. The Python worker crate likewise depends on process/JSON support, not a model provider.

Sources: [crates/browser-use-browser/Cargo.toml:7-15](), [crates/browser-use-python-worker/Cargo.toml:7-12]()

At the core runtime level, provider/model configuration is recorded separately from browser worker startup. The worker gets browser-mode and environment settings, while model provider information is appended as its own `model.config` event. That separation is what keeps BYOC/BYOK practical: bring your own browser or CDP endpoint, bring your own model/provider key elsewhere, and keep the browser interface the same.

Sources: [crates/browser-use-core/src/lib.rs:760-790](), [crates/browser-use-core/src/lib.rs:3132-3189]()

Remote Browser Use cloud is an optional browser backend, not a required model provider. It needs `BROWSER_USE_API_KEY` only for cloud browsers/profiles; local, managed, and external remote-CDP modes remain available without that cloud browser path.

Sources: [crates/browser-use-browser/src/lib.rs:893-958](), [crates/browser-use-browser/src/lib.rs:2600-2627](), [prompts/browser-tool-description.md:66-72]()

## Practical Mental Checklist

When adding or debugging browser work, ask these questions in order:

1. Is this lifecycle work or page work? Use `browser` for lifecycle, `browser_script` for page interaction.
2. Who owns the browser? Only Rust-owned managed/cloud browsers can be stopped or restarted by Rust.
3. Is the current target real and visible? If not, use `list_tabs()`, `ensure_real_tab()`, or `switch_tab()`.
4. Did the websocket, target, or session change? Treat old object ids as stale after reconnect/target-change events.
5. Is the evidence saved? Use screenshots, artifacts, and `audit_artifact()` when the result needs verification.

Sources: [prompts/browser-tool-description.md:1-4](), [prompts/browser-tool-description.md:132-132](), [prompts/browser-script-tool-description.md:61-72](), [python/llm_browser_worker/worker.py:988-1051](), [python/llm_browser_worker/worker.py:1053-1339]()

## Summary

The browser driver works because it keeps ownership clear. Rust holds browser lifecycle, CDP connection state, recovery state, and safe-action rules. Python helpers perform page-level work and package evidence as text, events, artifacts, and images. That boundary keeps the browser layer portable across local browsers, managed Chromium, external CDP endpoints, and optional cloud browsers without forcing one model vendor or one hosted execution path.

Sources: [crates/browser-use-browser/src/lib.rs:1-6](), [crates/browser-use-browser/src/lib.rs:988-1006](), [crates/browser-use-browser/src/lib.rs:2637-2780](), [prompts/browser-script-tool-description.md:7-16]()

---

## 06. Remember This Map

> A closing recap: the TUI is the steering wheel, the agent loop is the helper, the browser driver is the remote-control box, and the store plus tests are how the repo remembers and proves what happened.

- Page Markdown: https://grok-wiki.com/public/wiki/browser-use-terminal-686510dbe50c/pages/06-remember-this-map.md
- Generated: 2026-05-22T22:46:16.268Z

### Source Files

- `crates/browser-use-store/src/lib.rs`
- `crates/browser-use-store/migrations/0001_initial.sql`
- `crates/browser-use-protocol/src/lib.rs`
- `tests/golden-events/running-browser-session/events.jsonl`
- `tests/golden-events/running-browser-session/session.json`
- `scripts/verify-terminal-ui.sh`
- `scripts/tui-terminal-smoke.py`
- `docs/terminal-ui-testing.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [crates/browser-use-store/src/lib.rs](crates/browser-use-store/src/lib.rs)
- [crates/browser-use-store/migrations/0001_initial.sql](crates/browser-use-store/migrations/0001_initial.sql)
- [crates/browser-use-store/migrations/0002_agent_session_fields.sql](crates/browser-use-store/migrations/0002_agent_session_fields.sql)
- [crates/browser-use-store/migrations/0003_agent_messages.sql](crates/browser-use-store/migrations/0003_agent_messages.sql)
- [crates/browser-use-store/migrations/0004_app_settings.sql](crates/browser-use-store/migrations/0004_app_settings.sql)
- [crates/browser-use-protocol/src/lib.rs](crates/browser-use-protocol/src/lib.rs)
- [crates/browser-use-core/src/lib.rs](crates/browser-use-core/src/lib.rs)
- [crates/browser-use-browser/src/lib.rs](crates/browser-use-browser/src/lib.rs)
- [crates/browser-use-providers/src/lib.rs](crates/browser-use-providers/src/lib.rs)
- [crates/browser-use-tui/src/main.rs](crates/browser-use-tui/src/main.rs)
- [crates/browser-use-tui/src/runtime.rs](crates/browser-use-tui/src/runtime.rs)
- [crates/browser-use-tui/src/render.rs](crates/browser-use-tui/src/render.rs)
- [crates/browser-use-tui/src/settings.rs](crates/browser-use-tui/src/settings.rs)
- [tests/golden-events/running-browser-session/events.jsonl](tests/golden-events/running-browser-session/events.jsonl)
- [tests/golden-events/running-browser-session/session.json](tests/golden-events/running-browser-session/session.json)
- [scripts/verify-terminal-ui.sh](scripts/verify-terminal-ui.sh)
- [scripts/tui-terminal-smoke.py](scripts/tui-terminal-smoke.py)
- [docs/terminal-ui-testing.md](docs/terminal-ui-testing.md)
</details>

# Remember This Map

This page is the closing mental map for `browser-use/terminal`. Think of the app as four simple parts: the TUI is the steering wheel, the agent loop is the helper, the browser driver is the remote-control box, and the store plus tests are the notebook and proof system.

This is not a claim that the repo is tiny. It is a useful first map: when something changes on screen, an event was probably written; when an agent does work, the loop probably appended events; when a browser changes, the browser driver probably emitted browser events; when a maintainer asks whether a TUI change is done, the terminal verification loop is the repo-owned answer.

Generation note: no `STRATEGY.md` or `docs/solutions/**` source was present in this checkout, and no installed local Compound Engineering skill was available to execute. This page applies the requested wiki-shape guidance as portable synthesis while keeping repository code as the source of truth.

## The One-Page Map

```mermaid
flowchart LR
  subgraph UI["TUI: steering wheel"]
    Main["crates/browser-use-tui/src/main.rs<br/>surfaces, commands, state cache"]
    Render["crates/browser-use-tui/src/render.rs<br/>WorkbenchState -> terminal"]
  end

  subgraph Agent["Agent loop: helper"]
    Runtime["tui/runtime.rs<br/>starts an agent thread"]
    Core["browser-use-core<br/>provider turns, tools, events"]
    Providers["browser-use-providers<br/>provider-neutral ModelProvider"]
  end

  subgraph Browser["Browser driver: remote-control box"]
    BrowserCrate["browser-use-browser<br/>CDP session + browser commands"]
    PythonBridge["browser_script bridge<br/>fresh Python page work"]
  end

  subgraph Memory["Store + protocol: memory"]
    Store["browser-use-store<br/>SQLite state.db + artifacts"]
    Protocol["browser-use-protocol<br/>events -> WorkbenchState"]
  end

  subgraph Proof["Proof loop"]
    Golden["golden-events fixtures"]
    Verify["verify-terminal-ui.sh + tmux smoke"]
  end

  Main --> Runtime --> Core
  Core --> Providers
  Core --> BrowserCrate
  BrowserCrate --> PythonBridge
  Core --> Store
  BrowserCrate --> Store
  Store --> Protocol --> Render
  Store --> Golden
  Verify --> Main
```

The arrows show ownership of movement, not a strict call stack for every line. The TUI starts and displays work. The core loop performs model/tool turns. The browser crate owns browser control. The store records the facts. The protocol turns facts into display state. The tests check that the display and saved events still mean what maintainers think they mean.

Sources: [crates/browser-use-tui/src/main.rs:261-322](), [crates/browser-use-tui/src/runtime.rs:12-60](), [crates/browser-use-core/src/lib.rs:699-843](), [crates/browser-use-browser/src/lib.rs:1-7](), [crates/browser-use-store/src/lib.rs:321-378](), [crates/browser-use-protocol/src/lib.rs:1013-1076](), [scripts/verify-terminal-ui.sh:11-43]()

## The TUI Is The Steering Wheel

The terminal UI owns the human-facing surfaces: main, setup, account, API key, telemetry, model, browser, history, and developer. It also owns commands like starting a task, sending a follow-up, reconnecting the browser, changing model, changing browser, and saving settings. That is why “steering wheel” is a good name: it does not do every job itself, but it decides what the user is trying to do next.

The TUI keeps a local cache of sessions and events. It hydrates from the store, refreshes sessions and events after notifications, and projects that stored history into a `WorkbenchState`. Rendering then reads `WorkbenchState` and chooses what to draw: setup, ready, running, result, failed, or cancelled.

Sources: [crates/browser-use-tui/src/main.rs:116-198](), [crates/browser-use-tui/src/main.rs:261-339](), [crates/browser-use-tui/src/main.rs:341-420](), [crates/browser-use-tui/src/main.rs:445-557](), [crates/browser-use-tui/src/render.rs:93-129](), [crates/browser-use-tui/src/render.rs:151-213]()

### What The TUI Does Not Own

The TUI does not directly implement provider calls or browser automation. When it needs work done, it starts an agent thread. That thread opens the store, prepares browser mode and provider config, then calls the core runtime.

Sources: [crates/browser-use-tui/src/runtime.rs:12-50](), [crates/browser-use-core/src/lib.rs:301-338]()

## The Agent Loop Is The Helper

The agent loop is the helper that turns a task into steps. When a new task starts, core creates or loads a session, writes `session.input`, builds provider messages, and enters a turn loop. Inside the loop it records status, model config, provider usage, model deltas, tool calls, terminal results, and failures.

A short version:

1. Start with a stored session and task text.
2. Choose a provider backend and model.
3. Ask the provider for text or tool calls.
4. Dispatch tool calls.
5. Append events after each meaningful thing.
6. Finish with `session.done`, `session.failed`, or cancellation.

```rust
// crates/browser-use-core/src/lib.rs
store.append_event(
    &session.id,
    "session.input",
    serde_json::json!({ "text": task_text }),
)?;
```

The important design detail is that the helper reports its work by writing events. The UI can then recover the story from the store instead of trusting a fragile in-memory screen state.

Sources: [crates/browser-use-core/src/lib.rs:252-269](), [crates/browser-use-core/src/lib.rs:779-843](), [crates/browser-use-core/src/lib.rs:920-1003](), [crates/browser-use-core/src/lib.rs:1035-1088]()

### Provider Neutrality And BYOK

Provider choice is deliberately separated from the loop. `ProviderBackend` includes Codex, OpenAI, Anthropic, OpenRouter, Fake, and None. The provider crate defines a `ModelProvider` trait with `provider_name`, `model_name`, and turn methods. Concrete providers read keys or base URLs from settings or environment variables, which keeps the architecture friendly to BYOK and provider substitution.

This does not mean every provider has identical behavior. It means the core loop talks to a trait and records provider/model metadata, while provider-specific authentication and request formats stay behind provider implementations.

Sources: [crates/browser-use-core/src/lib.rs:56-90](), [crates/browser-use-core/src/lib.rs:301-348](), [crates/browser-use-core/src/lib.rs:426-484](), [crates/browser-use-providers/src/lib.rs:18-45](), [crates/browser-use-providers/src/lib.rs:119-154](), [crates/browser-use-providers/src/lib.rs:205-240](), [crates/browser-use-providers/src/lib.rs:348-408](), [crates/browser-use-tui/src/settings.rs:4-12](), [crates/browser-use-tui/src/settings.rs:60-79]()

## The Browser Driver Is The Remote-Control Box

The browser crate says its split plainly: `browser` controls connection, lifecycle, and debug state; `browser_script` runs fresh Python through the Rust-held CDP connection. That is the remote-control box: it knows how to connect to local Chrome, start managed Chromium, connect to remote CDP, start remote cloud sessions, recover, report status, and expose runtime ownership/log commands.

The browser session keeps mode, owner, endpoint, CDP connection, target/session IDs, managed-browser state, remote browser ID, live URL, profile, errors, and logs. Browser commands then emit browser events when the visible browser state changes.

Sources: [crates/browser-use-browser/src/lib.rs:1-7](), [crates/browser-use-browser/src/lib.rs:56-75](), [crates/browser-use-browser/src/lib.rs:146-188](), [crates/browser-use-browser/src/lib.rs:194-218](), [crates/browser-use-browser/src/lib.rs:412-560](), [crates/browser-use-browser/src/lib.rs:572-617]()

### Browser Scripts Are Isolated Work Bursts

When page interaction needs Python, `run_browser_script` creates an artifact directory, starts a local bridge, builds a prelude, spawns `python3`, waits with a timeout, parses a marked JSON result, and returns text, data, images, artifacts, and browser events. That keeps browser page work as a contained burst while Rust keeps the browser session registry.

Sources: [crates/browser-use-browser/src/lib.rs:220-344](), [crates/browser-use-browser/src/lib.rs:393-410]()

## The Store Is How The Repo Remembers

The store is the durable notebook. Opening a store resolves the state directory, creates the state and artifact directories, opens `state.db`, enables WAL mode and a busy timeout, then applies migrations. The first migration defines `sessions`, `events`, `artifacts`, `runs`, and `agent_edges`. Later migrations add agent metadata, agent messages, and app settings.

| Thing | What It Remembers |
|---|---|
| `sessions` | task identity, parent session, cwd, artifact root, status, timestamps |
| `events` | ordered facts for a session: input, browser state, tool output, result, failures |
| `artifacts` | files/images linked to sessions and event sequences |
| `runs` | process/run lifecycle for a session |
| `agent_edges` and agent fields | parent-child agent relationships |
| `app_settings` | local account, auth, browser, model, and setup choices |

Sources: [crates/browser-use-store/src/lib.rs:16-25](), [crates/browser-use-store/src/lib.rs:104-145](), [crates/browser-use-store/migrations/0001_initial.sql:1-49](), [crates/browser-use-store/migrations/0002_agent_session_fields.sql:1-5](), [crates/browser-use-store/migrations/0003_agent_messages.sql:1-11](), [crates/browser-use-store/migrations/0004_app_settings.sql:1-5]()

### Events Are The Main Memory Shape

`append_event` inserts an event row, updates the session timestamp, updates status when the event implies status, commits, and notifies listeners. The TUI cache listens for those notifications, pulls only new events after the last known sequence, and marks the projected state dirty.

That makes the app event-sourced in the practical sense: the screen is rebuilt from session metadata plus event history.

Sources: [crates/browser-use-store/src/lib.rs:321-378](), [crates/browser-use-store/src/lib.rs:401-435](), [crates/browser-use-store/src/lib.rs:490-517](), [crates/browser-use-store/src/lib.rs:1040-1125](), [crates/browser-use-tui/src/main.rs:362-377](), [crates/browser-use-tui/src/main.rs:445-463]()

## The Protocol Is The Translator

`browser-use-protocol` defines the shared vocabulary: session metadata, session statuses, event records, artifact metadata, tool calls, tool results, model events, browser summaries, telemetry summaries, history rows, transcript turns, and the final `WorkbenchState`.

It also translates raw events into useful UI facts. For example, `browser_summary_from_events` turns `browser.live_url`, `browser.page`, and `browser.state` events into status, URL, title, tab count, viewport, and live URL. `project_workbench` combines current session events, all session events, history, transcript, browser summary, telemetry, result, and failure into one state object for rendering.

Sources: [crates/browser-use-protocol/src/lib.rs:4-23](), [crates/browser-use-protocol/src/lib.rs:56-113](), [crates/browser-use-protocol/src/lib.rs:150-189](), [crates/browser-use-protocol/src/lib.rs:202-260](), [crates/browser-use-protocol/src/lib.rs:433-495](), [crates/browser-use-protocol/src/lib.rs:1013-1076]()

## Golden Fixtures Are Small Recorded Stories

The golden `running-browser-session` fixture is a tiny recorded story: a user asks to open and inspect a live browser, a live URL appears, the browser reports it is connected to `https://example.com/` with title, tab count, and viewport, and then a Python tool starts.

```json
{"type":"session.input","payload":{"text":"Open the live browser and inspect the page"}}
{"type":"browser.live_url","payload":{"live_url":"https://live.browser-use.com/?wss=golden"}}
{"type":"browser.state","payload":{"status":"connected","url":"https://example.com/","title":"Example Domain","tabs":2,"viewport":{"w":1280,"h":720}}}
```

The matching session metadata says the fixture session is `running`. Store tests import all checked-in golden event fixtures and assert expected IDs, so these files are not just examples; they are compatibility checks for the stored event shape.

Sources: [tests/golden-events/running-browser-session/events.jsonl:1-4](), [tests/golden-events/running-browser-session/session.json:1-8](), [crates/browser-use-store/src/lib.rs:1181-1235]()

## Terminal Tests Are The Proof System

The repository does not treat `cargo test` alone as enough for TUI work. The docs say terminal UI changes cross a real terminal boundary, so useful tests must start the app in a terminal, send real keys, capture visible output, and assert both presence and absence: expected UI text should appear, duplicate chrome and raw escape sequences should not.

The full verification script runs formatting, Rust tests, Python tests, deterministic TUI dumps for several states and overlays, and then the real terminal smoke test. The smoke test uses tmux, sends keys, captures panes, writes artifacts under `/tmp/but-design-loop`, and checks for problems like duplicated panels, broken bracketed paste, stale redraws, and ANSI escapes in plain output.

Sources: [docs/terminal-ui-testing.md:1-33](), [scripts/verify-terminal-ui.sh:11-43](), [scripts/tui-terminal-smoke.py:1-7](), [scripts/tui-terminal-smoke.py:80-109](), [scripts/tui-terminal-smoke.py:112-188](), [scripts/tui-terminal-smoke.py:255-257]()

## How To Use This Map When Debugging

| Symptom | First Place To Look | Why |
|---|---|---|
| The screen is wrong but events look right | `crates/browser-use-protocol` and `crates/browser-use-tui/src/render.rs` | Projection or rendering may be translating correctly stored events incorrectly. |
| A task does not finish or records the wrong result | `crates/browser-use-core/src/lib.rs` | The agent loop owns provider turns, tool dispatch, `session.done`, and failure events. |
| Browser status, live URL, or page metadata is wrong | `crates/browser-use-browser/src/lib.rs` | Browser command dispatch and browser event emission live there. |
| History, resume, or session status is wrong | `crates/browser-use-store/src/lib.rs` and migrations | Store inserts events, derives session status, and lists events/sessions. |
| A TUI fix looks fine in dumps but feels broken in a terminal | `scripts/verify-terminal-ui.sh` and `scripts/tui-terminal-smoke.py` | The repo requires real-terminal verification for keyboard, scrollback, paste, redraw, and plain output behavior. |

Sources: [crates/browser-use-protocol/src/lib.rs:1013-1076](), [crates/browser-use-tui/src/render.rs:93-129](), [crates/browser-use-core/src/lib.rs:920-1040](), [crates/browser-use-browser/src/lib.rs:412-617](), [crates/browser-use-store/src/lib.rs:321-435](), [docs/terminal-ui-testing.md:21-33]()

## Closing Summary

Remember the map: the TUI steers, the agent loop helps, the browser driver controls the browser, the store remembers, and the tests prove that the remembered story still renders and behaves correctly in a real terminal. The central design habit is to turn work into events, then rebuild user-facing state from those events through the protocol layer. Sources: [crates/browser-use-store/src/lib.rs:321-378](), [crates/browser-use-protocol/src/lib.rs:1013-1076](), [scripts/verify-terminal-ui.sh:11-43]()

---
