# Browser Automation vs Terminal Control

> Contrasts tmux's terminal-only control and capture model with cmux's browser pane automation, WebKit-backed panels, screenshots, profiles, and socket-addressable browser actions for local dev workflows.

- Repository: tmux/tmux-with-manaflow-ai-cmux
- GitHub: https://github.com/tmux/tmux
- Human wiki: https://grok-wiki.com/public/wiki/tmux-tmux-with-manaflow-ai-cmux-62db34dfaddc
- Complete Markdown: https://grok-wiki.com/public/wiki/tmux-tmux-with-manaflow-ai-cmux-62db34dfaddc/llms-full.txt

## Source Files

- `tmux-tmux:control.c`
- `tmux-tmux:cmd-capture-pane.c`
- `tmux-tmux:input-keys.c`
- `manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift`
- `manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift`
- `manaflow-ai-cmux:Sources/Panels/CmuxWebView.swift`
- `manaflow-ai-cmux:Sources/Panels/BrowserScreenshotPipeline.swift`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [tmux-tmux:control.c](tmux-tmux/control.c)
- [tmux-tmux:cmd-capture-pane.c](tmux-tmux/cmd-capture-pane.c)
- [tmux-tmux:input-keys.c](tmux-tmux/input-keys.c)
- [tmux-tmux:cmd-send-keys.c](tmux-tmux/cmd-send-keys.c)
- [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift](manaflow-ai-cmux/Sources/Panels/BrowserPanel.swift)
- [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift](manaflow-ai-cmux/Sources/Panels/BrowserAutomation.swift)
- [manaflow-ai-cmux:Sources/Panels/CmuxWebView.swift](manaflow-ai-cmux/Sources/Panels/CmuxWebView.swift)
- [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotPipeline.swift](manaflow-ai-cmux/Sources/Panels/BrowserScreenshotPipeline.swift)
- [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotSnapshotter.swift](manaflow-ai-cmux/Sources/Panels/BrowserScreenshotSnapshotter.swift)
- [manaflow-ai-cmux:Sources/TerminalController.swift](manaflow-ai-cmux/Sources/TerminalController.swift)
- [manaflow-ai-cmux:Sources/KeyboardShortcutSettings.swift](manaflow-ai-cmux/Sources/KeyboardShortcutSettings.swift)
- [manaflow-ai-cmux:cmuxTests/CMUXCLIErrorOutputRegressionTests.swift](manaflow-ai-cmux/cmuxTests/CMUXCLIErrorOutputRegressionTests.swift)
</details>

# Browser Automation vs Terminal Control

This page contrasts two local-dev control models. tmux exposes terminal panes as command-addressable PTYs: clients can send keys, capture scrollback or pending output, and consume ordered control-mode output. cmux keeps that terminal idea, but adds browser panes backed by WebKit and a socket API for navigation, DOM snapshots, clicks, form input, screenshots, downloads, profiles, and browser state.

No `STRATEGY.md` or `docs/solutions/**` sources were present in this workspace, so this page uses repository code as the source of truth. The Compound Engineering guidance was treated as bundled wiki-shaping guidance only, not as an installed local workflow. The comparison is provider-neutral: neither model requires a specific LLM vendor, hosted agent, or proprietary model API.

## Core Difference

| Capability | tmux | cmux |
|---|---|---|
| Primary surface | Terminal pane / PTY | Terminal plus browser panel |
| Output model | `%output` and `%extended-output` lines, pane history capture | DOM/accessibility-like snapshots, JS eval results, screenshot PNGs, browser state |
| Input model | Encoded terminal keys and mouse escape sequences | Browser navigation, DOM selector actions, synthetic DOM events, WebKit methods |
| Automation address | tmux command queue and control-mode client | JSON socket methods plus legacy browser commands |
| Browser profile state | Not in scope | Named browser profiles, website data stores, history files, import/clear/delete automation |

Sources: [tmux-tmux:control.c:30-42](), [tmux-tmux:control.c:615-729](), [tmux-tmux:input-keys.c:398-419](), [manaflow-ai-cmux:Sources/TerminalController.swift:3521-3578](), [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:334-390]()

## tmux: Terminal-Only Control

tmux control mode is a line protocol around terminal state. Incoming control lines are parsed into tmux commands, while outgoing data is buffered and ordered so pane output does not overtake later notifications. The code comments describe one queue per client plus pane queues for `%output` blocks, and `control_read_callback` turns newline-delimited client input into command queue work.

Sources: [tmux-tmux:control.c:30-42](), [tmux-tmux:control.c:553-580](), [tmux-tmux:control.c:771-809]()

```text
control client
  -> newline command
  -> cmd_parse_and_append(...)
  -> pane/session/window command effects

window pane output
  -> window_pane_get_new_data(...)
  -> %output / %extended-output
  -> control client
```

`capture-pane` works from tmux's terminal screen model. It chooses the base screen, alternate screen, or mode screen, extracts grid lines, and either writes to a paste buffer or prints to the client; in control mode, printable capture output is sent with `control_write`. That is powerful for shell transcripts, compiler output, REPL logs, and terminal UIs, but it is still text/grid capture rather than DOM or pixel capture.

Sources: [tmux-tmux:cmd-capture-pane.c:39-49](), [tmux-tmux:cmd-capture-pane.c:107-151](), [tmux-tmux:cmd-capture-pane.c:213-247]()

Input follows the same terminal boundary. `send-keys` resolves key names or literal strings, injects keys into a pane, and delegates to `window_pane_key`; `input_key` then encodes characters, VT10x keys, extended keys, and mouse events into bytes written to the pane bufferevent.

Sources: [tmux-tmux:cmd-send-keys.c:74-128](), [tmux-tmux:cmd-send-keys.c:193-240](), [tmux-tmux:input-keys.c:574-713](), [tmux-tmux:input-keys.c:797-815]()

## cmux: Browser Pane Automation

cmux's browser surface is a real `WKWebView`. `BrowserPanel` configures WebKit with persistent website data, JavaScript enabled, user scripts, developer extras, navigation delegates, download state, history state, and a Safari user agent. `CmuxWebView` extends `WKWebView` behavior for focus routing, JavaScript checks, mouse back/forward buttons, and drag routing inside the pane system.

Sources: [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:3368-3415](), [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:2758-2788](), [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:3638-3655](), [manaflow-ai-cmux:Sources/Panels/CmuxWebView.swift:1-16](), [manaflow-ai-cmux:Sources/Panels/CmuxWebView.swift:720-771]()

The socket API makes browser panels addressable by workspace, pane, or surface. `TerminalController` dispatches JSON methods such as `browser.navigate`, `browser.snapshot`, `browser.eval`, `browser.click`, `browser.type`, `browser.screenshot`, and many `browser.find.*` / `browser.get.*` calls. `v2BrowserWithPanel` resolves the target browser panel before each operation, so actions are scoped to the selected or explicitly requested surface.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:10923-10980](), [manaflow-ai-cmux:Sources/TerminalController.swift:3521-3578](), [manaflow-ai-cmux:Sources/TerminalController.swift:3580-3628](), [manaflow-ai-cmux:Sources/TerminalController.swift:3888-3948]()

### DOM-Level Actions

For browser automation, cmux does not just send keystrokes to a terminal process. It executes JavaScript inside the page. `browser.snapshot` builds a structured page payload with title, URL, ready state, text, HTML, and element entries with selectors, roles, names, and refs. Selector actions retry lookup, run a generated script, and can append a post-action snapshot.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:11912-11950](), [manaflow-ai-cmux:Sources/TerminalController.swift:12015-12215](), [manaflow-ai-cmux:Sources/TerminalController.swift:12221-12250]()

Clicks and text entry are selector-based. `browser.click` scrolls an element into view and calls `el.click()` or dispatches a mouse event. `browser.type` focuses the element, appends text, updates React-compatible values, and dispatches `input` and `change`. This is a browser automation model, not a terminal escape-sequence model.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:12391-12450](), [manaflow-ai-cmux:Sources/TerminalController.swift:12473-12515](), [manaflow-ai-cmux:Sources/TerminalController.swift:12525-12556]()

## Screenshots and Visual Feedback

tmux capture is terminal-grid text. cmux has both visible browser snapshots and full-page screenshot plumbing. The socket screenshot path asks a `BrowserPanel` for a snapshot and returns base64 PNG data, with a best-effort temporary file path. The screenshot pipeline supports full-page and selected-section modes, crops selected regions, and writes PNG/TIFF data to the pasteboard.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:12735-12756](), [manaflow-ai-cmux:Sources/TerminalController.swift:12758-12775](), [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotPipeline.swift:45-65](), [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotPipeline.swift:71-113](), [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotPipeline.swift:145-192]()

Full-page capture is bounded and WebKit-aware. `BrowserScreenshotWebViewSnapshotter` first tries a single full-content `WKSnapshotConfiguration`, validates the result against content dimensions, and falls back to scrolling and stitching viewport tiles. It also caps full-page size at `100_000_000` pixels.

Sources: [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotSnapshotter.swift:43-78](), [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotSnapshotter.swift:80-109](), [manaflow-ai-cmux:Sources/Panels/BrowserScreenshotSnapshotter.swift:120-156]()

## Profiles, Cookies, and Local Dev Sessions

tmux has no browser profile concept because it does not host a browser engine. cmux models browser profiles directly: each profile has an id, display name, built-in default flag, slug, website data store, and history store. Automation can list, create, rename, clear, and delete profiles, and profile deletion is blocked while live browser panels are using that profile.

Sources: [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:334-390](), [manaflow-ai-cmux:Sources/Panels/BrowserPanel.swift:530-570](), [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift:183-223](), [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift:233-288](), [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift:306-356]()

Browser import automation can select installed browser profiles, default to a detected default profile, merge or map source profiles into cmux profiles, and create a destination profile when explicitly allowed. This is useful for local dev workflows where a browser pane should already carry cookies or history from an existing browser, without tying the workflow to any model provider.

Sources: [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift:378-421](), [manaflow-ai-cmux:Sources/Panels/BrowserAutomation.swift:442-498]()

## Socket Addressability

cmux has two browser command layers in `TerminalController`: JSON v2 socket methods and legacy string commands. The v2 dispatcher covers browser automation broadly; the legacy commands include `openBrowser`, `navigateBrowser`, `browserBack`, `browserForward`, and `browserReload`. Tests also verify browser-related CLI socket behavior through `CMUX_SOCKET_PATH`.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:2248-2372](), [manaflow-ai-cmux:Sources/TerminalController.swift:18288-18405](), [manaflow-ai-cmux:cmuxTests/CMUXCLIErrorOutputRegressionTests.swift:783-842]()

The socket model is portable across agents because the integration boundary is local JSON or CLI command execution. A BYOC/BYOK architecture can let any local or remote coding agent call these primitives, as long as it can reach the local socket or CLI; the browser surface does not require a specific hosted model or vendor connector.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:2280-2297](), [manaflow-ai-cmux:Sources/TerminalController.swift:2300-2308](), [manaflow-ai-cmux:Sources/TerminalController.swift:3518-3578]()

## Important Limits

cmux's browser automation is WebKit-backed, not Chrome DevTools Protocol. The code explicitly reports CDP-style screencast and raw mouse/keyboard/touch input as unsupported, and points callers toward higher-level browser actions such as `browser.click`, `browser.hover`, `browser.scroll`, `browser.press`, `browser.keydown`, and `browser.keyup`.

Sources: [manaflow-ai-cmux:Sources/TerminalController.swift:14890-14903]()

tmux's corresponding limit is that it controls terminal applications through bytes and terminal state. That makes it stable and broadly compatible for shell workflows, but it cannot inspect page roles, fill DOM inputs, preserve browser cookies, or capture browser pixels unless an external browser/tool is layered on top.

Sources: [tmux-tmux:input-keys.c:414-419](), [tmux-tmux:input-keys.c:574-713](), [tmux-tmux:cmd-capture-pane.c:107-151]()

## Portable Design Takeaway

tmux is the better primitive when the task is terminal-native: run commands, drive TUIs, capture scrollback, and stream ordered pane output. cmux is the broader local-dev surface when the task spans terminal plus browser: open a dev server beside the shell, preserve authenticated browser state, inspect a page, click through UI, fill forms, take screenshots, and wait for downloads.

The portable pattern is not "choose a model provider"; it is "keep the control plane local and explicit." tmux demonstrates a durable terminal protocol, while cmux extends that idea with browser panels and socket-addressable browser actions. A Grok-Wiki or agent workflow can stay BYOC/BYOK-friendly by treating skill packs, repository files, and catalog entries as portable sources, then issuing local CLI/socket actions without assuming any specific LLM vendor or hosted automation backend. Sources: [tmux-tmux:control.c:553-580](), [tmux-tmux:control.c:615-729](), [manaflow-ai-cmux:Sources/TerminalController.swift:10923-10980](), [manaflow-ai-cmux:Sources/TerminalController.swift:12735-12775]()
