# CLI & Socket Automation API

> The scriptable control plane for workspaces, panes, surfaces, browser actions, notifications, windows, and VM commands, including tagged socket resolution and v2 method dispatch.

- Repository: manaflow-ai/cmux
- GitHub: https://github.com/manaflow-ai/cmux
- Human wiki: https://grok-wiki.com/public/wiki/manaflow-ai-cmux-5a511656cb1a
- Complete Markdown: https://grok-wiki.com/public/wiki/manaflow-ai-cmux-5a511656cb1a/llms-full.txt

## Source Files

- `CLI/cmux.swift`
- `CLI/CLISocketPathResolver.swift`
- `CLI/SocketOperationTelemetry.swift`
- `Sources/TerminalController.swift`
- `Sources/TerminalControllerV2ParamParsingSupport.swift`
- `cmuxTests/TerminalControllerSocketSecurityTests.swift`
- `cmuxUITests/AutomationSocketUITests.swift`
- `tests_v2/test_tab_dragging.py`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [CLI/cmux.swift](CLI/cmux.swift)
- [CLI/CLISocketPathResolver.swift](CLI/CLISocketPathResolver.swift)
- [CLI/SocketOperationTelemetry.swift](CLI/SocketOperationTelemetry.swift)
- [Sources/TerminalController.swift](Sources/TerminalController.swift)
- [Sources/TerminalControllerV2ParamParsingSupport.swift](Sources/TerminalControllerV2ParamParsingSupport.swift)
- [Sources/SocketControlSettings.swift](Sources/SocketControlSettings.swift)
- [cmuxTests/TerminalControllerSocketSecurityTests.swift](cmuxTests/TerminalControllerSocketSecurityTests.swift)
- [cmuxUITests/AutomationSocketUITests.swift](cmuxUITests/AutomationSocketUITests.swift)
- [tests_v2/test_tab_dragging.py](tests_v2/test_tab_dragging.py)
</details>

# CLI & Socket Automation API

cmux exposes a local, scriptable control plane through the `cmux` CLI and the app’s Unix-domain control socket. This API is the bridge between shell automation and app state: it can create and select workspaces, split panes, move surfaces, drive browser panels, send terminal input, post notifications, inspect windows, and route Cloud VM commands.

This page treats the API as an integration surface, not as a UI feature. The important design choice is that automation is local and method-based: callers use handles, JSON params, and socket paths rather than any model-provider-specific contract. That keeps the architecture BYOC/BYOK friendly: agents, scripts, skill packs, and repository tools can call the same local API regardless of which model or hosted service produced the command.

Sources: [CLI/cmux.swift:29452-29644](), [Sources/TerminalController.swift:2903-2928]()

## Mental Model

The CLI is a typed convenience layer over a newline-delimited socket protocol. Most modern commands call v2 JSON methods through `SocketClient.sendV2`; older and compatibility commands still dispatch through v1 text commands. The app-side owner is `TerminalController`, which accepts local socket clients, applies access policy, parses each line, and dispatches either v1 command names or v2 method names.

```text
Shell / agent / test
        |
        |  cmux command or raw rpc
        v
CLI/cmux.swift
        |
        |  AF_UNIX line protocol
        v
Sources/TerminalController.swift
        |
        +--> v1 text commands: ping, send, new_split, notify, browser_back, ...
        |
        +--> v2 JSON methods: workspace.*, pane.*, surface.*, browser.*, notification.*, vm.*
```

The API is useful to productize because it gives external tools a stable “remote control” for a live desktop workspace without tying those tools to SwiftUI internals or to a particular AI runtime.

Sources: [CLI/cmux.swift:2168-2205](), [Sources/TerminalController.swift:2681-2754](), [Sources/TerminalController.swift:3278-3324]()

## Socket Resolution And Tagged Debug Apps

Socket selection is intentionally layered:

| Priority | Source | Notes |
| --- | --- | --- |
| 1 | `--socket <path>` | Explicit override from CLI args. |
| 2 | `CMUX_SOCKET_PATH` / legacy `CMUX_SOCKET` | The CLI refuses to choose if both are set to different paths. |
| 3 | Bundle-aware default | Stable app defaults to Application Support; debug/nightly/staging variants use marker logic. |
| 4 | Auto-discovery | Implicit default resolution can prefer live tagged debug sockets. |

`CLISocketPathResolver.resolve` only auto-discovers when the socket source is implicit default. It prefers candidates that accept connections, then existing socket files, then the first candidate. For debug variants, it can scan for recent `cmux-debug-*.sock` files and uses `CMUX_TAG` to decide when tagged discovery is appropriate.

This is why tagged builds can be automated safely: a debug app can have a separate socket and password scope, while the CLI still falls back to stable paths for ordinary users.

Sources: [CLI/cmux.swift:28-40](), [CLI/cmux.swift:2669-2826](), [CLI/CLISocketPathResolver.swift:133-190](), [CLI/CLISocketPathResolver.swift:216-229](), [CLI/CLISocketPathResolver.swift:249-329]()

## Access Modes And Authentication

Socket control is guarded by `SocketControlMode`:

| Mode | File mode | Behavior |
| --- | ---: | --- |
| `off` | `0600` | Socket control disabled by setting. |
| `cmuxOnly` | `0600` | Only processes descended from cmux terminals may connect. |
| `automation` | `0600` | External local automation from the same macOS user. |
| `password` | `0600` | Requires socket authentication. |
| `allowAll` | `0666` | Any local process/user can connect; marked unsafe in code comments. |

On the server, `TerminalController` applies `accessMode.socketFilePermissions` with `chmod`. In `cmuxOnly`, it verifies peer process ancestry when possible, with a same-UID fallback for peer PID races. In password mode, v1 clients authenticate with `auth <password>` and v2 clients authenticate with `auth.login`.

On the CLI side, password resolution is ordered as explicit `--password`, `CMUX_SOCKET_PASSWORD`, a local password file, then keychain lookup. The keychain service can be scoped by `CMUX_TAG` or by the socket filename, which matters for tagged debug apps.

Sources: [Sources/SocketControlSettings.swift:8-61](), [Sources/TerminalController.swift:2090-2121](), [Sources/TerminalController.swift:2164-2235](), [Sources/TerminalController.swift:2681-2715](), [CLI/cmux.swift:1328-1390](), [CLI/cmux.swift:4825-4848](), [cmuxTests/TerminalControllerSocketSecurityTests.swift:28-75]()

## v2 JSON Method Dispatch

The v2 protocol is newline-delimited JSON. A request is a JSON object with `method`, optional `params`, and optional `id`; the response is an object with either `ok: true` plus `result`, or `ok: false` plus `error`.

Example raw call:

```bash
cmux rpc surface.report_tty '{"workspace_id":"...","surface_id":"...","tty_name":"ttys001"}'
```

Example request shape:

```json
{"id":"request-id","method":"surface.health","params":{"workspace_id":"workspace:1"}}
```

`SocketClient.sendV2` encodes the object, sends it as one line, parses the JSON response, returns `result` on success, and formats v2 errors into CLI errors. `cmux rpc <method> [json-params]` is the escape hatch for methods not wrapped by a first-class CLI command.

The app keeps some v2 methods on a socket worker instead of the main actor. That worker group includes auth, feed, browser profile/import work, system top/memory, remote PTY operations, and all `vm.*` methods. Other methods dispatch on the main actor because they mutate UI/app model state.

Sources: [CLI/cmux.swift:2168-2224](), [CLI/cmux.swift:3387-3396](), [CLI/cmux.swift:11635-11641](), [Sources/TerminalController.swift:2237-2278](), [Sources/TerminalController.swift:2280-2395](), [Sources/TerminalController.swift:3280-3319]()

## Main Method Families

The v2 dispatcher is organized by product domain:

| Family | Representative methods | What it controls |
| --- | --- | --- |
| `system.*` | `system.ping`, `system.capabilities`, `system.identify`, `system.tree` | Health, feature discovery, caller/window/workspace identity. |
| `window.*` | `window.list`, `window.current`, `window.focus`, `window.create`, `window.close` | Top-level cmux windows. |
| `workspace.*` | `workspace.list`, `workspace.create`, `workspace.select`, `workspace.reorder`, `workspace.remote.*` | Workspace lifecycle, ordering, remote workspace state. |
| `surface.*` / `tab.*` | `surface.create`, `surface.split`, `surface.move`, `surface.send_text`, `tab.action` | Terminal/browser surfaces and tab-like actions. |
| `pane.*` | `pane.list`, `pane.focus`, `pane.create`, `pane.resize`, `pane.join` | Split-pane layout and focus. |
| `notification.*` | `notification.create`, `notification.list`, `notification.dismiss`, `notification.open` | In-app notifications targeted to workspaces or surfaces. |
| `browser.*` | `browser.navigate`, `browser.snapshot`, `browser.click`, `browser.fill`, `browser.screenshot` | Browser panel automation. |
| `vm.*` | `vm.list`, plus create/destroy/attach helpers through CLI | Cloud VM lifecycle through socket-worker dispatch. |

The CLI help exposes the same domains with friendlier command names: `list-workspaces`, `new-split`, `new-pane`, `new-surface`, `send`, `notify`, `browser ...`, `vm ...`, and `rpc ...`.

Sources: [Sources/TerminalController.swift:3318-3519](), [Sources/TerminalController.swift:3521-3605](), [CLI/cmux.swift:29495-29634]()

## Handle Inputs And Output IDs

The CLI accepts UUIDs, short refs such as `window:1`, `workspace:2`, `pane:3`, and `surface:4`, or indexes where commands support those targets. It defaults output to refs and can print UUIDs or both with `--id-format`.

This is a developer-experience detail worth copying: it lets humans use compact references in scripts while still allowing machines to store stable UUIDs.

Sources: [CLI/cmux.swift:2672-2717](), [CLI/cmux.swift:29460-29467](), [CLI/cmux.swift:4882-4895](), [Sources/TerminalControllerV2ParamParsingSupport.swift:87-103]()

## Browser Automation Surface

Browser commands are not a separate server. They are socket methods in the same dispatch tree and CLI wrapper. The CLI parses `browser` subcommands, resolves a target surface where required, and calls methods such as `browser.navigate`, `browser.snapshot`, `browser.eval`, `browser.wait`, `browser.click`, `browser.fill`, `browser.screenshot`, `browser.tab.*`, and browser profile/import commands.

This makes browser automation composable with workspaces and panes: scripts can create a browser split, navigate it, inspect page state, and keep all state tied to cmux workspace handles.

Sources: [CLI/cmux.swift:9705-9768](), [CLI/cmux.swift:29600-29634](), [Sources/TerminalController.swift:2351-2377](), [Sources/TerminalController.swift:3521-3605]()

## Notifications, Telemetry, And Agent Status

The socket API is also a lightweight status bus. CLI commands can create notifications, list/mark/dismiss them, set status/progress/log entries, report TTY and shell state, and kick port metadata updates. Tests verify targeted v2 notifications attach to the intended workspace/surface instead of the currently focused panel.

Socket operation telemetry extracts an operation name from either a v2 JSON `method` or the first token of a v1 command, and tracks phase, timeout, duration, bytes read, and whether a newline response was seen. That makes hangs diagnosable without hard-coding command-specific telemetry.

Sources: [CLI/cmux.swift:29555-29570](), [Sources/TerminalController.swift:2989-3089](), [Sources/TerminalController.swift:3493-3513](), [CLI/SocketOperationTelemetry.swift:3-50](), [cmuxTests/TerminalControllerSocketSecurityTests.swift:800-830](), [cmuxTests/TerminalControllerSocketSecurityTests.swift:900-932]()

## Listener Reliability

The app’s listener is designed to survive common local automation failure modes. `TerminalController` binds an `AF_UNIX` socket, applies permissions, listens, records the last socket path, and publishes listener-start state. A path monitor detects when the socket file disappears and restarts the listener for the same path and access mode.

The UI tests cover three important operational cases: the socket appears when enabled, a deleted socket path is recreated and answers `ping`, and the socket stays absent when the setting is `off`.

Sources: [Sources/TerminalController.swift:1608-1778](), [Sources/TerminalController.swift:1898-1943](), [cmuxUITests/AutomationSocketUITests.swift:20-70](), [cmuxUITests/AutomationSocketUITests.swift:184-240]()

## Test Coverage And Dogfood Patterns

The test suite treats the socket as a real automation boundary. Unit tests open Unix sockets directly, write v2 JSON lines, and decode response envelopes. UI tests launch a tagged debug app with `CMUX_SOCKET_PATH` and `CMUX_TAG`. Python E2E tests use the socket client to create workspaces, split panes, focus surfaces, send terminal input, and verify terminal responsiveness with marker files.

That coverage explains the API’s shape: it is not just for public CLI commands, it is the same surface used to dogfood layout, focus, browser, notification, and remote-terminal workflows.

Sources: [cmuxTests/TerminalControllerSocketSecurityTests.swift:1089-1175](), [cmuxUITests/AutomationSocketUITests.swift:73-119](), [tests_v2/test_tab_dragging.py:49-93](), [tests_v2/test_tab_dragging.py:508-590]()

## Provider-Neutral Integration Guidance

For Grok-Wiki or any external automation UI, the portable integration point is the local CLI/socket contract:

| Integration need | Preferred API shape | Provider-neutral reason |
| --- | --- | --- |
| Discover current context | `cmux identify` or `system.identify` | Uses local handles, not model-session internals. |
| Run scripted workspace actions | First-class CLI commands where available | Stable UX labels over raw implementation names. |
| Access newer or niche methods | `cmux rpc <method> <json>` | Keeps method dispatch open without coupling to one client. |
| Target tagged debug apps | `CMUX_TAG`, `CMUX_SOCKET_PATH`, or `--socket` | Portable across local app builds and CI. |
| Feed agent/UI events | v2 feed/notification/status methods | Skill packs can be files, repositories, or catalogs; the socket does not depend on their provider. |

A good UI flow should show product actions first: “Create workspace,” “Split pane,” “Send text,” “Open browser,” “Create notification,” and “List VMs.” Raw `rpc` should be available as an advanced escape hatch, not the primary user-facing affordance.

Sources: [CLI/cmux.swift:29475-29644](), [CLI/cmux.swift:3378-3396](), [Sources/TerminalController.swift:3318-3519]()

## Summary

The CLI and socket API form cmux’s scriptable control plane: local Unix socket transport, layered socket discovery for stable and tagged apps, access modes for local automation, v1 compatibility, v2 JSON method dispatch, and broad product coverage across windows, workspaces, panes, surfaces, browser panels, notifications, and VM commands. The result is a practical automation boundary that can be driven by humans, tests, agents, or external tools without binding the architecture to a specific AI provider or hosted connector.

Sources: [CLI/cmux.swift:29452-29644](), [Sources/TerminalController.swift:3278-3519]()
