# The Mental Model — One Picture of the Whole System

> The simplest useful model of omp: what it is, how its layers relate, what the invariants are, and which facts let you predict behavior without reopening the code.

- Repository: can1357/oh-my-pi
- GitHub: https://github.com/can1357/oh-my-pi
- Human wiki: https://grok-wiki.com/public/wiki/can1357-oh-my-pi-64b0ce1ccc45
- Complete Markdown: https://grok-wiki.com/public/wiki/can1357-oh-my-pi-64b0ce1ccc45/llms-full.txt

## Source Files

- `README.md`
- `Cargo.toml`
- `packages/coding-agent/src/main.ts`
- `packages/coding-agent/src/sdk.ts`
- `packages/agent/src/agent.ts`
- `packages/agent/src/types.ts`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [README.md](README.md)
- [Cargo.toml](Cargo.toml)
- [AGENTS.md](AGENTS.md)
- [packages/coding-agent/src/main.ts](packages/coding-agent/src/main.ts)
- [packages/coding-agent/src/sdk.ts](packages/coding-agent/src/sdk.ts)
- [packages/agent/src/agent.ts](packages/agent/src/agent.ts)
- [packages/agent/src/agent-loop.ts](packages/agent/src/agent-loop.ts)
- [packages/agent/src/types.ts](packages/agent/src/types.ts)
- [packages/coding-agent/src/session/agent-session.ts](packages/coding-agent/src/session/agent-session.ts)
- [packages/coding-agent/src/modes/index.ts](packages/coding-agent/src/modes/index.ts)
</details>

# The Mental Model — One Picture of the Whole System

omp (oh-my-pi) is a full-stack AI coding agent — a CLI tool with the IDE wired in. It wraps LLM calls in a stateful, multi-turn agent loop, exposes a tool harness for filesystem, shell, LSP, MCP, and browser operations, and renders output through a terminal UI. Understanding how its layers relate — and what invariants hold across them — lets you predict behavior and debug integration problems without re-reading the code each time.

This page presents the smallest useful model of omp: what the layers are, which direction dependencies flow, what each layer owns, and which facts let you predict how the whole system behaves.

---

## The Three-Layer Stack

omp splits cleanly into three layers. Each layer depends only downward; upper layers are never imported by lower layers.

```text
┌──────────────────────────────────────────────────────┐
│              CLI / Run Modes                         │
│   main.ts · InteractiveMode · RpcMode · AcpMode      │
│   (I/O, arg parsing, session lifecycle, TUI)         │
├──────────────────────────────────────────────────────┤
│              Session / SDK                           │
│   createAgentSession() · AgentSession                │
│   (tools, skills, extensions, MCP, LSP, memory)     │
├──────────────────────────────────────────────────────┤
│              Agent Core                              │
│   Agent class · agentLoop() · streamSimple()         │
│   (LLM call, streaming, tool dispatch, events)      │
├──────────────────────────────────────────────────────┤
│   pi-ai (multi-provider LLM client)                  │
│   pi-natives (Rust: grep, AST, PTY, shell)           │
└──────────────────────────────────────────────────────┘
```

Sources: [AGENTS.md]() (package table), [packages/coding-agent/src/sdk.ts:156-284](), [packages/agent/src/agent.ts:242-310]()

---

## Layer 1 — Agent Core (`packages/agent`)

The agent core is a provider-neutral, runtime-agnostic conversation engine. It knows nothing about coding tools, sessions, or file systems.

### `Agent` class

`Agent` is the observable state machine around the agent loop. It owns:

- `AgentState` — `model`, `systemPrompt`, `tools`, `messages`, `isStreaming`, `pendingToolCalls`
- steering queue and follow-up queue
- an `AbortController` that cancels in-flight LLM calls

Key methods:

| Method | What it does |
|--------|-------------|
| `prompt(input)` | Start a new turn; throws `AgentBusyError` if already streaming |
| `continue()` | Resume from existing context (steering queues, retries) |
| `steer(msg)` | Enqueue a mid-run interrupt; delivered after the current tool completes |
| `followUp(msg)` | Enqueue a message delivered only when the agent would otherwise stop |
| `abort()` | Fire the abort controller; in-flight tool calls see a cancelled signal |
| `subscribe(fn)` | Receive `AgentEvent` stream (streaming, tool calls, completions) |

Sources: [packages/agent/src/agent.ts:242-260](), [packages/agent/src/agent.ts:650-665]()

### `agentLoop` — the turn engine

`agentLoop` is an async generator. One call covers one complete reasoning turn:

1. Optionally drain steering messages (if `interruptMode: "immediate"`, checked after every tool call).
2. Call `convertToLlm()` to filter `AgentMessage[]` down to wire-format `Message[]`.
3. Optionally call `transformContext()` for context pruning / injection.
4. Call `syncContextBeforeModelCall()` so live tool / prompt changes are captured.
5. Stream from the LLM via `streamFn` (defaults to `streamSimple` from `pi-ai`).
6. Execute tool calls — concurrent unless `concurrency: "exclusive"`.
7. Run `beforeToolCall` and `afterToolCall` hooks.
8. Call `getSteeringMessages()` after each tool; if any arrive, skip remaining tools and restart.
9. Call `getFollowUpMessages()` when stopping; if any arrive, continue.

Sources: [packages/agent/src/agent-loop.ts:1-64](), [packages/agent/src/types.ts:30-216]()

### `AgentMessage` vs `Message`

`AgentMessage = Message | CustomAgentMessages[keyof CustomAgentMessages]`

The distinction is structural. LLM providers only see `Message` (user/assistant/toolResult). `AgentMessage` can include UI-only custom types (notifications, artifacts). The `convertToLlm` function at the LLM boundary filters them out. This is what allows session storage and UI to share the same message array without leaking app-level concerns to the API payload.

Sources: [packages/agent/src/types.ts:305-315]()

### Event model

`Agent` emits a typed `AgentEvent` stream:

```
agent_start
  └─ turn_start
       ├─ message_start  (assistant streaming begins)
       │    └─ message_update  (token deltas)
       ├─ message_end    (full assistant message committed)
       ├─ tool_execution_start
       │    └─ tool_execution_update  (streaming partial results)
       ├─ tool_execution_end
       └─ turn_end
agent_end  (carries telemetry summary if OTEL enabled)
```

Subscribers get fine-grained lifecycle hooks for every message and tool call. The TUI layer, session storage, and RPC mode all drive from these events.

Sources: [packages/agent/src/types.ts:429-451]()

---

## Layer 2 — Session / SDK (`packages/coding-agent/src/sdk.ts` + `session/`)

The SDK layer wires everything that makes omp a *coding* agent: tools, extensions, skills, MCP, LSP, memory, and session persistence. The entry point is `createAgentSession(options)`.

### `createAgentSession` startup sequence

```text
createAgentSession(options)
  │
  ├─ [parallel] discoverContextFiles()   → AGENTS.md walking
  ├─ [parallel] discoverPromptTemplates()
  ├─ [parallel] discoverSlashCommands()
  ├─ [parallel] discoverSkills()
  ├─ [parallel] buildWorkspaceTree()     → native Rust scan
  ├─ [parallel] modelRegistry.refreshInBackground()
  │
  ├─ Settings.init()                     → layered config
  ├─ loadSecrets() + SecretObfuscator    → env + file secrets
  ├─ SessionManager                      → JSONL session file
  ├─ createTools()                       → built-in tool set
  ├─ discoverAndLoadExtensions()         → .omp/extensions/
  ├─ discoverAndLoadMCPTools()           → .mcp.json
  ├─ discoverStartupLspServers()         → LSP warmup
  │
  └─ new AgentSession(config)
       └─ new Agent(opts)                → agent core
```

Parallel discovery fans out early because the workspace tree scan, context file walk, and skill loading are I/O-bound and independent. They converge at their respective consumer sites inside `AgentSession`.

Sources: [packages/coding-agent/src/sdk.ts:714-834]()

### `AgentSession`

`AgentSession` is the durable, multi-turn session facade used by all run modes. It:

- Wraps the `Agent` core and surfaces `prompt()`, `promptCustomMessage()`, `dispose()`
- Manages `SessionManager` (JSONL append-log for message persistence)
- Owns the tool set, extension runner, skills, memory backend, and async job manager
- Handles model cycling (`Ctrl+P`), thinking level changes, plan mode, and compaction
- Routes IRC/subagent dispatch via `AgentRegistry`

The `AgentSession` is **shared** across all run modes — interactive, print, RPC, and ACP all call the same `session.prompt()`. Mode code adds I/O on top; it does not own the session.

Sources: [packages/coding-agent/src/session/agent-session.ts:4-270]()

### Tool architecture

Tools live in `packages/coding-agent/src/tools/`. They implement `AgentTool<TParameters, TDetails>` from the agent core. Key invariants:

| Property | Meaning |
|---------|---------|
| `loadMode: "essential"` | Loaded at startup unconditionally |
| `loadMode: "discoverable"` | Activated on demand by tool-search index |
| `concurrency: "exclusive"` | Runs alone — no other tools execute in parallel |
| `hidden: true` | Not shown unless explicitly listed via `--tools` |
| `deferrable: true` | Can stage a pending action requiring `resolve` tool |
| `nonAbortable: true` | Ignores abort signals; runs to completion |

Built-in tools include `bash`, `read`, `write`, `edit`, `find`, `search`, `fetch`, `eval` (Python kernel), and more. Extensions, MCP servers, and custom tools register additional tools into the same registry.

Sources: [packages/coding-agent/src/sdk.ts:121-146](), [packages/agent/src/types.ts:373-415]()

### Extension, skill, and MCP layers

All three are optional, composable extensibility layers:

- **Extensions** — TypeScript modules loaded from `.omp/extensions/` (or explicit paths). They receive `ExtensionContext` and can register tools, custom commands, and UI actions.
- **Skills** — Markdown files (`SKILL.md`) injected into the system prompt at the `skills.includeSkills` list. They carry no runtime code.
- **MCP (Model Context Protocol)** — External servers declared in `.mcp.json`. Their tools are discovered and registered as `AgentTool` adapters. In ACP mode, the host supplies MCP servers; the session disables `.mcp.json` discovery (`enableMCP: false`) to avoid shadowing.

Sources: [packages/coding-agent/src/sdk.ts:59-90](), [packages/coding-agent/src/main.ts:219-245]()

---

## Layer 3 — CLI / Run Modes (`packages/coding-agent/src/main.ts` + `modes/`)

The top layer parses arguments, initializes settings, creates the session, and dispatches to one of four run modes:

| Mode | Entry | When used |
|------|-------|-----------|
| Interactive | `InteractiveMode` (TUI) | Default terminal session |
| Print | `runPrintMode` | `--print`, piped input, non-TTY |
| RPC | `runRpcMode` | `--mode rpc` / `--mode rpc-ui` — IDE extensions |
| ACP | `runAcpMode` | `--mode acp` — Agent Communication Protocol |

All modes call `session.prompt()` or `session.promptCustomMessage()`. They differ only in how they acquire input and render output.

The main entry point `runRootCommand` performs a fixed startup sequence:

1. `initTheme` (early, for symbols)
2. `maybeAutoChdir` (redirect away from `$HOME`)
3. `discoverAuthStorage` + `ModelRegistry`
4. `Settings.init` + model role overrides
5. Plugin/marketplace preload (background)
6. `createSessionManager` (continue / resume / fork / new)
7. `buildSessionOptions` (model, system prompt, tools, skills, extensions)
8. `createAgentSession` → `AgentSession`
9. Dispatch to mode

Sources: [packages/coding-agent/src/main.ts:692-1028]()

### ACP mode and session isolation

In ACP mode, each `session/new` RPC call creates a fresh `AgentSession` via the factory returned by `createAcpSessionFactory`. The factory forces `enableMCP: false` on every session: MCP servers come from the ACP client's `session/new.mcpServers`, not from `.mcp.json` on disk. Without this, host-disk tools shadow client-supplied servers (issue #1234).

Sources: [packages/coding-agent/src/main.ts:202-244]()

---

## Dependency Direction

```text
  coding-agent (CLI)
      │  imports
      ▼
  packages/agent  (Agent, agentLoop, types)
      │  imports
      ▼
  packages/ai  (streamSimple, providers, LLM types)
      │  calls
      ▼
  LLM Provider APIs  (Anthropic, OpenAI, Gemini, …)

  packages/coding-agent
      │  also imports
      ▼
  crates/pi-natives  (Rust NAPI bindings via packages/natives)
  packages/tui       (terminal rendering)
  packages/utils     (logger, Snowflake, env, path utils)
```

`packages/agent` has no knowledge of coding-agent session types. `packages/ai` has no knowledge of tools or sessions. This means the agent core and LLM client are reusable outside the coding-agent context.

Sources: [AGENTS.md]() (package table), [packages/agent/src/agent.ts:1-36]()

---

## System Prompt Architecture

The system prompt is not a single string. It is an array of blocks (`string[]`) built at session creation from multiple sources:

1. **Harness prompt** — stable built-in instructions (from `.md` files in `prompts/system/`)
2. **Workspace tree** — rendered directory listing from the native scan
3. **Context files** — content of every `AGENTS.md` found walking up from `cwd`
4. **Skills** — `SKILL.md` content for each skill in `includeSkills`
5. **Rules** — content of `RULES.md` and `.omp/rules/` files
6. **Custom overrides** — `SYSTEM.md` discovered from `.omp/SYSTEM.md` or `~/.omp/`; `APPEND_SYSTEM.md` appended at the end

The CLI `--system-prompt` flag replaces block 0; `--append-system-prompt` appends after all defaults. The `systemPrompt` option in `CreateAgentSessionOptions` accepts either a `string[]` (full replacement) or a function `(default) => final` (surgical modification).

Prompts are never built inline in TypeScript. They live in static `.md` files imported with `{ type: "text" }`, with Handlebars for dynamic content.

Sources: [packages/coding-agent/src/sdk.ts:535-645](), [packages/coding-agent/src/main.ts:495-521](), [AGENTS.md]() (prompts rule)

---

## Session Persistence

Sessions are stored as JSONL append-logs managed by `SessionManager`. Each entry is a serialized `AgentMessage`. On resume (`--continue`, `--resume`), the log is replayed into the `Agent`'s message array before the first prompt.

`SessionManager` supports four modes: `create` (new), `open` (exact path or ID), `continueRecent` (latest for `cwd`), and `forkFrom` (copy-on-write branch of an existing session). The in-memory variant (`SessionManager.inMemory()`) is used with `--no-session`.

The session ID doubles as the `providerSessionId` forwarded to LLM providers that support session-based caching (e.g., OpenAI Codex), keeping provider-side cache isolation aligned with the on-disk session file.

Sources: [packages/coding-agent/src/main.ts:377-443](), [packages/coding-agent/src/sdk.ts:805-810]()

---

## Key Invariants

These facts predict behavior without reopening the code:

1. **`Agent` is synchronous in state, async in execution.** `#state` is mutated directly; listeners are notified synchronously after each mutation. There is no reducer pattern.

2. **One active prompt at a time.** `prompt()` throws `AgentBusyError` if `isStreaming`. Steering messages are queued, not injected mid-call.

3. **Tools transform at the LLM boundary only.** `convertToLlm` is the single choke point. Custom `AgentMessage` types survive in memory and session storage, but never reach the API.

4. **All discovery is parallel at startup; nothing re-scans during a session.** Context files, workspace tree, skills, and prompt templates are computed once in `createAgentSession` and passed to the session constructor. A session restart is needed to pick up new `AGENTS.md` files.

5. **Model identity is always provider/id, never a bare name.** `Model.provider` + `Model.id` is the stable key. Role aliases (`default`, `smol`, `slow`, `plan`) are resolved at startup via `resolveModelRoleValue`; changes after session creation require `session.setModel()`.

6. **Extensions, MCP, and custom tools share the same `AgentTool` registry.** The agent loop cannot distinguish between a built-in tool, an extension tool, and an MCP adapter. This means all tool features (`concurrency`, `beforeToolCall`, etc.) apply uniformly.

7. **`AgentSession` is run-mode-agnostic.** All four run modes (interactive, print, RPC, ACP) call the same `session.prompt()`. Mode code is a thin I/O wrapper, not a reimplementation of session logic.

Sources: [packages/agent/src/agent.ts:758-760](), [packages/agent/src/types.ts:305-315](), [packages/coding-agent/src/sdk.ts:763-787](), [packages/coding-agent/src/session/agent-session.ts:4-8]()

---

## Sequence: A Single Interactive Turn

```mermaid
sequenceDiagram
    participant User as User (TUI)
    participant IM as InteractiveMode
    participant AS as AgentSession
    participant AG as Agent
    participant AL as agentLoop
    participant LLM as LLM Provider (pi-ai)
    participant Tool as Tool (e.g. bash)

    User->>IM: keypress submit
    IM->>AS: session.prompt(text)
    AS->>AG: agent.prompt(userMsg)
    AG->>AL: agentLoop(msgs, context, config)
    AL->>LLM: streamSimple(messages, tools)
    LLM-->>AL: streaming tokens
    AL-->>AG: message_start / message_update
    AG-->>IM: AgentEvent (live token render)
    LLM-->>AL: tool_call block
    AL->>Tool: tool.execute(args, signal)
    Tool-->>AL: AgentToolResult
    AL-->>AG: tool_execution_start/end
    AL->>LLM: streamSimple([...messages, toolResult])
    LLM-->>AL: final assistant message
    AL-->>AG: message_end / agent_end
    AG-->>AS: AgentEvent (agent_end)
    AS->>SessionManager: append messages to JSONL
```

Sources: [packages/agent/src/agent.ts:834-1050](), [packages/agent/src/agent-loop.ts:1-80](), [packages/coding-agent/src/main.ts:317-321]()

---

## Rust / Native Layer

Performance-critical operations bypass the JS runtime. The `crates/pi-natives` crate exposes NAPI bindings consumed by `packages/natives`. Key capabilities:

- **Grep / file search** — `grep-searcher` + `ignore` (respects `.gitignore`)
- **AST-aware editing** — `pi-ast` using `tree-sitter` with grammars for 40+ languages
- **Shell execution** — `pi-shell` wraps `brush-core` (a Rust bash interpreter) for non-PTY shell operations
- **PTY support** — `portable-pty` for interactive shell sessions
- **Syntax highlighting** — `syntect`
- **Tokenization** — `tiktoken-rs` for token counting

The Rust workspace (`Cargo.toml`) uses `lto = "fat"` and `strip = true` in release builds and patches `brush-core` / `brush-builtins` to vendored local forks, pinning shell behavior for the agent's tool harness.

Sources: [Cargo.toml:1-15](), [Cargo.toml]() (workspace.dependencies section), [AGENTS.md]() (package table: `crates/pi-natives`)

---

## Summary

omp is three layers in strict dependency order: the provider-neutral `Agent` core (event stream, tool dispatch, steering/follow-up queues), the `AgentSession` / SDK layer (tool registry, extensions, MCP, LSP, memory, session persistence), and the CLI/mode layer (TUI, RPC, ACP, argument parsing). The LLM provider boundary is the single transform point where `AgentMessage[]` becomes `Message[]` — everything above that boundary can carry app-specific state without polluting the API payload. Because discovery runs parallel at startup and results are frozen into the session, the running session is always working from a consistent snapshot; restarts pick up filesystem changes.

Sources: [packages/agent/src/types.ts:305-315](), [packages/coding-agent/src/sdk.ts:763-787]()
