# TurnState & the Command/Event Protocol

> TurnState is the only thing that must survive between process boundaries. This page traces the lifecycle of a TurnState from creation through prompt/answer/wake commands, terminal events (complete, ask, sleep, interrupted), and what each field owns—agent messages, stateMachine session, usage accounting, and todos.

- Repository: dzhng/duet-agent
- GitHub: https://github.com/dzhng/duet-agent
- Human wiki: https://grok-wiki.com/public/wiki/dzhng-duet-agent-82dbe2572d3a
- Complete Markdown: https://grok-wiki.com/public/wiki/dzhng-duet-agent-82dbe2572d3a/llms-full.txt

## Source Files

- `src/types/protocol.ts`
- `src/turn-runner/turn-state.ts`
- `src/turn-runner/turn-runner.ts`
- `src/turn-runner/usage-accounting.ts`
- `src/session/session-manager.ts`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [src/types/protocol.ts](src/types/protocol.ts)
- [src/turn-runner/turn-state.ts](src/turn-runner/turn-state.ts)
- [src/turn-runner/turn-runner.ts](src/turn-runner/turn-runner.ts)
- [src/turn-runner/usage-accounting.ts](src/turn-runner/usage-accounting.ts)
- [src/session/session-manager.ts](src/session/session-manager.ts)
- [src/session/session.ts](src/session/session.ts)
</details>

# TurnState & the Command/Event Protocol

`TurnState` is the only data structure that must survive when the process exits and restarts. It is the serialized checkpoint that lets a session resume an agent conversation, a state-machine workflow, an in-progress todo list, and queued user prompts — all from a single JSON file. The command/event protocol built around `TurnState` is the control surface that every transport layer (CLI, HTTP server, daemon) speaks: commands drive the runner forward, and events report work or end the turn.

Understanding this protocol is the key to predicting what the system will do after a crash, a sleep, an interrupt, or a user follow-up that arrives while the agent is mid-task.

---

## The TurnState Snapshot

`TurnState` is defined as an interface in `src/types/protocol.ts`. Every field is intentionally carry-forward: when the runner writes a terminal event, the receiving layer persists this snapshot and hands it back to the runner on the next `start` command.

```ts
// src/types/protocol.ts
export interface TurnState {
  status: TurnStateStatus;       // lifecycle of this snapshot
  mode: TurnMode;                // "agent" | "auto" | StateMachineDefinition
  options?: TurnOptions;         // model, memoryModel, thinkingLevel
  agent: AgentSession;           // full conversation transcript
  stateMachine?: StateMachineSession;  // non-null in state-machine mode
  todos?: TurnTodo[];            // current work plan
  followUpQueue?: TurnFollowUpQueueEntry[];  // buffered pending user prompts
  queuedCommands?: TurnCommand[];            // commands not yet executed
}
```

Sources: [src/types/protocol.ts:169-208](src/types/protocol.ts)

### Field ownership

| Field | Owner | Why it survives |
|---|---|---|
| `status` | Runner | Tells the receiver whether to arm a wake timer, show a question, or accept prompts. |
| `mode` | Runner (set at start) | Prevents `auto` sessions from switching to a constrained definition after resume. |
| `options` | Runner (set at start) | Keeps model and thinking level stable across all pi-agent turns inside one session. |
| `agent` | pi-agent transcript | Is the entire conversation history; state-machine sessions share it as the parent transcript. |
| `stateMachine` | `StateMachineController` | Progress, definition, and current state for long-running workflows. |
| `todos` | todo tool | Preserved so a resumed runner does not lose the work plan. |
| `followUpQueue` | Runner | Multimodal payloads that were queued but not yet delivered. |
| `queuedCommands` | Runner | Commands that arrived while non-agent work was driving the turn; replayed on resume. |

### TurnStateStatus lifecycle

```text
            ┌────────────────────────────────────────────────────┐
            │                   running                          │
            │  (new turn starts, wake received, prompt absorbed) │
            └──────┬──────────┬──────────────┬──────────────────┘
                   │          │              │
            complete      sleeping     waiting_for_human
          (completed,    (wakeAt set)   (ask event)
           failed,
           cancelled)
                   │
             interrupted
```

Sources: [src/types/protocol.ts:161-167](src/types/protocol.ts)

---

## Command Types

Commands are the inputs that drive one turn. All three live-turn commands share the `TurnCommand` union; `start` is separate because it is a setup step, not a turn.

### start — session setup

`TurnStartCommand` bootstraps the runner: loads memory and skills, hydrates `state` (fresh or persisted), and emits `turn_started`. No LLM work happens here.

```ts
export interface TurnStartCommand {
  type: "start";
  mode?: TurnMode;
  state?: TurnState;    // provide to resume a previous session
  options?: TurnOptions;
  mcpServers?: Record<string, McpHttpServerConfig>;
}
```

Sources: [src/types/protocol.ts:317-340](src/types/protocol.ts)

When `command.state` is present, `TurnRunner.start()` calls `createInitialTurnState` only on fresh sessions; resumed sessions get their persisted agent messages and state-machine history loaded directly:

```ts
// src/turn-runner/turn-runner.ts:326-333
const state = command.state
  ? {
      ...command.state,
      options: this.resolveTurnOptions(startOptions, command.state.options),
    }
  : createInitialTurnState(mode, this.resolveTurnOptions(startOptions));
this.stateMachineController.hydrate(state.stateMachine);
```

Sources: [src/turn-runner/turn-runner.ts:326-334](src/turn-runner/turn-runner.ts)

### prompt — user message

`TurnPromptCommand` delivers a user message (with optional images) against the current state.

```ts
export interface TurnPromptCommand {
  type: "prompt";
  message: string;
  behavior: TurnPromptBehavior;   // "steer" | "follow_up"
  images?: TurnPromptImage[];
}
```

`behavior` controls delivery when a pi-agent session is already active:
- `"steer"` — calls `agent.steer()`, injecting the message into the running turn as immediate context.
- `"follow_up"` — calls `agent.followUp()`, queued until the current pi-agent turn finishes.

Sources: [src/types/protocol.ts:377-392](src/types/protocol.ts)

### answer — structured question response

`TurnAnswerCommand` serializes a picker answer into XML and delivers it as a prompt. It uses the same `behavior` field and eventually routes through `TurnRunner.prompt()`:

```ts
// src/turn-runner/turn-runner.ts:675-683
protected async answer(command: TurnAnswerCommand): Promise<TurnTerminalEvent> {
  const message = this.commandToUserMessage(command);
  return this.prompt({ type: "prompt", message, behavior: command.behavior, images: command.images });
}
```

Sources: [src/turn-runner/turn-runner.ts:675-683](src/turn-runner/turn-runner.ts)

### wake — resume a sleeping session

`TurnWakeCommand` is the simplest command. If the runner's state is `"sleeping"`, it calls `stateMachineController.wake()` and drives the resulting state-machine step. If the runner is not sleeping, it returns `complete` with `"Nothing to wake."` immediately — making it safe to replay:

```ts
// src/turn-runner/turn-runner.ts:685-703
protected async wake(): Promise<TurnTerminalEvent> {
  const originalState = this.requireRunnerState();
  const state: TurnState = { ...originalState, status: "running" };
  if (originalState.status === "sleeping") {
    const result = await this.stateMachineController.wake();
    if (result) return this.driveStateMachineResult(result, state);
  }
  return { type: "complete", status: "completed", state: originalState, result: "Nothing to wake." };
}
```

Sources: [src/turn-runner/turn-runner.ts:685-703](src/turn-runner/turn-runner.ts)

---

## Terminal Events

Every turn ends with exactly one terminal event. All four carry the updated `TurnState` so the receiver can persist and resume.

```ts
// src/types/protocol.ts:687-691
export type TurnTerminalEvent =
  | TurnAskEvent
  | TurnCompletedEvent
  | TurnInterruptedEvent
  | TurnSleepEvent;
```

Sources: [src/types/protocol.ts:687-691](src/types/protocol.ts)

### complete

`TurnCompletedEvent` is emitted when the parent agent finishes its turn. `status` is one of `"completed" | "failed" | "cancelled"`.

### ask

`TurnAskEvent` pauses the turn and surfaces structured questions to the caller. It sets `state.status = "waiting_for_human"`. The caller sends a `TurnAnswerCommand` to continue.

### sleep

`TurnSleepEvent` is emitted when a state-machine poll or timer state needs to wait. It carries a `wakeAt` Unix timestamp in milliseconds. The session layer (`Session`) persists the state and schedules a wall-clock wakeup:

```ts
// src/session/session.ts:549-567
private scheduleWake(terminal: Extract<TurnTerminalEvent, { type: "sleep" }>): void {
  this.cancelWake();
  const fire = (): void => {
    if (Date.now() < terminal.wakeAt) return;
    this.cancelWake();
    const state = this.runner.getState();
    if (!state || state.status !== "sleeping") return;
    this.dispatchTurn({ type: "wake" });
  };
  this.wakeTimer = setInterval(fire, WAKE_POLL_INTERVAL_MS);
  const remaining = terminal.wakeAt - Date.now();
  if (remaining < WAKE_POLL_INTERVAL_MS) {
    this.wakeFastPath = setTimeout(fire, Math.max(0, remaining));
  }
}
```

Sources: [src/session/session.ts:549-567](src/session/session.ts)

The poll interval is 30 seconds (`WAKE_POLL_INTERVAL_MS = 30_000`) to survive OS sleep where monotonic timers pause.

### interrupted

`TurnInterruptedEvent` is emitted when `runner.interrupt()` is called mid-turn. The runner marks `state.status = "interrupted"`, aborts the parent pi-agent, and clears all queues. If state-machine work was active, the controller records an interrupt marker on the session.

---

## During-Turn Events

These events stream while the runner is still working. They update the UI but do not end the turn.

| Event type | Payload | Purpose |
|---|---|---|
| `step` | `TurnStep` (text, reasoning, tool call, system) | Streaming agent progress |
| `todos` | `TurnTodo[]` | Updated work plan |
| `follow_up_queue` | `TurnFollowUpQueueEntry[]` | Updated buffer of pending user prompts |
| `state_machine` | `StateMachineSession` | Full session snapshot for Kanban rendering |
| `memory` | `ObservationalMemoryActivityEvent` | Memory observation/reflection activity |
| `usage` | `TurnUsageFields` | Running token cost after each LLM boundary |
| `system` | `level`, `message` | Diagnostic info, warnings, error notices |

Sources: [src/types/protocol.ts:677-684](src/types/protocol.ts)

---

## Turn Lifecycle Sequence

```mermaid
sequenceDiagram
    participant Caller as CLI / TUI / Session
    participant Runner as TurnRunner
    participant Agent as pi-agent (parent)
    participant SM as StateMachineController

    Caller->>Runner: start({ type:"start", state? })
    Runner->>Runner: ensureMemoryLoaded(), ensureSkillsLoaded()
    Runner-->>Caller: emit turn_started (TurnState)

    Caller->>Runner: turn({ type:"prompt", message, behavior })
    Runner->>Agent: agent.prompt(text, images)
    Agent-->>Runner: step events (text_delta, tool_call, ...)
    Runner-->>Caller: emit step, todos, usage (during events)
    Agent-->>Runner: message_end (usage)
    Runner->>SM: runDecision / runState (if state-machine mode)
    SM-->>Runner: StateMachineExecutionResult
    Runner-->>Caller: emit state_machine, usage
    Runner-->>Caller: emit complete | ask | sleep | interrupted (terminal)
    Runner->>Session: persist state.json
```

Sources: [src/turn-runner/turn-runner.ts:342-390](src/turn-runner/turn-runner.ts), [src/session/session.ts:436-455](src/session/session.ts)

---

## State Persistence and Resume

The `Session` class is the persistence owner. After every terminal event it writes `state.json` inside the session's directory (`~/.duet/sessions/<id>/state.json`):

```ts
// src/session/session.ts:652-663
private async writeStoredEnvelope(state: TurnState): Promise<void> {
  const payload: StoredSessionFile = {
    sessionId: this.id,
    updatedAt: Date.now(),
    state,
    sessionCostUsd: this.sessionCostUsd,
  };
  if (this.lastUsage !== undefined) payload.lastUsage = this.lastUsage;
  await writeFile(this.sessionFilePath(), `${JSON.stringify(payload, null, 2)}\n`, "utf-8");
}
```

Sources: [src/session/session.ts:652-663](src/session/session.ts)

On resume, `Session.start()` reads `state.json`, passes the stored `TurnState` through `TurnStartCommand.state`, and re-arms the wake timer when `state.status === "sleeping"`.

The `SessionManager` creates or resumes sessions by session id. New sessions get a fresh `nanoid`-based id; resumed sessions load their stored state:

```ts
// src/session/session-manager.ts:86-92
resume(sessionId: string): Session {
  const existing = this.sessions.get(sessionId);
  if (existing) return existing;
  const session = this.createSession(sessionId, true);
  this.sessions.set(sessionId, session);
  return session;
}
```

Sources: [src/session/session-manager.ts:86-92](src/session/session-manager.ts)

---

## Usage Accounting

Token usage is tracked across every LLM boundary (parent worker plus each state-machine agent). The `addUsage` function is a pure accumulator — earlier in-place mutation semantics were deliberately replaced to eliminate silent discard bugs:

```ts
// src/turn-runner/usage-accounting.ts:12-33
export function addUsage(
  a: TurnTokenUsage | undefined,
  b: TurnTokenUsage | undefined,
): TurnTokenUsage | undefined {
  if (!a && !b) return undefined;
  if (!a) return cloneUsage(b!);
  if (!b) return cloneUsage(a);
  return {
    input: a.input + b.input,
    output: a.output + b.output,
    cacheRead: a.cacheRead + b.cacheRead,
    cacheWrite: a.cacheWrite + b.cacheWrite,
    totalTokens: a.totalTokens + b.totalTokens,
    cost: { input: ..., output: ..., cacheRead: ..., cacheWrite: ..., total: ... },
  };
}
```

Sources: [src/turn-runner/usage-accounting.ts:12-33](src/turn-runner/usage-accounting.ts)

The protocol exposes two distinct token totals in `TurnUsageFields`:

- **`turnUsage`**: cumulative sum across every LLM call in the turn (parent + all state agents). Use this for cost accounting.
- **`lastMessageUsage`**: exact provider-reported usage of the most recent parent call. Use this for context-window pressure display.

`contextWindowUsage` provides a heuristic segment breakdown (`systemPrompt`, `messages`, `localMemory`, `globalMemory`) that `scaleContextWindowUsageToTotalTokens` rescales to sum exactly to `lastMessageUsage.totalTokens` before emission.

Sources: [src/types/protocol.ts:581-620](src/types/protocol.ts)

---

## Initial State Creation and snapshotState

`createInitialTurnState` produces the zeroed snapshot for a fresh session:

```ts
// src/turn-runner/turn-state.ts:9-18
export function createInitialTurnState(mode: TurnMode, options?: TurnOptions): TurnState {
  return {
    status: "running",
    mode,
    options,
    agent: { status: "running", messages: [] },
  };
}
```

Sources: [src/turn-runner/turn-state.ts:9-18](src/turn-runner/turn-state.ts)

The runner's `snapshotState` method is the single choke point that reconciles in-flight pi-agent messages, the `StateMachineController`'s live session, and the current todo/followUpQueue/queuedCommands arrays into one consistent snapshot. Every state leaving the runner — via emit, return, or `getState()` — passes through this method:

```ts
// src/turn-runner/turn-runner.ts:1004-1022
private snapshotState(state: TurnState): TurnState {
  const parentAgent = this.parentAgent
    ? { ...state.agent, status: state.agent.status, messages: this.parentAgent.state.messages }
    : state.agent;
  const snapshot: TurnState = {
    ...state,
    agent: parentAgent,
    stateMachine: this.stateMachineController.getSession(),
    todos: copyOptionalArray(state.todos ?? this.state?.todos),
    followUpQueue: copyOptionalArray(state.followUpQueue ?? this.state?.followUpQueue),
    queuedCommands: copyOptionalArray(state.queuedCommands ?? this.state?.queuedCommands),
  };
  return this.applyAutoStateCompaction(snapshot);
}
```

Sources: [src/turn-runner/turn-runner.ts:1004-1022](src/turn-runner/turn-runner.ts)

Auto-compaction is enabled by default and evicts the oldest messages when the state snapshot exceeds the 100 MB ceiling, preventing unbounded `state.json` growth from wedging persistence.

---

## Invariants and Failure Modes

| Invariant | Where enforced |
|---|---|
| Exactly one terminal event per turn chain | `runTurnChain` emits after `drainQueuedTurnCommands` completes |
| `start` must precede `turn` | `requireStarted()` throws otherwise |
| Only one parent agent worker active at a time | `parentAgentRunning` guard in `runAgentWorker` |
| `wake` on a non-sleeping session is a no-op | Early return with `"Nothing to wake."` |
| Interrupted turn cost is not persisted | `sessionCostUsd` only increments on the terminal event |
| Sleeping `state.status` restores after a user prompt when the state machine is still waiting | `restoreSleepAfterTurn` flag in `Session` |

The protocol's constraint that every turn must end with a terminal event means callers can safely `await runner.turn(command)` and then persist the returned `state` — the state snapshot is always consistent and complete at that boundary, regardless of whether the turn produced agent work, a state-machine transition, or an immediate `ask`.

Sources: [src/turn-runner/turn-runner.ts:364-390](src/turn-runner/turn-runner.ts), [src/session/session.ts:481-500](src/session/session.ts)
