# How Centaur Works in Your Head

> The simplest mental model of Centaur: one shared agent, one sandbox per Slack thread, durable state so nothing is lost on restart, and iron-proxy so agents never touch raw credentials. Understand this page and every other page falls into place.

- Repository: paradigmxyz/centaur
- GitHub: https://github.com/paradigmxyz/centaur
- Human wiki: https://grok-wiki.com/public/wiki/paradigmxyz-centaur-57fc6b2755e2
- Complete Markdown: https://grok-wiki.com/public/wiki/paradigmxyz-centaur-57fc6b2755e2/llms-full.txt

## Source Files

- `README.md`
- `docs/pages/architecture.mdx`
- `docs/pages/what-is-centaur.mdx`
- `services/api/api/agent.py`
- `AGENTS.md`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [README.md](README.md)
- [AGENTS.md](AGENTS.md)
- [docs/pages/architecture.mdx](docs/pages/architecture.mdx)
- [docs/pages/what-is-centaur.mdx](docs/pages/what-is-centaur.mdx)
- [services/api/api/agent.py](services/api/api/agent.py)
- [services/api/api/sandbox/base.py](services/api/api/sandbox/base.py)
- [services/api/api/sandbox/harness_protocol.py](services/api/api/sandbox/harness_protocol.py)
- [services/sandbox/entrypoint.sh](services/sandbox/entrypoint.sh)
- [services/iron-proxy/iron-proxy.yaml](services/iron-proxy/iron-proxy.yaml)
</details>

# How Centaur Works in Your Head

Centaur has four invariants. Understand them and every subsystem — sandboxes, durable state, tools, credential injection — falls into a coherent picture. This page builds that picture deliberately: start with the invariants, trace a single Slack message through the full system, then inspect how each invariant holds under failure.

One shared agent for the whole team. One Kubernetes sandbox per Slack thread. Postgres as the single source of truth so nothing is lost on restart. And iron-proxy so agents never see raw credentials.

---

## The Four Invariants

| # | Invariant | What it prevents |
|---|-----------|------------------|
| 1 | **One shared agent** | Every team member talks to the same Centaur Slack bot rather than spinning up one-off local setups |
| 2 | **One sandbox per thread** | Conversations cannot bleed into each other; each thread gets its own Kubernetes pod with its own filesystem, shell, and process tree |
| 3 | **Durable state in Postgres** | A client disconnect, API restart, or pod replacement does not erase a running turn; every step is stored before any response is sent |
| 4 | **iron-proxy for credentials** | Sandbox pods receive only placeholder strings; real API keys are injected on the wire by a per-sandbox proxy, bound to specific upstream hosts |

Sources: [docs/pages/what-is-centaur.mdx:1-44](), [README.md:186-194]()

---

## The Anatomy of One Request

This is what happens when a user types `@centaur why are the billing tests failing?` in Slack.

```text
┌─────────────────────────────────────────────────────────────┐
│  Slack                                                      │
│  @centaur why are the billing tests failing?                │
└────────────────────┬────────────────────────────────────────┘
                     │ HMAC-verified webhook
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  Slackbot  (Next.js + Slack Bolt)                           │
│  Verifies X-Slack-Signature, then calls Centaur API         │
└────────────────────┬────────────────────────────────────────┘
                     │ SLACKBOT_API_KEY
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  Centaur API  (FastAPI + Postgres)                          │
│  POST /agent/spawn   → pin or create a sandbox              │
│  POST /agent/message → write user turn to Postgres          │
│  POST /agent/execute → create execution row, enqueue        │
│  GET  /agent/threads/{thread}/events → SSE stream           │
└────────────────────┬────────────────────────────────────────┘
                     │ kubectl exec attach (stdin/stdout NDJSON)
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  Kubernetes Sandbox Pod  (centaur-agent:latest)             │
│  Harness CLI: amp | claude-code | codex | pi-mono           │
│  Reads workspace/AGENTS.md as system prompt                 │
│  Calls tools via: curl $CENTAUR_API_URL/tools/<tool>/<method│
│  LLM calls: HTTPS → iron-proxy → real API key injected      │
└────────────────────┬────────────────────────────────────────┘
                     │ SSE events back to Slackbot
                     ▼
              Slack thread reply
```

Sources: [AGENTS.md:70-80](), [services/api/api/agent.py:1-13]()

---

## Invariant 1: One Shared Agent

The Slack bot is a thin adapter that owns exactly two responsibilities: verifying incoming Slack requests with `SLACK_SIGNING_SECRET` (HMAC-SHA256), and translating Slack events into the three-step API protocol. It does not hold agent state, model context, or tool registrations.

Every team member's message reaches the same Centaur API. The API owns runtime assignment, execution serialization, cancellation, and delivery recovery. Clients are intentionally kept thin so there is no agent state to synchronize across team members.

Sources: [docs/pages/architecture.mdx:47-60](), [AGENTS.md:85-88]()

---

## Invariant 2: One Sandbox Per Thread

The `thread_key` is the fundamental unit of identity in Centaur. For Slack, it looks like `slack:C0AJ07U8Z1N:1773364194.179929` — channel plus message timestamp. The API maps exactly one active Kubernetes pod to each `thread_key` at a time.

The core logic is in `get_or_spawn()`:

```python
# services/api/api/agent.py:613-750 (condensed)
async def get_or_spawn(thread_key, harness, *, engine, persona) -> SandboxSession:
    """Tries (in order): DB session → warm pool → cold spawn."""
    session = await _db_get_session(thread_key)
    if session and session.db_state in _REUSABLE_DB_STATES:
        if await backend.status(session) == "running":
            return session          # existing pod → return immediately
    # pod is gone: save resume state, then try warm pool or cold spawn
    ...
```

A warm pool (`warm_pool.py`) pre-creates pods so common spawns take milliseconds instead of waiting for Kubernetes scheduling. If the warm pool is empty, the API cold-spawns a new pod. Either way, the `sandbox_sessions` table records the binding.

### What a sandbox pod contains

The entrypoint (`services/sandbox/entrypoint.sh`) runs at pod startup and:
1. Writes harness-specific config files (Amp settings, Codex config, Claude settings)
2. Clones the target repo into `~/workspace/` as a fresh branch
3. Assembles `workspace/AGENTS.md` from the base system prompt plus any org overlay and persona overlay
4. Copies skills into `workspace/.agents/skills/`
5. Touches `~/.ready` to signal readiness

The harness CLI (Amp, Claude Code, Codex, or pi-mono) then reads `workspace/AGENTS.md` as its system instructions and begins waiting for `turn.start` input on stdin.

Sources: [services/api/api/agent.py:613-750](), [services/sandbox/entrypoint.sh:1-208](), [AGENTS.md:436-470]()

---

## Invariant 3: Durable State in Postgres

This is the invariant that lets Centaur survive every common failure: client disconnect, API restart, pod death, workflow worker restart.

### The five-table contract

Every step of a turn writes to Postgres **before** any response is sent to the caller. The tables are:

| Table | What it stores |
|-------|----------------|
| `sandbox_sessions` | Thread-to-pod binding; inflight turn payload; last result cursor |
| `agent_runtime_assignments` | Thread-to-runtime pin and `assignment_generation` |
| `agent_message_requests` | Durable inbound transcript events |
| `agent_execution_requests` | Queued/running/terminal execution row |
| `agent_execution_events` | Replayable raw + projected execution events |
| `agent_final_delivery_outbox` | Final-result delivery obligation for retry paths |
| `chat_messages` | Persisted user/assistant messages for durable transcript surfaces |

Sources: [AGENTS.md:166-176](), [services/api/api/agent.py:268-298]()

### The three-step client protocol

```
POST /agent/spawn   → writes agent_runtime_assignments
POST /agent/message → writes agent_message_requests + chat_messages
POST /agent/execute → writes agent_execution_requests
GET  /agent/threads/{thread}/events → tails agent_execution_events (SSE)
```

Clients reconnect with `after_event_id` instead of restarting work. If the execution already finished and no more rows remain, the API emits the terminal `execution_state` snapshot so late joiners still get the answer.

### Inflight turn durability

Before writing to sandbox stdin, the API persists the full turn payload:

```python
# services/api/api/agent.py:1159-1165
durable_turn_id = f"turn-{uuid.uuid4().hex[:16]}"
await _db_set_inflight_turn(
    session.thread_key,
    durable_turn_id,
    turn_input,
    attempts=1,
)
```

If the pod dies mid-turn, `replay_inflight_turn()` re-sends the saved payload into the replacement pod. The attempt counter increments so operators can detect stuck retries.

### Reconciliation tick

Every 60 seconds, `reconcile_tick()` walks active `sandbox_sessions` rows and:
- Marks sessions `suspended` when the backing pod is gone (Step A)
- Enforces the idle TTL (`IDLE_TTL_S`, default 24 hours) by stopping pods that have been quiet (Step B)
- Reaps `running` rows with no live wire, turn, or execution (Step C)
- Reaps stuck inflight turns with no driving execution (Step D)

Sources: [services/api/api/agent.py:1516-1736]()

---

## Invariant 4: iron-proxy — No Raw Credentials in Sandboxes

This is the invariant that gives Centaur its security model. Sandbox pods are started with **stub** values for all third-party API keys. The real values live on a per-sandbox [iron-proxy](https://docs.iron.sh) pod.

### How it works

```text
Sandbox pod
  harness CLI
     │
     │ HTTPS (HTTPS_PROXY=http://firewall:8080)
     ▼
iron-proxy pod  (mitmproxy TLS MITM)
     │ matches outbound host and header
     │ replaces stub value with real credential from secrets service
     ▼
Upstream API  (api.anthropic.com, api.openai.com, api.github.com, …)
```

The proxy config (`iron-proxy.yaml`) sets `tls.mode: mitm` and issues a MITM CA cert to the sandbox at startup. The proxy's `transforms` block defines an allowlist of headers it will pass through — anything not on the list is stripped before the request leaves the cluster.

The credential injection map is managed by `firewall-manager` in the API. Each binding says: "for requests to `api.anthropic.com`, replace the `x-api-key` header with the real Anthropic key."

Agents and tool plugins refer to credentials by name (`secret("ANTHROPIC_API_KEY")`). The tool SDK returns the placeholder string. The proxy does the actual substitution at the network layer, so the plaintext key is never materialized in the pod's memory or filesystem.

Sources: [AGENTS.md:476-520](), [services/iron-proxy/iron-proxy.yaml:1-82](), [services/sandbox/entrypoint.sh:6-8](), [README.md:186-194]()

---

## How a Harness Talks to the API

Inside the pod, the agent harness calls tools via a bash helper (`/usr/local/bin/call`):

```bash
call slack get_channel_history '{"channel":"general"}'
# → POST http://$CENTAUR_API_URL/tools/slack/get_channel_history
```

The sandbox token (`sbx1.*` prefix, 2h TTL, HMAC-signed) is injected as `CENTAUR_API_KEY` at pod creation time and refreshed on each new turn. Tools never receive raw upstream secrets — they call `secret("NAME")` and the proxy handles the rest.

The wire between the API and the sandbox is **stdin/stdout NDJSON in Anthropic message format**:

```
→ stdin:  {"type":"turn.start","turn_id":1,"text":"why are billing tests failing?"}
← stdout: {"type":"assistant","message":{"role":"assistant","content":[...]}}
← stdout: {"type":"result","subtype":"success","result":"The tests fail because..."}
← stdout: {"type":"turn.done","turn_id":1,"result":"..."}
```

Harness-specific translation (materializing images to files for Amp, extracting text for Codex) happens inside the sandbox adapter (`harness_session.py`). The API and all clients always speak the canonical Anthropic format.

Sources: [AGENTS.md:178-207](), [services/api/api/sandbox/harness_protocol.py:1-60]()

---

## Failure Modes and Recovery

| Failure | What breaks | How it recovers |
|---------|-------------|-----------------|
| Client disconnects mid-stream | SSE connection drops | Reconnect with `after_event_id`; API replays from `agent_execution_events` |
| API restarts | In-memory `_runtime` dict is cleared | Rebuilt lazily from `sandbox_sessions` on next request |
| Sandbox pod dies mid-turn | Active turn is lost in-memory | `inflight_turn_input` in Postgres; `replay_inflight_turn()` re-sends to replacement pod |
| Workflow worker restarts | Running handler state is lost | Handler re-runs top-to-bottom; `ctx.step()` returns cached results for completed steps |
| iron-proxy restarts | Credential injection pauses | Key-injection map rebuilt from secrets-service cache |
| Pod capacity limit reached | New spawn fails | `_evict_idle_sessions_for_capacity()` stops oldest idle pods before cold-spawning |

Sources: [docs/pages/architecture.mdx:109-118](), [services/api/api/agent.py:334-395]()

---

## The Overlay Model

Centaur is designed to be extended without forking. The base repo (`paradigmxyz/centaur`) provides the control plane, workflow engine, and sandbox runtime. An org-level overlay repo sits alongside it:

```text
your-deployment/
├── centaur/              ← this repo (kernel)
└── centaur-overlay/      ← org tools, workflows, skills, personas
```

The Helm chart mounts the overlay at `/app/overlay/org`. The sandbox entrypoint merges overlays into `workspace/AGENTS.md` in order — base prompt, then org overlay. Later entries win on name collision.

This lets teams add tools, personas, workflows, and prompt customizations without touching the core repository.

Sources: [AGENTS.md:338-348](), [services/sandbox/entrypoint.sh:175-197]()

---

## Summary

Centaur is a durable control plane built on four hard invariants: one shared bot, one isolated sandbox pod per Slack thread, Postgres as the only source of truth, and iron-proxy to keep real credentials away from agent code. Once you see that every API call writes to Postgres before doing anything observable, and that the sandbox pod only ever sees placeholder strings for credentials, the rest of the system — tools, workflows, overlays, persona injection, harness adapters — becomes straightforward extensions of those four commitments. The architecture doc captures this precisely: "the event stream is the client contract; Slack and other clients should reconnect with `after_event_id` instead of trying to reconstruct state locally." Sources: [docs/pages/architecture.mdx:38-42]()
