# The Mental Model — How agentmemory Thinks

> The simplest useful model of the system: one long-lived Node process (the iii worker) owns all state, every agent interaction flows through REST or MCP, and memories are first-class objects with confidence scores, TTLs, and graph relationships — not raw text logs.

- Repository: rohitg00/agentmemory
- GitHub: https://github.com/rohitg00/agentmemory
- Human wiki: https://grok-wiki.com/public/wiki/rohitg00-agentmemory-94f173bce1dc
- Complete Markdown: https://grok-wiki.com/public/wiki/rohitg00-agentmemory-94f173bce1dc/llms-full.txt

## Source Files

- `src/index.ts`
- `src/types.ts`
- `src/config.ts`
- `src/state/schema.ts`
- `README.md`
- `iii-config.yaml`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [src/index.ts](src/index.ts)
- [src/types.ts](src/types.ts)
- [src/config.ts](src/config.ts)
- [src/state/schema.ts](src/state/schema.ts)
- [src/state/hybrid-search.ts](src/state/hybrid-search.ts)
- [src/functions/observe.ts](src/functions/observe.ts)
- [src/functions/remember.ts](src/functions/remember.ts)
- [src/triggers/api.ts](src/triggers/api.ts)
- [src/mcp/server.ts](src/mcp/server.ts)
- [iii-config.yaml](iii-config.yaml)
</details>

# The Mental Model — How agentmemory Thinks

agentmemory is a long-lived Node.js memory service that captures every event in an agent's working session, promotes the most durable observations into typed Memory objects, and makes all of it retrievable through a triple-stream hybrid search (BM25 + vector + graph). Understanding this page means you can predict which data path any request takes, why memory objects behave differently from raw observations, and where the system can degrade gracefully versus where it refuses to start.

The key architectural invariant: **one process owns all mutable state**. There is no distributed write path. Every agent hook, every REST call, and every MCP tool call ultimately routes through a single registered iii worker, which serializes writes through a keyed mutex and commits to a file-backed KV store. This design makes state consistent at the cost of being single-host only, and it is what makes the decay timers, dedup maps, and in-memory search indexes safe to reason about without distributed locking.

---

## 1. The iii Worker as the Spine

agentmemory is structured as an **iii (triple-i) worker**, registered at startup against the local iii engine via WebSocket:

```ts
// src/index.ts:166-186
const sdk = registerWorker(config.engineUrl, {
  workerName: "agentmemory",
  invocationTimeoutMs: 180000,
  ...
});
```

The iii engine (configured in `iii-config.yaml`) is itself a local multi-service runner that provides:

| iii service | Purpose |
|-------------|---------|
| `iii-http` | REST API on port 3111 |
| `iii-state` | File-backed KV store (`./data/state_store.db`) |
| `iii-queue` | Async function dispatch queue |
| `iii-pubsub` | Local pub/sub |
| `iii-stream` | WebSocket streaming on port 3112 |
| `iii-cron` | Scheduled trigger runner |
| `iii-observability` | Metrics and trace collection |
| `iii-exec` | Hot-reload watcher (`src/**/*.ts` → `dist/index.mjs`) |

Sources: [iii-config.yaml:1-53]()

The worker registers every capability as a named function (e.g., `mem::observe`, `mem::remember`, `mem::smart-search`) and two trigger surfaces: REST endpoints and MCP endpoints. Both surfaces call the same underlying functions — the transport layer is a thin wrapper.

Sources: [src/index.ts:204-342](), [src/triggers/api.ts:1-28](), [src/mcp/server.ts:42-58]()

---

## 2. The Two Tiers of Memory Objects

agentmemory maintains a strict two-tier hierarchy. Confusing the tiers is the most common source of wrong mental models about what the system stores.

```text
┌────────────────────────────────────────────────────────┐
│  Tier 1: Observations  (ephemeral captures)            │
│  RawObservation → CompressedObservation                │
│  Keyed by session: mem:obs:<sessionId>                 │
│  Lifecycle: captured → (optionally) LLM-compressed     │
│             → BM25-indexed → vector-embedded           │
└──────────────────────────┬─────────────────────────────┘
                           │  promote via mem::remember
                           │  or consolidation pipeline
┌──────────────────────────▼─────────────────────────────┐
│  Tier 2: Memories  (durable, versioned facts)          │
│  Memory  (typed, strength-scored, expirable)           │
│  Keyed globally: mem:memories                          │
│  Lifecycle: created → reinforced/superseded → evicted  │
└────────────────────────────────────────────────────────┘
```

### 2.1 Observations

Every hook event (tool use, prompt submit, session start/stop) arrives as a `RawObservation`. Unless `AGENTMEMORY_AUTO_COMPRESS=true`, it is immediately converted via **synthetic compression** (zero-LLM, rule-based) into a `CompressedObservation` with:

- `type`: one of `file_read`, `file_write`, `command_run`, `search`, `decision`, `discovery`, `error`, `conversation`, etc.
- `facts`: string array of extracted key claims
- `narrative`: short prose summary
- `concepts`: normalized entity list
- `importance`: numeric weight (0–10)
- `confidence`: optional float

Sources: [src/types.ts:29-79]()

Observations are keyed per session (`mem:obs:<sessionId>`) and are subject to a per-session cap (`MAX_OBS_PER_SESSION`, default 500). Sources: [src/state/schema.ts:6](), [src/config.ts:145]()

### 2.2 Memories

A `Memory` is a promoted, versioned fact with these key fields:

| Field | Type | Meaning |
|-------|------|---------|
| `type` | `"pattern" \| "preference" \| "architecture" \| "bug" \| "workflow" \| "fact"` | Semantic category |
| `strength` | `number` (default 7) | Recall priority weight; higher = retained longer |
| `version` | `number` | Monotonically incremented on supersession |
| `isLatest` | `boolean` | Only `true` entries participate in search |
| `forgetAfter` | `ISO string \| undefined` | TTL; absent means permanent |
| `parentId` / `supersedes` | optional IDs | Immutable chain of versions |
| `relatedIds` | optional IDs | Soft cross-memory links |
| `sourceObservationIds` | optional IDs | Provenance back to raw events |

Sources: [src/types.ts:81-101]()

When a new memory is saved and its Jaccard similarity to an existing memory exceeds 0.7, the old memory is marked `isLatest: false` and the new one inherits `version + 1` and a `parentId` pointing to its predecessor. This creates an **immutable version chain** rather than overwriting.

Sources: [src/functions/remember.ts:52-98]()

---

## 3. Interaction Surfaces

All external interaction flows through one of two surfaces. Internally they call the same registered functions.

```mermaid
flowchart LR
    subgraph "Agent / Claude Code"
        H[Claude Code Hooks<br/>PreToolUse · PostToolUse<br/>SessionStart · Stop]
        M[MCP Client<br/>npx @agentmemory/mcp]
        R[REST Client<br/>curl / SDK / viewer]
    end

    subgraph "agentmemory Node process"
        API[REST Triggers<br/>src/triggers/api.ts<br/>:3111]
        MCP[MCP Endpoints<br/>src/mcp/server.ts]
        FN[Registered Functions<br/>mem::observe<br/>mem::remember<br/>mem::smart-search<br/>…60+ more]
        KV[StateKV<br/>src/state/kv.ts]
        IDX[In-memory Indexes<br/>BM25 · VectorIndex]
    end

    subgraph "iii Engine"
        STATE[iii-state<br/>KV file store]
        QUEUE[iii-queue]
        STREAM[iii-stream :3112]
    end

    H -->|HTTP POST| API
    R -->|HTTP| API
    M -->|HTTP| MCP
    API --> FN
    MCP --> FN
    FN --> KV
    FN --> IDX
    KV --> STATE
    IDX -->|persisted on shutdown| STATE
```

Sources: [src/index.ts:340-342](), [src/triggers/api.ts:1-5](), [src/mcp/server.ts:42-58]()

The REST surface at startup advertises **124 endpoints** and the MCP surface exposes **tools, 6 resources, and 3 prompts** — both are populated from the same registered function list. Sources: [src/index.ts:484-488]()

---

## 4. The Observation Lifecycle

Understanding observation flow prevents surprises about when data appears in search.

```mermaid
stateDiagram-v2
    [*] --> RawCapture: hook fires (HookPayload)
    RawCapture --> Dedup: DedupMap hash check
    Dedup --> Dropped: duplicate within session
    Dedup --> Strip: pass dedup
    Strip --> SyntheticCompress: stripPrivateData + buildSyntheticCompression
    Strip --> LLMCompress: if AGENTMEMORY_AUTO_COMPRESS=true
    SyntheticCompress --> KVStore: kv.set(mem:obs:sessionId, obsId, obs)
    LLMCompress --> KVStore
    KVStore --> BM25Index: getSearchIndex().add(obs)
    KVStore --> VectorIndex: embed + store (if provider configured)
    BM25Index --> [*]: searchable immediately
    VectorIndex --> [*]: searchable after embed()
```

Sources: [src/functions/observe.ts:36-130](), [src/config.ts:266-287]()

Key invariant: **BM25 indexing is synchronous with the KV write**; vector embedding is fire-and-forget. If the embedding provider is unavailable, the observation is still fully searchable via BM25.

---

## 5. The Triple-Stream Search Engine

Search is the most mechanically interesting part of agentmemory. Every `mem::smart-search` call runs three retrieval streams in parallel and fuses their ranks using Reciprocal Rank Fusion (RRF, k=60):

| Stream | Implementation | Default weight |
|--------|---------------|----------------|
| BM25 | `SearchIndex` (in-process) | 0.4 (`BM25_WEIGHT`) |
| Vector | `VectorIndex` cosine similarity | 0.6 (`VECTOR_WEIGHT`) |
| Graph | `GraphRetrieval` entity walk | 0.3 (`AGENTMEMORY_GRAPH_WEIGHT`) |

```ts
// src/state/hybrid-search.ts:82-115
const bm25Results = this.bm25.search(query, limit * 2);
queryEmbedding = await this.embeddingProvider.embed(query);
vectorResults = this.vector.search(queryEmbedding, limit * 2);
graphResults = await this.graphRetrieval.searchByEntities(entities, 2, limit);
```

After fusion, an optional cross-encoder rerank pass can be enabled via `RERANK_ENABLED=true`. The result type carries all three individual scores plus the combined score, which lets callers inspect retrieval provenance.

Sources: [src/state/hybrid-search.ts:22-127](), [src/types.ts:250-258]()

When no embedding provider is configured (no API key present), the system falls back to **BM25-only mode** and logs `BM25+Graph search active` at boot rather than `Triple-stream`. This is not an error state — BM25 alone is fully functional. Sources: [src/index.ts:480-482]()

---

## 6. The Knowledge Graph Layer

When `GRAPH_EXTRACTION_ENABLED=true`, agentmemory maintains a property graph of entities extracted from observations. Nodes and edges are first-class objects:

| Type | Key fields |
|------|-----------|
| `GraphNode` | `type` (file/function/concept/error/decision/pattern/person/…), `name`, `properties`, `aliases`, `stale` |
| `GraphEdge` | `type` (uses/imports/modifies/causes/fixes/depends_on/related_to/…), `weight`, `tcommit`, `tvalid`, `tvalidEnd`, `isLatest`, `supersededBy` |
| `EdgeContext` | `reasoning`, `sentiment`, `alternatives`, `situationalFactors`, `confidence` |

The `tvalid`/`tvalidEnd` fields on edges implement **bi-temporal graph modeling**: each edge records when it became valid and when it stopped being valid, enabling `mem::temporal-query` to answer "what was the relationship between X and Y as of commit sha abc?" without destroying history.

Sources: [src/types.ts:362-431](), [src/types.ts:833-851]()

Graph search runs as the third stream in hybrid search: it extracts entity names from the query text, walks the graph from those anchors, and promotes observations linked to relevant graph nodes. Sources: [src/state/hybrid-search.ts:100-126]()

---

## 7. The Consolidation Memory Tiers

Beyond the basic observation/memory split, agentmemory implements a four-tier consolidation pipeline modeled loosely on cognitive memory systems:

```text
working    →   episodic    →   semantic    →   procedural
(slots,        (Memory        (SemanticMemory,  (ProceduralMemory,
 recent obs)    objects)       confirmed facts)  step sequences)
```

| Tier | Type | Distinctive field |
|------|------|------------------|
| Working | `MemorySlot` | `pinned`, `readOnly`, `scope` (project/global) |
| Episodic | `Memory` | `strength`, `forgetAfter`, `version` chain |
| Semantic | `SemanticMemory` | `confidence`, `accessCount`, `lastAccessedAt` |
| Procedural | `ProceduralMemory` | `steps[]`, `triggerCondition`, `frequency` |

The consolidation pipeline (`mem::consolidate-pipeline`) runs every 2 hours by default (`CONSOLIDATION_INTERVAL_MS`) when `CONSOLIDATION_ENABLED=true`. It promotes frequently-reinforced episodic memories into semantic facts and extracts procedural patterns from repeated action sequences.

Sources: [src/types.ts:439-472](), [src/index.ts:531-539]()

---

## 8. Decay, Retention, and Forgetting

Memories are not immortal. Several decay mechanisms run on timers:

| Timer | Default interval | What it does |
|-------|-----------------|--------------|
| `mem::auto-forget` | 1 hour | Evicts memories where `forgetAfter < now` |
| `mem::lesson-decay-sweep` | 24 hours | Reduces `Lesson.confidence` by `decayRate` |
| `mem::insight-decay-sweep` | 24 hours | Same for `Insight` objects |
| Consolidation pipeline | 2 hours | Promotes + prunes stale episodic items |

Retention scores (`RetentionScore`) combine salience, temporal decay, and reinforcement boost into a single float. The `source` field (`"episodic" | "semantic"`) on `RetentionScore` tells the eviction loop which KV scope to target for deletion — missing on pre-0.8.10 entries, where both scopes must be probed for backwards compatibility.

Sources: [src/types.ts:853-876](), [src/index.ts:499-539]()

---

## 9. The Orchestration Layer

Beyond memory, agentmemory exposes a coordination layer for multi-agent workflows:

| Concept | TypeScript type | Role |
|---------|----------------|------|
| `Action` | `Action` | Named task with status (`pending→active→done→blocked→cancelled`) |
| `ActionEdge` | `ActionEdge` | Dependency relationships (`requires`, `unlocks`, `gated_by`, `conflicts_with`) |
| `Lease` | `Lease` | Mutex-style lock on an action by one agent |
| `Routine` | `Routine` | Named multi-step plan promoted from `ProceduralMemory` |
| `Signal` | `Signal` | Typed agent-to-agent message (`info/request/response/alert/handoff`) |
| `Checkpoint` | `Checkpoint` | Blocking gate (`ci/approval/deploy/external/timer`) |
| `Sentinel` | `Sentinel` | Watcher that triggers on `webhook/timer/threshold/pattern/approval` |
| `Sketch` | `Sketch` | Ephemeral draft that can be promoted or discarded |
| `Crystal` | `Crystal` | Distilled narrative from completed action sets |

Agents acquire leases to claim exclusive ownership of actions, send signals to hand off work, and resolve checkpoints when conditions are met. This lets a swarm of agents coordinate through shared memory without a central orchestrator.

Sources: [src/types.ts:585-737]()

---

## 10. State Ownership and KV Schema

All persistent state is stored under `mem:*` keys in the iii-state file store. The `KV` constant in `src/state/schema.ts` is the authoritative list:

```ts
// src/state/schema.ts:3-50 (selected)
export const KV = {
  sessions:      "mem:sessions",
  observations:  (sessionId) => `mem:obs:${sessionId}`,
  memories:      "mem:memories",
  graphNodes:    "mem:graph:nodes",
  graphEdges:    "mem:graph:edges",
  semantic:      "mem:semantic",
  procedural:    "mem:procedural",
  actions:       "mem:actions",
  leases:        "mem:leases",
  signals:       "mem:signals",
  checkpoints:   "mem:checkpoints",
  retentionScores: "mem:retention",
  slots:         "mem:slots",
  state:         "mem:state",    // system counters (disk size, flags)
  ...
}
```

The `StateScope` interface (`src/types.ts:884-886`) types the `mem:state` scope, currently holding `"system:currentDiskSize": number`. Every other scope is typed by its corresponding interface. This pattern means any new key added to `KV` without a matching TypeScript type is a maintainability gap, not a runtime error.

Sources: [src/state/schema.ts:3-50](), [src/types.ts:882-888]()

---

## 11. Startup Invariants and Failure Modes

At startup, agentmemory enforces several hard invariants before declaring itself ready:

**Vector dimension guard.** If the on-disk vector index was written with a different embedding model than the currently configured provider, the process refuses to start with a descriptive error rather than silently corrupting cosine similarity scores (cross-dimension dot products return 0, causing all affected observations to disappear from search). Setting `AGENTMEMORY_DROP_STALE_INDEX=true` bypasses this by discarding the persisted index and rebuilding from live observations over time.

Sources: [src/index.ts:362-410]()

**BM25 rebuild is fire-and-forget.** If the BM25 index is empty at boot (first run or cache deleted), index rebuilding is intentionally **not** awaited. On a large corpus with a rate-limited embedding endpoint, rebuilding can take hours. Blocking on it would leave the viewer server unbound for the full duration. Instead, search degrades gracefully under partial coverage while the index fills in asynchronously.

Sources: [src/index.ts:412-432]()

**Provider detection is opt-in safe.** If no LLM API key is present, the provider resolves to `"noop"` and LLM-backed compression and summarization are disabled entirely. The agent-sdk fallback (which spawns Claude Code child sessions) is guarded behind `AGENTMEMORY_ALLOW_AGENT_SDK=true` because those child sessions inherit Claude Code's Stop hook, which calls agentmemory, producing infinite recursion.

Sources: [src/config.ts:100-132]()

**Unhandled rejections are suppressed, not fatal.** The top-level `unhandledRejection` handler (rate-limited to one log per minute) prevents a single slow `state::set` timeout from crashing the long-lived service. This is the correct tradeoff for a memory daemon: one failed write should not destroy a session's ongoing memory stream.

Sources: [src/index.ts:112-129]()

---

## Summary

agentmemory is a single long-lived Node process whose entire surface area — REST, MCP, and background timers — routes through one set of registered iii worker functions. Memories are structured objects with typed categories, strength scores, TTLs, version chains, and graph relationships, not raw text appended to a log. Search is a three-stream fusion (BM25 + vector + graph), each independently degradable. The startup sequence enforces dimension-consistency on vector indexes and refuses to silently corrupt search rather than accepting mismatched embeddings. Any mental model that treats agentmemory as "a thing that appends hook events to a file" will mispredict retention behavior, search scoring, and why some observations disappear from recall after an embedding provider change.

Sources: [src/index.ts:131-560]()
