# State Layer & Hybrid Search

> All persistent data lives in the iii-engine KV store (StateKV wrapping state::get/set/list triggers); reads go through HybridSearch which fuses BM25 (weight 0.4), vector cosine (weight 0.6), and graph traversal (weight 0.3) via Reciprocal Rank Fusion, with an optional reranker pass on top.

- Repository: rohitg00/agentmemory
- GitHub: https://github.com/rohitg00/agentmemory
- Human wiki: https://grok-wiki.com/public/wiki/rohitg00-agentmemory-94f173bce1dc
- Complete Markdown: https://grok-wiki.com/public/wiki/rohitg00-agentmemory-94f173bce1dc/llms-full.txt

## Source Files

- `src/state/kv.ts`
- `src/state/hybrid-search.ts`
- `src/state/search-index.ts`
- `src/state/vector-index.ts`
- `src/state/reranker.ts`
- `src/state/index-persistence.ts`
- `src/functions/graph.ts`
- `src/functions/graph-retrieval.ts`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [src/state/kv.ts](src/state/kv.ts)
- [src/state/hybrid-search.ts](src/state/hybrid-search.ts)
- [src/state/search-index.ts](src/state/search-index.ts)
- [src/state/vector-index.ts](src/state/vector-index.ts)
- [src/state/reranker.ts](src/state/reranker.ts)
- [src/state/index-persistence.ts](src/state/index-persistence.ts)
- [src/state/schema.ts](src/state/schema.ts)
- [src/functions/graph.ts](src/functions/graph.ts)
- [src/functions/graph-retrieval.ts](src/functions/graph-retrieval.ts)
</details>

# State Layer & Hybrid Search

The state layer is the persistence and retrieval backbone of agentmemory. All agent observations, memories, graph nodes and edges, and index snapshots flow through a single abstraction—`StateKV`—which delegates to the iii-engine KV store via typed trigger calls. On top of this foundation sits `HybridSearch`, a retrieval engine that fuses three independent ranking signals (BM25 keyword search, dense vector cosine similarity, and knowledge-graph traversal) using Reciprocal Rank Fusion (RRF), with an optional cross-encoder reranking pass as a final stage.

Understanding this layer is essential for predicting how memory recall behaves under partial failures (e.g., no embedding provider, no graph entities), how indexes survive process restarts, and what invariants must hold for search quality to remain stable across model or provider swaps.

---

## StateKV: The Persistence Interface

`StateKV` is a thin, typed wrapper around the iii-engine SDK. Every read and write goes through `sdk.trigger()`, routing to named iii-engine functions: `state::get`, `state::set`, `state::update`, `state::delete`, and `state::list`. There is no local in-process cache—the engine is the authoritative store.

```typescript
// src/state/kv.ts
export class StateKV {
  async get<T = unknown>(scope: string, key: string): Promise<T | null> {
    return this.sdk.trigger<{ scope: string; key: string }, T | null>({
      function_id: 'state::get',
      payload: { scope, key },
    })
  }

  async set<T = unknown>(scope: string, key: string, value: T): Promise<T> {
    return this.sdk.trigger<{ scope: string; key: string; value: T }, T>({
      function_id: 'state::set',
      payload: { scope, key, value },
    })
  }
  // ...list, delete, update
}
```

The `scope` parameter maps to a namespace key defined in `KV` constants (e.g., `"mem:obs:<sessionId>"`, `"mem:graph:nodes"`, `"mem:index:bm25"`), making every stored record addressable by its logical category.

Sources: [src/state/kv.ts:1-47]()

### KV Namespace Map

All scopes are declared as constants in `src/state/schema.ts`. Key namespaces:

| KV Key | Content |
|---|---|
| `mem:obs:<sessionId>` | `CompressedObservation` records per session |
| `mem:memories` | Saved long-term `Memory` records |
| `mem:index:bm25` | Serialized BM25 index (`"data"`) + vector index (`"vectors"`) |
| `mem:graph:nodes` | `GraphNode` entities |
| `mem:graph:edges` | `GraphEdge` relationships |
| `mem:sessions` | Session metadata |
| `mem:summaries` | Session summaries |
| `mem:relations` | Explicit entity relations |

Sources: [src/state/schema.ts:1-50]()

---

## Index Persistence

The BM25 and vector indexes are in-memory structures that are serialized and stored back to KV on a debounced schedule. `IndexPersistence` manages this lifecycle.

```typescript
// src/state/index-persistence.ts
const DEBOUNCE_MS = 5000;

export class IndexPersistence {
  scheduleSave(): void {
    // Debounces saves; resets timer on each call
    this.timer = setTimeout(() => {
      this.save().catch((err) => this.logFailure(err));
    }, DEBOUNCE_MS);
  }

  async save(): Promise<void> {
    await this.kv.set(KV.bm25Index, "data", this.bm25.serialize());
    if (this.vector && this.vector.size > 0) {
      await this.kv.set(KV.bm25Index, "vectors", this.vector.serialize());
    }
  }
}
```

Both `SearchIndex` and `VectorIndex` implement `serialize()` / `static deserialize()` methods. `SearchIndex` serializes to JSON (v2 schema: entries, inverted index, per-doc term counts, total doc length). `VectorIndex` encodes each `Float32Array` embedding as a base64 string.

A key invariant: on restore, `VectorIndex.validateDimensions()` checks that all stored embeddings match the current provider's expected dimension. If any mismatch is found—possible when a provider swap occurs mid-session—the index is rejected and rebuilt from scratch rather than producing corrupted results. Failures during `state::set` are throttled to one log line per 60 seconds to avoid noise under iii-engine queue pressure.

Sources: [src/state/index-persistence.ts:1-95](), [src/state/vector-index.ts:77-90]()

---

## BM25 Search Index

`SearchIndex` implements BM25+ scoring entirely in-process. It maintains:

- `entries`: map from `obsId` → `{ obsId, sessionId, termCount }`
- `invertedIndex`: term → set of `obsId`s
- `docTermCounts`: per-doc term frequency map
- `totalDocLength`: needed for average document length

The BM25 parameters are fixed at `k1 = 1.2` and `b = 0.75`—standard Okapi BM25 defaults.

### Tokenization Pipeline

Before indexing or querying, text goes through:

1. Unicode-aware cleaning: strip non-letter/non-digit characters
2. CJK detection: if the token contains CJK characters, it is split via `segmentCjk()` (bigram + unigram fallback); otherwise it is passed through the Porter-family `stem()` function
3. Synonym expansion at query time: synonyms from a bundled table receive a weight factor of 0.7 (vs. 1.0 for exact terms)
4. Prefix matching: a sorted term array supports binary-search prefix expansion during search, scored at half-IDF weight

Text is extracted from `CompressedObservation` fields: `title`, `subtitle`, `narrative`, `facts[]`, `concepts[]`, `files[]`, and `type`.

Sources: [src/state/search-index.ts:19-137](), [src/state/search-index.ts:212-258]()

---

## Vector Index

`VectorIndex` holds dense embeddings in a `Map<obsId, { embedding: Float32Array; sessionId }>`. Search runs a brute-force scan using cosine similarity:

```
cosineSimilarity(a, b) = dot(a, b) / (||a|| · ||b||)
```

A partial min-heap maintains the top-`limit` results without sorting the full list. Embeddings are only computed when an `EmbeddingProvider` is configured; if the provider is absent or throws, the vector stream is silently skipped and `HybridSearch` falls back to BM25 only.

Sources: [src/state/vector-index.ts:9-63]()

---

## HybridSearch: Three-Stream RRF Fusion

`HybridSearch` coordinates all three retrieval signals in `tripleStreamSearch()`. The nominal weights at construction time are:

| Stream | Default weight |
|---|---|
| BM25 (`bm25Weight`) | 0.4 |
| Vector cosine (`vectorWeight`) | 0.6 |
| Graph traversal (`graphWeight`) | 0.3 |

These are **re-normalized** at query time if any stream returns no results, so the effective weights always sum to 1.0:

```typescript
// src/state/hybrid-search.ts:197-206
const totalW = effectiveBm25W + effectiveVectorW + effectiveGraphW;
if (totalW > 0) {
  effectiveBm25W /= totalW;
  effectiveVectorW /= totalW;
  effectiveGraphW /= totalW;
}
```

This means: in a text-only deployment with no embedding provider and no matched graph entities, BM25 receives the full weight of 1.0.

### RRF Score Formula

Each candidate observation is scored using Reciprocal Rank Fusion with `RRF_K = 60`:

```
combinedScore =
  w_bm25  * 1/(60 + bm25Rank)
+ w_vector * 1/(60 + vectorRank)
+ w_graph  * 1/(60 + graphRank)
```

Observations absent from a stream receive `rank = Infinity`, which collapses `1/(60 + ∞)` to zero—a clean zero-contribution without special casing.

Sources: [src/state/hybrid-search.ts:20-20](), [src/state/hybrid-search.ts:194-219]()

### Search Flow

```text
query
  │
  ├─► BM25 SearchIndex.search()           (always)
  ├─► EmbeddingProvider.embed()
  │     └─► VectorIndex.search()          (if provider + index non-empty)
  └─► extractEntitiesFromQuery()
        └─► GraphRetrieval.searchByEntities()  (if entities found)
              └─► expandFromChunks()      (from top-5 vector hits)
  │
  ├─► Merge into per-obsId rank map
  ├─► Re-normalize weights
  ├─► Compute RRF combinedScore
  ├─► diversifyBySession()               (max 3 results per session)
  ├─► enrichResults()                    (fetch full Observation from KV)
  └─► rerank() (optional, env-gated)
        └─► Xenova/ms-marco-MiniLM-L-6-v2
```

Sources: [src/state/hybrid-search.ts:77-240]()

### Query Expansion

`searchWithExpansion()` accepts a `QueryExpansion` object containing `reformulations`, `temporalConcretizations`, and `entityExtractions`. It runs `tripleStreamSearch` in parallel for every expanded query string via `Promise.all`, then keeps only the highest-scoring result per `obsId` across all result sets before re-ranking.

Sources: [src/state/hybrid-search.ts:42-75]()

---

## Graph Retrieval

The graph component gives `HybridSearch` a third signal grounded in extracted entity relationships. `GraphRetrieval` operates over two KV scopes: `mem:graph:nodes` and `mem:graph:edges`.

### Entity Extraction and Matching

Entity names are extracted from the query string by `extractEntitiesFromQuery()` (from `src/functions/query-expansion.ts`). Node matching is case-insensitive substring overlap in both directions (query entity contains node name, or node name contains query entity).

### Graph Traversal: Dijkstra over Edge Weights

Traversal replaced a prior BFS with a Dijkstra implementation using `cost = 1 / max(edge.weight, 0.01)`. This prioritizes higher-weight edges (stronger relationships) and runs in O((V + E) log V) using an inline binary min-heap. Traversal is bounded by `maxDepth` (default 2 hops).

Score for a path of length `L` with edge weights `w₁…wₙ`:

```
score = avgWeight(w₁, …, wₙ) * (1 / pathLength)
```

Direct matches on the start node score 1.0. Stale nodes and edges (`.stale === true`) are pre-filtered before traversal.

```typescript
// src/functions/graph-retrieval.ts:82-88
const avgWeight = edgeWeights.length > 0
  ? edgeWeights.reduce((a, b) => a + b, 0) / edgeWeights.length
  : 0.5;
const score = avgWeight * (1 / pathLength);
```

### Graph Expansion from Vector Hits

After standard entity search, the top-5 vector results are used as seeds: `expandFromChunks()` looks up which graph nodes cite those observations, then traverses 1 hop from each. This bridges the vector and graph signals—a semantically close result can pull in structurally related observations even if the query contained no named entities.

Sources: [src/functions/graph-retrieval.ts:44-156]()

---

## Session Diversification

Before enrichment, `diversifyBySession()` caps results to 3 observations per session ID. If fewer than the requested limit are selected in the first pass, remaining observations are admitted without the cap. This prevents a single prolific session from dominating every recall result.

Sources: [src/state/hybrid-search.ts:242-276]()

---

## Reranker (Optional)

When `RERANK_ENABLED=true`, `rerank()` applies a cross-encoder pass over the top 20 RRF candidates using `Xenova/ms-marco-MiniLM-L-6-v2` (quantized, loaded lazily via `@xenova/transformers`). Input pairs are formed as `"<query> [SEP] <title> <narrative>"` truncated to 512 characters. The reranker reassigns `combinedScore` to the cross-encoder logit; the tail beyond the reranker window is appended unmodified.

If the model fails to load (missing optional dependency, incompatible runtime), `pipelineUnavailable` is set to `true` and all subsequent calls return results unchanged. This makes the reranker fully optional with no impact on the core retrieval path.

Sources: [src/state/reranker.ts:1-74]()

---

## Failure Modes and Invariants

| Condition | Behavior |
|---|---|
| No embedding provider | Vector stream skipped; BM25 weight re-normalized to 1.0 |
| No graph entities extracted | Graph stream skipped; weights re-normalized across BM25 + vector |
| Vector embed throws | `try/catch` swallows error; falls through to BM25-only |
| Graph search throws | `try/catch` swallows; graph weight set to 0 |
| Reranker model missing | `pipelineUnavailable = true`; RRF order preserved |
| KV `state::set` timeout | Logged (throttled to 1/min); index remains in memory, retried on next debounce |
| Wrong vector dimension on restore | `validateDimensions()` rejects index; rebuilt from scratch |
| Observation in index but deleted from KV | `enrichResults()` `.catch(() => null)` drops it silently |

The design ensures every path degrades gracefully: a fully offline or embedding-free deployment still retrieves results via BM25 alone, and graph or reranker failures never propagate exceptions to the caller.

Sources: [src/state/hybrid-search.ts:91-98](), [src/state/hybrid-search.ts:105-115](), [src/state/index-persistence.ts:77-94](), [src/state/vector-index.ts:77-90]()
