Agent-readable wiki

Understand Anything — Mental Model Wiki

A Claude Code plugin that turns any codebase into an interactive knowledge graph dashboard. It combines a multi-agent LLM pipeline, tree-sitter static analysis, and a React Flow UI so engineers can predict relationships, navigate architecture, and incrementally refresh the graph as code changes.

Pages

  1. The Mental Model: Graph-First Codebase UnderstandingThe single simplest model of the whole system: a CLI plugin triggers a sequential multi-agent pipeline that emits one JSON knowledge graph, which is then rendered as an interactive React Flow dashboard. Everything else — tree-sitter extractors, parsers, staleness checks, Zustand store — supports this central invariant. Understanding this flow lets you predict where any new feature, bug, or change lands.
  2. Agent Pipeline: Five Phases from Scan to GraphThe /understand skill orchestrates five sequential agent phases: project-scanner (file discovery), file-analyzer (per-file nodes), architecture-analyzer (cross-cutting edges), tour-builder (guided walkthroughs), and assemble-reviewer (graph assembly and validation). Agents write intermediate JSON to .understand-anything/intermediate/ to avoid polluting context; results are merged and cleaned up. Auto-update mode replays stale phases after a git commit via the PostToolUse hook.
  3. Schema & Type Contracts: Nodes, Edges, and AliasesThe knowledge graph is defined by two Zod schemas: NodeTypeSchema (21 canonical node types such as file, function, class, domain, article) and EdgeTypeSchema (35 edge types across 8 categories: structural, behavioral, data-flow, dependencies, semantic, infrastructure, domain, knowledge). Alias maps (NODE_TYPE_ALIASES, EDGE_TYPE_ALIASES) normalize LLM-generated variants to canonical forms at assembly time. This schema is the contract between the agent pipeline and the dashboard — both sides import from @understand-anything/core/types and @understand-anything/core/schema.
  4. Static Analysis: Tree-Sitter Extractors & ParsersTwo plugin families produce deterministic graph nodes without LLM calls. Language extractors (TypeScript, Python, Go, Java, Rust, C++, Ruby, C#, PHP) use web-tree-sitter (WASM) via tree-sitter-plugin.ts to parse ASTs and emit function/class/module nodes. Config parsers (JSON, YAML, TOML, SQL, GraphQL, Dockerfile, Protobuf, Makefile, shell, Markdown, Terraform, .env) extract config/schema/document nodes. The plugin registry in registry.ts and discovery.ts wires both families together. The WASM constraint — no native bindings — is a hard invariant: never swap in the native tree-sitter package.
  5. Staleness Detection & Incremental UpdatesThe knowledge graph is stored as .understand-anything/knowledge-graph.json alongside config.json (which records the last analyzed commit hash and user preferences such as language and autoUpdate). On each /understand invocation, staleness.ts compares git diff lastCommitHash..HEAD; if files changed, only affected nodes are removed and re-analyzed (incremental mode). --full forces a complete rebuild. The auto-update hook re-triggers analysis after every git commit when autoUpdate is true. Worktree redirect is a critical invariant: graphs generated inside a Claude Code worktree are redirected to the main repo root to prevent ephemeral loss.
  6. Dashboard State Machine: Zustand Store & View ModesThe dashboard's single Zustand store (store.ts) owns all runtime state: the loaded KnowledgeGraph, active Persona (non-technical / junior / experienced), ViewMode (structural / domain / knowledge), FilterState (node types, complexities, layers, edge categories), selected node, search results via SearchEngine (Fuse.js fuzzy search on name/tags/summary/languageNotes), and the React Flow instance. The store is the single source of truth — components never hold local graph state. Key boundary: dashboard imports only from @understand-anything/core/search, /types, and /schema (browser-safe subpath exports); never the core main entry point, which pulls in Node.js modules.
  7. Skill Surface: /understand, /understand-chat, /understand-diff & HooksEight skills are exposed: /understand (full graph build), /understand-dashboard (opens dashboard), /understand-chat (Q&A against the graph using context-builder.ts), /understand-diff (change analysis via diff-analyzer.ts), /understand-explain (node explanation via explain-builder.ts), /understand-onboard (onboarding guide via onboard-builder.ts), /understand-domain, and /understand-knowledge. The @understand-anything/skill package exports typed builders consumed by the chat/diff/explain/onboard skills. Hooks (hooks.json) fire PostToolUse on git commit to trigger auto-update and a PreToolUse hook to auto-update before /understand-chat responses. Agent models are all set to inherit for cross-platform compatibility.
  8. Invariants, Failure Modes & Safe-Change RulesA synthesis of every load-bearing constraint in the system. Hard invariants: (1) use web-tree-sitter (WASM) only — native bindings break on darwin/arm64 + Node 24; (2) dashboard imports only browser-safe core subpath exports; (3) graphs inside git worktrees are redirected to the main repo root; (4) all five version fields must be bumped in sync when releasing. Key failure modes: stale graph after code changes (fix: run /understand or enable autoUpdate), broken incremental update when lastCommitHash is missing from config.json (fix: --full rebuild), dashboard blank on schema mismatch (fix: check WarningBanner, validate graph JSON against schema.ts). Safe-change rules: adding a new language extractor only requires a new file under extractors/ plus registry entry; adding a new edge type requires updating schema.ts alias maps and the EDGE_CATEGORY_MAP in store.ts; dashboard layout changes are isolated to components/ and never touch core.

Complete Markdown

# Understand Anything — Mental Model Wiki

> A Claude Code plugin that turns any codebase into an interactive knowledge graph dashboard. It combines a multi-agent LLM pipeline, tree-sitter static analysis, and a React Flow UI so engineers can predict relationships, navigate architecture, and incrementally refresh the graph as code changes.

## Context Links

- [Agent index](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/llms.txt)
- [Human interactive wiki](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896)
- [GitHub repository](https://github.com/Lum1104/Understand-Anything)

## Repository Metadata

- Repository: Lum1104/Understand-Anything

- Generated: 2026-05-22T00:58:23.057Z
- Updated: 2026-05-22T00:58:34.378Z
- Runtime: Claude Code
- Format: Mental Model
- Pages: 8

## Page Index

- 01. [The Mental Model: Graph-First Codebase Understanding](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/01-the-mental-model-graph-first-codebase-understanding.md) - The single simplest model of the whole system: a CLI plugin triggers a sequential multi-agent pipeline that emits one JSON knowledge graph, which is then rendered as an interactive React Flow dashboard. Everything else — tree-sitter extractors, parsers, staleness checks, Zustand store — supports this central invariant. Understanding this flow lets you predict where any new feature, bug, or change lands.
- 02. [Agent Pipeline: Five Phases from Scan to Graph](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/02-agent-pipeline-five-phases-from-scan-to-graph.md) - The /understand skill orchestrates five sequential agent phases: project-scanner (file discovery), file-analyzer (per-file nodes), architecture-analyzer (cross-cutting edges), tour-builder (guided walkthroughs), and assemble-reviewer (graph assembly and validation). Agents write intermediate JSON to .understand-anything/intermediate/ to avoid polluting context; results are merged and cleaned up. Auto-update mode replays stale phases after a git commit via the PostToolUse hook.
- 03. [Schema & Type Contracts: Nodes, Edges, and Aliases](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/03-schema-type-contracts-nodes-edges-and-aliases.md) - The knowledge graph is defined by two Zod schemas: NodeTypeSchema (21 canonical node types such as file, function, class, domain, article) and EdgeTypeSchema (35 edge types across 8 categories: structural, behavioral, data-flow, dependencies, semantic, infrastructure, domain, knowledge). Alias maps (NODE_TYPE_ALIASES, EDGE_TYPE_ALIASES) normalize LLM-generated variants to canonical forms at assembly time. This schema is the contract between the agent pipeline and the dashboard — both sides import from @understand-anything/core/types and @understand-anything/core/schema.
- 04. [Static Analysis: Tree-Sitter Extractors & Parsers](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/04-static-analysis-tree-sitter-extractors-parsers.md) - Two plugin families produce deterministic graph nodes without LLM calls. Language extractors (TypeScript, Python, Go, Java, Rust, C++, Ruby, C#, PHP) use web-tree-sitter (WASM) via tree-sitter-plugin.ts to parse ASTs and emit function/class/module nodes. Config parsers (JSON, YAML, TOML, SQL, GraphQL, Dockerfile, Protobuf, Makefile, shell, Markdown, Terraform, .env) extract config/schema/document nodes. The plugin registry in registry.ts and discovery.ts wires both families together. The WASM constraint — no native bindings — is a hard invariant: never swap in the native tree-sitter package.
- 05. [Staleness Detection & Incremental Updates](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/05-staleness-detection-incremental-updates.md) - The knowledge graph is stored as .understand-anything/knowledge-graph.json alongside config.json (which records the last analyzed commit hash and user preferences such as language and autoUpdate). On each /understand invocation, staleness.ts compares git diff lastCommitHash..HEAD; if files changed, only affected nodes are removed and re-analyzed (incremental mode). --full forces a complete rebuild. The auto-update hook re-triggers analysis after every git commit when autoUpdate is true. Worktree redirect is a critical invariant: graphs generated inside a Claude Code worktree are redirected to the main repo root to prevent ephemeral loss.
- 06. [Dashboard State Machine: Zustand Store & View Modes](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/06-dashboard-state-machine-zustand-store-view-modes.md) - The dashboard's single Zustand store (store.ts) owns all runtime state: the loaded KnowledgeGraph, active Persona (non-technical / junior / experienced), ViewMode (structural / domain / knowledge), FilterState (node types, complexities, layers, edge categories), selected node, search results via SearchEngine (Fuse.js fuzzy search on name/tags/summary/languageNotes), and the React Flow instance. The store is the single source of truth — components never hold local graph state. Key boundary: dashboard imports only from @understand-anything/core/search, /types, and /schema (browser-safe subpath exports); never the core main entry point, which pulls in Node.js modules.
- 07. [Skill Surface: /understand, /understand-chat, /understand-diff & Hooks](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/07-skill-surface-understand-understand-chat-understand-diff-hooks.md) - Eight skills are exposed: /understand (full graph build), /understand-dashboard (opens dashboard), /understand-chat (Q&A against the graph using context-builder.ts), /understand-diff (change analysis via diff-analyzer.ts), /understand-explain (node explanation via explain-builder.ts), /understand-onboard (onboarding guide via onboard-builder.ts), /understand-domain, and /understand-knowledge. The @understand-anything/skill package exports typed builders consumed by the chat/diff/explain/onboard skills. Hooks (hooks.json) fire PostToolUse on git commit to trigger auto-update and a PreToolUse hook to auto-update before /understand-chat responses. Agent models are all set to inherit for cross-platform compatibility.
- 08. [Invariants, Failure Modes & Safe-Change Rules](https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/08-invariants-failure-modes-safe-change-rules.md) - A synthesis of every load-bearing constraint in the system. Hard invariants: (1) use web-tree-sitter (WASM) only — native bindings break on darwin/arm64 + Node 24; (2) dashboard imports only browser-safe core subpath exports; (3) graphs inside git worktrees are redirected to the main repo root; (4) all five version fields must be bumped in sync when releasing. Key failure modes: stale graph after code changes (fix: run /understand or enable autoUpdate), broken incremental update when lastCommitHash is missing from config.json (fix: --full rebuild), dashboard blank on schema mismatch (fix: check WarningBanner, validate graph JSON against schema.ts). Safe-change rules: adding a new language extractor only requires a new file under extractors/ plus registry entry; adding a new edge type requires updating schema.ts alias maps and the EDGE_CATEGORY_MAP in store.ts; dashboard layout changes are isolated to components/ and never touch core.

## Source File Index

- `README.md`
- `understand-anything-plugin/agents/architecture-analyzer.md`
- `understand-anything-plugin/agents/assemble-reviewer.md`
- `understand-anything-plugin/agents/file-analyzer.md`
- `understand-anything-plugin/agents/project-scanner.md`
- `understand-anything-plugin/agents/tour-builder.md`
- `understand-anything-plugin/CLAUDE.md`
- `understand-anything-plugin/hooks/auto-update-prompt.md`
- `understand-anything-plugin/hooks/hooks.json`
- `understand-anything-plugin/package.json`
- `understand-anything-plugin/packages/core/package.json`
- `understand-anything-plugin/packages/core/src/plugins/discovery.ts`
- `understand-anything-plugin/packages/core/src/plugins/extractors/base-extractor.ts`
- `understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts`
- `understand-anything-plugin/packages/core/src/plugins/parsers/index.ts`
- `understand-anything-plugin/packages/core/src/plugins/registry.ts`
- `understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.test.ts`
- `understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts`
- `understand-anything-plugin/packages/core/src/schema.ts`
- `understand-anything-plugin/packages/core/src/search.ts`
- `understand-anything-plugin/packages/core/src/staleness.ts`
- `understand-anything-plugin/packages/core/src/types.test.ts`
- `understand-anything-plugin/packages/dashboard/src/App.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/CodeViewer.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx`
- `understand-anything-plugin/packages/dashboard/src/store.ts`
- `understand-anything-plugin/skills/understand-chat/SKILL.md`
- `understand-anything-plugin/skills/understand/SKILL.md`
- `understand-anything-plugin/src/context-builder.ts`
- `understand-anything-plugin/src/diff-analyzer.ts`
- `understand-anything-plugin/src/explain-builder.ts`
- `understand-anything-plugin/src/index.ts`
- `understand-anything-plugin/src/onboard-builder.ts`

---

## 01. The Mental Model: Graph-First Codebase Understanding

> The single simplest model of the whole system: a CLI plugin triggers a sequential multi-agent pipeline that emits one JSON knowledge graph, which is then rendered as an interactive React Flow dashboard. Everything else — tree-sitter extractors, parsers, staleness checks, Zustand store — supports this central invariant. Understanding this flow lets you predict where any new feature, bug, or change lands.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/01-the-mental-model-graph-first-codebase-understanding.md
- Generated: 2026-05-22T00:56:12.572Z

### Source Files

- `README.md`
- `understand-anything-plugin/skills/understand/SKILL.md`
- `understand-anything-plugin/package.json`
- `understand-anything-plugin/packages/core/src/types.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [README.md](README.md)
- [understand-anything-plugin/skills/understand/SKILL.md](understand-anything-plugin/skills/understand/SKILL.md)
- [understand-anything-plugin/skills/understand-dashboard/SKILL.md](understand-anything-plugin/skills/understand-dashboard/SKILL.md)
- [understand-anything-plugin/package.json](understand-anything-plugin/package.json)
- [understand-anything-plugin/packages/core/src/types.ts](understand-anything-plugin/packages/core/src/types.ts)
- [understand-anything-plugin/packages/core/src/types.test.ts](understand-anything-plugin/packages/core/src/types.test.ts)
- [understand-anything-plugin/packages/core/src/schema.ts](understand-anything-plugin/packages/core/src/schema.ts)
- [understand-anything-plugin/packages/core/src/search.ts](understand-anything-plugin/packages/core/src/search.ts)
- [understand-anything-plugin/packages/core/src/staleness.ts](understand-anything-plugin/packages/core/src/staleness.ts)
- [understand-anything-plugin/packages/dashboard/src/App.tsx](understand-anything-plugin/packages/dashboard/src/App.tsx)
- [understand-anything-plugin/packages/dashboard/src/store.ts](understand-anything-plugin/packages/dashboard/src/store.ts)
- [understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx](understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx)
- [understand-anything-plugin/agents/project-scanner.md](understand-anything-plugin/agents/project-scanner.md)
</details>

# The Mental Model: Graph-First Codebase Understanding

Understand Anything is organized around one central invariant: a CLI skill (`/understand`) triggers a sequential multi-agent pipeline that produces a single JSON file — the **knowledge graph** — which a React Flow dashboard then renders as an interactive, explorable visualization. Every subsystem in the project either helps produce that JSON, validate it, keep it fresh, or display it. Understanding this pipeline-to-graph-to-dashboard flow lets you confidently predict where any feature, bug, or change belongs.

This page explains the end-to-end flow, the data contract that holds it together, and the architectural rules that keep each boundary clean. It is the starting point before reading any other part of the codebase.

---

## The Central Invariant

```text
CLI skill invoked
      │
      ▼
┌─────────────────────────────────────────┐
│         Multi-Agent Pipeline            │
│  (project-scanner → file-analyzer ×N   │
│   → architecture-analyzer → tour-builder│
│   → [reviewer])                         │
└────────────────┬────────────────────────┘
                 │ writes
                 ▼
    .understand-anything/
        knowledge-graph.json     ◄── THE OUTPUT
                 │
                 │ loaded by
                 ▼
┌─────────────────────────────────────────┐
│         React Flow Dashboard            │
│  (GraphView · sidebar · search ·        │
│   layers · tour · diff overlay)         │
└─────────────────────────────────────────┘
```

One JSON artifact is the only handoff between the pipeline and the UI. Agents never talk to the dashboard directly; the dashboard never re-runs agents. This strict separation means agents can be swapped, retried, or run incrementally without touching any UI code, and the dashboard can be deployed anywhere that can serve a static JSON file.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:683](understand-anything-plugin/skills/understand/SKILL.md), [understand-anything-plugin/packages/dashboard/src/App.tsx:133-163](understand-anything-plugin/packages/dashboard/src/App.tsx)

---

## The Knowledge Graph Schema: The Contract

The `KnowledgeGraph` type is the data contract every agent writes to and every dashboard component reads from. It is defined once in `@understand-anything/core`:

```typescript
// understand-anything-plugin/packages/core/src/types.ts:91-99
export interface KnowledgeGraph {
  version: string;
  kind?: "codebase" | "knowledge";
  project: ProjectMeta;
  nodes: GraphNode[];
  edges: GraphEdge[];
  layers: Layer[];
  tour: TourStep[];
}
```

The top-level structure has five concerns:

| Field | Purpose | Populated by |
|---|---|---|
| `project` | Name, languages, frameworks, timestamp, git hash | Phase 0/1 (project-scanner) |
| `nodes` | Every file, function, class, config, service, etc. | Phase 2 (file-analyzer) |
| `edges` | All relationships between nodes | Phase 2 (file-analyzer) |
| `layers` | Architectural groupings (API, Service, Data, UI…) | Phase 4 (architecture-analyzer) |
| `tour` | Ordered learning steps referencing node IDs | Phase 5 (tour-builder) |

Sources: [understand-anything-plugin/packages/core/src/types.ts:38-99](understand-anything-plugin/packages/core/src/types.ts), [understand-anything-plugin/packages/core/src/types.test.ts:4-28](understand-anything-plugin/packages/core/src/types.test.ts)

### Node Types

The schema supports 21 node types across four domains:

| Domain | Types |
|---|---|
| Code | `file`, `function`, `class`, `module`, `concept` |
| Non-code | `config`, `document`, `service`, `table`, `endpoint`, `pipeline`, `schema`, `resource` |
| Domain | `domain`, `flow`, `step` |
| Knowledge | `article`, `entity`, `topic`, `claim`, `source` |

Node IDs follow a `<type>:<relative-path>` convention (e.g., `file:src/auth/login.ts`, `function:src/auth/login.ts:handleLogin`). This makes IDs human-readable and stable across incremental updates.

Sources: [understand-anything-plugin/packages/core/src/types.ts:1-7](understand-anything-plugin/packages/core/src/types.ts)

### Edge Types

Thirty-five edge types span eight semantic categories:

```
Structural:     imports, exports, contains, inherits, implements
Behavioral:     calls, subscribes, publishes, middleware
Data flow:      reads_from, writes_to, transforms, validates
Dependencies:   depends_on, tested_by, configures
Semantic:       related, similar_to
Infrastructure: deploys, serves, provisions, triggers
Domain:         contains_flow, flow_step, cross_domain
Knowledge:      cites, contradicts, builds_on, exemplifies, categorized_under, authored_by
```

Every edge has a `weight` (0–1) encoding relationship strength, and a `direction` (`forward`, `backward`, `bidirectional`). The schema layer (`schema.ts`) normalizes LLM-generated aliases at load time — for example `"func"` → `"function"`, `"container"` → `"service"` — so downstream code only ever sees canonical types.

Sources: [understand-anything-plugin/packages/core/src/types.ts:9-19](understand-anything-plugin/packages/core/src/types.ts), [understand-anything-plugin/packages/core/src/schema.ts:1-60](understand-anything-plugin/packages/core/src/schema.ts)

---

## The Multi-Agent Pipeline

The `/understand` skill orchestrates a **sequential, phase-gated pipeline** with parallelism inside Phase 2. Phases hand off via files written to `.understand-anything/intermediate/` — no agent returns data directly to the orchestrating skill context.

```mermaid
sequenceDiagram
    participant User
    participant Skill as /understand (SKILL.md)
    participant Scanner as project-scanner
    participant Analyzers as file-analyzer ×N (≤5 concurrent)
    participant Assembler as merge-batch-graphs.py
    participant ArchAgent as architecture-analyzer
    participant TourAgent as tour-builder
    participant Reviewer as inline validator / graph-reviewer
    participant FS as .understand-anything/

    User->>Skill: /understand [options]
    Skill->>Skill: Phase 0 — pre-flight, staleness check
    Skill->>Scanner: Phase 1 — scan project
    Scanner->>FS: scan-result.json
    Skill->>Analyzers: Phase 2 — analyze batches (parallel)
    Analyzers->>FS: batch-0.json … batch-N.json
    Skill->>Assembler: merge-batch-graphs.py
    Assembler->>FS: assembled-graph.json
    Skill->>ArchAgent: Phase 4 — layer assignment
    ArchAgent->>FS: layers.json
    Skill->>TourAgent: Phase 5 — tour generation
    TourAgent->>FS: tour.json
    Skill->>Reviewer: Phase 6 — validate assembled graph
    Reviewer->>FS: review.json
    Skill->>FS: Phase 7 — write knowledge-graph.json
    Skill->>User: summary + auto-launch /understand-dashboard
```

### Phase 0: Pre-flight and Staleness

Before any agent is dispatched, the skill:
1. Resolves `PROJECT_ROOT` (supports cross-worktree redirect to avoid ephemeral paths).
2. Reads `.understand-anything/meta.json` to get the last `gitCommitHash`.
3. Runs `git diff <lastHash>..HEAD --name-only` to detect changed files.
4. Chooses between **full rebuild**, **incremental update** (only changed files), or **no-op** (graph current).

The staleness logic is also available as a library function in core:

```typescript
// understand-anything-plugin/packages/core/src/staleness.ts:34-43
export function isStale(projectDir: string, lastCommitHash: string): StalenessResult {
  const changedFiles = getChangedFiles(projectDir, lastCommitHash);
  return { stale: changedFiles.length > 0, changedFiles };
}
```

Sources: [understand-anything-plugin/skills/understand/SKILL.md:25-160](understand-anything-plugin/skills/understand/SKILL.md), [understand-anything-plugin/packages/core/src/staleness.ts:1-43](understand-anything-plugin/packages/core/src/staleness.ts)

### Phase 1: Scan

The `project-scanner` agent writes `scan-result.json` containing the full file list with `fileCategory` per file (`code`, `config`, `docs`, `infra`, `data`, `script`, `markup`), detected languages and frameworks, and a pre-resolved `importMap`. The skill stores `importMap` in memory for injection into Phase 2 batches, avoiding redundant import resolution work by agents.

Sources: [understand-anything-plugin/agents/project-scanner.md:1-8](understand-anything-plugin/agents/project-scanner.md), [understand-anything-plugin/skills/understand/SKILL.md:214-248](understand-anything-plugin/skills/understand/SKILL.md)

### Phase 2: Parallel File Analysis

Files are batched in groups of 20–30. Up to **5 `file-analyzer` agents run concurrently**, each producing a `batch-N.json`. After all batches complete, `merge-batch-graphs.py` runs a single-pass merge that:

- Combines nodes and edges across all batches
- Normalizes node IDs (strips double prefixes, adds missing type prefixes)
- Normalizes complexity values (`low` → `simple`, `high` → `complex`)
- Deduplicates by ID and by `(source, target, type)` triple
- Drops dangling edges referencing missing nodes
- Runs a `tested_by` linker that canonicalizes test-coverage edges and flips LLM-inverted ones

Output is `intermediate/assembled-graph.json`.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:259-328](understand-anything-plugin/skills/understand/SKILL.md)

### Phase 4: Architecture Layer Assignment

The `architecture-analyzer` agent takes all file-level nodes and all edges, applies language/framework context files (from `languages/` and `frameworks/` subdirectories), and assigns each node to an architectural layer. The skill normalizes the output: unwraps envelope JSON, renames `nodes` → `nodeIds`, synthesizes missing IDs, converts raw file paths to typed node IDs, and drops dangling references.

Layers are the only structural metadata that groups nodes for visualization. Without valid layers, no node appears in the layer-cluster view.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:369-437](understand-anything-plugin/skills/understand/SKILL.md)

### Phase 5: Tour Generation

The `tour-builder` agent produces an ordered array of `TourStep` objects. Each step has a title, description, and a list of node IDs to highlight. The skill normalizes field names (`nodesToInspect` → `nodeIds`), drops dangling node references, and sorts by `order`. Tours power the Learn persona in the dashboard sidebar.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:450-518](understand-anything-plugin/skills/understand/SKILL.md), [understand-anything-plugin/packages/core/src/types.ts:71-78](understand-anything-plugin/packages/core/src/types.ts)

### Phase 6: Validation

By default the skill runs an inline Node.js validator (written to `.understand-anything/tmp/ua-inline-validate.cjs`) that checks structural integrity: required node fields, duplicate IDs, dangling edge references, nodes missing from layers, and tour steps referencing absent nodes. With `--review`, a full LLM `graph-reviewer` agent runs instead. Issues trigger automated fixes (remove dangling edges, fill missing fields) before the final save.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:554-677](understand-anything-plugin/skills/understand/SKILL.md)

### Phase 7: Save and Cleanup

The skill writes `knowledge-graph.json`, generates a structural fingerprints baseline (required for correct future incremental updates — see issue #152 comment in the SKILL.md), writes `meta.json` with the current git hash, then removes all `intermediate/` and `tmp/` files. After a successful save, `/understand-dashboard` is auto-launched.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:683-736](understand-anything-plugin/skills/understand/SKILL.md)

---

## The Dashboard: Rendering the Graph

The dashboard (`packages/dashboard`) is a **React + Vite SPA**. It has no knowledge of agents or Node.js pipeline internals. Its only coupling to the pipeline is the JSON file it fetches at startup.

### Load Path

```typescript
// understand-anything-plugin/packages/dashboard/src/App.tsx:133-163
useEffect(() => {
  fetch(dataUrl("knowledge-graph.json", accessToken))
    .then((res) => res.json())
    .then((data: unknown) => {
      const result = validateGraph(data);   // schema validation
      if (result.success && result.data) {
        setGraph(result.data);              // Zustand store
        ...
      }
    });
}, [setGraph]);
```

The dashboard fetches `knowledge-graph.json` (and optionally `domain-graph.json`, `diff-overlay.json`, `config.json`, `meta.json`) from the Vite dev server. A token gate (`TokenGate`) protects the endpoints — the Vite server generates a one-time token printed to the terminal. In demo mode (`VITE_DEMO_MODE=true`), the token gate is bypassed and URLs come from environment variables.

Sources: [understand-anything-plugin/packages/dashboard/src/App.tsx:49-163](understand-anything-plugin/packages/dashboard/src/App.tsx)

### Schema Validation on Load

Before the graph reaches any React component, it passes through `validateGraph()` from `@understand-anything/core/schema`. Auto-correctable issues (alias normalization, missing optional fields) are silently fixed and logged; fatal structural errors surface as a `WarningBanner` in the UI.

Sources: [understand-anything-plugin/packages/dashboard/src/App.tsx:136-152](understand-anything-plugin/packages/dashboard/src/App.tsx), [understand-anything-plugin/packages/core/src/schema.ts:17-60](understand-anything-plugin/packages/core/src/schema.ts)

### Zustand Store: Single State Owner

The dashboard's entire runtime state lives in a Zustand store (`store.ts`). It owns the loaded `KnowledgeGraph`, the selected node, filters, personas, view mode, and navigation level. All components derive their display from this store via selectors.

Two layer indexes are maintained in the store simultaneously and intentionally kept separate:

- `nodeIdToLayerId` — first-wins mapping, used for navigation (drillIntoLayer, sidebar history)
- `nodeIdToLayerIds` — all-layers set, used for filter queries (a node in multiple layers should survive if any selected layer matches)

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:54-95](understand-anything-plugin/packages/dashboard/src/store.ts)

### React Flow and the Graph Layout

`GraphView.tsx` uses `@xyflow/react` (React Flow) for the interactive canvas. It translates `GraphNode[]` and `GraphEdge[]` from the store into React Flow `Node` and `Edge` objects, runs an ELK layout pass, and renders four custom node types:

| React Flow node type | Represents |
|---|---|
| `custom` | Individual graph nodes (file, function, class, etc.) |
| `layer-cluster` | A collapsed architectural layer |
| `portal` | Cross-layer entry/exit point |
| `container` | An expanded layer showing all member nodes |

The `ViewMode` in the store switches between `"structural"` (the main dependency graph), `"domain"` (business domains via `DomainGraphView`), and `"knowledge"` (wiki/knowledge base via `KnowledgeGraphView`). The same `KnowledgeGraph` JSON is reinterpreted for each view mode.

Sources: [understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx:1-60](understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx), [understand-anything-plugin/packages/dashboard/src/store.ts:17](understand-anything-plugin/packages/dashboard/src/store.ts)

### Search Engine

The `SearchEngine` class in `@understand-anything/core/search` wraps Fuse.js with a multi-field weighted index over `GraphNode[]`. It uses an OR-token strategy: a query like `"auth controller"` becomes `"auth | controller"` so either token matches.

```typescript
// understand-anything-plugin/packages/core/src/search.ts:14-25
const FUSE_OPTIONS = {
  keys: [
    { name: "name", weight: 0.4 },
    { name: "tags", weight: 0.3 },
    { name: "summary", weight: 0.2 },
    { name: "languageNotes", weight: 0.1 },
  ],
  threshold: 0.4,
  ...
};
```

The store instantiates a `SearchEngine` whenever a new graph is loaded. Search results reference node IDs, which the store uses to highlight matching nodes in the graph view.

Sources: [understand-anything-plugin/packages/core/src/search.ts:1-50](understand-anything-plugin/packages/core/src/search.ts), [understand-anything-plugin/packages/dashboard/src/store.ts:2](understand-anything-plugin/packages/dashboard/src/store.ts)

---

## Dependency Direction and Module Boundaries

The dependency graph flows strictly one way:

```text
understand-anything-plugin/
├── packages/core          ← shared types, schema, search, staleness (no UI, no Node.js in browser exports)
│     └── subpath exports: ./types  ./schema  ./search
├── packages/dashboard     ← imports core via subpath exports ONLY
│     (never imports core's main entry — avoids pulling Node.js modules into the browser)
└── src/                   ← skill TypeScript source (Node.js, imports core freely)

agents/                    ← LLM agent definitions (markdown), no TypeScript
skills/                    ← skill definitions (markdown), no TypeScript
```

The critical rule: the dashboard must only import from core's browser-safe subpath exports (`./types`, `./schema`, `./search`). The main entry point pulls in Node.js modules (`child_process`, `execFileSync` in `staleness.ts`) that will break Vite's browser build.

Sources: [CLAUDE.md](CLAUDE.md), [understand-anything-plugin/packages/dashboard/src/store.ts:1-9](understand-anything-plugin/packages/dashboard/src/store.ts), [understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx:28-32](understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx)

---

## Incremental Update Invariant

The graph is git-aware by design. `meta.json` stores the `gitCommitHash` at analysis time. On every subsequent `/understand` invocation, the skill compares that hash to HEAD. If files changed, only those files are re-analyzed; the merge script then surgically removes and replaces their nodes and edges in the existing graph.

A fingerprints baseline (written in Phase 7 before `meta.json`) feeds the `change-classifier` during auto-update mode. **If `meta.json` is written before the fingerprints baseline**, the auto-updater sees a fresh commit hash with no baseline to compare against and escalates every future commit to a full rebuild (issue #152). The SKILL.md enforces this ordering explicitly.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:685-706](understand-anything-plugin/skills/understand/SKILL.md), [understand-anything-plugin/packages/core/src/staleness.ts:44-55](understand-anything-plugin/packages/core/src/staleness.ts)

---

## What Breaks If the Design Changes

| Invariant | What breaks if violated |
|---|---|
| All agents write to `intermediate/`, never return data directly | The orchestrating skill's context window grows unbounded for large projects; retry logic breaks because there is no file to reuse |
| `knowledge-graph.json` is the single handoff | Dashboard has no way to load partial results; real-time agent streaming would require a protocol redesign |
| Dashboard imports only core subpath exports | Node.js modules (`child_process`, `execFileSync`) are bundled into the Vite build and crash in the browser |
| `meta.json` written after fingerprints baseline | Auto-update sees every file as structurally changed and runs a full rebuild on every commit |
| Node IDs follow `<type>:<path>` convention | Tour step and layer `nodeIds` arrays reference IDs that do not exist after incremental updates; the validator's dangling-ref checks catch this but it degrades the graph |
| Schema validation happens at dashboard load | Silent data corruption from buggy agents reaches the React Flow renderer and produces invisible nodes or crashes |

---

## Summary

The entire system is an expression of one idea: a sequential, file-mediated multi-agent pipeline produces one well-typed JSON artifact, and a stateless React dashboard renders it. The `KnowledgeGraph` schema — defined in `@understand-anything/core/src/types.ts` with 21 node types, 35 edge types, layers, and tour steps — is the contract that makes every phase independently testable and every component independently replaceable. Any new feature either enriches that JSON (add a field, a node type, or an edge category) or reads from it (add a dashboard panel, a filter, or a search mode).

Sources: [understand-anything-plugin/packages/core/src/types.ts:1-99](understand-anything-plugin/packages/core/src/types.ts)

---

## 02. Agent Pipeline: Five Phases from Scan to Graph

> The /understand skill orchestrates five sequential agent phases: project-scanner (file discovery), file-analyzer (per-file nodes), architecture-analyzer (cross-cutting edges), tour-builder (guided walkthroughs), and assemble-reviewer (graph assembly and validation). Agents write intermediate JSON to .understand-anything/intermediate/ to avoid polluting context; results are merged and cleaned up. Auto-update mode replays stale phases after a git commit via the PostToolUse hook.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/02-agent-pipeline-five-phases-from-scan-to-graph.md
- Generated: 2026-05-22T00:55:02.600Z

### Source Files

- `understand-anything-plugin/agents/project-scanner.md`
- `understand-anything-plugin/agents/file-analyzer.md`
- `understand-anything-plugin/agents/architecture-analyzer.md`
- `understand-anything-plugin/agents/tour-builder.md`
- `understand-anything-plugin/agents/assemble-reviewer.md`
- `understand-anything-plugin/hooks/hooks.json`
- `understand-anything-plugin/skills/understand/SKILL.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/agents/project-scanner.md](understand-anything-plugin/agents/project-scanner.md)
- [understand-anything-plugin/agents/file-analyzer.md](understand-anything-plugin/agents/file-analyzer.md)
- [understand-anything-plugin/agents/architecture-analyzer.md](understand-anything-plugin/agents/architecture-analyzer.md)
- [understand-anything-plugin/agents/tour-builder.md](understand-anything-plugin/agents/tour-builder.md)
- [understand-anything-plugin/agents/assemble-reviewer.md](understand-anything-plugin/agents/assemble-reviewer.md)
- [understand-anything-plugin/hooks/hooks.json](understand-anything-plugin/hooks/hooks.json)
- [understand-anything-plugin/hooks/auto-update-prompt.md](understand-anything-plugin/hooks/auto-update-prompt.md)
- [understand-anything-plugin/skills/understand/SKILL.md](understand-anything-plugin/skills/understand/SKILL.md)
- [understand-anything-plugin/skills/understand/merge-batch-graphs.py](understand-anything-plugin/skills/understand/merge-batch-graphs.py)
</details>

# Agent Pipeline: Five Phases from Scan to Graph

The `/understand` skill drives a sequential, multi-agent pipeline that transforms a raw codebase into an interactive knowledge graph. Each phase is handled by a dedicated agent that writes its results as intermediate JSON to `.understand-anything/intermediate/`, keeping large data artifacts out of the orchestrating context window. A final merge script and two validation passes stitch all batch outputs into a single `knowledge-graph.json`.

Understanding this pipeline matters when diagnosing partial failures, reasoning about what gets regenerated on an incremental run, or extending the system with new analysis types. Every design choice in the pipeline — writing to disk, batching files, running agents in parallel — is motivated by a single invariant: the orchestrator never accumulates large result payloads itself.

---

## Overview: Phases and Output Files

```text
Phase 0    Pre-flight + config     (orchestrator, no agent)
Phase 0.5  Ignore config           (orchestrator, no agent)
Phase 1    SCAN                    → .understand-anything/intermediate/scan-result.json
Phase 2    ANALYZE (batched)       → .understand-anything/intermediate/batch-N.json
           merge-batch-graphs.py   → .understand-anything/intermediate/assembled-graph.json
Phase 3    ASSEMBLE REVIEW         → .understand-anything/intermediate/assemble-review.json
Phase 4    ARCHITECTURE            → .understand-anything/intermediate/layers.json
Phase 5    TOUR                    → .understand-anything/intermediate/tour.json
Phase 6    REVIEW + assemble       → .understand-anything/intermediate/assembled-graph.json
Phase 7    SAVE + clean up         → .understand-anything/knowledge-graph.json
                                   → .understand-anything/meta.json
                                   → .understand-anything/fingerprints.json
```

Sources: [understand-anything-plugin/skills/understand/SKILL.md:214-718]()

---

## Phase 1: project-scanner — File Discovery and Import Resolution

The `project-scanner` agent performs a two-phase scan. In **Phase 1** it writes and executes a deterministic Node.js (or Python) script that:

1. Discovers all tracked files via `git ls-files` (falling back to a recursive listing), then applies a multi-tier exclusion filter: hardcoded defaults (node_modules, lock files, binaries) overlaid with optional user patterns from `.understand-anything/.understandignore` via the `createIgnoreFilter` function from `@understand-anything/core`.
2. Classifies each file into one of seven `fileCategory` values: `code`, `config`, `docs`, `infra`, `data`, `script`, `markup`.
3. Detects languages (20+ extension mappings), frameworks (via manifest inspection of `package.json`, `Cargo.toml`, `go.mod`, `pyproject.toml`, etc.), and estimates project complexity (`small`/`moderate`/`large`/`very-large`).
4. Builds a full `importMap`: for every **code-category** file it resolves project-internal imports using language-appropriate patterns — including TypeScript path aliases from `tsconfig.json`, Python absolute imports, Go module paths, Rust `use crate::` — and writes each file's resolved import list as an array of project-relative paths. External packages are silently dropped.

In **Phase 2** the agent reads the script's JSON output and synthesizes a human-readable `description` field from `rawDescription` or the first 10 lines of `README.md`.

The final output is written to `intermediate/scan-result.json` and includes the complete `importMap`. The orchestrator stores this as `$IMPORT_MAP` and `$FILE_LIST` for use in Phase 2 batch construction.

Sources: [understand-anything-plugin/agents/project-scanner.md:1-365]()

---

## Phase 2: file-analyzer — Per-File Nodes and Edges

The orchestrator batches the file list into groups of **20–30 files** and dispatches up to **5 concurrent** `file-analyzer` subagents. Each agent receives:

- Its batch file list (with `path`, `language`, `sizeLines`, `fileCategory`).
- A `batchImportData` slice of `$IMPORT_MAP` covering only the files in that batch.

Each `file-analyzer` also runs a two-phase workflow:

**Structural extraction (Phase 1):** Invokes the bundled `extract-structure.mjs` script, which uses `web-tree-sitter` (WASM) to extract functions, classes, exports, and call-graph entries from 10 languages. For non-code files it extracts sections, definitions, services (Docker), endpoints (OpenAPI), CI steps, and Terraform resources. Languages without tree-sitter support (Swift, Kotlin, shell scripts) fall back to regex matching for function signatures.

**Semantic analysis (Phase 2):** Uses the structured extraction as the basis for producing `GraphNode` and `GraphEdge` objects. Node types depend on `fileCategory`:

| fileCategory | Node type(s) |
|---|---|
| `code` | `file`, `function`, `class` |
| `config` | `config` |
| `docs` | `document` |
| `infra` | `service` / `pipeline` / `resource` |
| `data` | `table` / `schema` / `endpoint` |

For import edges, the agent follows a strict 1:1 rule: for every path in `batchImportData[file]`, it emits exactly one `imports` edge. It must not drop or aggregate imports; the orchestrator's merge script can recover missed ones but it is the agent's responsibility to emit all of them.

Each batch writes its output to `intermediate/batch-<N>.json`.

After all batches complete, the orchestrator runs:

```bash
python <SKILL_DIR>/merge-batch-graphs.py $PROJECT_ROOT
```

This script reads all `batch-*.json` files, normalizes node IDs (strips double-prefixes, adds missing type prefixes, canonicalizes `func:` → `function:`), normalizes complexity values, rewrites dangling edge references, deduplicates nodes and edges, and runs a `tested_by` linker that canonicalizes test-coverage edges via both LLM emission and path-convention pairing (e.g., `X.ts` ↔ `X.test.ts`, `src/main/java` ↔ `src/test/java`). The result is `intermediate/assembled-graph.json`.

Sources: [understand-anything-plugin/agents/file-analyzer.md:1-476](), [understand-anything-plugin/skills/understand/SKILL.md:259-328](), [understand-anything-plugin/skills/understand/merge-batch-graphs.py:1-60]()

---

## Phase 3: assemble-reviewer — Semantic Review and Gap Recovery

The `assemble-reviewer` is the quality-control agent that handles what `merge-batch-graphs.py` could not fix mechanically. It receives the merge script's stderr report (which lists fixed items and unfixable items) and the full `$IMPORT_MAP`.

Its task has four steps:

1. **Sanity-check the "Fixed" section.** If >30% of nodes required ID correction, it flags this as a systemic upstream issue.
2. **Investigate "Could not fix" items.** For nodes missing an `id` field, it reconstructs the ID from `type`, `filePath`, and `name`. It remaps unknown node types (e.g., `"svc"` → `"service"`) and maps unknown complexity values to the nearest valid value.
3. **Check for cross-batch edge gaps.** For every import relationship in `$IMPORT_MAP`, it verifies a corresponding `imports` edge exists in the assembled graph. Missing edges backed by the import map are added with `weight: 0.7`; speculative edges are never added.
4. **Apply fixes in-place** to `assembled-graph.json` and write a `assemble-review.json` summary.

Sources: [understand-anything-plugin/agents/assemble-reviewer.md:1-97](), [understand-anything-plugin/skills/understand/SKILL.md:344-366]()

---

## Phase 4: architecture-analyzer — Cross-Cutting Layer Assignment

The `architecture-analyzer` assigns every file-level node to exactly one of 3–10 architectural layers. Like the other agents, it runs a **two-phase** workflow.

**Structural analysis script (Phase 1):** Computes directory groupings, node-type distribution, import adjacency matrices (fan-in, fan-out per file), inter-group and intra-group import density, directory-name pattern matching (`routes` → `api`, `services` → `service`, etc.), deployment topology detection (Dockerfile/K8s/Terraform chain), data pipeline detection, and documentation coverage ratios.

**Semantic assignment (Phase 2):** Uses the script's output to select layers and assign every node. Non-code nodes follow type-based rules: `config` → Configuration layer, `document` → Documentation layer, `service`/`resource` → Infrastructure layer, `pipeline` → CI/CD layer, `table`/`schema`/`endpoint` → Data layer. For incremental runs, previous layer definitions are injected to enforce naming consistency.

The orchestrator normalizes the output (renames `nodes` → `nodeIds`, synthesizes missing `id` fields, drops dangling refs) before writing the final layers array.

Sources: [understand-anything-plugin/agents/architecture-analyzer.md:1-482](), [understand-anything-plugin/skills/understand/SKILL.md:369-437]()

---

## Phase 5: tour-builder — Guided Learning Walkthroughs

The `tour-builder` designs 5–15 pedagogical tour steps through the knowledge graph.

**Graph topology script (Phase 1):** Scores entry-point candidates (filename patterns, fan-out, fan-in), runs BFS from the top code entry point following `imports`/`calls` edges to produce a depth-ordered traversal, identifies tightly coupled clusters (mutual bidirectional edges), and groups non-code files (documentation, infrastructure, data, config) for scheduled tour inclusion.

**Tour design (Phase 2):** Uses BFS depth to structure tour order — depth 0 is the entry point, depth 1 covers direct dependencies, depth 2+ covers feature modules, and non-code stops (Dockerfile, CI config, SQL schema) anchor the end of the tour. Steps are connected narratively: each description explicitly references earlier steps to build a coherent mental model. Optional `languageLesson` fields attach language-specific educational notes (Docker multi-stage builds, TypeScript barrel patterns, SQL migration ordering, etc.).

The orchestrator normalizes the tour output (renames `nodesToInspect` → `nodeIds`, converts bare file paths to `file:` prefixed IDs, sorts by `order`).

Sources: [understand-anything-plugin/agents/tour-builder.md:1-379](), [understand-anything-plugin/skills/understand/SKILL.md:452-517]()

---

## Phase 6–7: Review, Save, and Cleanup

Phase 6 assembles the full `KnowledgeGraph` object (`version`, `project`, `nodes`, `edges`, `layers`, `tour`) and validates it. The default path runs an inline deterministic Node.js validator that checks: all nodes have required fields, edge sources/targets exist, every file-level node appears in exactly one layer, and no dangling tour step references exist. The `--review` flag substitutes the full LLM `graph-reviewer` agent.

Phase 7 writes three artifacts and then cleans up:

1. `knowledge-graph.json` — the final graph.
2. `fingerprints.json` — structural fingerprints (SHA-256 content hash + extracted functions/classes/imports/exports) for every source file, generated via the bundled `build-fingerprints.mjs` script using tree-sitter for precision. **This must succeed before `meta.json` is written** — a failure here causes auto-update to classify every subsequent commit as a full structural change (issue #152).
3. `meta.json` — `gitCommitHash`, `lastAnalyzedAt`, `analyzedFiles`.

Intermediate and tmp directories are deleted:

```bash
rm -rf $PROJECT_ROOT/.understand-anything/intermediate
rm -rf $PROJECT_ROOT/.understand-anything/tmp
```

Sources: [understand-anything-plugin/skills/understand/SKILL.md:522-735]()

---

## Auto-Update: Replaying Stale Phases After a Commit

When `autoUpdate: true` is stored in `.understand-anything/config.json`, two hooks fire to detect and update a stale graph.

**PostToolUse hook** fires after any `Bash` command matching a git commit/merge/cherry-pick/rebase pattern and injects an instruction into the conversation to run the auto-update prompt:

```json
{
  "matcher": "Bash",
  "hooks": [{
    "type": "command",
    "command": "printf '%s' \"$TOOL_INPUT\" | grep -qE 'git\\s+(commit|merge|cherry-pick|rebase)' && [ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' .understand-anything/config.json && [ -f .understand-anything/knowledge-graph.json ] && echo \"[understand-anything] Commit detected with auto-update enabled...\""
  }]
}
```

**SessionStart hook** detects commit hash drift by comparing `meta.json`'s stored `gitCommitHash` against `git rev-parse HEAD` and fires the same auto-update instruction if they diverge.

Sources: [understand-anything-plugin/hooks/hooks.json:1-25]()

The auto-update workflow (`hooks/auto-update-prompt.md`) is designed to minimize token cost:

| Phase | Cost |
|---|---|
| Phase 0 — Pre-flight, `.understandignore` filtering | Zero tokens |
| Phase 1 — Structural fingerprint check (script) | Zero tokens |
| Phase 2 — Targeted re-analysis (LLM agents) | Tokens only for structurally changed files |
| Phase 3 — Conditional architecture/tour + save | Tokens only if `ARCHITECTURE_UPDATE` |

The fingerprint check classifies each changed source file as `NONE`, `COSMETIC`, or `STRUCTURAL` by comparing SHA-256 content hashes and regex-extracted function/class/import signatures against `fingerprints.json`. The decision gate produces four actions:

| Action | Trigger | Behavior |
|---|---|---|
| `SKIP` | All changes cosmetic | Update `meta.json`, zero tokens |
| `PARTIAL_UPDATE` | ≤10 structural files, same dirs | Re-analyze changed files only |
| `ARCHITECTURE_UPDATE` | New/deleted dirs or >10 structural files | Re-analyze + re-run architecture phase |
| `FULL_UPDATE` | >30 structural files or >50% of graph | Recommend `/understand --full`, stop |

After re-analysis, fingerprints are updated using a load-patch-save pattern: all existing entries are loaded first, then only the re-analyzed entries are patched in-place, and the full dict is saved back. Overwriting only the batch subset would discard all other files' fingerprints and cause permanent `FULL_UPDATE` escalation on every subsequent commit.

Sources: [understand-anything-plugin/hooks/auto-update-prompt.md:1-321]()

---

## Summary

The five-agent pipeline (project-scanner → file-analyzer × N → architecture-analyzer → tour-builder → assemble-reviewer) is held together by a single invariant: agents write to `.understand-anything/intermediate/` rather than returning large payloads to the orchestrator. The merge script and inline validator provide two deterministic passes that recover mechanical errors before the final knowledge graph is persisted. Auto-update extends this pipeline into an incremental mode, spending LLM tokens only on structurally changed files while using zero-token fingerprint comparison to gate the decision — a design enforced by `fingerprints.json`'s tree-sitter-generated baseline written at the end of every full analysis run.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:683-735]()

---

## 03. Schema & Type Contracts: Nodes, Edges, and Aliases

> The knowledge graph is defined by two Zod schemas: NodeTypeSchema (21 canonical node types such as file, function, class, domain, article) and EdgeTypeSchema (35 edge types across 8 categories: structural, behavioral, data-flow, dependencies, semantic, infrastructure, domain, knowledge). Alias maps (NODE_TYPE_ALIASES, EDGE_TYPE_ALIASES) normalize LLM-generated variants to canonical forms at assembly time. This schema is the contract between the agent pipeline and the dashboard — both sides import from @understand-anything/core/types and @understand-anything/core/schema.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/03-schema-type-contracts-nodes-edges-and-aliases.md
- Generated: 2026-05-22T00:55:14.293Z

### Source Files

- `understand-anything-plugin/packages/core/src/schema.ts`
- `understand-anything-plugin/packages/core/src/types.test.ts`
- `understand-anything-plugin/packages/dashboard/src/store.ts`
- `understand-anything-plugin/packages/core/package.json`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/packages/core/src/schema.ts](understand-anything-plugin/packages/core/src/schema.ts)
- [understand-anything-plugin/packages/core/src/types.ts](understand-anything-plugin/packages/core/src/types.ts)
- [understand-anything-plugin/packages/core/src/types.test.ts](understand-anything-plugin/packages/core/src/types.test.ts)
- [understand-anything-plugin/packages/core/package.json](understand-anything-plugin/packages/core/package.json)
- [understand-anything-plugin/packages/dashboard/src/store.ts](understand-anything-plugin/packages/dashboard/src/store.ts)
</details>

# Schema & Type Contracts: Nodes, Edges, and Aliases

The knowledge graph produced by Understand Anything is not a free-form JSON blob — it is a strictly typed structure governed by two complementary layers: TypeScript interfaces in `types.ts` (for compile-time safety) and Zod schemas in `schema.ts` (for runtime validation at assembly time). Every node and every edge must conform to a finite set of canonical string values. Because the graph is assembled from LLM output, a normalization layer — the alias maps — sits between raw agent output and the validated result, transparently folding common LLM variants back into canonical forms before validation runs.

Both the agent pipeline and the dashboard share these contracts by importing from the `@understand-anything/core/types` and `@understand-anything/core/schema` subpath exports. Neither side owns a private copy of the type definitions; changes to the schema are immediately visible to both.

---

## Node Types

### Canonical Set (21 types)

Nodes represent any meaningful concept in the analyzed project. The 21 canonical node types span four semantic groups:

| Group | Types | Count |
|---|---|---|
| **Code** | `file`, `function`, `class`, `module`, `concept` | 5 |
| **Non-code** | `config`, `document`, `service`, `table`, `endpoint`, `pipeline`, `schema`, `resource` | 8 |
| **Domain** | `domain`, `flow`, `step` | 3 |
| **Knowledge** | `article`, `entity`, `topic`, `claim`, `source` | 5 |

The TypeScript union is defined in `types.ts`:

```typescript
// understand-anything-plugin/packages/core/src/types.ts:2-7
export type NodeType =
  | "file" | "function" | "class" | "module" | "concept"
  | "config" | "document" | "service" | "table" | "endpoint"
  | "pipeline" | "schema" | "resource"
  | "domain" | "flow" | "step"
  | "article" | "entity" | "topic" | "claim" | "source";
```

The Zod schema mirrors it exactly as a `z.enum` in `schema.ts`, used inside `GraphNodeSchema`:

```typescript
// understand-anything-plugin/packages/core/src/schema.ts:370-376
type: z.enum([
  "file", "function", "class", "module", "concept",
  "config", "document", "service", "table", "endpoint",
  "pipeline", "schema", "resource",
  "domain", "flow", "step",
  "article", "entity", "topic", "claim", "source",
]),
```

Sources: [understand-anything-plugin/packages/core/src/types.ts:1-7](), [understand-anything-plugin/packages/core/src/schema.ts:370-376]()

### GraphNode Fields

Every node carries a fixed set of fields:

| Field | Type | Required | Notes |
|---|---|---|---|
| `id` | `string` | ✓ | Unique identifier; edges reference by id |
| `type` | `NodeType` | ✓ | One of 21 canonical values |
| `name` | `string` | ✓ | Human-readable label |
| `summary` | `string` | ✓ | LLM-generated description |
| `tags` | `string[]` | ✓ | Defaulted to `[]` if missing |
| `complexity` | `"simple" \| "moderate" \| "complex"` | ✓ | Defaulted to `"moderate"` if missing |
| `filePath` | `string` | optional | Path in the analyzed repo |
| `lineRange` | `[number, number]` | optional | Start/end line numbers |
| `languageNotes` | `string` | optional | Language-specific remarks |
| `domainMeta` | `DomainMeta` | optional | For `domain`/`flow`/`step` nodes |
| `knowledgeMeta` | `KnowledgeMeta` | optional | For `article`/`entity`/`topic`/`claim`/`source` nodes |

Sources: [understand-anything-plugin/packages/core/src/types.ts:39-51](), [understand-anything-plugin/packages/core/src/schema.ts:368-386]()

### Extended Metadata

Two optional metadata bags extend the core node structure for specialized node groups:

**`DomainMeta`** (for `domain`, `flow`, `step` nodes):
```typescript
// understand-anything-plugin/packages/core/src/types.ts:30-36
export interface DomainMeta {
  entities?: string[];
  businessRules?: string[];
  crossDomainInteractions?: string[];
  entryPoint?: string;
  entryType?: "http" | "cli" | "event" | "cron" | "manual";
}
```

**`KnowledgeMeta`** (for `article`, `entity`, `topic`, `claim`, `source` nodes):
```typescript
// understand-anything-plugin/packages/core/src/types.ts:22-27
export interface KnowledgeMeta {
  wikilinks?: string[];
  backlinks?: string[];
  category?: string;
  content?: string;
}
```

Both schemas use `.passthrough()` in Zod, allowing agents to include extra fields without causing validation failures.

Sources: [understand-anything-plugin/packages/core/src/types.ts:22-36](), [understand-anything-plugin/packages/core/src/schema.ts:353-386]()

---

## Edge Types

### Canonical Set (35 types across 8 categories)

The `EdgeTypeSchema` is a `z.enum` of 35 values, grouped by semantic category. The same grouping is reflected in the dashboard's `EDGE_CATEGORY_MAP`:

| Category | Edge Types |
|---|---|
| **Structural** | `imports`, `exports`, `contains`, `inherits`, `implements` |
| **Behavioral** | `calls`, `subscribes`, `publishes`, `middleware` |
| **Data flow** | `reads_from`, `writes_to`, `transforms`, `validates` |
| **Dependencies** | `depends_on`, `tested_by`, `configures` |
| **Semantic** | `related`, `similar_to` |
| **Infrastructure** | `deploys`, `serves`, `provisions`, `triggers`, `migrates`, `documents`, `routes`, `defines_schema` |
| **Domain** | `contains_flow`, `flow_step`, `cross_domain` |
| **Knowledge** | `cites`, `contradicts`, `builds_on`, `exemplifies`, `categorized_under`, `authored_by` |

Sources: [understand-anything-plugin/packages/core/src/schema.ts:4-14](), [understand-anything-plugin/packages/dashboard/src/store.ts:31-40]()

### GraphEdge Fields

| Field | Type | Required | Notes |
|---|---|---|---|
| `source` | `string` | ✓ | Node id; validated against the node set |
| `target` | `string` | ✓ | Node id; validated against the node set |
| `type` | `EdgeType` | ✓ | One of 35 canonical values; defaulted to `"depends_on"` |
| `direction` | `"forward" \| "backward" \| "bidirectional"` | ✓ | Defaulted to `"forward"` |
| `weight` | `number` | ✓ | Float in `[0, 1]`; defaulted to `0.5`, clamped if out of range |
| `description` | `string` | optional | Human-readable label for the edge |

Sources: [understand-anything-plugin/packages/core/src/types.ts:54-61](), [understand-anything-plugin/packages/core/src/schema.ts:388-395]()

---

## Alias Maps

LLMs frequently produce variant spellings for both node and edge types (`func` instead of `function`, `extends` instead of `inherits`). The alias maps translate these variants to canonical forms **before** Zod validation runs, preventing unnecessary validation failures without weakening the schema.

### NODE_TYPE_ALIASES

```typescript
// understand-anything-plugin/packages/core/src/schema.ts:17-75
export const NODE_TYPE_ALIASES: Record<string, string> = {
  func: "function",
  fn: "function",
  method: "function",
  interface: "class",
  struct: "class",
  mod: "module",
  pkg: "module",
  package: "module",
  container: "service",
  deployment: "service",
  pod: "service",
  // ... (46 entries total)
};
```

Notable design decisions captured in comments:
- `"process"` is **intentionally excluded** from domain aliases — it is ambiguous with the OS/Node.js `process` global.

### EDGE_TYPE_ALIASES

```typescript
// understand-anything-plugin/packages/core/src/schema.ts:78-125
export const EDGE_TYPE_ALIASES: Record<string, string> = {
  extends: "inherits",
  invokes: "calls",
  uses: "depends_on",
  requires: "depends_on",
  // ... (37 entries total)
};
```

A critical invariant is documented with a comment:
```typescript
// Note: "implemented_by" is intentionally NOT aliased to "implements" —
// it inverts edge direction (see commit fd0df15). The LLM should use
// "implements" with correct source/target instead.
```

This means aliasing is not purely cosmetic — it preserves directed semantics. Reversing source/target via an alias would silently corrupt graph meaning.

Sources: [understand-anything-plugin/packages/core/src/schema.ts:17-75](), [understand-anything-plugin/packages/core/src/schema.ts:78-125]()

### Additional Alias Maps

Two further alias maps cover other enum-like fields:

- **`COMPLEXITY_ALIASES`**: maps `low→simple`, `medium→moderate`, `high→complex`, etc.
- **`DIRECTION_ALIASES`**: maps `to→forward`, `from→backward`, `both→bidirectional`, etc.

Sources: [understand-anything-plugin/packages/core/src/schema.ts:128-146]()

---

## Validation Pipeline

The `validateGraph` function in `schema.ts` applies four tiers of processing in sequence before returning a `ValidationResult`:

```text
Raw LLM JSON
     │
     ▼
Tier 1: sanitizeGraph()
  – null → undefined for optional fields
  – lowercase all enum strings
     │
     ▼
normalizeGraph()
  – NODE_TYPE_ALIASES applied to node.type
  – EDGE_TYPE_ALIASES applied to edge.type
     │
     ▼
Tier 2: autoFixGraph()
  – Missing type → default ("file" for nodes, "depends_on" for edges)
  – Missing complexity → "moderate"
  – Missing tags → []
  – Missing direction → "forward"
  – Missing weight → 0.5; string weight coerced; out-of-range clamped to [0,1]
     │
     ▼
Tier 3: Per-element Zod parse
  – Invalid nodes → dropped (with GraphIssue at level "dropped")
  – Invalid edges → dropped
  – Edge referential integrity: source and target must be valid node ids
  – Layers and tour steps: drop invalid, filter dangling nodeIds
     │
     ▼
Tier 4: Fatal checks
  – Not an object → fatal
  – Collections not arrays → fatal
  – No valid project metadata → fatal
  – Zero valid nodes → fatal
     │
     ▼
ValidationResult { success, data, issues, fatal? }
```

The `GraphIssue` type carries a `level` (`"auto-corrected"` | `"dropped"` | `"fatal"`), a `category`, and a human-readable `message`. The dashboard's `WarningBanner` component surfaces these issues to the user alongside layout-time issues from ELK.

Sources: [understand-anything-plugin/packages/core/src/schema.ts:148-663](), [understand-anything-plugin/packages/dashboard/src/store.ts:236-239]()

---

## The Contract Between Pipeline and Dashboard

Both sides of the system share the same types through the core package's subpath exports:

```json
// understand-anything-plugin/packages/core/package.json:6-23
"exports": {
  "./types": {
    "types": "./dist/types.d.ts",
    "default": "./dist/types.js"
  },
  "./schema": {
    "types": "./dist/schema.d.ts",
    "default": "./dist/schema.js"
  }
}
```

The dashboard imports only the browser-safe subpaths — never the main entry point, which would pull in Node.js modules:

```typescript
// understand-anything-plugin/packages/dashboard/src/store.ts:4-9
import type { GraphIssue } from "@understand-anything/core/schema";
import type {
  GraphNode,
  KnowledgeGraph,
  TourStep,
} from "@understand-anything/core/types";
```

The dashboard also re-declares the `NodeType` union and `EDGE_CATEGORY_MAP` directly in `store.ts` to keep the store self-contained for filtering logic, but these are structurally identical to what lives in `types.ts`.

Sources: [understand-anything-plugin/packages/core/package.json:6-23](), [understand-anything-plugin/packages/dashboard/src/store.ts:1-39]()

---

## What Would Break If the Schema Changed

| Change | Impact |
|---|---|
| Add a new `NodeType` value | Must be added to both `NodeType` union and `GraphNodeSchema` enum; dashboard `ALL_NODE_TYPES` and `NodeType` type must also update |
| Remove a `NodeType` value | Existing graphs with the removed type will fail Zod parse; nodes get dropped unless an alias covers the old value |
| Add a new `EdgeType` | Must be added to `EdgeTypeSchema` enum, `types.ts` union, and `EDGE_CATEGORY_MAP` in `store.ts` |
| Add an alias | Safe — aliases are applied before validation; no schema change required |
| Alias `implemented_by` → `implements` | Direction inversion bug: edges pointing the wrong way in the graph |
| Change `weight` range | `autoFixGraph` clamps values; consumers assume `[0, 1]` |
| Make `filePath` required | Breaks non-code nodes (domain, knowledge, config) that have no file path |

The closing invariant that holds the system together: **the canonical enum values in `EdgeTypeSchema` / `GraphNodeSchema` are the contract**. The alias maps are a translation layer, not an expansion of the contract. Any string not in the canonical set and not in an alias map will be rejected at Tier 3 and the element dropped, not silently ignored.

Sources: [understand-anything-plugin/packages/core/src/schema.ts:462-497](), [understand-anything-plugin/packages/core/src/schema.ts:543-609]()

---

## 04. Static Analysis: Tree-Sitter Extractors & Parsers

> Two plugin families produce deterministic graph nodes without LLM calls. Language extractors (TypeScript, Python, Go, Java, Rust, C++, Ruby, C#, PHP) use web-tree-sitter (WASM) via tree-sitter-plugin.ts to parse ASTs and emit function/class/module nodes. Config parsers (JSON, YAML, TOML, SQL, GraphQL, Dockerfile, Protobuf, Makefile, shell, Markdown, Terraform, .env) extract config/schema/document nodes. The plugin registry in registry.ts and discovery.ts wires both families together. The WASM constraint — no native bindings — is a hard invariant: never swap in the native tree-sitter package.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/04-static-analysis-tree-sitter-extractors-parsers.md
- Generated: 2026-05-22T00:55:50.918Z

### Source Files

- `understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts`
- `understand-anything-plugin/packages/core/src/plugins/registry.ts`
- `understand-anything-plugin/packages/core/src/plugins/discovery.ts`
- `understand-anything-plugin/packages/core/src/plugins/extractors/base-extractor.ts`
- `understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts`
- `understand-anything-plugin/packages/core/src/plugins/parsers/index.ts`
- `understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts](understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts)
- [understand-anything-plugin/packages/core/src/plugins/registry.ts](understand-anything-plugin/packages/core/src/plugins/registry.ts)
- [understand-anything-plugin/packages/core/src/plugins/discovery.ts](understand-anything-plugin/packages/core/src/plugins/discovery.ts)
- [understand-anything-plugin/packages/core/src/plugins/extractors/base-extractor.ts](understand-anything-plugin/packages/core/src/plugins/extractors/base-extractor.ts)
- [understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts](understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts)
- [understand-anything-plugin/packages/core/src/plugins/extractors/types.ts](understand-anything-plugin/packages/core/src/plugins/extractors/types.ts)
- [understand-anything-plugin/packages/core/src/plugins/extractors/index.ts](understand-anything-plugin/packages/core/src/plugins/extractors/index.ts)
- [understand-anything-plugin/packages/core/src/plugins/parsers/index.ts](understand-anything-plugin/packages/core/src/plugins/parsers/index.ts)
- [understand-anything-plugin/packages/core/src/plugins/parsers/yaml-parser.ts](understand-anything-plugin/packages/core/src/plugins/parsers/yaml-parser.ts)
- [understand-anything-plugin/packages/core/src/languages/configs/index.ts](understand-anything-plugin/packages/core/src/languages/configs/index.ts)
- [understand-anything-plugin/packages/core/src/languages/configs/typescript.ts](understand-anything-plugin/packages/core/src/languages/configs/typescript.ts)
- [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.test.ts](understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.test.ts)
</details>

# Static Analysis: Tree-Sitter Extractors & Parsers

The static analysis layer is the deterministic, LLM-free core of Understand Anything's knowledge graph construction. It is responsible for turning raw source files into structured graph nodes — functions, classes, imports, exports, and config sections — without ever making an AI call. Two plugin families do this work: **language extractors** that parse programming-language ASTs via `web-tree-sitter` (WASM), and **config parsers** that parse infrastructure and documentation formats using ordinary text or library-level parsers.

Understanding this layer is important for two reasons. First, it defines the accuracy ceiling for structural analysis: every node that appears in the graph without an LLM annotation was produced here. Second, the WASM-only constraint is a hard architectural invariant: `web-tree-sitter` must be used instead of native bindings because native `tree-sitter` bindings fail on darwin/arm64 + Node 24. Swapping in the native package would silently break in CI and on Apple Silicon, so this page makes the boundary explicit.

---

## Architecture Overview

```text
LanguageConfig (configs/index.ts)
  └─ treeSitter: { wasmPackage, wasmFile }   ← present only for code langs

Plugin initialization path:
  TreeSitterPlugin(configs[]) ──init()──► load WASM grammars via LanguageCls.load()
                                                    │
                              builtinExtractors[]    │
                              (TypeScript, Python,   │
                               Go, Java, Rust, …)   │
                                         ▼
                              analyzeFile(path, src)
                                   │
                              parser.parse(src) → AST root
                                   │
                              LanguageExtractor.extractStructure(root)
                                   │
                              → StructuralAnalysis { functions, classes, imports, exports }

Config parsers (non-code family):
  registerAllParsers(registry)  ←  MarkdownParser, YAMLConfigParser, SQLParser, …
  Each implements AnalyzerPlugin but uses library/regex parsing, not tree-sitter
  Returns: StructuralAnalysis { sections: SectionInfo[] }

PluginRegistry (registry.ts)
  register(plugin) ──► languageMap: lang → plugin
  analyzeFile(path, src) ──► getPluginForFile(path) ──► plugin.analyzeFile()
```

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:32-98](), [understand-anything-plugin/packages/core/src/plugins/parsers/index.ts:31-44](), [understand-anything-plugin/packages/core/src/plugins/registry.ts:11-81]()

---

## The WASM Constraint

The single most important invariant in this subsystem: **`web-tree-sitter` (WASM) must be used instead of native `tree-sitter` bindings.**

The comment at the top of `tree-sitter-plugin.ts` explains the mechanics:

```ts
// web-tree-sitter uses CJS internally; we need createRequire for .wasm resolution
const require = createRequire(import.meta.url);
```

All `.wasm` grammar files are resolved with `require.resolve()`, not `import()`, because `web-tree-sitter` ships them as CommonJS-resolvable assets. The `Parser` and `Language` classes come from `web-tree-sitter` exclusively; there is no code path that conditionally falls back to native bindings.

During `init()`, grammars are loaded with `LanguageCls.load(wasmPath)` in parallel:

```ts
const wasmPath = require.resolve(
  `${config.treeSitter!.wasmPackage}/${config.treeSitter!.wasmFile}`,
);
const lang = await LanguageCls.load(wasmPath);
this._languages.set(config.id, lang);
```

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:1-14, 125-198]()

---

## Language Configuration

Each programming language that has tree-sitter support carries a `LanguageConfig` with a `treeSitter` field. Configs without that field are non-code languages and are handled by the config-parser family instead.

Example — the TypeScript config:

```ts
// understand-anything-plugin/packages/core/src/languages/configs/typescript.ts
export const typescriptConfig = {
  id: "typescript",
  displayName: "TypeScript",
  extensions: [".ts", ".tsx"],
  treeSitter: {
    wasmPackage: "tree-sitter-typescript",
    wasmFile: "tree-sitter-typescript.wasm",
  },
  // ...
} satisfies LanguageConfig;
```

The `TreeSitterPlugin` constructor filters `configs` to only those with a `treeSitter` field, then builds the extension-to-language map from them. Languages without a `treeSitter` field are simply excluded from structural parsing and fall through to the LLM agent.

Sources: [understand-anything-plugin/packages/core/src/languages/configs/typescript.ts:1-30](), [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:57-97]()

### TSX Special Case

TypeScript has a two-grammar special case: when the TypeScript grammar is loaded, the plugin also attempts to load `tree-sitter-tsx.wasm` from the same WASM package. `.tsx` files receive the TSX grammar (its own grammar key `"tsx"`), but extraction logic is routed to the `TypeScriptExtractor` because the syntactic forms are identical:

```ts
private getExtractor(langKey: string): LanguageExtractor | null {
  // tsx is a synthetic grammar key — extraction logic is identical to typescript
  const key = langKey === "tsx" ? "typescript" : langKey;
  return this.extractors.get(key) ?? null;
}
```

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:106-110, 150-162]()

---

## Language Extractor Family

### Interface Contract

Every language extractor implements the `LanguageExtractor` interface:

```ts
// understand-anything-plugin/packages/core/src/plugins/extractors/types.ts
export interface LanguageExtractor {
  languageIds: string[];
  extractStructure(rootNode: TreeSitterNode): StructuralAnalysis;
  extractCallGraph(rootNode: TreeSitterNode): CallGraphEntry[];
}
```

`extractStructure` returns `{ functions, classes, imports, exports }`. `extractCallGraph` returns `{ caller, callee, lineNumber }[]`. The node type passed in is `web-tree-sitter`'s `Node` — the root of an already-parsed AST.

Sources: [understand-anything-plugin/packages/core/src/plugins/extractors/types.ts:1-19]()

### Built-in Extractors

Nine extractors ship as builtins:

| Extractor | `languageIds` |
|---|---|
| `TypeScriptExtractor` | `typescript`, `javascript` |
| `PythonExtractor` | `python` |
| `GoExtractor` | `go` |
| `RustExtractor` | `rust` |
| `JavaExtractor` | `java` |
| `RubyExtractor` | `ruby` |
| `PhpExtractor` | `php` |
| `CppExtractor` | `cpp` (and `c`) |
| `CSharpExtractor` | `csharp` |

All nine are instantiated at module load and collected in `builtinExtractors[]`. The `TreeSitterPlugin` constructor registers them all by default:

```ts
for (const extractor of builtinExtractors) {
  this.registerExtractor(extractor);
}
```

Sources: [understand-anything-plugin/packages/core/src/plugins/extractors/index.ts:24-34](), [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:88-97]()

### Base Extractor Utilities

Rather than duplicating traversal logic in every extractor, `base-extractor.ts` provides shared helpers:

| Helper | Purpose |
|---|---|
| `traverse(node, visitor)` | Depth-first recursive walk |
| `getStringValue(node)` | Strips quotes from string-like nodes |
| `findChild(node, type)` | First child with matching `node.type` |
| `findChildren(node, type)` | All children with matching `node.type` |
| `hasChildOfType(node, type)` | Boolean check for export/visibility |

Sources: [understand-anything-plugin/packages/core/src/plugins/extractors/base-extractor.ts:1-53]()

### TypeScript Extractor in Detail

The `TypeScriptExtractor` is the most fully-featured extractor and illustrates the pattern used by all others. It processes top-level AST nodes in a single pass:

```ts
// understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts
switch (node.type) {
  case "function_declaration":    this.extractFunction(node, functions); break;
  case "class_declaration":       this.extractClass(node, classes); break;
  case "lexical_declaration":
  case "variable_declaration":    this.extractVariableDeclarations(node, functions); break;
  case "import_statement":        this.extractImport(node, imports); break;
  case "export_statement":        this.processExportStatement(...); break;
}
```

**Function extraction** handles three forms: `function` declarations, arrow functions assigned to `const`/`let`, and function expressions. Parameters are extracted with full support for `required_parameter`, `optional_parameter`, rest parameters (`...args`), and plain JavaScript identifiers. Return type annotations are stripped of their leading `:`.

**Call graph extraction** uses a function stack. A depth-first walk pushes function names as scope is entered and pops on exit. Every `call_expression` node encountered emits a `{ caller, callee, lineNumber }` entry using the top of the stack as caller:

```ts
if (node.type === "call_expression") {
  const callee = node.childForFieldName("function");
  if (callee && functionStack.length > 0) {
    entries.push({
      caller: functionStack[functionStack.length - 1],
      callee: callee.text,
      lineNumber: node.startPosition.row + 1,
    });
  }
}
```

Sources: [understand-anything-plugin/packages/core/src/plugins/extractors/typescript-extractor.ts:106-194]()

### Lifecycle: init() → analyzeFile()

The plugin separates async initialization from synchronous analysis. Grammar loading is async (WASM); parsing and extraction are synchronous once grammars are resident in memory.

```
await plugin.init()      // loads all .wasm grammars in parallel
↓
plugin.analyzeFile(path, src)
  getParser(path)         // synchronous: looks up pre-loaded Language, creates Parser
  parser.parse(src)       // synchronous: returns Tree
  extractor.extractStructure(tree.rootNode)
  tree.delete(); parser.delete()  // explicit WASM memory cleanup
  return StructuralAnalysis
```

If a grammar failed to load during `init()` (the npm package is missing or the WASM file is absent), `getParser()` returns `null` and `analyzeFile()` returns an empty `StructuralAnalysis`. This graceful degradation means an unsupported language never throws — it simply contributes no structural nodes, and the LLM agent fills the gap.

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:125-250]()

---

## Config Parser Family

Config parsers handle non-code file formats. They implement the same `AnalyzerPlugin` interface but do not use tree-sitter at all — they use format-specific libraries (e.g., the `yaml` npm package) or regular expressions.

### Built-in Config Parsers

```ts
// understand-anything-plugin/packages/core/src/plugins/parsers/index.ts
export function registerAllParsers(registry: PluginRegistry): void {
  registry.register(new MarkdownParser());
  registry.register(new YAMLConfigParser());
  registry.register(new JSONConfigParser());
  registry.register(new TOMLParser());
  registry.register(new EnvParser());
  registry.register(new DockerfileParser());
  registry.register(new SQLParser());
  registry.register(new GraphQLParser());
  registry.register(new ProtobufParser());
  registry.register(new TerraformParser());
  registry.register(new MakefileParser());
  registry.register(new ShellParser());
}
```

Sources: [understand-anything-plugin/packages/core/src/plugins/parsers/index.ts:31-44]()

### Output Shape

Config parsers emit `StructuralAnalysis` with `sections: SectionInfo[]` populated and `functions/classes/imports/exports` left empty. A `SectionInfo` carries `{ name, level, lineRange }` — enough for the knowledge graph to represent top-level configuration structure (e.g., YAML top-level keys, SQL statement blocks, Makefile targets).

### YAML Parser: Library + Regex Fallback

The `YAMLConfigParser` illustrates the robustness pattern used across config parsers. It first attempts a proper library parse with the `yaml` npm package; if that throws, it falls back to a line-level regex:

```ts
try {
  const doc = parseYAML(content);
  // extract top-level keys from the parsed object
} catch (err) {
  console.warn(`[yaml-parser] YAML parse failed, falling back to regex...`);
  const lines = content.split("\n");
  for (let i = 0; i < lines.length; i++) {
    const match = lines[i].match(/^(\w[\w-]*)\s*:/);
    if (match) sections.push({ name: match[1], level: 1, lineRange: [i+1, i+1] });
  }
}
```

The `YAMLConfigParser.languages` array also includes YAML-flavored variants (`docker-compose`, `kubernetes`, `github-actions`, `openapi`) so the language registry can route those file types without falling through to the "no plugin matched" branch.

Sources: [understand-anything-plugin/packages/core/src/plugins/parsers/yaml-parser.ts:14-106]()

---

## Plugin Registry Wiring

`PluginRegistry` is the unified dispatch layer. It maintains a `languageMap: Map<string, AnalyzerPlugin>` populated by calls to `register()`. Both `TreeSitterPlugin` and config parsers register through the same interface.

File-to-plugin routing goes through `LanguageRegistry.getForFile(filePath)` — an extension-based lookup that returns a `LanguageConfig`. The config's `id` is then used to look up the plugin:

```ts
getPluginForFile(filePath: string): AnalyzerPlugin | null {
  const langConfig = this.languageRegistry.getForFile(filePath);
  if (!langConfig) return null;
  return this.getPluginForLanguage(langConfig.id);
}
```

This means the extension-to-language mapping lives in `LanguageRegistry` (driven by `builtinLanguageConfigs`), not duplicated in each plugin.

Sources: [understand-anything-plugin/packages/core/src/plugins/registry.ts:39-48]()

### Default Plugin Configuration

`discovery.ts` holds `DEFAULT_PLUGIN_CONFIG`, which is derived at module load from `builtinLanguageConfigs` — specifically those configs that carry a `treeSitter` field. This is what gets written to disk when no user-supplied config exists:

```ts
export const DEFAULT_PLUGIN_CONFIG: PluginConfig = {
  plugins: [
    {
      name: "tree-sitter",
      enabled: true,
      languages: builtinLanguageConfigs
        .filter((c) => c.treeSitter)
        .map((c) => c.id),
    },
  ],
};
```

Sources: [understand-anything-plugin/packages/core/src/plugins/discovery.ts:14-24]()

---

## Failure Modes and Boundaries

| Failure | Behavior |
|---|---|
| WASM grammar `.wasm` file missing at `require.resolve()` time | `init()` logs a debug message, that language is skipped; `analyzeFile()` returns empty `StructuralAnalysis` |
| File extension not in any language config | `getPluginForFile()` returns `null`; no analysis is produced |
| `parser.parse()` returns `null` | Extractor skips the file; `tree.delete()` / `parser.delete()` are called in finally-equivalent branches |
| Config parser library throws (e.g., malformed YAML) | Regex fallback extracts sections where possible |
| Native `tree-sitter` package swapped in | Build failure on darwin/arm64 + Node 24 — this is a hard invariant, not a handled failure |

The design choice to return empty results rather than throw means analysis can proceed even when some languages are unavailable. LLM agents are the designated fallback for files that static analysis cannot parse.

---

## Summary

The static analysis subsystem is built around two plugin families that share a single `PluginRegistry`. Language extractors use `web-tree-sitter` (WASM) — a non-negotiable constraint driven by platform compatibility — to parse ASTs and extract functions, classes, imports, and call graphs for nine programming languages. Config parsers handle a dozen infrastructure and documentation formats using library or regex parsing, emitting section-level structure. Both families are registered identically into `PluginRegistry`, which dispatches by language id derived from `LanguageRegistry`. The WASM approach means `init()` is async but `analyzeFile()` is synchronous, and missing grammars degrade gracefully to empty output rather than hard failures.

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:222-251]()

---

## 05. Staleness Detection & Incremental Updates

> The knowledge graph is stored as .understand-anything/knowledge-graph.json alongside config.json (which records the last analyzed commit hash and user preferences such as language and autoUpdate). On each /understand invocation, staleness.ts compares git diff lastCommitHash..HEAD; if files changed, only affected nodes are removed and re-analyzed (incremental mode). --full forces a complete rebuild. The auto-update hook re-triggers analysis after every git commit when autoUpdate is true. Worktree redirect is a critical invariant: graphs generated inside a Claude Code worktree are redirected to the main repo root to prevent ephemeral loss.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/05-staleness-detection-incremental-updates.md
- Generated: 2026-05-22T00:55:31.598Z

### Source Files

- `understand-anything-plugin/packages/core/src/staleness.ts`
- `understand-anything-plugin/skills/understand/SKILL.md`
- `understand-anything-plugin/hooks/hooks.json`
- `understand-anything-plugin/hooks/auto-update-prompt.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/packages/core/src/staleness.ts](understand-anything-plugin/packages/core/src/staleness.ts)
- [understand-anything-plugin/packages/core/src/fingerprint.ts](understand-anything-plugin/packages/core/src/fingerprint.ts)
- [understand-anything-plugin/packages/core/src/change-classifier.ts](understand-anything-plugin/packages/core/src/change-classifier.ts)
- [understand-anything-plugin/packages/core/src/types.ts](understand-anything-plugin/packages/core/src/types.ts)
- [understand-anything-plugin/hooks/hooks.json](understand-anything-plugin/hooks/hooks.json)
- [understand-anything-plugin/hooks/auto-update-prompt.md](understand-anything-plugin/hooks/auto-update-prompt.md)
- [understand-anything-plugin/skills/understand/SKILL.md](understand-anything-plugin/skills/understand/SKILL.md)
</details>

# Staleness Detection & Incremental Updates

Understand Anything maintains its knowledge graph as `.understand-anything/knowledge-graph.json` alongside two small sidecar files — `meta.json` (recording the last analyzed git commit hash and file count) and `config.json` (storing user preferences such as `autoUpdate` and `outputLanguage`). Every time `/understand` or the auto-update hook runs, the system compares the stored commit hash against `HEAD` to decide what work is actually necessary. The goal is to spend zero LLM tokens on cosmetic edits and the minimum possible tokens on genuine structural changes, while a full rebuild is reserved for sweeping rearchitecture.

This page explains the full staleness pipeline: how git-diff-based detection drives an incremental vs. full decision, how the structural fingerprint system distinguishes cosmetic from structural changes within a changed file set, how the graph is surgically patched without touching untouched nodes, and how two automatic hooks (post-commit and session-start) keep the graph current with no manual invocation.

---

## Storage Layout

```text
<project-root>/
└── .understand-anything/
    ├── knowledge-graph.json   ← the graph (nodes, edges, layers, tour)
    ├── meta.json              ← { gitCommitHash, lastAnalyzedAt, version, analyzedFiles }
    ├── config.json            ← { autoUpdate, outputLanguage }
    ├── fingerprints.json      ← per-file structural fingerprints baseline
    └── intermediate/          ← scratch space (cleaned up after each run)
```

`meta.json` is the staleness anchor. Its `gitCommitHash` is written only after `fingerprints.json` has been successfully built, so the two are always in sync. Writing `meta.json` before the fingerprint baseline would cause every subsequent auto-update to escalate to `FULL_UPDATE` (documented as issue #152 in `auto-update-prompt.md`).

`config.json` carries opt-in flags. The relevant TypeScript type is:

```ts
// understand-anything-plugin/packages/core/src/types.ts:117-119
export interface ProjectConfig {
  autoUpdate: boolean;
  outputLanguage?: string;
```

Sources: [understand-anything-plugin/packages/core/src/types.ts:80-119]()

---

## Staleness Detection

Staleness is determined purely by git: the system runs `git diff <lastCommitHash>..HEAD --name-only` and considers the graph stale if any files are listed in the output.

```ts
// understand-anything-plugin/packages/core/src/staleness.ts:13-29
export function getChangedFiles(
  projectDir: string,
  lastCommitHash: string,
): string[] {
  try {
    const output = execFileSync('git', ['diff', `${lastCommitHash}..HEAD`, '--name-only'], {
      cwd: projectDir,
      encoding: "utf-8",
    });
    return output
      .split("\n")
      .map((line) => line.trim())
      .filter((line) => line.length > 0);
  } catch {
    return [];
  }
}
```

`isStale()` wraps this into a `StalenessResult` with a boolean `stale` flag and the `changedFiles` list. On any git error (e.g., the repo is not initialized), it conservatively returns an empty array rather than throwing.

Sources: [understand-anything-plugin/packages/core/src/staleness.ts:1-43]()

---

## Decision Logic at `/understand` Invocation

Phase 0 of the `/understand` skill reads both `meta.json` and the existing graph, then routes to one of four paths:

| Condition | Action |
|---|---|
| `--full` flag present | Full analysis (all phases) |
| No existing graph or `meta.json` | Full analysis (all phases) |
| `--review` + existing graph + same commit hash | Skip directly to graph reviewer |
| Existing graph + same commit hash | Ask user: rebuild, review, or do nothing |
| Existing graph + changed files | Incremental update (re-analyze changed files only) |

For incremental updates, the skill uses the same `git diff <lastCommitHash>..HEAD --name-only` call and passes the changed file list to a targeted re-analysis pipeline.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:141-159]()

---

## Structural Fingerprint System

Not every changed file represents a structural change. A developer reformatting a function body or fixing a comment should not trigger node re-analysis. The fingerprint system provides a zero-LLM-token pre-filter that classifies each changed file into one of three levels.

### FileFingerprint Shape

```ts
// understand-anything-plugin/packages/core/src/fingerprint.ts:9-39
export interface FileFingerprint {
  filePath: string;
  contentHash: string;
  functions: FunctionFingerprint[];  // name, params, returnType, exported, lineCount
  classes: ClassFingerprint[];       // name, methods, properties, exported, lineCount
  imports: ImportFingerprint[];      // source, specifiers
  exports: string[];
  totalLines: number;
  hasStructuralAnalysis: boolean;
}
```

The baseline `FingerprintStore` (written to `fingerprints.json` in Phase 7 of `/understand`) uses tree-sitter for precise extraction. During auto-update (Phase 1), a temporary Node.js script uses regex-based extraction — faster but sufficient for signature-level detection.

### Change Classification

`compareFingerprints()` applies a three-level decision:

| Level | Condition |
|---|---|
| `NONE` | SHA-256 content hash is identical — file unchanged |
| `COSMETIC` | Content changed but all function/class/import/export signatures match |
| `STRUCTURAL` | Any signature-level difference: new/removed function or class, changed params, changed return type, changed export status, changed imports/exports, or no structural analysis available (conservative) |

```ts
// understand-anything-plugin/packages/core/src/fingerprint.ts:131-246
export function compareFingerprints(
  oldFp: FileFingerprint,
  newFp: FileFingerprint,
): FileChangeResult {
  // Fast path: identical content
  if (oldFp.contentHash === newFp.contentHash) {
    return { filePath: newFp.filePath, changeLevel: "NONE", details: [] };
  }
  // Conservative: no structural analysis → STRUCTURAL
  if (!oldFp.hasStructuralAnalysis || !newFp.hasStructuralAnalysis) {
    return { ..., changeLevel: "STRUCTURAL", ... };
  }
  // ... compare function, class, import, export signatures ...
```

Sources: [understand-anything-plugin/packages/core/src/fingerprint.ts:131-246]()

---

## Update Classification

After fingerprinting all changed files, `classifyUpdate()` in `change-classifier.ts` maps the aggregate analysis to an action:

```ts
// understand-anything-plugin/packages/core/src/change-classifier.ts:21-87
export function classifyUpdate(
  analysis: ChangeAnalysis,
  totalFilesInGraph: number,
  allKnownFiles: string[] = [],
): UpdateDecision {
```

| Action | Trigger condition |
|---|---|
| `SKIP` | All changed files are `NONE` or `COSMETIC` |
| `PARTIAL_UPDATE` | ≤10 structural files, no new/removed top-level directories |
| `ARCHITECTURE_UPDATE` | >10 structural files, or new/removed top-level directories, or structural count ≤30 but directory structure changed |
| `FULL_UPDATE` | >30 structural files, or structural changes exceed 50% of all files in the graph |

`FULL_UPDATE` does not perform re-analysis automatically; it reports the situation and tells the user to run `/understand --full`. This prevents an auto-update from silently spending large amounts of LLM tokens.

Sources: [understand-anything-plugin/packages/core/src/change-classifier.ts:1-143]()

---

## Incremental Graph Merge

When an incremental update re-analyzes a subset of files, the results must be merged back into the existing graph without disturbing untouched nodes.

`mergeGraphUpdate()` in `staleness.ts` performs a surgical replacement:

1. Identify nodes whose `filePath` is in the changed-file set.
2. Collect their IDs into `removedNodeIds`.
3. Retain all nodes whose ID is **not** in `removedNodeIds`.
4. Retain all edges whose `source` **and** `target` are **not** in `removedNodeIds`.
5. Append the freshly analyzed nodes and edges.
6. Update `project.gitCommitHash` and `project.analyzedAt`.

```ts
// understand-anything-plugin/packages/core/src/staleness.ts:54-90
export function mergeGraphUpdate(
  existingGraph: KnowledgeGraph,
  changedFilePaths: string[],
  newNodes: GraphNode[],
  newEdges: GraphEdge[],
  newCommitHash: string,
): KnowledgeGraph {
  const changedSet = new Set(changedFilePaths);

  const removedNodeIds = new Set(
    existingGraph.nodes
      .filter((node) => node.filePath !== undefined && changedSet.has(node.filePath))
      .map((node) => node.id),
  );

  const retainedNodes = existingGraph.nodes.filter(
    (node) => !removedNodeIds.has(node.id),
  );

  const retainedEdges = existingGraph.edges.filter(
    (edge) => !removedNodeIds.has(edge.source) && !removedNodeIds.has(edge.target),
  );

  return {
    ...existingGraph,
    project: { ...existingGraph.project, gitCommitHash: newCommitHash, analyzedAt: new Date().toISOString() },
    nodes: [...retainedNodes, ...newNodes],
    edges: [...retainedEdges, ...newEdges],
  };
}
```

This invariant — removing both the affected nodes and any edge touching them — prevents dangling references in the merged graph.

Sources: [understand-anything-plugin/packages/core/src/staleness.ts:47-90]()

---

## Auto-Update Hooks

Two hooks in `hooks.json` allow the system to automatically detect and respond to graph staleness without user invocation.

### PostToolUse: Post-Commit Hook

```json
// understand-anything-plugin/hooks/hooks.json:4-13
"PostToolUse": [
  {
    "matcher": "Bash",
    "hooks": [
      {
        "type": "command",
        "command": "printf '%s' \"$TOOL_INPUT\" | grep -qE 'git\\s+(commit|merge|cherry-pick|rebase)' && [ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' .understand-anything/config.json && [ -f .understand-anything/knowledge-graph.json ] && echo \"[understand-anything] Commit detected with auto-update enabled. You MUST read the file at ${CLAUDE_PLUGIN_ROOT}/hooks/auto-update-prompt.md and execute its instructions...\" || true"
      }
    ]
  }
]
```

This hook fires after every Bash tool use. It checks whether the Bash input looks like a git commit/merge/cherry-pick/rebase, verifies that `autoUpdate` is true in `config.json`, and that a graph already exists. If all conditions are met, it injects an instruction into the Claude session to run the incremental update prompt.

### SessionStart: Stale-on-Session-Open Hook

```json
// understand-anything-plugin/hooks/hooks.json:14-23
"SessionStart": [
  {
    "hooks": [
      {
        "type": "command",
        "command": "[ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' ... && [ \"$(node -p ...)\" != \"$(git rev-parse HEAD 2>/dev/null)\" ] && echo \"[understand-anything] Knowledge graph is stale. You MUST read...\" || true"
      }
    ]
  }
]
```

At session start, if `autoUpdate` is enabled and the stored `gitCommitHash` in `meta.json` differs from the current `HEAD`, Claude is prompted to run the incremental update immediately. This handles the case where commits were made outside of an active Claude session.

Sources: [understand-anything-plugin/hooks/hooks.json:1-25]()

---

## Auto-Update Execution (Three-Phase Protocol)

When a hook fires, Claude reads `auto-update-prompt.md` and follows a three-phase protocol designed to minimize LLM token usage.

```mermaid
stateDiagram-v2
    [*] --> Phase0: Hook triggered
    Phase0 --> STOP_NoGraph: No knowledge-graph.json
    Phase0 --> STOP_UpToDate: Hashes match
    Phase0 --> STOP_NonSource: Only non-source files changed
    Phase0 --> Phase1: Source files changed

    Phase1 --> SKIP: All NONE/COSMETIC
    Phase1 --> PARTIAL_UPDATE: ≤10 structural files, same dirs
    Phase1 --> ARCHITECTURE_UPDATE: >10 structural or new dirs
    Phase1 --> FULL_UPDATE_STOP: >30 structural or >50% of graph

    SKIP --> SaveMeta: Update meta.json only
    PARTIAL_UPDATE --> Phase2: Re-analyze changed files
    ARCHITECTURE_UPDATE --> Phase2: Re-analyze + rerun architecture
    Phase2 --> Phase3: Merge results
    Phase3 --> SaveGraph: Write knowledge-graph.json + meta.json + fingerprints
```

**Phase 0 (Zero token cost):** Checks hashes, enumerates changed files, filters to source extensions (`.ts`, `.tsx`, `.js`, `.py`, `.go`, `.rs`, etc.), and applies `.understandignore` exclusions. If no relevant source files remain after filtering, `meta.json` is updated and execution stops.

**Phase 1 (Zero LLM tokens):** Runs a Node.js fingerprint-check script against stored `fingerprints.json` to classify each file as `NONE`, `COSMETIC`, or `STRUCTURAL`. The outcome drives the action decision (`SKIP`, `PARTIAL_UPDATE`, `ARCHITECTURE_UPDATE`, or `FULL_UPDATE`).

**Phase 2 (Minimal LLM tokens):** Re-dispatches the `file-analyzer` agent only for structurally changed files. Results are merged using the same node-removal logic described above for `mergeGraphUpdate()`.

**Phase 3:** Saves the final graph, updates `meta.json`, and performs a load-patch-save update of `fingerprints.json` (never overwriting the full dict — only patching re-analyzed entries to avoid issue #152).

Sources: [understand-anything-plugin/hooks/auto-update-prompt.md:1-321]()

---

## Worktree Redirect Invariant

A critical invariant governs where the graph is written when `/understand` runs inside a Claude Code git worktree.

Claude Code creates temporary worktrees for tasks. Any `.understand-anything/` directory written inside a worktree is destroyed when the session ends — taking the knowledge graph with it (documented as issue #133). To prevent this, Phase 0 of `/understand` detects worktrees by comparing `git rev-parse --git-dir` against `git rev-parse --git-common-dir`:

```bash
# understand-anything-plugin/skills/understand/SKILL.md:36-51
COMMON_DIR=$(git -C "$PROJECT_ROOT" rev-parse --git-common-dir 2>/dev/null)
GIT_DIR=$(git -C "$PROJECT_ROOT" rev-parse --git-dir 2>/dev/null)
if [ -n "$COMMON_DIR" ] && [ -n "$GIT_DIR" ]; then
  COMMON_ABS=$(cd "$PROJECT_ROOT" && cd "$COMMON_DIR" 2>/dev/null && pwd -P)
  GIT_ABS=$(cd "$PROJECT_ROOT" && cd "$GIT_DIR" 2>/dev/null && pwd -P)
  if [ -n "$COMMON_ABS" ] && [ "$COMMON_ABS" != "$GIT_ABS" ]; then
    MAIN_ROOT=$(dirname "$COMMON_ABS")
    PROJECT_ROOT="$MAIN_ROOT"   # redirect output to main repo root
  fi
fi
```

In a normal checkout or submodule, `--git-dir` and `--git-common-dir` resolve to the same path and no redirect occurs. In a worktree they differ; the parent of `--git-common-dir` is the main repo root, and `PROJECT_ROOT` is updated accordingly. The redirect can be suppressed with `UNDERSTAND_NO_WORKTREE_REDIRECT=1` for the rare case of wanting a per-worktree graph.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:33-53]()

---

## Fingerprint Integrity: The LOAD-PATCH-SAVE Rule

The most critical invariant during fingerprint updates is that `fingerprints.json` must be loaded fully before patching, and saved in full afterward. A naive implementation that writes only the freshly computed entries discards every other file's fingerprint; on the next auto-update, those files have no stored baseline, get classified as `STRUCTURAL`, and the system escalates to `FULL_UPDATE` permanently.

The correct pattern (from `auto-update-prompt.md` Phase 3d) is:

```javascript
// 1. LOAD ALL existing entries
const all = existsSync(fpPath) ? JSON.parse(readFileSync(fpPath, 'utf-8')) : {};
const before = Object.keys(all).length;

// 2. PATCH only the re-analyzed paths
for (const filePath of filesToReanalyze) { /* update or delete all[filePath] */ }

// 3. GUARD: if file existed and non-empty but loaded as {}, abort
if (existedAndNonEmpty && before === 0) {
  throw new Error('fingerprints.json existed and was non-empty but loaded as {} — refusing to overwrite');
}

// 4. SAVE ALL entries back
writeFileSync(fpPath, JSON.stringify(all, null, 2));
```

The guard on step 3 prevents a silent read failure from clobbering the store. This is the same failure mode as issue #152, caught at the write boundary rather than the read boundary.

Sources: [understand-anything-plugin/hooks/auto-update-prompt.md:243-290]()

---

## Summary

Staleness detection in Understand Anything is a multi-layered pipeline where each layer has zero cost if nothing significant changed. The git-diff check (`staleness.ts`) determines whether any files are new since the last analysis; the fingerprint system (`fingerprint.ts`, `change-classifier.ts`) further classifies whether those file-level changes affect signatures the knowledge graph cares about; and `mergeGraphUpdate()` (`staleness.ts:54-90`) applies a minimal surgical replacement — removing stale nodes and their incident edges, then appending fresh ones — so the vast majority of the graph survives unchanged. Two hooks make this automatic: a post-commit hook for changes made inside a Claude session, and a session-start hook for changes made between sessions. The worktree redirect ensures the graph is never written to an ephemeral location, and the LOAD-PATCH-SAVE fingerprint rule ensures the incremental system never accidentally enters a degenerate state that forces unnecessary full rebuilds.

---

## 06. Dashboard State Machine: Zustand Store & View Modes

> The dashboard's single Zustand store (store.ts) owns all runtime state: the loaded KnowledgeGraph, active Persona (non-technical / junior / experienced), ViewMode (structural / domain / knowledge), FilterState (node types, complexities, layers, edge categories), selected node, search results via SearchEngine (Fuse.js fuzzy search on name/tags/summary/languageNotes), and the React Flow instance. The store is the single source of truth — components never hold local graph state. Key boundary: dashboard imports only from @understand-anything/core/search, /types, and /schema (browser-safe subpath exports); never the core main entry point, which pulls in Node.js modules.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/06-dashboard-state-machine-zustand-store-view-modes.md
- Generated: 2026-05-22T00:58:23.040Z

### Source Files

- `understand-anything-plugin/packages/dashboard/src/store.ts`
- `understand-anything-plugin/packages/core/src/search.ts`
- `understand-anything-plugin/packages/dashboard/src/App.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx`
- `understand-anything-plugin/packages/dashboard/src/components/CodeViewer.tsx`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/packages/dashboard/src/store.ts](understand-anything-plugin/packages/dashboard/src/store.ts)
- [understand-anything-plugin/packages/core/src/search.ts](understand-anything-plugin/packages/core/src/search.ts)
- [understand-anything-plugin/packages/dashboard/src/App.tsx](understand-anything-plugin/packages/dashboard/src/App.tsx)
- [understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx](understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx)
- [understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx](understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx)
- [understand-anything-plugin/packages/dashboard/src/components/CodeViewer.tsx](understand-anything-plugin/packages/dashboard/src/components/CodeViewer.tsx)
</details>

# Dashboard State Machine: Zustand Store & View Modes

The Understand-Anything dashboard is driven by a single Zustand store (`useDashboardStore`) that owns every piece of runtime state: the loaded knowledge graph, active persona, view mode, filter configuration, selected node, search results, navigation history, code viewer visibility, tour progress, diff overlay, container expand/collapse state, and the React Flow instance. No component holds its own graph-level state; all reads and mutations flow through this store.

This architecture matters because the graph is large and highly interconnected — multiple independent UI panels (graph canvas, sidebar, search bar, code viewer, tour overlay) must stay synchronized without prop drilling. The store is also the enforcement point for two structural invariants: the `nodeIdToLayerId` / `nodeIdToLayerIds` dual-index design that drives layer navigation, and the cache-invalidation discipline that keeps ELK layout results consistent when the visible node set changes.

---

## Store Shape

The `DashboardStore` interface (∼140 fields and methods) is defined in full at the top of `store.ts`. The logical groups are:

| Group | Key fields |
|---|---|
| **Graph data** | `graph`, `nodesById`, `nodeIdToLayerId`, `nodeIdToLayerIds` |
| **View mode** | `viewMode`, `isKnowledgeGraph`, `domainGraph`, `activeDomainId` |
| **Navigation** | `navigationLevel`, `activeLayerId`, `selectedNodeId`, `nodeHistory`, `focusNodeId` |
| **Search** | `searchQuery`, `searchResults`, `searchEngine`, `searchMode` |
| **Persona** | `persona` |
| **Filters** | `filters` (FilterState), `nodeTypeFilters` (NodeCategory booleans), `detailLevel` |
| **Code viewer** | `codeViewerOpen`, `codeViewerNodeId`, `codeViewerExpanded` |
| **Tour** | `tourActive`, `currentTourStep`, `tourHighlightedNodeIds`, `tourFitPending` |
| **Diff overlay** | `diffMode`, `changedNodeIds`, `affectedNodeIds` |
| **Layout cache** | `containerLayoutCache`, `containerSizeMemory`, `expandedContainers`, `stage1Tick` |
| **UI panels** | `filterPanelOpen`, `exportMenuOpen`, `pathFinderOpen` |
| **React Flow** | `reactFlowInstance` |
| **Issues** | `layoutIssues` |

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:100-240]()

---

## Graph Indexes: The Dual-Layer Map

When a `KnowledgeGraph` is loaded via `setGraph`, the helper `buildGraphIndexes` constructs three lookup structures:

```ts
// understand-anything-plugin/packages/dashboard/src/store.ts
function buildGraphIndexes(graph: KnowledgeGraph): {
  nodesById: Map<string, GraphNode>;
  nodeIdToLayerId: Map<string, string>;    // first-matching-layer wins
  nodeIdToLayerIds: Map<string, Set<string>>; // every layer the node belongs to
}
```

The design comment in the source explains why two separate maps are needed:

- **`nodeIdToLayerId`** uses "first matching layer wins" semantics — if a node appears in multiple layers, only the first occurrence in `graph.layers` order is recorded. This is correct for navigation: drilling into a layer or computing a tour's target layer needs exactly one canonical answer.
- **`nodeIdToLayerIds`** records every layer membership. This is correct for filtering: a node in layer L1 and L2 must remain visible when only L2 is selected. Collapsing to first-wins for filtering would silently hide nodes.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:54-95]()

---

## `setGraph`: The Initialization Transition

`setGraph` is the entry point for all graph-loading logic. It is called from `App.tsx` after the fetched JSON passes schema validation:

```ts
// understand-anything-plugin/packages/dashboard/src/App.tsx
fetch(dataUrl("knowledge-graph.json", accessToken))
  .then(res => res.json())
  .then((data: unknown) => {
    const result = validateGraph(data);
    if (result.success && result.data) {
      setGraph(result.data);
      ...
      if ((data as Record<string, unknown>).kind === "knowledge") {
        useDashboardStore.getState().setViewMode("knowledge");
        useDashboardStore.getState().setIsKnowledgeGraph(true);
      }
    }
  });
```

Inside `setGraph`, the store:
1. Runs `buildGraphIndexes` to build the three lookup maps.
2. Instantiates a new `SearchEngine` over the graph nodes.
3. Re-runs any pending `searchQuery` against the new engine.
4. Preserves the current `"domain"` view mode if a domain graph is already loaded (`keepDomainView`).
5. Resets navigation, selection, focus, and all layout caches to a clean initial state.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:365-394](), [understand-anything-plugin/packages/dashboard/src/App.tsx:132-163]()

---

## View Modes

Three view modes control which graph canvas component is mounted:

| `ViewMode` value | Graph component | When active |
|---|---|---|
| `"structural"` | `GraphView` | Default; shows the full structural graph with ELK layout |
| `"domain"` | `DomainGraphView` | When a `domain-graph.json` is available and the user switches to Domain view |
| `"knowledge"` | `KnowledgeGraphView` | When the loaded graph has `kind === "knowledge"` |

The App routes to the correct canvas at render time:

```tsx
// understand-anything-plugin/packages/dashboard/src/App.tsx
{viewMode === "knowledge" ? (
  <KnowledgeGraphView />
) : viewMode === "domain" && domainGraph ? (
  <DomainGraphView />
) : (
  <GraphView />
)}
```

`setViewMode` resets `selectedNodeId`, `focusNodeId`, and the code viewer when called — the new view starts clean. The domain graph is fetched separately from `domain-graph.json`; if absent, the domain mode button is hidden entirely (guarded by `domainGraph !== null` in App.tsx line 452).

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:685-694](), [understand-anything-plugin/packages/dashboard/src/App.tsx:638-644](), [understand-anything-plugin/packages/dashboard/src/App.tsx:452-483]()

---

## Navigation State Machine

Navigation within the structural graph has two levels, controlled by `navigationLevel: "overview" | "layer-detail"`.

```
┌──────────────────────────────────────────────────────────┐
│  navigationLevel = "overview"                            │
│  All layer-cluster nodes visible; no drill-down active   │
└────────────────┬─────────────────────────────────────────┘
                 │  drillIntoLayer(layerId)
                 │  navigateToNodeInLayer(nodeId) [if node has a layer]
                 ▼
┌──────────────────────────────────────────────────────────┐
│  navigationLevel = "layer-detail"                        │
│  activeLayerId = <id>; only this layer's nodes shown     │
└────────────────┬─────────────────────────────────────────┘
                 │  navigateToOverview()
                 │  Escape key (if nothing else to dismiss)
                 ▼
             (back to overview)
```

`navigateToNodeInLayer` resolves the node's canonical layer via `nodeIdToLayerId`, then sets both `navigationLevel = "layer-detail"` and `activeLayerId`. If the node has no layer, it only sets `selectedNodeId`.

Node selection history is maintained as a capped stack (`MAX_HISTORY = 50`). `selectNode`, `navigateToNode`, `goBackNode`, and `navigateToHistoryIndex` all manage this stack, pushing the outgoing `selectedNodeId` before navigating forward and popping it when going back.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:97-98](), [understand-anything-plugin/packages/dashboard/src/store.ts:409-490]()

---

## Persona

The `persona` field takes one of three string literals: `"non-technical"`, `"junior"`, or `"experienced"`. It is declared at the top of `store.ts`:

```ts
export type Persona = "non-technical" | "junior" | "experienced";
```

The default is `"junior"`. Changing persona via `setPersona` also clears the container layout cache — persona drives which node types are visible (and thus which containers have which children), so cached positions become invalid.

In `App.tsx`, persona feeds a derived `isLearnMode` flag: `const isLearnMode = tourActive || persona === "junior"`. When `isLearnMode` is true, the sidebar renders `LearnPanel` below `NodeInfo` instead of `ProjectOverview`.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:12](), [understand-anything-plugin/packages/dashboard/src/store.ts:534-542](), [understand-anything-plugin/packages/dashboard/src/App.tsx:391-402]()

---

## FilterState

Filtering is expressed through two complementary mechanisms:

### 1. `FilterState` (advanced filter panel)
```ts
export interface FilterState {
  nodeTypes: Set<NodeType>;       // which fine-grained types are visible
  complexities: Set<Complexity>;  // "simple" | "moderate" | "complex"
  layerIds: Set<string>;          // layer membership filter
  edgeCategories: Set<EdgeCategory>; // which edge groups to show
}
```

All sets default to "all enabled" (`ALL_NODE_TYPES`, `ALL_COMPLEXITIES`, etc.). `hasActiveFilters()` checks whether any set has been narrowed from its default to drive UI indicator badges.

### 2. `nodeTypeFilters` (category toggles in the header)
A flat `Record<NodeCategory, boolean>` with categories: `"code" | "config" | "docs" | "infra" | "data" | "domain" | "knowledge"`. These map to groups of `NodeType` values as defined by `NODE_TYPE_TO_CATEGORY` in `GraphView.tsx`.

Both mechanisms are in the store; `GraphView` applies them when deriving the visible node set for React Flow.

### Edge category grouping
The `EDGE_CATEGORY_MAP` constant provides the canonical grouping of edge types into categories:

```ts
export const EDGE_CATEGORY_MAP: Record<EdgeCategory, string[]> = {
  structural: ["imports", "exports", "contains", "inherits", "implements"],
  behavioral: ["calls", "subscribes", "publishes", "middleware"],
  "data-flow": ["reads_from", "writes_to", "transforms", "validates"],
  dependencies: ["depends_on", "tested_by", "configures"],
  semantic: ["related", "similar_to"],
  infrastructure: ["deploys", "serves", "provisions", "triggers", ...],
  domain: ["contains_flow", "flow_step", "cross_domain"],
  knowledge: ["cites", "contradicts", "builds_on", "exemplifies", ...],
};
```

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:20-49](), [understand-anything-plugin/packages/dashboard/src/store.ts:596-602]()

---

## SearchEngine Integration

`SearchEngine` is instantiated in `setGraph` and stored as `searchEngine` in the store. It wraps Fuse.js with a fixed field-weight configuration:

| Field | Weight |
|---|---|
| `name` | 0.4 |
| `tags` | 0.3 |
| `summary` | 0.2 |
| `languageNotes` | 0.1 |

```ts
// understand-anything-plugin/packages/core/src/search.ts
const FUSE_OPTIONS: IFuseOptions<GraphNode> = {
  keys: [
    { name: "name", weight: 0.4 },
    { name: "tags", weight: 0.3 },
    { name: "summary", weight: 0.2 },
    { name: "languageNotes", weight: 0.1 },
  ],
  threshold: 0.4,
  includeScore: true,
  ignoreLocation: true,
  useExtendedSearch: true,
};
```

When `setSearchQuery` is called, the store calls `engine.search(query)` if both `searchEngine` is non-null and `query.trim()` is non-empty; otherwise `searchResults` is set to `[]`. The `searchMode` field (`"fuzzy" | "semantic"`) is stored but currently both modes route to the same Fuse.js engine — a comment in `setSearchQuery` marks this as the placeholder for a future embeddings-based path.

Sources: [understand-anything-plugin/packages/core/src/search.ts:14-25](), [understand-anything-plugin/packages/dashboard/src/store.ts:520-532]()

---

## Code Viewer State

The code viewer has three states that form a mini state machine:

```
closed ──openCodeViewer(nodeId)──► collapsed-overlay
                                       │
                               expandCodeViewer()
                                       │
                                       ▼
                                 expanded-modal
                                       │
                               collapseCodeViewer()
                                       │
                                       ▼
                               collapsed-overlay
                                       │
                               closeCodeViewer()
                                       │
                                       ▼
                                    closed
```

`App.tsx` renders the collapsed overlay as an absolutely-positioned 40vh panel at the bottom, and the expanded state as a fixed full-screen modal with a backdrop. Both mount the `CodeViewer` component (lazy-loaded), which fetches source via `/file-content.json?token=…&path=…`.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:544-549](), [understand-anything-plugin/packages/dashboard/src/App.tsx:656-684]()

---

## Tour State

The tour is driven by `graph.tour` — an array of `TourStep` objects sorted by `step.order` at runtime. `startTour`, `nextTourStep`, `prevTourStep`, and `setTourStep` all call `navigateTourToLayer` to automatically drill into the correct layer for the first highlighted node in each step. When a step crosses layers, `layerResetIfChanged` clears the container layout cache so the new layer's ELK layout runs fresh.

`tourFitPending` is a boolean flag set to `true` while the graph is waiting for highlighted nodes to materialize after a layer change; it drives a "Computing layout…" overlay in the UI.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:247-287](), [understand-anything-plugin/packages/dashboard/src/store.ts:604-670]()

---

## Layout Cache Invalidation Discipline

Every state transition that changes the visible node set **must** reset the container layout cache. The store enforces this consistently by including these four resets in every relevant action:

```ts
containerLayoutCache: new Map(),
containerSizeMemory: new Map(),
expandedContainers: new Set(),
pendingFocusContainer: null,
```

This pattern appears in: `drillIntoLayer`, `navigateToOverview`, `setFocusNode`, `setPersona`, `toggleNodeTypeFilter`, `setDetailLevel`, `toggleShowFunctionsInClassView`, `startTour` / `setTourStep` (when layer changes), and `setGraph`. Failure to reset in any of these would cause ELK to position new nodes using stale positions from the previous visible set.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:474-505](), [understand-anything-plugin/packages/dashboard/src/store.ts:327-363]()

---

## Module Boundary: Browser-Safe Core Imports

The dashboard enforces a strict import boundary: it may only import from core's browser-safe subpath exports, never from the main entry point. The store itself demonstrates this at its top:

```ts
import { SearchEngine } from "@understand-anything/core/search";
import type { SearchResult } from "@understand-anything/core/search";
import type { GraphIssue } from "@understand-anything/core/schema";
import type { GraphNode, KnowledgeGraph, TourStep } from "@understand-anything/core/types";
```

The main core entry point pulls in Node.js modules (tree-sitter, filesystem utilities) that would fail in the browser. Using subpath exports (`/search`, `/types`, `/schema`) keeps the browser bundle free of those dependencies. This boundary is enforced by TypeScript module resolution and documented in CLAUDE.md.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:1-10]()

---

## Summary

The `useDashboardStore` in `store.ts` is the single source of truth for all dashboard runtime state. Its design encodes several invariants: the dual-index layer maps ensure correctness for both navigation (first-layer-wins) and filtering (all-layers membership); layout cache resets are co-located with every state mutation that changes node visibility; and view mode transitions start with a clean selection state. `SearchEngine` (backed by Fuse.js with weighted fields) is embedded in the store and rebuilt on each graph load. The import boundary — all dashboard code importing only from core's browser-safe subpaths — is enforced structurally rather than by convention.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:289-393]()

---

## 07. Skill Surface: /understand, /understand-chat, /understand-diff & Hooks

> Eight skills are exposed: /understand (full graph build), /understand-dashboard (opens dashboard), /understand-chat (Q&A against the graph using context-builder.ts), /understand-diff (change analysis via diff-analyzer.ts), /understand-explain (node explanation via explain-builder.ts), /understand-onboard (onboarding guide via onboard-builder.ts), /understand-domain, and /understand-knowledge. The @understand-anything/skill package exports typed builders consumed by the chat/diff/explain/onboard skills. Hooks (hooks.json) fire PostToolUse on git commit to trigger auto-update and a PreToolUse hook to auto-update before /understand-chat responses. Agent models are all set to inherit for cross-platform compatibility.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/07-skill-surface-understand-understand-chat-understand-diff-hooks.md
- Generated: 2026-05-22T00:56:12.835Z

### Source Files

- `understand-anything-plugin/src/index.ts`
- `understand-anything-plugin/src/context-builder.ts`
- `understand-anything-plugin/src/diff-analyzer.ts`
- `understand-anything-plugin/src/explain-builder.ts`
- `understand-anything-plugin/src/onboard-builder.ts`
- `understand-anything-plugin/hooks/hooks.json`
- `understand-anything-plugin/skills/understand-chat/SKILL.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/src/index.ts](understand-anything-plugin/src/index.ts)
- [understand-anything-plugin/src/context-builder.ts](understand-anything-plugin/src/context-builder.ts)
- [understand-anything-plugin/src/diff-analyzer.ts](understand-anything-plugin/src/diff-analyzer.ts)
- [understand-anything-plugin/src/explain-builder.ts](understand-anything-plugin/src/explain-builder.ts)
- [understand-anything-plugin/src/onboard-builder.ts](understand-anything-plugin/src/onboard-builder.ts)
- [understand-anything-plugin/hooks/hooks.json](understand-anything-plugin/hooks/hooks.json)
- [understand-anything-plugin/hooks/auto-update-prompt.md](understand-anything-plugin/hooks/auto-update-prompt.md)
- [understand-anything-plugin/skills/understand/SKILL.md](understand-anything-plugin/skills/understand/SKILL.md)
- [understand-anything-plugin/skills/understand-chat/SKILL.md](understand-anything-plugin/skills/understand-chat/SKILL.md)
- [understand-anything-plugin/skills/understand-diff/SKILL.md](understand-anything-plugin/skills/understand-diff/SKILL.md)
- [understand-anything-plugin/skills/understand-explain/SKILL.md](understand-anything-plugin/skills/understand-explain/SKILL.md)
- [understand-anything-plugin/skills/understand-onboard/SKILL.md](understand-anything-plugin/skills/understand-onboard/SKILL.md)
- [understand-anything-plugin/skills/understand-dashboard/SKILL.md](understand-anything-plugin/skills/understand-dashboard/SKILL.md)
- [understand-anything-plugin/skills/understand-domain/SKILL.md](understand-anything-plugin/skills/understand-domain/SKILL.md)
- [understand-anything-plugin/skills/understand-knowledge/SKILL.md](understand-anything-plugin/skills/understand-knowledge/SKILL.md)
</details>

# Skill Surface: /understand, /understand-chat, /understand-diff & Hooks

The `@understand-anything/skill` package exposes eight slash commands (skills) that cover the full lifecycle of working with a codebase knowledge graph: building the graph, querying it conversationally, analyzing diffs, explaining individual components, generating onboarding guides, visualizing a business domain model, and indexing an LLM wiki. Each skill is a SKILL.md prompt file consumed directly by the AI runtime; the four chat/diff/explain/onboard skills additionally delegate heavy graph-query logic to typed TypeScript builder functions exported from `src/index.ts`, keeping the prompt logic thin and the computation testable.

A `hooks.json` file wires two lifecycle hooks into the AI runtime so that the knowledge graph stays current without user intervention: a `PostToolUse` hook fires whenever a `git commit/merge/cherry-pick/rebase` is detected and triggers an incremental auto-update, and a `SessionStart` hook compares the stored git commit hash against `HEAD` and triggers the same update if they diverge. All agent models referenced across every skill are set to `inherit`, making the entire skill surface portable across Claude Code, Cursor, opencode, and any other compatible runtime.

---

## Skill Inventory

| Skill | Trigger | What it does | Backed by |
|---|---|---|---|
| `/understand` | User-invoked | Full or incremental graph build | 7-phase pipeline with subagents |
| `/understand-dashboard` | Auto after `/understand` | Starts Vite dev server with tokenized URL | `packages/dashboard/` |
| `/understand-chat` | User-invoked | Q&A against the graph | `context-builder.ts` |
| `/understand-diff` | User-invoked | Change/PR impact analysis | `diff-analyzer.ts` |
| `/understand-explain` | User-invoked | Deep-dive on a file or function | `explain-builder.ts` |
| `/understand-onboard` | User-invoked | Markdown onboarding guide | `onboard-builder.ts` |
| `/understand-domain` | User-invoked | Business domain flow graph | `domain-analyzer` agent |
| `/understand-knowledge` | User-invoked | Karpathy LLM wiki graph | `article-analyzer` agent |

---

## `/understand` — Graph Build Pipeline

`/understand` is the root skill. It produces `.understand-anything/knowledge-graph.json` and then automatically triggers `/understand-dashboard`.

### Options

| Flag | Effect |
|---|---|
| `--full` | Force complete rebuild, ignoring existing graph |
| `--auto-update` | Write `autoUpdate: true` to `config.json` (enables hook-driven incremental updates) |
| `--no-auto-update` | Write `autoUpdate: false` to `config.json` |
| `--review` | Use the LLM `graph-reviewer` agent instead of inline deterministic validation |
| `--language <lang>` | Output summaries/tags in a specific language (ISO 639-1 or friendly name); persisted to `config.json` |
| `<path>` | Analyze a different directory; git worktrees are redirected to the main repo root |

### Seven-Phase Pipeline

```text
Phase 0    Pre-flight: resolve PROJECT_ROOT, detect worktree, build core if needed,
           check auto-update config, merge subdomain graphs
Phase 0.5  .understandignore setup (user-confirmed)
Phase 1    SCAN — project-scanner agent: file tree, language/framework detection,
           importMap, fileCategory per file
Phase 2    ANALYZE — up to 5 concurrent file-analyzer subagents (batches of 25 files);
           merge-batch-graphs.py normalizes IDs and linkes tested_by edges
Phase 3    ASSEMBLE REVIEW — assemble-reviewer agent validates cross-batch consistency
Phase 4    ARCHITECTURE — architecture-analyzer agent; language/framework context injected
Phase 5    TOUR — tour-builder agent; README and entry point injected as context
Phase 6    REVIEW — inline deterministic Node.js validator (or LLM graph-reviewer with --review)
Phase 7    SAVE — write knowledge-graph.json, build fingerprints baseline, write meta.json,
           clean intermediate files, auto-launch dashboard
```

The decision logic in Phase 0 determines whether to run a full analysis, incremental update, or review-only:

- **No existing graph** → full analysis (all phases)
- **Existing graph + same commit hash** → ask user (rebuild / review / nothing)
- **Existing graph + changed files** → incremental: only re-analyze changed files, then re-run architecture/tour
- **`--full`** → always full analysis

Sources: [understand-anything-plugin/skills/understand/SKILL.md:143-153]()

### Knowledge Graph Schema

The output JSON has a fixed schema validated in Phase 6. Key shapes:

| Section | Contents |
|---|---|
| `project` | name, description, languages, frameworks, analyzedAt, gitCommitHash |
| `nodes[]` | id, type (13 types), name, filePath, summary, tags, complexity, languageNotes? |
| `edges[]` | source, target, type (26 types), direction, weight |
| `layers[]` | id, name, description, nodeIds[] |
| `tour[]` | order, title, description, nodeIds[], languageLesson? |

Node ID conventions: `file:<path>`, `function:<path>:<name>`, `class:<path>:<name>`, `config:<path>`, `document:<path>`, `service:<path>`, etc.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:751-789]()

---

## The `@understand-anything/skill` Package

The TypeScript source under `understand-anything-plugin/src/` exports four typed builder functions consumed by the chat, diff, explain, and onboard skill prompts. This separates testable computation from prompt text.

```typescript
// understand-anything-plugin/src/index.ts
export { buildChatContext, formatContextForPrompt, type ChatContext } from "./context-builder.js";
export { buildChatPrompt } from "./understand-chat.js";
export { buildDiffContext, formatDiffAnalysis, type DiffContext } from "./diff-analyzer.js";
export { buildExplainContext, formatExplainPrompt, type ExplainContext } from "./explain-builder.js";
export { buildOnboardingGuide } from "./onboard-builder.js";
```

Sources: [understand-anything-plugin/src/index.ts:1-17]()

---

## `/understand-chat` — Graph Q&A

`/understand-chat [query]` answers freeform questions about the codebase by searching the knowledge graph rather than reading source files directly.

### How the Skill Prompt Works

1. Verify `.understand-anything/knowledge-graph.json` exists (fail fast if not)
2. Grep the graph for the user's query keywords in `"name"`, `"summary"`, and `"tags"` fields — avoiding loading the full JSON into context
3. For each matched node ID, grep edges to build the 1-hop subgraph
4. Grep `"layers"` to find which architectural layers the matched nodes belong to
5. Answer using only the relevant subgraph, referencing specific files, functions, and relationships

Sources: [understand-anything-plugin/skills/understand-chat/SKILL.md:25-54]()

### `context-builder.ts` — The Backing Library

The `buildChatContext` function is the programmatic equivalent of the skill's grep-based approach. It uses `SearchEngine` from `@understand-anything/core` to search nodes, then expands one hop via edges, and collects all layers that contain any expanded node:

```typescript
// understand-anything-plugin/src/context-builder.ts:25-79
export function buildChatContext(graph, query, maxNodes): ChatContext {
  const engine = new SearchEngine(graph.nodes);
  const searchResults = engine.search(query, { limit: maxNodes ?? 15 });

  // 1-hop expansion via edges
  const expandedIds = new Set(matchedIds);
  for (const edge of graph.edges) {
    if (matchedIds.has(edge.source)) expandedIds.add(edge.target);
    if (matchedIds.has(edge.target)) expandedIds.add(edge.source);
  }

  // Collect edges where both endpoints are in the expanded set
  // Collect layers containing any expanded node
  return { projectName, projectDescription, relevantNodes, relevantEdges, relevantLayers, query };
}
```

`formatContextForPrompt` then serializes the `ChatContext` into markdown sections (Project header, Relevant Layers, Code Components, Relationships) suitable for LLM consumption.

Sources: [understand-anything-plugin/src/context-builder.ts:25-147]()

---

## `/understand-diff` — Change Impact Analysis

`/understand-diff` maps git diffs onto the knowledge graph to identify changed components, downstream blast radius, affected architectural layers, and risk level.

### Skill Prompt Workflow

1. Get changed files: `git diff --name-only` (uncommitted) or `git diff main...HEAD --name-only` (branch)
2. Grep the graph for nodes whose `filePath` matches each changed file
3. Grep edges for 1-hop neighbors (upstream callers and downstream dependencies)
4. Identify affected layers
5. Produce a structured report: Changed Components, Affected Components, Affected Layers, Risk Assessment
6. Write `.understand-anything/diff-overlay.json` so the dashboard can highlight changed/affected nodes visually

Sources: [understand-anything-plugin/skills/understand-diff/SKILL.md:29-70]()

### `diff-analyzer.ts` — The Backing Library

`buildDiffContext` maps file paths to graph nodes (also pulling in `contains`-edge children of matched file nodes), finds 1-hop affected neighbors, and computes a `DiffContext`:

```typescript
// understand-anything-plugin/src/diff-analyzer.ts:22-88
export function buildDiffContext(graph, changedFiles): DiffContext {
  // Map files → node IDs; track unmappedFiles for newly created/deleted files
  // Expand via "contains" edges (e.g., a file node → its function nodes)
  // Find 1-hop affected nodes and the impacted edges
  // Identify affected layers from union of changed + affected node IDs
  return { changedFiles, changedNodes, affectedNodes, impactedEdges, affectedLayers, unmappedFiles };
}
```

`formatDiffAnalysis` produces a risk-scored markdown report. The risk assessment heuristics are:

| Signal | Risk indicator |
|---|---|
| Any `changedNodes` with `complexity === "complex"` | High complexity flag |
| `affectedLayers.length > 1` | Cross-layer impact flag |
| `affectedNodes.length > 5` | Wide blast radius flag |
| `unmappedFiles.length > 0` | New/untracked files flag |
| None of the above | Low risk |

Sources: [understand-anything-plugin/src/diff-analyzer.ts:160-194]()

---

## `/understand-explain` — Deep Component Explanation

`/understand-explain [file-path]` produces a thorough explanation of a single file or function by combining graph context with source file reading.

### Path Format Support

Both `src/auth.ts` (file) and `src/auth.ts:login` (file:function) are accepted. The skill prompt reads the actual source file in step 6, while steps 2-5 use only graph grep operations.

### `explain-builder.ts` — The Backing Library

`buildExplainContext` resolves `path:function` notation, falls back to file path matching, then collects child nodes (via `contains` edges), connected neighbors (1-hop, excluding children), all relevant edges, and the containing layer:

```typescript
// understand-anything-plugin/src/explain-builder.ts:22-103
export function buildExplainContext(graph, path): ExplainContext {
  // Parse "src/auth.ts:login" → filePath + funcName
  // targetNode → childNodes (contains edges) → connectedNodes (1-hop)
  // relevantEdges: all edges touching target or children
  // layer: first layer containing the target node ID
}
```

`formatExplainPrompt` serializes the result as a structured "Deep Dive" prompt with sections for Architectural Layer, Internal Components, Connected Components, Relationships, and Language Notes. When the target node is not found, it returns a diagnostic message explaining possible causes (not analyzed, renamed, or deleted).

Sources: [understand-anything-plugin/src/explain-builder.ts:22-196]()

---

## `/understand-onboard` — Onboarding Guide Generation

`/understand-onboard` synthesizes the knowledge graph into a standalone markdown document suitable for new team members.

### `onboard-builder.ts` — The Backing Library

`buildOnboardingGuide` reads the full graph and renders seven sections in order:

| Section | Source data |
|---|---|
| Project overview table | `graph.project` (languages, frameworks, node/edge counts, analyzedAt) |
| Architecture layers | `graph.layers` + node names from `graph.nodes` |
| Key Concepts | `graph.nodes` filtered to `type === "concept"` |
| Getting Started (Guided Tour) | `graph.tour` steps with ordered walkthrough and optional `languageLesson` |
| File Map | All `type === "file"` nodes with filePath, summary, complexity |
| Complexity Hotspots | `graph.nodes` filtered to `complexity === "complex"` |
| Footer | Version attribution |

```typescript
// understand-anything-plugin/src/onboard-builder.ts:7-124
export function buildOnboardingGuide(graph: KnowledgeGraph): string {
  // Project header table
  // Architecture section per layer (with member names resolved from nodes)
  // Key Concepts (concept-type nodes only)
  // Getting Started (tour steps, with languageLesson support)
  // File Map table
  // Complexity Hotspots list
  // Attribution footer referencing graph.version
}
```

The skill prompt (steps 8-10) instructs the AI to output clean markdown and offer to save it as `docs/ONBOARDING.md`.

Sources: [understand-anything-plugin/src/onboard-builder.ts:7-124](), [understand-anything-plugin/skills/understand-onboard/SKILL.md:43-54]()

---

## `/understand-dashboard` — Dashboard Launcher

`/understand-dashboard [project-path]` starts the Vite dev server for the React dashboard. It resolves `PLUGIN_ROOT` through a priority-ordered candidate list (`CLAUDE_PLUGIN_ROOT` → `~/.understand-anything-plugin` → self-relative symlink resolution → common clone paths), builds the core package if `dist/` is missing, then starts:

```bash
cd <dashboard-dir> && GRAPH_DIR=<project-dir> npx vite --host 127.0.0.1
```

The server prints a tokenized URL (`?token=<TOKEN>`) that must be passed to the user verbatim — the token gates access to the `/file-content.json` endpoint and is required to load graph data.

`/understand` automatically calls `/understand-dashboard` at the end of Phase 7 if graph validation passed.

Sources: [understand-anything-plugin/skills/understand-dashboard/SKILL.md:80-99]()

---

## `/understand-domain` — Business Domain Graph

`/understand-domain [--full]` extracts business domains, flows, and process steps — separate from the structural code graph.

- **Without `--full`**: if `knowledge-graph.json` exists, derives domain knowledge directly from it (no file scanning, cheap)
- **With `--full`** or no existing graph: runs a lightweight scan via `extract-domain-context.py`, which emits file tree, entry points (HTTP routes, CLI commands, event handlers, cron jobs), and file signatures — then dispatches a `domain-analyzer` agent

Output goes to `.understand-anything/domain-graph.json`. The dashboard detects this file and switches to the horizontal domain flow layout. The graph uses `kind: "knowledge"` to signal force-directed layout.

Sources: [understand-anything-plugin/skills/understand-domain/SKILL.md:89-141]()

---

## `/understand-knowledge` — Karpathy LLM Wiki Graph

`/understand-knowledge [wiki-directory]` targets a Karpathy-pattern LLM wiki (raw sources + wiki markdown with `[[wikilink]]` syntax + schema file + `index.md` + `log.md`).

The pipeline:
1. **DETECT**: `parse-knowledge-base.py` runs deterministic extraction: article nodes, source nodes, topic nodes, `related` edges from wikilinks, `categorized_under` edges from `index.md` section headings → `scan-manifest.json`
2. **ANALYZE**: `article-analyzer` subagents (batches of 10-15, up to 3 concurrent) extract implicit relationships not captured by wikilinks
3. **MERGE**: `merge-knowledge-graph.py` deduplicates entities, normalizes node/edge types, builds layers from `index.md` categories, builds tour from section ordering
4. **SAVE**: validates, writes `knowledge-graph.json`, auto-triggers `/understand-dashboard`

The `--full` flag is not applicable; a fresh run always re-parses.

Sources: [understand-anything-plugin/skills/understand-knowledge/SKILL.md:29-127]()

---

## Hooks: Automatic Graph Maintenance

Two hooks in `hooks.json` keep the knowledge graph current without user intervention.

### Hook Configuration

```json
// understand-anything-plugin/hooks/hooks.json
{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Bash",
      "hooks": [{
        "type": "command",
        "command": "printf '%s' \"$TOOL_INPUT\" | grep -qE 'git\\s+(commit|merge|cherry-pick|rebase)' && [ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' .understand-anything/config.json && [ -f .understand-anything/knowledge-graph.json ] && echo \"...\""
      }]
    }],
    "SessionStart": [{
      "hooks": [{
        "type": "command",
        "command": "... compare stored gitCommitHash with git rev-parse HEAD ..."
      }]
    }]
  }
}
```

Sources: [understand-anything-plugin/hooks/hooks.json:1-25]()

### Hook Behavior Table

| Hook | Trigger | Condition | Action |
|---|---|---|---|
| `PostToolUse` (Bash matcher) | Any Bash tool call containing `git commit/merge/cherry-pick/rebase` | `autoUpdate: true` in `config.json` AND `knowledge-graph.json` exists | Instructs AI to read `auto-update-prompt.md` and execute incremental update |
| `SessionStart` | Session begin | `autoUpdate: true` AND stored `gitCommitHash` ≠ `git rev-parse HEAD` | Same — triggers incremental update if graph is stale |

Both hooks are **guard-checked**: if `autoUpdate` is not `true` or no graph file exists, the hook outputs nothing (via `|| true`) and is a no-op.

### Auto-Update Pipeline (hooks/auto-update-prompt.md)

The hook-triggered update follows a four-phase, token-minimizing pipeline:

```text
Phase 0  Pre-flight — verify graph + meta exist; get git commit diff
         Apply .understandignore exclusions to the changed-file list
Phase 1  Structural Fingerprint Check (zero LLM tokens)
         Classify each changed file as NONE / COSMETIC / STRUCTURAL
         Decision: SKIP | PARTIAL_UPDATE | ARCHITECTURE_UPDATE | FULL_UPDATE
Phase 2  Targeted Re-Analysis (LLM tokens only for STRUCTURAL files)
         Dispatch file-analyzer subagents only for structurally changed files
         Merge new nodes/edges into existing graph (remove stale, add fresh)
Phase 3  Conditional architecture/tour update + save
         LOAD-PATCH-SAVE fingerprints (never overwrite — issue #152 guard)
         Clean intermediate files; report summary
```

The fingerprint check in Phase 1 classifies changes without LLM calls using regex extraction (functions, classes, imports, exports). Only Phase 2 spends LLM tokens, and only when structural signatures actually changed.

A critical correctness invariant: the fingerprint update in Phase 3 must **load all existing entries, patch only the re-analyzed files, and save the full dict back**. Overwriting with only the fresh batch causes every other file to appear as a new structural change on the next commit, escalating to `FULL_UPDATE` permanently (tracked as issue #152).

Sources: [understand-anything-plugin/hooks/auto-update-prompt.md:1-10](), [understand-anything-plugin/hooks/auto-update-prompt.md:94-149](), [understand-anything-plugin/hooks/auto-update-prompt.md:243-290]()

---

## Cross-Cutting Design Decisions

### Agent Model Inheritance

All agent definitions across every skill use `model: inherit`. This means the model used is whatever the host AI runtime (Claude Code, Cursor, opencode, etc.) is configured to use, with no hardcoded provider or model name anywhere in the skill surface. This makes the entire plugin BYOC/BYOK-friendly.

### Plugin Root Resolution

Every skill that needs to invoke scripts or agent definitions resolves `PLUGIN_ROOT` through the same priority list:

```text
1. $CLAUDE_PLUGIN_ROOT   (runtime-injected, highest priority)
2. ~/.understand-anything-plugin  (universal symlink)
3. Two levels up from ~/.agents/skills/<skill-name>  (self-relative)
4. Two levels up from ~/.copilot/skills/<skill-name>  (Copilot personal skills)
5. ~/.codex/.../understand-anything-plugin  (common clone paths)
6. ~/.opencode/.../  (opencode clone path)
7. ~/.pi/.../  (pi clone path)
8. ~/understand-anything/...  (home directory clone)
```

Failure to resolve any candidate causes an explicit error with all checked paths listed — no silent fallback.

### Graph-First, File-Second Query Pattern

The chat, diff, and explain skills follow a consistent strategy: **grep the graph first, read source files last (or not at all)**. This keeps most queries within a small subgraph context window rather than loading large source files. The `SearchEngine` (from `@understand-anything/core`) and grep-based approaches both implement the same 1-hop expansion logic: match by name/summary/tags, then expand via edges to neighbors.

### Worktree Redirect

`/understand` and `/understand-domain` both detect git worktrees by comparing `git rev-parse --git-dir` against `git rev-parse --git-common-dir`. When inside a worktree, output is redirected to the main repo root — worktree paths are ephemeral and would lose the graph when the session ends (tracked as issue #133). Set `UNDERSTAND_NO_WORKTREE_REDIRECT=1` to opt out.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:33-53](), [understand-anything-plugin/skills/understand-domain/SKILL.md:22-43]()

---

## Summary

The eight skills form a layered system: `/understand` builds the graph that all other skills consume, `/understand-dashboard` visualizes it, and the query skills (`/understand-chat`, `/understand-diff`, `/understand-explain`, `/understand-onboard`) extract actionable knowledge from the graph without re-reading source files. Two domain-specific skills (`/understand-domain`, `/understand-knowledge`) extend the graph model to business flows and LLM wikis respectively. The hooks system closes the loop by keeping the graph synchronized with every commit automatically, spending zero LLM tokens on cosmetic-only changes and targeting only structurally changed files when real changes occur. Sources: [understand-anything-plugin/hooks/auto-update-prompt.md:4-6]()

---

## 08. Invariants, Failure Modes & Safe-Change Rules

> A synthesis of every load-bearing constraint in the system. Hard invariants: (1) use web-tree-sitter (WASM) only — native bindings break on darwin/arm64 + Node 24; (2) dashboard imports only browser-safe core subpath exports; (3) graphs inside git worktrees are redirected to the main repo root; (4) all five version fields must be bumped in sync when releasing. Key failure modes: stale graph after code changes (fix: run /understand or enable autoUpdate), broken incremental update when lastCommitHash is missing from config.json (fix: --full rebuild), dashboard blank on schema mismatch (fix: check WarningBanner, validate graph JSON against schema.ts). Safe-change rules: adding a new language extractor only requires a new file under extractors/ plus registry entry; adding a new edge type requires updating schema.ts alias maps and the EDGE_CATEGORY_MAP in store.ts; dashboard layout changes are isolated to components/ and never touch core.

- Page Markdown: https://grok-wiki.com/public/wiki/lum1104-understand-anything-3b923df96896/pages/08-invariants-failure-modes-safe-change-rules.md
- Generated: 2026-05-22T00:57:46.226Z

### Source Files

- `understand-anything-plugin/packages/core/src/schema.ts`
- `understand-anything-plugin/packages/core/src/staleness.ts`
- `understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts`
- `understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx`
- `understand-anything-plugin/CLAUDE.md`
- `understand-anything-plugin/skills/understand/SKILL.md`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [understand-anything-plugin/packages/core/src/schema.ts](understand-anything-plugin/packages/core/src/schema.ts)
- [understand-anything-plugin/packages/core/src/staleness.ts](understand-anything-plugin/packages/core/src/staleness.ts)
- [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts](understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts)
- [understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx](understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx)
- [understand-anything-plugin/packages/core/src/plugins/extractors/index.ts](understand-anything-plugin/packages/core/src/plugins/extractors/index.ts)
- [understand-anything-plugin/packages/core/src/persistence/index.ts](understand-anything-plugin/packages/core/src/persistence/index.ts)
- [understand-anything-plugin/packages/core/src/types.ts](understand-anything-plugin/packages/core/src/types.ts)
- [understand-anything-plugin/packages/dashboard/src/store.ts](understand-anything-plugin/packages/dashboard/src/store.ts)
- [understand-anything-plugin/skills/understand/SKILL.md](understand-anything-plugin/skills/understand/SKILL.md)
- [understand-anything-plugin/src/__tests__/worktree-redirect.test.mjs](understand-anything-plugin/src/__tests__/worktree-redirect.test.mjs)
</details>

# Invariants, Failure Modes & Safe-Change Rules

This page collects every load-bearing constraint in the Understand-Anything system: hard invariants that must never be violated, failure modes a developer will encounter when those invariants slip, and a set of safe-change rules that describe the minimal, bounded edits needed to extend the system. Read this before touching tree-sitter configuration, graph schema, dashboard imports, or release scripts.

Understand-Anything is a multi-layer system: a CLI skill invokes LLM agents to produce a `knowledge-graph.json`, which a React dashboard consumes. Many of the invariants live at the seams between those layers — the graph schema, the persistence layer, and the boundary between browser-safe and Node.js-only code. Violating any one of them silently corrupts a downstream consumer; the failure often appears far from the cause.

---

## Hard Invariants

### 1. Use `web-tree-sitter` (WASM) — Never Native Bindings

`TreeSitterPlugin` imports `web-tree-sitter` exclusively and loads language grammars as `.wasm` files resolved via `require.resolve`. Native `tree-sitter` Node.js bindings are **not used** anywhere in the project.

```typescript
// understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:1
import { createRequire } from "node:module";
// ...
const mod = await import("web-tree-sitter");
const ParserCls = mod.Parser;
```

The CLAUDE.md project file states the reason explicitly: native bindings fail on darwin/arm64 + Node 24. The `packages/core/package.json` records the dependency as `"web-tree-sitter": "^0.26.6"` — there is no `tree-sitter` (native) entry.

**What breaks if violated:** On Apple Silicon with Node 24, `import('tree-sitter')` throws a native-binding error at startup, preventing any structural analysis from running. Because `TreeSitterPlugin.init()` is guarded by an `_initialized` flag, a partially initialized plugin will return empty `StructuralAnalysis` objects (`{ functions: [], classes: [], imports: [], exports: [] }`) for every file rather than crashing loudly.

Sources: [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:125-197](), [understand-anything-plugin/packages/core/package.json]()

---

### 2. Dashboard Imports Only Browser-Safe Core Subpath Exports

The dashboard (`packages/dashboard`) must import from `@understand-anything/core/search`, `@understand-anything/core/types`, and `@understand-anything/core/schema` — never from the bare `@understand-anything/core` entry point.

```typescript
// understand-anything-plugin/packages/dashboard/src/store.ts:2-8
import { SearchEngine } from "@understand-anything/core/search";
import type { SearchResult } from "@understand-anything/core/search";
import type { GraphIssue } from "@understand-anything/core/schema";
import type { GraphNode, KnowledgeGraph, TourStep } from "@understand-anything/core/types";
```

The main entry point pulls in Node.js modules (`node:fs`, `node:path`, `child_process`, etc.) that Vite's browser bundler cannot resolve. The subpath exports expose only modules free of Node.js globals.

**What breaks if violated:** Vite will fail to bundle with errors like `"fs" is not defined` or `Cannot resolve "node:child_process"`. The dashboard will not build; or if built with a loose resolver, the runtime will throw on the first import.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:1-10]()

---

### 3. Graphs Inside Git Worktrees Are Redirected to the Main Repo Root

When `/understand` runs inside a git worktree (not the main checkout), `PROJECT_ROOT` is automatically redirected to the main repo root before writing any output. The detection compares `git rev-parse --git-dir` to `git rev-parse --git-common-dir`; they differ in a worktree.

```bash
# From understand-anything-plugin/skills/understand/SKILL.md Phase 0 step 1
COMMON_DIR=$(git -C "$PROJECT_ROOT" rev-parse --git-common-dir 2>/dev/null)
GIT_DIR=$(git -C "$PROJECT_ROOT" rev-parse --git-dir 2>/dev/null)
if [ -n "$COMMON_DIR" ] && [ -n "$GIT_DIR" ]; then
  COMMON_ABS=$(cd "$PROJECT_ROOT" && cd "$COMMON_DIR" 2>/dev/null && pwd -P)
  GIT_ABS=$(cd "$PROJECT_ROOT" && cd "$GIT_DIR" 2>/dev/null && pwd -P)
  if [ -n "$COMMON_ABS" ] && [ "$COMMON_ABS" != "$GIT_ABS" ]; then
    MAIN_ROOT=$(dirname "$COMMON_ABS")
    if [ -d "$MAIN_ROOT" ] && [ "${UNDERSTAND_NO_WORKTREE_REDIRECT:-0}" != "1" ]; then
      PROJECT_ROOT="$MAIN_ROOT"
    fi
  fi
fi
```

This behavior is tested in `src/__tests__/worktree-redirect.test.mjs`, which creates a real git worktree and verifies both redirect and opt-out behavior. The escape hatch `UNDERSTAND_NO_WORKTREE_REDIRECT=1` is available for intentional per-worktree graphs.

**What breaks if violated:** Claude Code worktrees are ephemeral; `.understand-anything/` written there is destroyed when the session ends, silently discarding the knowledge graph. The graph appears to be generated successfully but is then gone.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:33-53](), [understand-anything-plugin/src/__tests__/worktree-redirect.test.mjs:65-89]()

---

### 4. All Five Version Fields Must Be Bumped in Sync on Release

The project carries five separate version fields that must stay identical. The CLAUDE.md lists them:

| File | Field |
|---|---|
| `understand-anything-plugin/package.json` | `"version"` |
| `understand-anything-plugin/.claude-plugin/plugin.json` | `"version"` |
| `.claude-plugin/plugin.json` | `"version"` |
| `.cursor-plugin/plugin.json` | `"version"` |
| `.copilot-plugin/plugin.json` | `"version"` |

Note: `.claude-plugin/marketplace.json` intentionally does **not** carry a version field — adding one causes marketplace schema validation failures.

**What breaks if violated:** Claude Code's plugin cache key is the version string from the marketplace entry. A version mismatch between the cache directory and the installed plugin causes the plugin to serve stale code until the user manually uninstalls and reinstalls. On Cursor or Copilot, plugins with mismatched versions may fail to activate.

Sources: [understand-anything-plugin/CLAUDE.md]() (Testing Local Plugin Changes and Versioning sections)

---

## Key Failure Modes

### Stale Graph After Code Changes

**Symptom:** The dashboard shows architecture that no longer matches the code — functions, classes, or modules that were renamed or deleted still appear as nodes.

**Root cause:** The graph is a snapshot keyed to a git commit hash (`project.gitCommitHash` in `knowledge-graph.json`). The staleness check in `staleness.ts` runs `git diff <lastCommitHash>..HEAD --name-only` to detect changed files. If `/understand` is never re-run, the graph never updates.

```typescript
// understand-anything-plugin/packages/core/src/staleness.ts:34-43
export function isStale(projectDir: string, lastCommitHash: string): StalenessResult {
  const changedFiles = getChangedFiles(projectDir, lastCommitHash);
  return {
    stale: changedFiles.length > 0,
    changedFiles,
  };
}
```

**Fix options:**
- Run `/understand` manually after committing changes.
- Run `/understand --auto-update` once to enable the commit hook (`autoUpdate: true` in `.understand-anything/config.json`). From that point, the graph updates automatically on each commit.

Sources: [understand-anything-plugin/packages/core/src/staleness.ts:13-43](), [understand-anything-plugin/packages/core/src/types.ts:116-120]()

---

### Broken Incremental Update: `lastCommitHash` Missing

**Symptom:** The incremental update path (`git diff <lastCommitHash>..HEAD`) fails or exits immediately with "Graph is up to date" even though files changed.

**Root cause:** Incremental updates read `gitCommitHash` from `.understand-anything/meta.json`. If `meta.json` is absent or corrupt, or if the graph was written without a meta file, the Phase 0 decision logic in the SKILL has no hash to diff against. The skill falls through to "No existing graph or meta → full analysis" — but a corrupt `meta.json` (present but unparseable) may silently skip the full rebuild.

```markdown
# understand-anything-plugin/skills/understand/SKILL.md Phase 0, step 7
| Condition | Action |
|---|---|
| No existing graph or meta | Full analysis (all phases) |
| Existing graph + changed files | Incremental update (re-analyze changed files only) |
```

**Fix:** Run `/understand --full` to force a complete rebuild. This always produces a fresh `meta.json` with the current `gitCommitHash`.

Sources: [understand-anything-plugin/skills/understand/SKILL.md:143-159](), [understand-anything-plugin/packages/core/src/persistence/index.ts:107-116]()

---

### Dashboard Blank or Broken: Schema Mismatch

**Symptom:** The dashboard shows a red banner ("Dashboard hit N fatal errors") or renders a blank graph view. The `WarningBanner` component is the visible signal.

**Root cause:** `loadGraph()` in `persistence/index.ts` calls `validateGraph()` on every load. `validateGraph` runs a four-tier pipeline:

```
Tier 1: sanitizeGraph  → null-to-empty-array coercions, lowercase enums
Tier 2: autoFixGraph   → missing fields defaulted, alias mapping
Tier 3: Drop           → invalid individual nodes or edges are silently removed
Tier 4: Fatal          → no valid nodes, missing project metadata, malformed collections
```

The `WarningBanner` component distinguishes fatal from non-fatal issues by color and copy text:

```typescript
// understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx:9-16
const hasFatal = issues.some((i) => i.level === "fatal");
const lines = hasFatal
  ? [
      "Some of these issues look like dashboard rendering bugs.",
      "Please file an issue at github.com/Lum1104/Understand-Anything/issues...",
    ]
  : [
      "These are LLM generation errors — not a system bug.",
      "You can ask your agent to fix these specific issues in knowledge-graph.json",
    ];
```

Fatal issues (red banner) indicate a bug in the dashboard or ELK layout. Non-fatal issues with dropped nodes (amber banner) indicate the LLM generated schema-invalid data.

**Fix:**
1. Open the dashboard and expand the `WarningBanner`.
2. Use the "Copy Issues" button to get the structured issue list.
3. For fatal issues: file a GitHub bug report with the copied text.
4. For dropped/auto-corrected issues: paste the issue list to your agent and ask it to repair `knowledge-graph.json` directly.
5. If the graph is too corrupt: re-run `/understand --full` to regenerate from scratch.

Sources: [understand-anything-plugin/packages/dashboard/src/components/WarningBanner.tsx:8-43](), [understand-anything-plugin/packages/core/src/schema.ts:499-663]()

---

### Graph Validation Pipeline (Tier Summary)

```text
Input JSON
    │
    ▼ Tier 1: sanitizeGraph
    │  • null → [] for tour, layers
    │  • null → undefined for optional node fields
    │  • lowercase type, complexity, direction strings
    │
    ▼ Tier 2: autoFixGraph + normalizeGraph (alias maps)
    │  • Missing type    → "file"      (node)  / "depends_on" (edge)
    │  • Missing direction → "forward"
    │  • Missing weight  → 0.5; weight clamped [0,1]
    │  • "fn" → "function", "extends" → "inherits", etc.
    │
    ▼ Tier 3: Per-item validation (Zod parse)
    │  • Invalid nodes DROPPED (level: "dropped")
    │  • Edges with dangling source/target DROPPED
    │  • Layer/tour nodeIds filtered to surviving node IDs
    │
    ▼ Tier 4: Fatal checks
       • Not an object → fatal
       • Collections not arrays → fatal
       • Missing project metadata → fatal
       • No valid nodes remaining → fatal
```

Sources: [understand-anything-plugin/packages/core/src/schema.ts:499-663]()

---

## Safe-Change Rules

### Adding a New Language Extractor

**Scope:** `packages/core/src/plugins/extractors/` only. No changes needed to tree-sitter-plugin, schema, or dashboard.

**Steps:**
1. Create `<language>-extractor.ts` implementing the `LanguageExtractor` interface from `extractors/types.ts`.
2. Export the class from `extractors/index.ts` and add an instance to `builtinExtractors`:

```typescript
// understand-anything-plugin/packages/core/src/plugins/extractors/index.ts:24-34
export const builtinExtractors: LanguageExtractor[] = [
  new TypeScriptExtractor(),
  new PythonExtractor(),
  // ... add your new extractor here
];
```

3. Add a WASM grammar package for the language to the language config (in `packages/core/src/languages/configs/`). `TreeSitterPlugin` will pick it up automatically via the `builtinExtractors` registry.

**What you do NOT need to touch:** `tree-sitter-plugin.ts` (it registers extractors from `builtinExtractors` by default), `schema.ts`, `store.ts`, or any dashboard file.

Sources: [understand-anything-plugin/packages/core/src/plugins/extractors/index.ts:24-34](), [understand-anything-plugin/packages/core/src/plugins/tree-sitter-plugin.ts:88-98]()

---

### Adding a New Edge Type

Adding an edge type touches two files: the schema (source of truth for LLM and validation) and the dashboard store (source of truth for UI filtering). Both must be updated together.

**Step 1 — `packages/core/src/schema.ts`:** Add the new edge type string to `EdgeTypeSchema`:

```typescript
// understand-anything-plugin/packages/core/src/schema.ts:4-14
export const EdgeTypeSchema = z.enum([
  "imports", "exports", "contains", ...
  // add "your_new_edge" here
]);
```

Optionally add LLM alias entries to `EDGE_TYPE_ALIASES` if agents are likely to produce variant spellings.

**Step 2 — `packages/core/src/types.ts`:** Mirror the new type in the `EdgeType` union.

**Step 3 — `packages/dashboard/src/store.ts`:** Add the new edge type to the appropriate category in `EDGE_CATEGORY_MAP`:

```typescript
// understand-anything-plugin/packages/dashboard/src/store.ts:31-40
export const EDGE_CATEGORY_MAP: Record<EdgeCategory, string[]> = {
  structural: ["imports", "exports", "contains", "inherits", "implements"],
  behavioral: ["calls", "subscribes", "publishes", "middleware"],
  // add "your_new_edge" to the correct category, or add a new EdgeCategory key
};
```

If the new edge belongs to a new category, `EdgeCategory`, `ALL_EDGE_CATEGORIES`, and `FilterState` must also be updated in `store.ts`.

**What you do NOT need to touch:** Dashboard layout components, `WarningBanner`, any skill or agent prompt (schema change is picked up automatically at validation time).

Sources: [understand-anything-plugin/packages/core/src/schema.ts:4-14, 78-125](), [understand-anything-plugin/packages/dashboard/src/store.ts:31-40]()

---

### Dashboard Layout Changes Are Isolated to `components/`

The dashboard renders a graph-first layout (75% graph + 360px sidebar). Layout changes — rearranging panels, resizing areas, swapping sidebar tabs — are contained within `packages/dashboard/src/components/`. They do not touch:

- `packages/core/` — any change there modifies schema, staleness, or persistence behavior for all consumers.
- `packages/dashboard/src/store.ts` — the Zustand store owns graph state and filter state; layout components read from it via selectors but do not own it.
- Skills or agents — they write `knowledge-graph.json`; they have no knowledge of dashboard layout.

The safe boundary for layout work: read state via store selectors, render it in components. State shape changes (new filter dimensions, new sidebar tabs that require new state fields) require a coordinated `store.ts` edit, but those are data-model changes, not layout changes.

Sources: [understand-anything-plugin/packages/dashboard/src/store.ts:100-120]()

---

## Summary

The four non-negotiable invariants — WASM-only tree-sitter, browser-safe dashboard imports, worktree-to-main-root redirect, and five-file version sync — each protect a different layer boundary. Violations are often silent: the wrong grammar loader returns empty analysis, an unsafe import makes a build fail, a worktree graph evaporates, a version mismatch serves cached code. The three common failure modes (stale graph, missing `lastCommitHash`, schema mismatch) each have a clear fix path anchored to `--full` rebuild or the `WarningBanner` copy-to-agent workflow. Safe changes stay within their layer: new extractors live only in `extractors/`, new edge types update both `schema.ts` and `store.ts`'s `EDGE_CATEGORY_MAP` in lockstep, and dashboard layout work never leaves `components/`.

Sources: [understand-anything-plugin/packages/core/src/schema.ts:499-510]()

---