Agent-readable wiki

CodeGraph First 30 Minutes Wiki

A fast orientation map for CodeGraph, a local TypeScript CLI and MCP server that indexes code into a SQLite-backed semantic graph for agent-friendly search, context, and impact analysis. The structure emphasizes source-backed entry points, provider-neutral agent integration, and the implementation boundaries a new contributor should understand first.

Pages

  1. Start HereWhat this repo is, the fastest read order, the first files to open, and the vocabulary behind nodes, edges, projects, tools, and local-first indexing.
  2. CLI, Installer & Agent TargetsThe command surface a new reader should try first: interactive install, project init, indexing, sync, status, query, context, serve, affected tests, and portable agent config writers for Claude Code, Cursor, Codex CLI, and opencode.
  3. MCP Tools & Context WorkflowHow CodeGraph exposes provider-neutral code intelligence through stdio JSON-RPC tools, why codegraph_context is the primary workflow, and how search, callers, callees, impact, explore, node, status, and files fit together.
  4. Indexing & Language Extraction PipelineThe pipeline from project files to graph records: git-aware scanning, include and exclude filters, language detection, tree-sitter WASM loading, worker-based parsing, extraction results, and tests that prove supported language behavior.
  5. Reference Resolution & Framework RoutesHow unresolved references become graph edges through import resolution, name matching, path aliases, framework detectors, and route resolvers for web frameworks across JavaScript, Python, Ruby, Java, Go, Rust, C#, Swift, PHP, Svelte, and Vue.
  6. Freshness, Storage & What to Try NextThe closing map for the first 30 minutes: understand the SQLite schema, query and traversal layers, file watcher, git hook sync path, safety checks, and the concrete next experiments that prove the local graph is fresh and useful.

Complete Markdown

# CodeGraph First 30 Minutes Wiki

> A fast orientation map for CodeGraph, a local TypeScript CLI and MCP server that indexes code into a SQLite-backed semantic graph for agent-friendly search, context, and impact analysis. The structure emphasizes source-backed entry points, provider-neutral agent integration, and the implementation boundaries a new contributor should understand first.

## Context Links

- [Agent index](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/llms.txt)
- [Human interactive wiki](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a)
- [GitHub repository](https://github.com/colbymchenry/codegraph)

## Repository Metadata

- Repository: colbymchenry/codegraph

- Generated: 2026-05-22T16:27:32.272Z
- Updated: 2026-05-22T16:36:41.406Z
- Runtime: Codex CLI
- Format: First 30 Minutes
- Pages: 6

## Page Index

- 01. [Start Here](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/01-start-here.md) - What this repo is, the fastest read order, the first files to open, and the vocabulary behind nodes, edges, projects, tools, and local-first indexing.
- 02. [CLI, Installer & Agent Targets](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/02-cli-installer-agent-targets.md) - The command surface a new reader should try first: interactive install, project init, indexing, sync, status, query, context, serve, affected tests, and portable agent config writers for Claude Code, Cursor, Codex CLI, and opencode.
- 03. [MCP Tools & Context Workflow](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/03-mcp-tools-context-workflow.md) - How CodeGraph exposes provider-neutral code intelligence through stdio JSON-RPC tools, why codegraph_context is the primary workflow, and how search, callers, callees, impact, explore, node, status, and files fit together.
- 04. [Indexing & Language Extraction Pipeline](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/04-indexing-language-extraction-pipeline.md) - The pipeline from project files to graph records: git-aware scanning, include and exclude filters, language detection, tree-sitter WASM loading, worker-based parsing, extraction results, and tests that prove supported language behavior.
- 05. [Reference Resolution & Framework Routes](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/05-reference-resolution-framework-routes.md) - How unresolved references become graph edges through import resolution, name matching, path aliases, framework detectors, and route resolvers for web frameworks across JavaScript, Python, Ruby, Java, Go, Rust, C#, Swift, PHP, Svelte, and Vue.
- 06. [Freshness, Storage & What to Try Next](https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/06-freshness-storage-what-to-try-next.md) - The closing map for the first 30 minutes: understand the SQLite schema, query and traversal layers, file watcher, git hook sync path, safety checks, and the concrete next experiments that prove the local graph is fresh and useful.

## Source File Index

- `__tests__/extraction.test.ts`
- `__tests__/frameworks.test.ts`
- `__tests__/installer-targets.test.ts`
- `__tests__/mcp-initialize.test.ts`
- `__tests__/mcp-roots.test.ts`
- `package.json`
- `README.md`
- `src/bin/codegraph.ts`
- `src/bin/node-version-check.ts`
- `src/config.ts`
- `src/context/formatter.ts`
- `src/context/index.ts`
- `src/db/index.ts`
- `src/db/queries.ts`
- `src/db/schema.sql`
- `src/db/sqlite-adapter.ts`
- `src/directory.ts`
- `src/extraction/grammars.ts`
- `src/extraction/index.ts`
- `src/extraction/languages/python.ts`
- `src/extraction/languages/typescript.ts`
- `src/extraction/parse-worker.ts`
- `src/extraction/tree-sitter-types.ts`
- `src/extraction/tree-sitter.ts`
- `src/graph/queries.ts`
- `src/graph/traversal.ts`
- `src/index.ts`
- `src/installer/index.ts`
- `src/installer/targets/claude.ts`
- `src/installer/targets/codex.ts`
- `src/installer/targets/registry.ts`
- `src/installer/targets/types.ts`
- `src/mcp/index.ts`
- `src/mcp/server-instructions.ts`
- `src/mcp/tools.ts`
- `src/mcp/transport.ts`
- `src/resolution/frameworks/express.ts`
- `src/resolution/frameworks/index.ts`
- `src/resolution/import-resolver.ts`
- `src/resolution/index.ts`
- `src/resolution/name-matcher.ts`
- `src/resolution/path-aliases.ts`
- `src/resolution/types.ts`
- `src/sync/git-hooks.ts`
- `src/sync/watcher.ts`
- `src/types.ts`

---

## 01. Start Here

> What this repo is, the fastest read order, the first files to open, and the vocabulary behind nodes, edges, projects, tools, and local-first indexing.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/01-start-here.md
- Generated: 2026-05-22T16:26:29.694Z

### Source Files

- `README.md`
- `package.json`
- `src/index.ts`
- `src/types.ts`
- `src/config.ts`
- `src/directory.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [README.md](README.md)
- [package.json](package.json)
- [src/index.ts](src/index.ts)
- [src/types.ts](src/types.ts)
- [src/config.ts](src/config.ts)
- [src/directory.ts](src/directory.ts)
- [src/bin/codegraph.ts](src/bin/codegraph.ts)
- [src/mcp/tools.ts](src/mcp/tools.ts)
- [src/db/index.ts](src/db/index.ts)
- [src/db/schema.sql](src/db/schema.sql)
- [src/extraction/index.ts](src/extraction/index.ts)
- [src/sync/watcher.ts](src/sync/watcher.ts)
- [src/context/index.ts](src/context/index.ts)
</details>

# Start Here

CodeGraph is a TypeScript CLI and library for building a local semantic knowledge graph of a codebase. Its public package exposes a `codegraph` binary, compiles TypeScript into `dist`, copies SQLite schema and tree-sitter WASM assets at build time, and supports Node.js 20 through 24. Sources: [package.json:1-25](), [package.json:34-58]().

This page gives a first-30-minutes map: what to read first, what the core words mean, how indexing works, and where the CLI and agent-facing tools enter the system. Repository code is the source of truth here. No `STRATEGY.md` or `docs/solutions/**` sources were present in this checkout, so this page does not cite strategy or solved-problem notes. The selected Compound Engineering profile was treated as bundled guidance for page shape and QA review, not as an installed local skill execution.

## What This Repo Is

At the center is `CodeGraph`, a class that opens or initializes a project, creates a local `.codegraph` directory, stores graph data in SQLite, extracts symbols from source files, resolves references into edges, and serves query/context APIs. The class wires together the database, extraction orchestrator, resolver, graph manager, traverser, context builder, file lock, and optional file watcher. Sources: [src/index.ts:123-170](), [src/index.ts:185-216](), [src/index.ts:255-286]().

The repo has two main surfaces:

| Surface | Where to start | What it does |
|---|---|---|
| Library API | `src/index.ts` | Owns project lifecycle, indexing, sync, graph queries, context building, and exports public types. |
| CLI | `src/bin/codegraph.ts` | Provides commands such as `init`, `index`, `sync`, `status`, `query`, `files`, `context`, and `serve --mcp`. |
| MCP tools | `src/mcp/tools.ts` | Exposes agent-facing tools like `codegraph_context`, `codegraph_search`, `codegraph_node`, and `codegraph_files`. |
| Local storage | `src/db/index.ts`, `src/db/schema.sql` | Creates and opens `.codegraph/codegraph.db` with tables for files, nodes, edges, unresolved refs, and metadata. |

Sources: [src/bin/codegraph.ts:1-19](), [src/bin/codegraph.ts:537-705](), [src/bin/codegraph.ts:787-880](), [src/bin/codegraph.ts:1052-1135](), [src/mcp/tools.ts:240-248](), [src/db/index.ts:181-191](), [src/db/schema.sql:19-81]().

## Fastest Read Order

Read in this order if you are new:

1. `README.md` for the product promise, install flow, supported agents, and high-level feature list.
2. `package.json` for runtime constraints, binary name, build/test commands, and dependencies.
3. `src/index.ts` for the core `CodeGraph` lifecycle and how subsystems are composed.
4. `src/types.ts` for the domain vocabulary: node kinds, edge kinds, languages, records, config, context, and traversal options.
5. `src/directory.ts` and `src/config.ts` for what “a CodeGraph project” means on disk.
6. `src/extraction/index.ts` for indexing phases, file inclusion, hashing, git-aware scanning, and sync inputs.
7. `src/mcp/tools.ts` and `src/bin/codegraph.ts` for the user-facing and agent-facing operations.

Sources: [README.md:24-37](), [README.md:102-113](), [package.json:15-25](), [src/index.ts:53-82](), [src/types.ts:11-60](), [src/types.ts:429-490](), [src/directory.ts:10-34](), [src/config.ts:13-23](), [src/extraction/index.ts:52-87](), [src/mcp/tools.ts:248-442]().

## First Files to Open

### `src/index.ts`: The Spine

`CodeGraph.init()` creates the directory structure, writes config, initializes the database, and optionally indexes immediately. `CodeGraph.open()` checks that the project is initialized, validates the directory, loads config, opens the database, and optionally syncs. Indexing and sync are guarded by an in-process mutex and a cross-process file lock so CLI, MCP, and hooks do not write the graph at the same time. Sources: [src/index.ts:185-216](), [src/index.ts:255-286](), [src/index.ts:370-410](), [src/index.ts:417-490]().

```ts
// src/index.ts
static async init(projectRoot: string, options: InitOptions = {}): Promise<CodeGraph>
static async open(projectRoot: string, options: OpenOptions = {}): Promise<CodeGraph>
async indexAll(options: IndexOptions = {}): Promise<IndexResult>
async sync(options: IndexOptions = {}): Promise<SyncResult>
```

### `src/types.ts`: The Vocabulary

Nodes represent symbols and code elements. Edges represent relationships between nodes. Files record indexed source files and hashes for change detection. Context combines a focal node with ancestors, children, incoming and outgoing refs, related types, and imports. Sources: [src/types.ts:18-60](), [src/types.ts:97-186](), [src/types.ts:188-215](), [src/types.ts:376-400]().

| Term | Meaning in this repo |
|---|---|
| Node | A code symbol or structural element such as `file`, `class`, `function`, `route`, or `component`. |
| Edge | A relationship such as `contains`, `calls`, `imports`, `extends`, `implements`, `references`, or `decorates`. |
| FileRecord | The tracked source file, content hash, language, size, timestamps, node count, and extraction errors. |
| Subgraph | A focused subset of nodes and edges with root entry points. |
| Context | A task-ready bundle around a focal symbol and its relevant graph neighborhood. |

## Projects and Local Files

A project is considered initialized only when it has both a `.codegraph/` directory and `.codegraph/codegraph.db`. The directory module also walks upward from a starting path to find the nearest initialized project, similar to how Git discovers a repository root. Sources: [src/directory.ts:10-34](), [src/directory.ts:36-64]().

The `.codegraph` directory is intentionally local. `createDirectory()` writes a `.gitignore` that ignores database files, WAL/SHM files, cache, logs, and hook markers. This supports a local-first workflow where indexes are machine-local artifacts rather than committed project source. Sources: [src/directory.ts:66-106](), [src/db/index.ts:181-191]().

```text
project-root/
  .codegraph/
    codegraph.db        local SQLite graph database
    .gitignore          ignores DB, WAL/SHM, cache, logs, hook markers
  src/
  package.json
```

Sources: [src/directory.ts:83-105](), [src/db/schema.sql:19-81]().

## Indexing in One Mental Model

Indexing starts with file discovery, filters files through config include/exclude patterns, hashes contents, parses supported languages, stores files/nodes/edges, then resolves unresolved references into graph edges. Full indexing resolves all unresolved refs in batches; sync scopes resolution to changed files when possible and falls back to batched resolution when git change data is unavailable. Sources: [src/extraction/index.ts:28-50](), [src/extraction/index.ts:90-126](), [src/extraction/index.ts:178-216](), [src/index.ts:375-410](), [src/index.ts:437-490]().

```text
source files
  -> include/exclude filter
  -> language detection + tree-sitter extraction
  -> files / nodes / unresolved_refs in SQLite
  -> resolver creates edges
  -> context, search, callers, callees, impact, files
```

The watcher keeps the same local graph fresh by using native `fs.watch`, filtering out `.codegraph/` changes, applying include/exclude rules, and debouncing sync work. It returns `false` instead of crashing when recursive watching is unavailable or disabled for the environment. Sources: [src/sync/watcher.ts:1-9](), [src/sync/watcher.ts:40-49](), [src/sync/watcher.ts:82-138](), [src/sync/watcher.ts:168-206]().

## Config Defaults That Matter

Configuration lives at `.codegraph/config.json`, but `rootDir` is derived from the actual project path when config is loaded or saved. Validation requires version, root, include/exclude arrays, language and framework arrays, max file size, docstring extraction, and call-site tracking flags. Custom regex patterns are checked for compilability and basic ReDoS risk before acceptance. Sources: [src/config.ts:13-23](), [src/config.ts:25-48](), [src/config.ts:50-111](), [src/config.ts:134-191]().

The default config includes many source extensions across TypeScript, JavaScript, Python, Go, Rust, Java, C/C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Svelte, Vue, Liquid, Pascal/Delphi, and Scala. It excludes version control, dependencies, build outputs, framework caches, virtualenvs, and language-specific generated directories. Sources: [src/types.ts:492-548](), [src/types.ts:549-620]().

## Tools and Agent Workflows

The CLI is the human/operator surface. Use `codegraph init -i` to create a project and index it, `codegraph index` for full indexing, `codegraph sync` for incremental updates, `codegraph status` for counts and backend status, `codegraph query` for symbol search, `codegraph files` for indexed file structure, `codegraph context` for task-ready context, and `codegraph serve --mcp` for agent integration. Sources: [src/bin/codegraph.ts:391-460](), [src/bin/codegraph.ts:537-662](), [src/bin/codegraph.ts:664-705](), [src/bin/codegraph.ts:787-880](), [src/bin/codegraph.ts:1052-1135]().

The MCP tools are the agent surface. `codegraph_context` is explicitly described as the primary first call for architecture, feature, and bug-context questions. Other tools are narrower: search symbols, inspect one node, find callers/callees, estimate impact, explore related source grouped by file, get status, and list indexed files. All MCP tools support an optional `projectPath` for querying another initialized local project. Sources: [src/mcp/tools.ts:232-248](), [src/mcp/tools.ts:250-442]().

## Provider-Neutral and BYOC/BYOK-Friendly Boundaries

The implementation keeps model-provider concerns outside the graph core. The package dependencies list local parsing, CLI, config parsing, SQLite, glob matching, and tree-sitter packages; it does not declare a model-provider SDK dependency. The integration boundary is CLI/MCP tooling, and MCP tools accept `projectPath` so the same installed tool can query different local repositories. Sources: [package.json:34-55](), [src/mcp/tools.ts:232-248](), [src/mcp/tools.ts:507-520]().

That makes the architecture portable across bring-your-own-compute and bring-your-own-key setups: CodeGraph builds and queries local repository indexes, while whichever assistant, key, or runtime invokes the MCP server remains outside the indexing/storage layer. The README also frames the system as local, with no API keys or external services and a SQLite database. Sources: [README.md:102-113](), [src/db/index.ts:181-191](), [src/db/schema.sql:19-81]().

## What to Remember

Start with `src/index.ts` for lifecycle, `src/types.ts` for vocabulary, `src/directory.ts` plus `src/config.ts` for project shape, and `src/mcp/tools.ts` plus `src/bin/codegraph.ts` for the two user-facing surfaces. The key mental model is simple: CodeGraph creates a local `.codegraph/codegraph.db`, extracts source symbols into nodes, resolves relationships into edges, and exposes that graph through CLI and MCP tools for faster codebase understanding. Sources: [src/index.ts:123-170](), [src/types.ts:97-186](), [src/directory.ts:23-34](), [src/mcp/tools.ts:240-442]().

---

## 02. CLI, Installer & Agent Targets

> The command surface a new reader should try first: interactive install, project init, indexing, sync, status, query, context, serve, affected tests, and portable agent config writers for Claude Code, Cursor, Codex CLI, and opencode.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/02-cli-installer-agent-targets.md
- Generated: 2026-05-22T16:27:02.751Z

### Source Files

- `src/bin/codegraph.ts`
- `src/bin/node-version-check.ts`
- `src/installer/index.ts`
- `src/installer/targets/registry.ts`
- `src/installer/targets/types.ts`
- `src/installer/targets/claude.ts`
- `src/installer/targets/codex.ts`
- `__tests__/installer-targets.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [src/bin/codegraph.ts](src/bin/codegraph.ts)
- [src/bin/node-version-check.ts](src/bin/node-version-check.ts)
- [src/installer/index.ts](src/installer/index.ts)
- [src/installer/targets/registry.ts](src/installer/targets/registry.ts)
- [src/installer/targets/types.ts](src/installer/targets/types.ts)
- [src/installer/targets/claude.ts](src/installer/targets/claude.ts)
- [src/installer/targets/cursor.ts](src/installer/targets/cursor.ts)
- [src/installer/targets/codex.ts](src/installer/targets/codex.ts)
- [src/installer/targets/opencode.ts](src/installer/targets/opencode.ts)
- [src/installer/targets/shared.ts](src/installer/targets/shared.ts)
- [src/installer/targets/toml.ts](src/installer/targets/toml.ts)
- [src/installer/instructions-template.ts](src/installer/instructions-template.ts)
- [src/sync/watch-policy.ts](src/sync/watch-policy.ts)
- [src/sync/git-hooks.ts](src/sync/git-hooks.ts)
- [src/mcp/index.ts](src/mcp/index.ts)
- [__tests__/installer-targets.test.ts](__tests__/installer-targets.test.ts)
- [README.md](README.md)
- [package.json](package.json)
</details>

# CLI, Installer & Agent Targets

This page orients a new contributor around CodeGraph's command surface: the `codegraph` binary, its first-use installer path, project initialization, indexing and sync commands, query/context commands, MCP serving, affected-test lookup, and the portable agent config writers for Claude Code, Cursor, Codex CLI, and opencode.

The implementation is intentionally local and provider-neutral. The installer writes ordinary config and instruction files for several agents, while the runtime server is launched as `codegraph serve --mcp`; no page claim here depends on a hosted model provider or proprietary connector. The selected wiki profile was used as bundled page-shaping guidance only; no local `STRATEGY.md` or `docs/solutions/**` sources were present in this checkout during inspection.

## First Commands To Try

For a fresh reader, the shortest useful path is:

```bash
npx @colbymchenry/codegraph
cd your-project
codegraph init -i
codegraph status
codegraph query "SomeSymbol"
codegraph context "understand this feature"
codegraph serve --mcp
```

The published package exposes the `codegraph` binary from `dist/bin/codegraph.js`, and the source CLI explicitly runs the interactive installer when invoked with no arguments. The README mirrors that flow: run `npx @colbymchenry/codegraph`, then initialize a project with `codegraph init -i`.

Sources: [package.json:7-9](), [src/bin/codegraph.ts:67-78](), [README.md:24-37]()

## Command Surface Map

| Command | First-use purpose | Important options |
|---|---|---|
| `codegraph` | Runs interactive installer when no args are supplied. | None. |
| `codegraph install` | Configures one or more agent targets. | `--target`, `--location`, `--yes`, `--no-permissions`, `--print-config`. |
| `codegraph init [path]` | Creates `.codegraph/`; optionally indexes immediately. | `--index`, `--verbose`. |
| `codegraph index [path]` | Full indexing pass. | `--force`, `--quiet`, `--verbose`. |
| `codegraph sync [path]` | Incremental update since last index. | `--quiet`. |
| `codegraph status [path]` | Shows index stats, backend, and pending changes. | `--json`. |
| `codegraph query <search>` | Searches symbols. | `--path`, `--limit`, `--kind`, `--json`. |
| `codegraph files` | Shows indexed file structure. | `--filter`, `--pattern`, `--format`, `--max-depth`, `--no-metadata`, `--json`. |
| `codegraph context <task>` | Builds markdown or JSON task context. | `--path`, `--max-nodes`, `--max-code`, `--no-code`, `--format`. |
| `codegraph serve --mcp` | Starts the MCP stdio server for agents. | `--path`, `--mcp`, `--no-watch`. |
| `codegraph affected [files...]` | Finds tests affected by changed files. | `--stdin`, `--depth`, `--filter`, `--json`, `--quiet`. |

The README has the high-level command reference, while the Commander definitions in `src/bin/codegraph.ts` are the source of truth for options and behavior.

Sources: [README.md:314-329](), [src/bin/codegraph.ts:394-399](), [src/bin/codegraph.ts:536-542](), [src/bin/codegraph.ts:605-609](), [src/bin/codegraph.ts:667-671](), [src/bin/codegraph.ts:786-793](), [src/bin/codegraph.ts:849-858](), [src/bin/codegraph.ts:1055-1063](), [src/bin/codegraph.ts:1101-1107](), [src/bin/codegraph.ts:1195-1204](), [src/bin/codegraph.ts:1326-1334]()

## CLI Bootstrap And Node Guardrails

`src/bin/codegraph.ts` keeps startup light by lazy-loading heavy CodeGraph modules and the installer. Before any tree-sitter or WASM-heavy work, it blocks Node.js major versions 25 and newer unless `CODEGRAPH_ALLOW_UNSAFE_NODE=1` is set. The banner text lives in a side-effect-free helper so tests can import it without bootstrapping the CLI.

```ts
// src/bin/codegraph.ts
const nodeVersion = process.versions.node;
const nodeMajor = parseInt(nodeVersion.split('.')[0] ?? '0', 10);
if (nodeMajor >= 25) {
  process.stderr.write(buildNode25BlockBanner(nodeVersion) + '\n');
  if (!process.env.CODEGRAPH_ALLOW_UNSAFE_NODE) {
    process.exit(1);
  }
}
```

The package also declares supported Node engines as `>=20.0.0 <25.0.0`.

Sources: [src/bin/codegraph.ts:30-65](), [src/bin/node-version-check.ts:1-20](), [src/bin/node-version-check.ts:20-39](), [package.json:56-58]()

## Project Lifecycle Commands

### `init`, `index`, and `sync`

`codegraph init [path]` resolves the target path, creates a CodeGraph project with `CodeGraph.init(projectPath, { index: false })`, and then optionally calls `indexAll()` when `--index` is set. If the project is already initialized, rerunning `init` is still useful: it attempts to wire project-local agent surfaces for globally configured agents and offers the watch fallback.

`codegraph index [path]` opens an initialized project and performs a full `indexAll()` pass. `--force` clears the existing index first, `--quiet` suppresses UI, and `--verbose` uses timestamped progress instead of the shimmer progress renderer. `codegraph sync [path]` opens the same project and calls `cg.sync()`, reporting added, modified, removed, and node update counts.

Sources: [src/bin/codegraph.ts:394-464](), [src/bin/codegraph.ts:536-588](), [src/bin/codegraph.ts:605-651]()

### `status`, `query`, `files`, and `context`

`status` is the health check. In JSON mode it returns initialization state, project path, file/node/edge counts, database size, backend, languages, and pending change counts. In human mode it also prints node and language breakdowns and tells the user to run `codegraph sync` when pending changes exist.

`query` searches indexed symbols via `cg.searchNodes(search, { limit, kinds })`. `files` reads `cg.getFiles()` and can render a tree, flat list, grouped-by-language output, or JSON. `context` calls `cg.buildContext(task, ...)` and prints the returned markdown or JSON.

Sources: [src/bin/codegraph.ts:667-710](), [src/bin/codegraph.ts:715-773](), [src/bin/codegraph.ts:786-812](), [src/bin/codegraph.ts:849-958](), [src/bin/codegraph.ts:1055-1089]()

## MCP Serve And Freshness

`codegraph serve` without `--mcp` prints a sample MCP config and the available tools to stderr. `codegraph serve --mcp` starts `MCPServer`, optionally bound to `--path`; without a path, the MCP server can initialize from client roots. `--no-watch` sets `CODEGRAPH_NO_WATCH=1` before server startup.

Freshness has two paths. The MCP server normally starts file watching and reports auto-sync activity to stderr. If watching is disabled by policy, it explains why and tells the user to run `codegraph sync` or install git sync hooks. The watch policy disables watching when `CODEGRAPH_NO_WATCH=1`, allows `CODEGRAPH_FORCE_WATCH=1` to override auto-detection, and disables WSL2 `/mnt/*` projects because recursive watching is too slow there.

Sources: [src/bin/codegraph.ts:1101-1148](), [src/mcp/index.ts:240-270](), [src/sync/watch-policy.ts:71-98]()

## Affected Test Discovery

`codegraph affected [files...]` accepts file arguments or newline-delimited stdin. It includes changed files that already match test patterns, then performs a breadth-first traversal through `cg.getFileDependents()` up to `--depth` and keeps dependents that look like tests. The default test patterns include `.spec.`, `.test.`, `__tests__`, `tests`, `e2e`, and `spec`; `--filter` replaces that with a custom glob-derived regex.

This command is designed for local hooks or CI glue:

```bash
git diff --name-only | codegraph affected --stdin --quiet
codegraph affected src/auth.ts --filter "e2e/*" --json
```

Sources: [src/bin/codegraph.ts:1185-1204](), [src/bin/codegraph.ts:1213-1256](), [src/bin/codegraph.ts:1258-1303](), [README.md:331-357]()

## Installer Flow

The installer has two entry points: `runInstaller()` for the interactive path and `runInstallerWithOptions()` for CLI flags such as `--target`, `--location`, and `--yes`. Its flow is:

1. Resolve target agents first.
2. Optionally install the npm package globally so agents can run `codegraph`.
3. Choose global vs local config location.
4. Ask about Claude auto-allow permissions only when Claude is selected.
5. Run each target's `install(location, { autoAllow })`.
6. For local installs, initialize and index the current project.
7. For global installs, print `cd your-project` and `codegraph init -i` as the quick start.

Sources: [src/installer/index.ts:64-87](), [src/installer/index.ts:87-130](), [src/installer/index.ts:132-217]()

### Target Selection

The target registry is deliberately small and explicit. Adding a new agent means implementing `AgentTarget` in `targets/<id>.ts` and adding it to `ALL_TARGETS`. The current target order is Claude Code, Cursor, Codex CLI, then opencode; that order controls the multiselect prompt, `--target=all`, and print-config help.

`--target` accepts `auto`, `all`, `none`, or a comma-separated list. `auto` returns detected installed targets, falling back to Claude if none are detected.

Sources: [src/installer/targets/registry.ts:1-21](), [src/installer/targets/registry.ts:23-45](), [src/installer/targets/registry.ts:47-83](), [src/installer/index.ts:261-309]()

## Agent Target Contract

Every target implements the same portable contract: support checks, detection, install, uninstall, manual config printing, path description, and optional project-surface wiring. That keeps the installer BYOC/BYOK friendly: CodeGraph does not assume one model vendor or agent; each agent adapter owns only its local file format and filesystem paths.

```ts
// src/installer/targets/types.ts
export interface AgentTarget {
  readonly id: TargetId;
  readonly displayName: string;
  supportsLocation(loc: Location): boolean;
  detect(loc: Location): DetectionResult;
  install(loc: Location, opts: InstallOptions): WriteResult;
  uninstall(loc: Location): WriteResult;
  printConfig(loc: Location): string;
  describePaths(loc: Location): string[];
  wireProjectSurfaces?(): WriteResult;
}
```

Sources: [src/installer/targets/types.ts:1-13](), [src/installer/targets/types.ts:15-23](), [src/installer/targets/types.ts:73-120]()

## Target Matrix

| Target | Global files | Local files | Config shape | Permissions |
|---|---|---|---|---|
| Claude Code | `~/.claude.json`, `~/.claude/settings.json`, `~/.claude/CLAUDE.md` | `./.mcp.json`, `./.claude/settings.json`, `./.claude/CLAUDE.md` | JSON `mcpServers.codegraph` | Claude auto-allow list when enabled |
| Cursor | `~/.cursor/mcp.json` | `./.cursor/mcp.json`, `./.cursor/rules/codegraph.mdc` | JSON `mcpServers.codegraph`, with extra `--path` | None |
| Codex CLI | `~/.codex/config.toml`, `~/.codex/AGENTS.md` | Not supported | TOML `[mcp_servers.codegraph]` | None |
| opencode | XDG or Windows config dir `opencode.jsonc`/`.json`, plus `AGENTS.md` | `./opencode.jsonc`/`.json`, `./AGENTS.md` | JSONC `mcp.codegraph` with `enabled: true` | None |

All JSON-shaped targets share the base server command shape where applicable: `type: 'stdio'`, `command: 'codegraph'`, and `args: ['serve', '--mcp']`. Codex serializes the same command into a narrow TOML block. opencode uses a different wrapper where `command` is an array and `enabled` is explicit.

Sources: [src/installer/targets/claude.ts:1-18](), [src/installer/targets/cursor.ts:1-31](), [src/installer/targets/codex.ts:1-15](), [src/installer/targets/opencode.ts:1-27](), [src/installer/targets/shared.ts:14-25](), [src/installer/targets/toml.ts:1-20]()

### Claude Code Details

Claude supports both global and local locations. Global MCP config is written to `~/.claude.json`; local MCP config is written to `./.mcp.json`, because the source notes that Claude Code does not read project-level `./.claude.json`. The target also migrates stale local `./.claude.json` CodeGraph entries during install and uninstall.

When auto-allow is enabled, Claude receives permissions for CodeGraph MCP tools in `settings.json`. Instructions are inserted into `CLAUDE.md` through marker-delimited replacement.

Sources: [src/installer/targets/claude.ts:46-74](), [src/installer/targets/claude.ts:85-120](), [src/installer/targets/claude.ts:197-241](), [src/installer/targets/claude.ts:244-308](), [src/installer/targets/shared.ts:27-42]()

### Cursor Details

Cursor supports both global and local MCP configs, but its instruction/rules surface is project-local only. A global Cursor install writes `~/.cursor/mcp.json`; a local install writes `./.cursor/mcp.json` and `./.cursor/rules/codegraph.mdc`.

Cursor also gets special `--path` handling. Local installs use the current absolute path; global installs use `${workspaceFolder}` so a single global config can still resolve each opened workspace correctly.

Sources: [src/installer/targets/cursor.ts:59-71](), [src/installer/targets/cursor.ts:87-123](), [src/installer/targets/cursor.ts:151-185](), [src/installer/targets/cursor.ts:187-238]()

### Codex CLI Details

Codex CLI is global-only in this implementation. It writes `[mcp_servers.codegraph]` into `~/.codex/config.toml` and writes shared CodeGraph instructions into `~/.codex/AGENTS.md`. The TOML helper is intentionally not a general TOML parser; it splices one dotted-key table while preserving the rest of the file.

Sources: [src/installer/targets/codex.ts:40-59](), [src/installer/targets/codex.ts:61-90](), [src/installer/targets/codex.ts:121-180](), [src/installer/targets/toml.ts:56-121]()

### opencode Details

opencode supports global and local locations. On global installs, it uses `%APPDATA%/opencode` on Windows or `$XDG_CONFIG_HOME/opencode` / `~/.config/opencode` elsewhere. It prefers an existing `opencode.jsonc`, falls back to an existing `opencode.json`, and defaults new installs to `.jsonc`.

The target uses `jsonc-parser` edits so comments survive reinstall and uninstall. It writes the MCP entry under `mcp.codegraph` and uses `AGENTS.md` for shared instructions.

Sources: [src/installer/targets/opencode.ts:52-82](), [src/installer/targets/opencode.ts:99-132](), [src/installer/targets/opencode.ts:173-224](), [src/installer/targets/opencode.ts:226-242]()

## Shared Instructions And Idempotency

All agent instruction files receive the same marker-delimited block from `instructions-template.ts`. The block tells agents when to prefer CodeGraph tools over native search and includes a tool-choice table for `codegraph_search`, `codegraph_context`, callers/callees, impact, node, explore, files, and status.

The shared writer utilities are conservative: JSON parse failures are backed up before overwrite, writes are atomic, JSON equality avoids unnecessary rewrites, and markdown sections are replaced only between CodeGraph markers. The installer-target tests enforce the expected contract: install writes files, reinstall is idempotent, sibling MCP servers are preserved, uninstall reverses install, and `printConfig` does not write files.

Sources: [src/installer/instructions-template.ts:1-23](), [src/installer/instructions-template.ts:23-56](), [src/installer/targets/shared.ts:44-95](), [src/installer/targets/shared.ts:97-167](), [__tests__/installer-targets.test.ts:1-15](), [__tests__/installer-targets.test.ts:65-145]()

## Project-Local Surfaces After Global Install

A global MCP config is not always enough. Cursor's rule file is project-scoped, so `codegraph init` calls `wireProjectSurfacesForGlobalAgents()` to let any globally configured target write project-local support files. Today, Cursor is the notable target with a `wireProjectSurfaces()` implementation.

This is the key workflow detail for new maintainers: `codegraph install --location=global` configures the agent globally, and `codegraph init -i` in each project builds the graph and repairs any target-specific local surfaces needed for that project.

Sources: [src/installer/index.ts:219-249](), [src/bin/codegraph.ts:405-421](), [src/bin/codegraph.ts:430-442](), [src/installer/targets/types.ts:106-120](), [src/installer/targets/cursor.ts:163-171]()

## Watch Fallback And Git Hooks

After initialization, CodeGraph may offer a fallback when the live watcher is disabled. If the project is not a git repo, it tells the user to run `codegraph sync` manually after changes. If the repo is git-backed and hooks are not already installed, it can install sync hooks for commit, pull, and checkout workflows. Hook installation strips and re-appends only CodeGraph's marked block, preserving user hook content.

Sources: [src/installer/index.ts:365-433](), [src/sync/git-hooks.ts:121-159](), [src/sync/git-hooks.ts:161-208]()

## Manual Config Printing

`codegraph install --print-config <id>` is the no-write escape hatch. It resolves the target id, chooses global unless `--location=local` is provided, then prints that target's config snippet and returns without invoking the installer. Tests assert that `printConfig` returns non-empty output without creating files.

This supports portable setup docs, locked-down environments, and BYOC/BYOK workflows where users want to review or paste config manually rather than letting the installer write files.

Sources: [src/bin/codegraph.ts:1326-1352](), [src/installer/targets/types.ts:98-105](), [__tests__/installer-targets.test.ts:135-141]()

## Maintainer Checklist

When changing this area, verify the behavior at three layers:

| Change area | Files to inspect | Tests or checks to update |
|---|---|---|
| Add a CLI command or option | `src/bin/codegraph.ts`, README CLI reference | Command behavior and README examples |
| Add an agent target | `src/installer/targets/types.ts`, new `targets/<id>.ts`, `registry.ts` | Contract tests in `__tests__/installer-targets.test.ts` |
| Change shared MCP command shape | `src/installer/targets/shared.ts`, target-specific overrides | Target tests and manual snippets |
| Change Codex TOML handling | `src/installer/targets/codex.ts`, `src/installer/targets/toml.ts` | TOML serializer tests |
| Change install flow | `src/installer/index.ts` | Target resolution, local/global behavior, watch fallback |
| Change freshness behavior | `src/bin/codegraph.ts`, `src/mcp/index.ts`, `src/sync/**` | Watch policy and git hook behavior |

Sources: [src/installer/targets/registry.ts:1-8](), [src/installer/targets/types.ts:73-120](), [__tests__/installer-targets.test.ts:438-457](), [__tests__/installer-targets.test.ts:459-549]()

## Summary

The CLI is the user-facing shell around a local CodeGraph index and MCP server. The installer is the portability layer: one orchestrator, one target registry, and small target adapters that write each agent's native config format while preserving user-owned content. For a first 30 minutes in the repo, read `src/bin/codegraph.ts` for commands, `src/installer/index.ts` for install flow, `src/installer/targets/types.ts` plus `registry.ts` for the adapter contract, and the four target files for agent-specific config behavior.

Sources: [src/bin/codegraph.ts:126-129](), [src/installer/index.ts:1-12](), [src/installer/targets/types.ts:1-13](), [src/installer/targets/registry.ts:16-21]()

---

## 03. MCP Tools & Context Workflow

> How CodeGraph exposes provider-neutral code intelligence through stdio JSON-RPC tools, why codegraph_context is the primary workflow, and how search, callers, callees, impact, explore, node, status, and files fit together.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/03-mcp-tools-context-workflow.md
- Generated: 2026-05-22T16:26:39.449Z

### Source Files

- `src/mcp/index.ts`
- `src/mcp/tools.ts`
- `src/mcp/transport.ts`
- `src/mcp/server-instructions.ts`
- `src/context/index.ts`
- `src/context/formatter.ts`
- `__tests__/mcp-initialize.test.ts`
- `__tests__/mcp-roots.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [src/mcp/index.ts](src/mcp/index.ts)
- [src/mcp/tools.ts](src/mcp/tools.ts)
- [src/mcp/transport.ts](src/mcp/transport.ts)
- [src/mcp/server-instructions.ts](src/mcp/server-instructions.ts)
- [src/context/index.ts](src/context/index.ts)
- [src/context/formatter.ts](src/context/formatter.ts)
- [src/index.ts](src/index.ts)
- [src/bin/codegraph.ts](src/bin/codegraph.ts)
- [src/installer/targets/shared.ts](src/installer/targets/shared.ts)
- [src/types.ts](src/types.ts)
- [__tests__/mcp-initialize.test.ts](__tests__/mcp-initialize.test.ts)
- [__tests__/mcp-roots.test.ts](__tests__/mcp-roots.test.ts)
</details>

# MCP Tools & Context Workflow

CodeGraph exposes its indexed code knowledge graph through an MCP server that speaks JSON-RPC 2.0 over stdio. The result is intentionally provider-neutral: any client or agent runtime that can launch `codegraph serve --mcp` and exchange MCP messages over stdin/stdout can use the same local graph, without assuming a specific model vendor, hosted service, or proprietary connector.

This page explains the runtime handshake, why `codegraph_context` is the primary workflow, and how the supporting tools fit together when an agent needs search, call graph tracing, blast-radius analysis, source inspection, status, or indexed file structure.

Sources: [src/mcp/transport.ts:1-5](), [src/bin/codegraph.ts:1101-1121](), [src/installer/targets/shared.ts:14-24](), [src/mcp/server-instructions.ts:1-7]()

## Runtime Boundary

At runtime, `MCPServer` owns the stdio transport, a default `CodeGraph` instance when one is resolved, and a `ToolHandler` that executes individual `codegraph_*` tools. The server starts listening immediately, handles process shutdown signals, and defers project initialization until it has enough client context to choose the right workspace root.

```text
MCP client / agent
        |
        | JSON-RPC 2.0 over stdio
        v
src/mcp/transport.ts
        |
        v
src/mcp/index.ts  ->  ToolHandler in src/mcp/tools.ts
        |                         |
        v                         v
CodeGraph instance           context/search/call/status/file handlers
        |
        v
SQLite-backed local code graph
```

The server’s routing surface is small: `initialize`, `initialized`, `tools/list`, `tools/call`, and `ping` are the recognized JSON-RPC methods. Unknown request methods get a standard method-not-found error.

Sources: [src/mcp/index.ts:83-104](), [src/mcp/index.ts:112-125](), [src/mcp/index.ts:300-343](), [src/mcp/transport.ts:47-56]()

## Initialization And Root Resolution

The initialize response is deliberately sent before heavy CodeGraph initialization. The implementation comments call out slow filesystem and tree-sitter WASM startup cases where blocking the handshake caused MCP tools not to appear in clients. After responding, the server initializes in the background when it has an explicit path.

Project root selection is ordered by strength of signal:

| Signal | When Used | Behavior |
|---|---|---|
| `rootUri` | Client sends it during `initialize` | Converted from `file://` URI to a filesystem path. |
| `workspaceFolders[0].uri` | Client sends workspace folders | First folder is used. |
| `--path` constructor argument | Server was launched with a path | Used as the explicit project path. |
| `roots/list` | No explicit path, client advertised `roots` | Server asks the client for workspace roots on first needed tool call. |
| `process.cwd()` | Last resort | Used only after no stronger project signal is available. |

The `roots/list` path matters for BYOC/BYOK and editor portability: clients can launch the same stdio server from outside the repository while still reporting the real workspace root through MCP, instead of relying on a vendor-specific launch directory convention.

Sources: [src/mcp/index.ts:348-407](), [src/mcp/index.ts:171-216](), [src/mcp/index.ts:218-238](), [__tests__/mcp-roots.test.ts:1-17]()

```mermaid
sequenceDiagram
    participant Client as MCP client
    participant Server as MCPServer
    participant Transport as StdioTransport
    participant Graph as CodeGraph

    Client->>Server: initialize(rootUri? workspaceFolders? roots?)
    Server-->>Client: initialize result with tools capability and instructions
    alt explicit rootUri/workspaceFolders/--path
        Server->>Graph: tryInitializeDefault(explicitPath) in background
    else roots capability but no explicit path
        Client->>Server: tools/call
        Server->>Transport: roots/list
        Transport-->>Server: first file:// root
        Server->>Graph: tryInitializeDefault(root)
    else no usable root signal
        Server->>Graph: tryInitializeDefault(process.cwd())
    end
```

The tests enforce both contracts: initialize must respond quickly even when no `.codegraph` exists, and when a real `.codegraph` exists the initialize JSON-RPC response must be observed before watcher startup logs prove initialization finished.

Sources: [__tests__/mcp-initialize.test.ts:1-12](), [__tests__/mcp-initialize.test.ts:110-148](), [__tests__/mcp-roots.test.ts:96-126](), [__tests__/mcp-roots.test.ts:153-179]()

## Transport: Plain JSON-RPC Over Stdio

`StdioTransport` reads one JSON-RPC message per stdin line and writes JSON responses or notifications to stdout. It also supports server-initiated requests, which CodeGraph uses for `roots/list`; pending requests are keyed by generated IDs and rejected on timeout so the server can fall back instead of hanging.

Short excerpt from `src/mcp/transport.ts`:

```ts
process.stdout.write(JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n');
```

This boundary is intentionally simple. It keeps CodeGraph portable across local agent clients and skill/catalog sources: the server is just a local process with stdio, and model/provider choice stays outside the repository.

Sources: [src/mcp/transport.ts:63-72](), [src/mcp/transport.ts:77-93](), [src/mcp/transport.ts:110-133](), [src/mcp/transport.ts:177-226]()

## Why `codegraph_context` Is The Primary Workflow

The tool definitions and server instructions both steer agents toward `codegraph_context` first for “how does this work?”, architecture, feature-area, and bug-context questions. The reason is visible in the context builder: it turns a natural-language task into a compact markdown context by finding entry points, expanding graph relationships, extracting key code blocks, collecting related files, and computing stats in one call.

That makes `codegraph_context` the default first move for understanding a code area. It is not a replacement for product clarification: the tool handler adds a reminder for feature-like tasks because code context does not define UX preferences, edge cases, or acceptance criteria.

Sources: [src/mcp/tools.ts:241-276](), [src/mcp/server-instructions.ts:25-35](), [src/context/index.ts:196-267](), [src/mcp/tools.ts:640-675]()

### What Context Contains

`TaskContext` stores the original query, relevant subgraph, entry points, code blocks, related files, summary, and basic stats. The markdown formatter emits a compact document with entry points, related symbols grouped by file, and selected code blocks.

Sources: [src/types.ts:746-818](), [src/context/formatter.ts:9-71]()

## Tool Map

| Tool | Use It For | Implementation Notes |
|---|---|---|
| `codegraph_context` | First-pass task, architecture, feature-area, or bug context | Builds markdown context through `cg.buildContext()`. |
| `codegraph_search` | Quick symbol lookup by name | Returns locations and signatures, not source bodies. |
| `codegraph_callers` | “What calls this?” | Aggregates callers across all exact matching symbols. |
| `codegraph_callees` | “What does this call?” | Aggregates callees across all exact matching symbols. |
| `codegraph_impact` | Blast radius for a symbol change | Merges impact subgraphs across matching symbols. |
| `codegraph_explore` | Source for several related symbols | Groups symbols by file, shows relationships, and caps output adaptively. |
| `codegraph_node` | One symbol’s details | Defaults to no source; container nodes return outlines when source is requested. |
| `codegraph_status` | Index health and size | Reports files, nodes, edges, DB size, backend, node kinds, languages. |
| `codegraph_files` | Indexed file structure | Returns tree, flat, or grouped views with optional metadata. |

Sources: [src/mcp/tools.ts:248-442](), [src/mcp/tools.ts:585-612](), [src/mcp/tools.ts:617-637](), [src/mcp/tools.ts:1270-1353](), [src/mcp/tools.ts:1355-1404]()

## Context Workflow For A Reader Or Agent

For the first 30 minutes in a repo, use tools by intent rather than by implementation curiosity:

1. Start with `codegraph_context` for the task or area.
2. If you need source for several surfaced symbols, use one `codegraph_explore` with specific symbol, file, or code terms.
3. If one symbol needs exact details, use `codegraph_node`.
4. For refactors, use `codegraph_search`, then `codegraph_callers`, then `codegraph_impact`.
5. For project shape, use `codegraph_files`; for index health, use `codegraph_status`.

This mirrors the server-level playbook emitted in the initialize response. It also avoids repeated search/read loops that duplicate the graph index’s job.

Sources: [src/mcp/server-instructions.ts:37-60](), [src/mcp/index.ts:373-391](), [src/mcp/tools.ts:380-398]()

## How `codegraph_explore` Fits After Context

`codegraph_explore` is a breadth tool, but it is not the first tool for natural-language questions. Its description asks callers to query with specific symbol, file, or code terms, and its handler uses `findRelevantContext()` with larger search/traversal budgets before grouping results by file.

The output is intentionally capped and shaped. It scores files around entry points, can include a relationship map, reads contiguous source sections, clusters nearby symbol ranges, includes line numbers by default, and adds “additional relevant files” or budget notes depending on project size.

Sources: [src/mcp/tools.ts:380-398](), [src/mcp/tools.ts:819-858](), [src/mcp/tools.ts:863-932](), [src/mcp/tools.ts:994-1093](), [src/mcp/tools.ts:1207-1268]()

## Symbol Resolution And Ambiguity

The handler supports simple symbol names and qualified names using dot, slash, or `::` separators. For qualified lookups, it can suffix-match `qualifiedName` or match path segments, which helps languages where module/package structure is represented in file paths. `callers`, `callees`, and `impact` use the “all symbols” path so ambiguous names aggregate across exact matches and include a note about the matched locations.

Sources: [src/mcp/tools.ts:1547-1604](), [src/mcp/tools.ts:1606-1648](), [src/mcp/tools.ts:1650-1681](), [src/mcp/tools.ts:706-817]()

## Status, Files, And Freshness

`codegraph_status` reads graph stats and backend state from the active `CodeGraph` connection. `codegraph_files` reads the indexed file list, filters by path or glob-like pattern, and can render tree, flat, or language-grouped output. These tools use the index, not a live filesystem scan.

Freshness is handled by the MCP server starting the CodeGraph watcher when available. If watching is disabled or unavailable, the server writes a diagnostic message telling the user to run `codegraph sync` or use git sync hooks.

Sources: [src/index.ts:606-623](), [src/index.ts:686-690](), [src/mcp/tools.ts:1308-1353](), [src/mcp/tools.ts:1355-1404](), [src/mcp/index.ts:240-280]()

## Internal API Boundary

The MCP layer is a wrapper over `CodeGraph` methods rather than a separate intelligence engine. `CodeGraph` exposes node search, file listing, call graph traversal, callers, callees, impact radius, source extraction, `findRelevantContext()`, and `buildContext()`. The MCP tools choose argument defaults, output shape, truncation, cross-project lookup, and agent-facing descriptions.

Sources: [src/index.ts:650-655](), [src/index.ts:711-723](), [src/index.ts:725-798](), [src/index.ts:896-943](), [src/mcp/tools.ts:444-560]()

## Portable Integration Notes

The installed MCP config helper emits the same local stdio command shape for JSON-shaped agent configs: `type: "stdio"`, `command: "codegraph"`, and `args: ["serve", "--mcp"]`. Codex TOML has its own wrapper, but the underlying local command remains the same. This is the key portability point for Grok-Wiki or other BYOC/BYOK integrations: integration code should depend on local MCP process configuration and repository/cached skill files, not on a specific hosted model or connector.

The selected wiki knowledge profile also requested strategy and solved-problem sources. This checkout did not contain `STRATEGY.md` or `docs/solutions/**`, so implementation claims in this page are grounded in repository code and regression tests.

Sources: [src/installer/targets/shared.ts:14-24](), [src/bin/codegraph.ts:1101-1121](), [src/mcp/server-instructions.ts:1-17]()

## Summary

The MCP workflow is: launch a local stdio server, resolve the project root through explicit client signals or MCP roots, expose a compact set of graph-backed tools, and steer agents toward `codegraph_context` before narrower follow-up calls. `search`, `callers`, `callees`, `impact`, `explore`, `node`, `status`, and `files` are supporting tools around that primary context-building loop, each backed by the same local CodeGraph index rather than provider-specific code intelligence.

Sources: [src/mcp/tools.ts:241-248](), [src/context/index.ts:270-283](), [src/index.ts:925-943]()

---

## 04. Indexing & Language Extraction Pipeline

> The pipeline from project files to graph records: git-aware scanning, include and exclude filters, language detection, tree-sitter WASM loading, worker-based parsing, extraction results, and tests that prove supported language behavior.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/04-indexing-language-extraction-pipeline.md
- Generated: 2026-05-22T16:27:17.940Z

### Source Files

- `src/extraction/index.ts`
- `src/extraction/tree-sitter.ts`
- `src/extraction/parse-worker.ts`
- `src/extraction/grammars.ts`
- `src/extraction/tree-sitter-types.ts`
- `src/extraction/languages/typescript.ts`
- `src/extraction/languages/python.ts`
- `__tests__/extraction.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [src/extraction/index.ts](src/extraction/index.ts)
- [src/extraction/tree-sitter.ts](src/extraction/tree-sitter.ts)
- [src/extraction/parse-worker.ts](src/extraction/parse-worker.ts)
- [src/extraction/grammars.ts](src/extraction/grammars.ts)
- [src/extraction/tree-sitter-types.ts](src/extraction/tree-sitter-types.ts)
- [src/extraction/languages/index.ts](src/extraction/languages/index.ts)
- [src/extraction/languages/typescript.ts](src/extraction/languages/typescript.ts)
- [src/extraction/languages/python.ts](src/extraction/languages/python.ts)
- [src/types.ts](src/types.ts)
- [src/db/queries.ts](src/db/queries.ts)
- [__tests__/extraction.test.ts](__tests__/extraction.test.ts)
</details>

# Indexing & Language Extraction Pipeline

This page explains how CodeGraph turns project files into graph records. The pipeline starts with git-aware file discovery, applies include and exclude filters, detects languages, loads only the needed tree-sitter WASM grammars, parses files in a worker when available, extracts graph nodes and references, and finally persists the result into the local graph database.

Knowledge-profile note: this page used the bundled Compound Engineering page-shape guidance provided in the prompt as synthesis guidance. No installed local Compound Engineering skill was executed, and no `STRATEGY.md` or `docs/solutions/**` source was present in this checkout. Implementation claims below are grounded in repository code and tests.

## Pipeline At A Glance

```text
project root
  -> scanDirectoryAsync()
     -> git-visible files when possible
     -> filesystem walk fallback
     -> include/exclude filtering
  -> detect frameworks once per full index
  -> detect needed languages
  -> load tree-sitter WASM grammars in parse worker
  -> read files in small I/O batches
  -> parse/extract per file
     -> file node
     -> symbol nodes
     -> contains edges
     -> unresolved imports/calls/types/decorators/etc.
  -> storeExtractionResult()
     -> nodes
     -> edges
     -> unresolved_refs
     -> files metadata
```

The important boundary is that scanning and database writes stay in the orchestrator, while parsing is delegated to `extractFromSource()` and usually a worker thread. SQLite writes remain on the main thread. Sources: [src/extraction/index.ts:512-600](), [src/extraction/index.ts:734-849](), [src/extraction/tree-sitter.ts:2487-2547]()

## File Discovery And Filtering

`shouldIncludeFile()` is the first explicit filter: exclude patterns win before include patterns are considered. Both checks use normalized forward-slash paths and `picomatch` with dotfile matching enabled. Sources: [src/extraction/index.ts:97-126]()

The default configuration includes source extensions for TypeScript, JavaScript, Python, Go, Rust, Java, C/C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Svelte, Vue, Liquid, Pascal/Delphi, and Scala. It excludes common dependency, build, cache, coverage, IDE, and generated-output directories, and caps processed files at 1 MB by default. Sources: [src/types.ts:453-490](), [src/types.ts:495-696]()

### Git-Aware Scanning

In git repositories, the scanner uses `git ls-files` rather than walking the filesystem first. It collects tracked files with `--recurse-submodules`, separately collects untracked files with `--exclude-standard`, and recurses into embedded nested git repositories that git reports as opaque trailing-slash entries. If git discovery fails or the root is ignored by a parent repository, scanning falls back to the filesystem walker. Sources: [src/extraction/index.ts:128-215]()

The public scanner entry points are:

| Function | Role |
|---|---|
| `scanDirectory()` | Synchronous scan for source paths. |
| `scanDirectoryAsync()` | Async scan that periodically yields during large git file lists so progress rendering can continue. |
| `scanDirectoryWalk()` | Non-git fallback that follows readable directories, guards symlink cycles, respects `.codegraphignore`, and applies include/exclude rules. |

Sources: [src/extraction/index.ts:270-334](), [src/extraction/index.ts:336-435]()

## Language Detection And Grammar Loading

Language detection is extension-first. `EXTENSION_MAP` maps each known extension to the repository's `Language` union. `.h` files default to C but are upgraded to C++ if the first 8 KB contain C++-specific syntax such as namespaces, classes, templates, access labels, virtual methods, or `using namespace`. Sources: [src/extraction/grammars.ts:40-81](), [src/extraction/grammars.ts:178-200]()

Grammar loading is deliberately lazy. `initGrammars()` initializes the `web-tree-sitter` runtime without loading grammar WASM files. `loadGrammarsForLanguages()` deduplicates requested languages, skips already-loaded or unavailable grammars, and loads WASM grammars sequentially. Pascal and Scala use bundled local WASM files under `src/extraction/wasm`; the other grammar files come from `tree-sitter-wasms`. Sources: [src/extraction/grammars.ts:1-7](), [src/extraction/grammars.ts:19-38](), [src/extraction/grammars.ts:92-140]()

Supported languages are those with a WASM grammar plus the custom extractor languages `svelte`, `vue`, and `liquid`; `unknown` is explicitly unsupported. Sources: [src/extraction/grammars.ts:202-227]()

## Worker-Based Parsing

`indexAll()` initializes the WASM runtime, scans files, detects frameworks, emits parsing progress, computes the needed languages, and loads grammars inside a parse worker when `parse-worker.js` exists. If the compiled worker is unavailable, it loads grammars locally and parses in-process; this is useful for test/runtime environments where compiled worker output may not exist. Sources: [src/extraction/index.ts:512-600]()

The worker protocol is small:

| Message | Direction | Effect |
|---|---|---|
| `load-grammars` | main -> worker | Load the requested language grammars. |
| `grammars-loaded` | worker -> main | Signal that parsing can begin. |
| `parse` | main -> worker | Detect language and run `extractFromSource()`. |
| `parse-result` | worker -> main | Return an `ExtractionResult`. |
| `shutdown` / `shutdown-ack` | main <-> worker | Acknowledge shutdown. |

The worker also filters known noisy Emscripten abort lines from stderr, periodically resets per-language parsers after 5,000 parses, and exits on WASM memory corruption so the main thread can spawn a clean worker. Sources: [src/extraction/parse-worker.ts:13-56](), [src/extraction/parse-worker.ts:58-100]()

The orchestrator separately recycles the whole worker after 250 parses, applies per-file parse timeouts that scale with content size, rejects pending parses on worker crashes, and retries likely WASM-memory failures with fresh workers. Sources: [src/extraction/index.ts:36-50](), [src/extraction/index.ts:624-731](), [src/extraction/index.ts:881-984]()

## Extraction Results

The extraction result shape is shared across parsers: `nodes`, `edges`, `unresolvedReferences`, `errors`, and `durationMs`. Nodes represent files and symbols; edges represent relationships such as `contains`, `calls`, `imports`, `extends`, `implements`, `type_of`, `returns`, `instantiates`, and `decorates`. Unresolved references hold relationships that need later resolution against graph nodes. Sources: [src/types.ts:18-60](), [src/types.ts:97-186](), [src/types.ts:221-289]()

`TreeSitterExtractor.extract()` checks support, obtains a parser from the grammar cache, parses the source, creates a file node, pushes that file node as the root scope, walks the tree, then deletes the tree and releases the source string to reduce WASM and GC pressure. Sources: [src/extraction/tree-sitter.ts:140-239]()

During traversal, the extractor dispatches based on the active language extractor's node-type lists: functions, classes, methods, interfaces, structs, enums, aliases, properties, fields, variables, imports, calls, instantiations, and Rust impl items. `createNode()` assigns graph metadata and creates `contains` edges from the current scope. Sources: [src/extraction/tree-sitter.ts:263-385](), [src/extraction/tree-sitter.ts:390-433]()

Imports, calls, constructor invocations, decorators, inheritance, and type annotations are generally recorded as unresolved references first. For example, import extraction creates an `import` node and an unresolved `imports` reference; call extraction records a `calls` reference; instantiation extraction records an `instantiates` reference to the constructor class name. Sources: [src/extraction/tree-sitter.ts:1234-1270](), [src/extraction/tree-sitter.ts:1446-1454](), [src/extraction/tree-sitter.ts:1457-1506](), [src/extraction/tree-sitter.ts:1508-1569]()

## Language Extractor Contract

Per-language behavior is configured through the `LanguageExtractor` interface. It names AST node types for language concepts, field names for common syntax roles, and optional hooks for signatures, visibility, exports, async/static flags, imports, variables, custom visitors, body resolution, class classification, receiver types, and parser-misparse handling. Sources: [src/extraction/tree-sitter-types.ts:73-151](), [src/extraction/tree-sitter-types.ts:153-208]()

The extractor registry maps repository `Language` values to concrete language extractors. TypeScript and TSX share `typescriptExtractor`; JavaScript and JSX share `javascriptExtractor`; other languages have their own modules. Sources: [src/extraction/languages/index.ts:1-46]()

TypeScript extraction covers function declarations, arrow functions, function expressions, classes, abstract classes, methods, interfaces, enums, type aliases, imports, calls, and top-level variables. It also handles class-field arrow functions by resolving the nested arrow/function body, computes signatures from parameters and return types, walks parents to detect exports, and detects `const`. Sources: [src/extraction/languages/typescript.ts:4-58](), [src/extraction/languages/typescript.ts:59-118]()

Python extraction maps `function_definition` to both functions and methods depending on class scope, handles `class_definition`, import nodes, calls, assignments, signatures with return annotations, async functions, `@staticmethod`, and `from ... import ...` module extraction. Sources: [src/extraction/languages/python.ts:4-53]()

## Custom Extractors And Framework Add-Ons

`extractFromSource()` routes Svelte, Vue, Liquid, and Pascal DFM/FMX files to custom extractors instead of the generic tree-sitter extractor. All other supported languages go through `TreeSitterExtractor`. After the language pass, framework-specific extractors can run if `frameworkNames` are supplied, and their nodes and unresolved references are merged into the same result. Sources: [src/extraction/tree-sitter.ts:2487-2547]()

`indexAll()` detects frameworks once per full index from the scanned file list, caches the names on the orchestrator for the run, and passes those names into each parse call. Single-file indexing can also detect frameworks on demand if a full run has not populated the cache. Sources: [src/extraction/index.ts:437-507](), [src/extraction/index.ts:546-552](), [src/extraction/index.ts:1140-1145]()

## Persistence Into Graph Records

`storeExtractionResult()` hashes file content, skips unchanged files, deletes stale rows for changed files, filters invalid nodes, inserts valid nodes, filters edges so both endpoints exist, inserts unresolved references with denormalized file and language context, and upserts the file metadata record. Sources: [src/extraction/index.ts:1154-1225]()

Database writes are handled by `QueryBuilder`: nodes and edges are inserted in transactions, file records are upserted by path, deleting a file also deletes its nodes, and unresolved references are inserted in a batch transaction. Sources: [src/db/queries.ts:193-264](), [src/db/queries.ts:960-992](), [src/db/queries.ts:1074-1116](), [src/db/queries.ts:1160-1189]()

## Sync Path

Full indexing is not the only entry point. `sync()` initializes grammars, then tries `git status --porcelain --no-renames` to identify modified, added, and deleted files that still pass include/exclude filtering. Deleted files are removed from the database; modified and untracked files are hashed and re-indexed only when needed. If git change detection is unavailable, sync falls back to a full scan and compares current files against stored file records. Sources: [src/extraction/index.ts:218-268](), [src/extraction/index.ts:1227-1320]()

## Behavior Proven By Tests

The extraction test suite loads all grammars before tests, then verifies language detection, language support reporting, extraction behavior, scanner filtering, git submodules, embedded git repositories, and Scala support. Sources: [__tests__/extraction.test.ts:17-20](), [__tests__/extraction.test.ts:34-125]()

Key tested behaviors include:

| Area | Evidence |
|---|---|
| TypeScript extraction | Functions, classes, interfaces, calls, arrow/function-expression exports, type aliases, exported constants, file nodes, and containment edges. Sources: [__tests__/extraction.test.ts:128-214](), [__tests__/extraction.test.ts:216-314](), [__tests__/extraction.test.ts:316-530]() |
| Python, Go, Rust, Java, PHP, Swift, Kotlin, Dart | Representative symbol extraction and language-specific relationships such as Rust trait implementation references and Swift inheritance references. Sources: [__tests__/extraction.test.ts:532-725](), [__tests__/extraction.test.ts:727-965](), [__tests__/extraction.test.ts:967-1155]() |
| Scanner exclusions | `node_modules`, nested `node_modules`, `.git`, normalized paths, and `.codegraphignore`. Sources: [__tests__/extraction.test.ts:3006-3080]() |
| Git repository boundaries | Submodule files are included; embedded non-submodule repos are traversed; each embedded repo's `.gitignore` is respected. Sources: [__tests__/extraction.test.ts:3083-3205]() |
| Scala support | Detection, support reporting, classes, objects, traits, methods, signatures, and top-level functions. Sources: [__tests__/extraction.test.ts:3212-3299]() |

## Provider-Neutral Architecture Notes

This pipeline is BYOC/BYOK-friendly because the extraction path is local and provider-neutral: it reads files from the project root, optionally asks local git for visible or changed paths, parses with tree-sitter WASM grammars, uses worker threads for isolation, and writes graph records through the local database query layer. The extraction code shown here does not depend on a hosted model provider, proprietary API key, or connector-specific runtime. Sources: [src/extraction/index.ts:7-26](), [src/extraction/grammars.ts:9-11](), [src/extraction/parse-worker.ts:8-11](), [src/db/queries.ts:7-23]()

For Grok-Wiki or skill-pack integration, keep the same boundary: treat generated wiki context, solved-problem notes, strategy files, and skill catalogs as portable source inputs that can live in files, repositories, or catalogs. They should orient documentation, but code and tests should remain the source of truth for implementation claims.

In short, CodeGraph indexing is a local, staged pipeline: discover eligible files, detect languages, load only needed grammars, parse safely through a worker when possible, extract graph-shaped records, and persist only valid nodes, edges, unresolved references, and file metadata. Sources: [src/extraction/index.ts:512-600](), [src/extraction/tree-sitter.ts:2487-2547](), [src/extraction/index.ts:1154-1225]()

---

## 05. Reference Resolution & Framework Routes

> How unresolved references become graph edges through import resolution, name matching, path aliases, framework detectors, and route resolvers for web frameworks across JavaScript, Python, Ruby, Java, Go, Rust, C#, Swift, PHP, Svelte, and Vue.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/05-reference-resolution-framework-routes.md
- Generated: 2026-05-22T16:27:32.269Z

### Source Files

- `src/resolution/index.ts`
- `src/resolution/types.ts`
- `src/resolution/import-resolver.ts`
- `src/resolution/name-matcher.ts`
- `src/resolution/path-aliases.ts`
- `src/resolution/frameworks/index.ts`
- `src/resolution/frameworks/express.ts`
- `__tests__/frameworks.test.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [src/resolution/index.ts](src/resolution/index.ts)
- [src/resolution/types.ts](src/resolution/types.ts)
- [src/resolution/import-resolver.ts](src/resolution/import-resolver.ts)
- [src/resolution/name-matcher.ts](src/resolution/name-matcher.ts)
- [src/resolution/path-aliases.ts](src/resolution/path-aliases.ts)
- [src/resolution/frameworks/index.ts](src/resolution/frameworks/index.ts)
- [src/resolution/frameworks/express.ts](src/resolution/frameworks/express.ts)
- [src/resolution/frameworks/nestjs.ts](src/resolution/frameworks/nestjs.ts)
- [src/resolution/frameworks/python.ts](src/resolution/frameworks/python.ts)
- [src/resolution/frameworks/laravel.ts](src/resolution/frameworks/laravel.ts)
- [src/resolution/frameworks/ruby.ts](src/resolution/frameworks/ruby.ts)
- [src/resolution/frameworks/java.ts](src/resolution/frameworks/java.ts)
- [src/resolution/frameworks/go.ts](src/resolution/frameworks/go.ts)
- [src/resolution/frameworks/rust.ts](src/resolution/frameworks/rust.ts)
- [src/resolution/frameworks/csharp.ts](src/resolution/frameworks/csharp.ts)
- [src/resolution/frameworks/swift.ts](src/resolution/frameworks/swift.ts)
- [src/resolution/frameworks/svelte.ts](src/resolution/frameworks/svelte.ts)
- [src/resolution/frameworks/vue.ts](src/resolution/frameworks/vue.ts)
- [src/extraction/tree-sitter.ts](src/extraction/tree-sitter.ts)
- [src/extraction/index.ts](src/extraction/index.ts)
- [__tests__/frameworks.test.ts](__tests__/frameworks.test.ts)
- [__tests__/frameworks-integration.test.ts](__tests__/frameworks-integration.test.ts)
- [__tests__/resolution.test.ts](__tests__/resolution.test.ts)
- [docs/plans/2026-04-24-framework-resolver-extract.md](docs/plans/2026-04-24-framework-resolver-extract.md)
</details>

# Reference Resolution & Framework Routes

This page explains how CodeGraph turns unresolved references produced during extraction into graph edges. The core path is intentionally local and provider-neutral: source files are parsed, unresolved references are persisted, and `ReferenceResolver` later resolves them using framework-specific rules, import resolution, name matching, path aliases, and re-export chasing.

The route side of the system uses the same pipeline. Framework extractors create `route` nodes plus unresolved references from routes to handlers; those references then flow through the normal resolver, so a Django `path("users/", UserListView.as_view())`, Express `app.get('/users', listUsers)`, or Axum `.route("/users", get(list_users))` can become a graph edge instead of remaining text in a route file.

## Mental Model

```text
source file
  |
  | tree-sitter or custom extractor
  v
nodes + unresolvedReferences
  |
  | framework extractors may add route nodes + route->handler unresolved refs
  v
ReferenceResolver.resolveAll()
  |
  | 1. framework resolve()
  | 2. import resolve()
  | 3. name matcher
  v
ResolvedRef[]
  |
  | createEdges()
  v
graph edges with confidence + resolvedBy metadata
```

`FrameworkResolver.extract()` is the bridge between web framework syntax and ordinary graph resolution. It returns `{ nodes, references }`, where route nodes enter the node set and handler references enter the unresolved-reference set. Later, `CodeGraph.resolveReferences()` loads unresolved references from the database and persists resolved edges.

Sources: [src/resolution/types.ts:112-143](), [src/extraction/tree-sitter.ts:2480-2547](), [src/index.ts:565-578]()

## Resolver Context and Strategy Order

`ReferenceResolver` owns the resolution context used by all strategies. The context wraps database queries and filesystem reads with small caches for file nodes, file contents, import mappings, re-exports, symbol names, lower-case names, qualified names, known files, and project aliases. It warms lightweight file/name caches before resolving and clears them when the resolver is reinitialized.

The central `resolveOne()` order is:

| Order | Strategy | What it handles | Early return |
|---:|---|---|---|
| 0 | built-in/external skip | JS, React hooks, Python, Go, Pascal built-ins and similar external names | returns `null` |
| 1 | framework resolver | framework conventions such as services, controllers, routes, compiler macros | returns immediately at confidence `>= 0.9` |
| 2 | import resolver | local imports, aliases, namespace imports, default/named imports, barrels | returns immediately at confidence `>= 0.9` |
| 3 | name matcher | file-path, qualified-name, method-call, exact-name, fuzzy matches | participates in best-candidate selection |

The resolver also uses a pre-filter: if a reference name is not in the known symbol-name set and does not match a local import, it skips the expensive strategies. The import escape matters because barrel re-export chains can expose a name that has no declaration under that exact imported name.

Sources: [src/resolution/index.ts:119-151](), [src/resolution/index.ts:159-185](), [src/resolution/index.ts:190-320](), [src/resolution/index.ts:390-496]()

## From Resolved References to Graph Edges

A successful resolution produces a `ResolvedRef` with a target node id, confidence, and `resolvedBy` method. `createEdges()` then converts those into graph edges and keeps the original reference line and column. Two semantic promotions happen at this final step:

| Original edge kind | Target shape | Final edge kind | Why |
|---|---|---|---|
| `extends` | interface or protocol | `implements` | class/struct inheritance syntax often targets an interface-like symbol |
| `calls` | class or struct | `instantiates` | languages such as Python and Ruby use `Foo()` for object construction |

The output edge metadata keeps `confidence` and `resolvedBy`, which lets downstream callers understand whether an edge came from imports, a framework rule, an exact name match, fuzzy matching, and so on.

Sources: [src/resolution/types.ts:31-60](), [src/resolution/index.ts:498-540](), [__tests__/resolution.test.ts:610-647](), [__tests__/resolution.test.ts:664-689]()

## Import Resolution

`resolveImportPath()` first classifies external imports, then resolves relative imports or aliased/absolute imports. For JS/TS, external classification checks Node built-ins but consults project aliases before treating bare specifiers as npm packages, so prefixes like `@utils/*` can still resolve locally.

Supported extension resolution is language-specific. JS/TS tries source files and index files, Python tries `.py` and `__init__.py`, Rust tries `.rs` and `mod.rs`, and Java, C#, PHP, Ruby, and Go have their expected source extensions.

Import mapping extraction is regex-based and currently covers JS/TS `import` and `require`, Python `from/import`, Go import declarations, and PHP `use` statements. `resolveViaImport()` matches the local reference name against those mappings, resolves the mapped source path, and then looks for the exported symbol in the resolved file.

Sources: [src/resolution/import-resolver.ts:12-56](), [src/resolution/import-resolver.ts:66-112](), [src/resolution/import-resolver.ts:146-202](), [src/resolution/import-resolver.ts:204-225](), [src/resolution/import-resolver.ts:230-328](), [src/resolution/import-resolver.ts:331-452](), [src/resolution/import-resolver.ts:585-636]()

### Path Aliases

Path aliases are loaded from root `tsconfig.json` or `jsconfig.json`. The loader parses JSON-with-comments, reads `compilerOptions.baseUrl` and `compilerOptions.paths`, supports TypeScript-style `*` wildcards, sorts patterns by specificity, and deliberately does not follow `extends` chains or bundler configs yet.

When an alias matches, `applyAliases()` returns candidate project-relative paths in replacement priority order. The import resolver then applies the language extension list to each candidate. If no project alias applies, the resolver still falls back to conventional aliases such as `@/`, `~/`, `@src/`, `src/`, `@app/`, and `app/`.

Sources: [src/resolution/path-aliases.ts:1-24](), [src/resolution/path-aliases.ts:30-55](), [src/resolution/path-aliases.ts:57-122](), [src/resolution/path-aliases.ts:138-200](), [src/resolution/path-aliases.ts:202-242](), [src/resolution/import-resolver.ts:173-202](), [__tests__/resolution.test.ts:715-783]()

### Re-export Chains

Barrel files are resolved by `findExportedSymbol()`. It first checks direct exported declarations in the resolved file. If none match, it asks the context for JS/TS re-exports and follows named or wildcard exports recursively, with a depth cap and a visited set to avoid cycles. Named re-export aliases are handled by chasing the original upstream name.

```typescript
// src/resolution/import-resolver.ts
export { signIn as login } from './auth';
// an import of "login" is chased upstream as "signIn"
```

Tests cover both wildcard barrel chains and renamed re-exports, proving that imports through `all.ts` or `index.ts` can still attach callers to the original declaration.

Sources: [src/resolution/import-resolver.ts:515-583](), [src/resolution/import-resolver.ts:638-730](), [__tests__/resolution.test.ts:786-847]()

## Name Matching

The name matcher is the general fallback after framework and import strategies. It tries:

| Strategy | Example | Confidence behavior |
|---|---|---|
| file path | `snippets/drawer-menu.liquid` | exact path, suffix path, or only-file fallback |
| qualified name | `User.save`, `User::save` | exact qualified name first, then suffix-style qualified match |
| method call | `obj.method`, `Class::method` | class-local method, capitalized receiver, then receiver/class word overlap |
| exact name | `navigate` | unique match, or best scored candidate among duplicates |
| fuzzy | case-insensitive callable lookup | last resort, lower confidence |

When duplicate names exist, `findBestMatch()` scores candidates by same file, directory proximity, same language, reference kind bias, export status, and line proximity. This is why a Python monorepo can prefer `apps/app_a/src/server.py::navigate` when the caller is also in `apps/app_a`.

Sources: [src/resolution/name-matcher.ts:10-63](), [src/resolution/name-matcher.ts:65-147](), [src/resolution/name-matcher.ts:149-278](), [src/resolution/name-matcher.ts:290-397](), [src/resolution/name-matcher.ts:399-463](), [__tests__/resolution.test.ts:38-144]()

## Framework Registry and Detection

All framework resolvers are registered in one array. The registry includes PHP Laravel; JS/TS Express, NestJS, React, Svelte, and Vue; Python Django, Flask, and FastAPI; Ruby Rails; Java Spring; Go; Rust; C# ASP.NET; and SwiftUI, UIKit, and Vapor. `detectFrameworks()` runs each resolver’s project-level `detect()` safely and ignores resolver exceptions. `getApplicableFrameworks()` filters detected resolvers by file language, treating resolvers without a `languages` list as universal.

Framework detection happens once per indexing run in `ExtractionOrchestrator`. The orchestrator builds a filesystem-backed detection context before graph nodes exist, stores detected framework names, and passes them into each parse call. Worker-thread parsing receives those names too, so framework extraction is not limited to the in-process fallback path.

Sources: [src/resolution/frameworks/index.ts:23-105](), [src/extraction/index.ts:437-507](), [src/extraction/index.ts:546-553](), [src/extraction/index.ts:692-731](), [src/extraction/parse-worker.ts:58-67]()

## Route Extraction by Framework

Framework route extractors generally follow the same pattern: strip comments, find route syntax, create a `route` node, then create an unresolved `references` or `imports` entry from the route node to the handler symbol. UI-file-routing frameworks may emit route nodes without handler references.

| Language | Resolver | Route shapes extracted | Handler reference behavior |
|---|---|---|---|
| JavaScript/TypeScript | Express | `app`/`router` HTTP methods and `use` with path | links the last handler argument, treating earlier handlers as middleware |
| JavaScript/TypeScript | NestJS | HTTP decorators, GraphQL ops, message/event patterns, WebSocket messages | links to the decorated method; joins controller prefix with method path |
| JavaScript/TypeScript | React/Next | pages/app default-export routes | emits route/component nodes; no route-handler references yet |
| Python | Django | `path`, `re_path`, `url`, `include` | links to view class/function or imports included module |
| Python | Flask/FastAPI | route decorators | links to decorated function |
| PHP | Laravel | `Route::method`, `resource`, `apiResource` | links to method or controller class depending on syntax |
| Ruby | Rails | `controller#action` route declarations | links to action name |
| Java | Spring | mapping annotations | links to the next Java method declaration |
| Go | Gin/Echo/Fiber/Chi/net/http-like route calls | links to tail identifier of handler expression |
| Rust | Actix/Rocket attributes and Axum `.route()` | links to following function or Axum handler |
| C# | ASP.NET attributes and Minimal APIs | links to next method or handler identifier |
| Swift | Vapor | `app/router/routes.method(..., use: handler)` | links to last handler segment |
| Svelte | SvelteKit | route files under `src/routes` | emits route nodes; also resolves runes, stores, and `$lib` imports |
| Vue | Nuxt | `pages/` and `server/api/` files | emits route nodes; also resolves macros, auto-imports, virtual modules, and aliases |

Sources: [src/resolution/frameworks/express.ts:101-146](), [src/resolution/frameworks/nestjs.ts:107-187](), [src/resolution/frameworks/python.ts:41-90](), [src/resolution/frameworks/python.ts:139-192](), [src/resolution/frameworks/laravel.ts:95-176](), [src/resolution/frameworks/ruby.ts:90-132](), [src/resolution/frameworks/java.ts:121-169](), [src/resolution/frameworks/go.ts:83-132](), [src/resolution/frameworks/rust.ts:92-172](), [src/resolution/frameworks/csharp.ts:119-202](), [src/resolution/frameworks/swift.ts:337-384](), [src/resolution/frameworks/react.ts:77-173](), [src/resolution/frameworks/svelte.ts:148-179](), [src/resolution/frameworks/vue.ts:190-220]()

## Framework-specific Reference Resolution

Framework `resolve()` methods handle conventions that are not explicit imports. Examples include Express middleware names, controller methods, service/helper method references, Laravel `Model::method()` and `Controller@method`, Rails model/controller/helper/service conventions, Spring service/repository/controller/entity naming, Go handler/service/middleware/model naming, Rust handler/service/struct/module naming, ASP.NET controller/service/repository/model/view-model naming, and Swift view/controller/model/middleware naming.

Some framework resolvers intentionally resolve framework-provided names back to the source node rather than user code. Svelte runes and SvelteKit `$app`/`$env` modules are compiler/framework-provided. Vue compiler macros, Nuxt auto-imports, and Nuxt virtual modules are handled similarly. This avoids wasting resolver work on symbols that should not have local declarations.

Sources: [src/resolution/frameworks/express.ts:54-99](), [src/resolution/frameworks/laravel.ts:47-93](), [src/resolution/frameworks/ruby.ts:34-88](), [src/resolution/frameworks/java.ts:52-119](), [src/resolution/frameworks/go.ts:27-81](), [src/resolution/frameworks/rust.ts:31-90](), [src/resolution/frameworks/csharp.ts:50-117](), [src/resolution/frameworks/swift.ts:37-78](), [src/resolution/frameworks/swift.ts:158-212](), [src/resolution/frameworks/swift.ts:294-335](), [src/resolution/frameworks/svelte.ts:69-146](), [src/resolution/frameworks/vue.ts:103-188]()

## Rust Workspace Module Resolution

Rust has one extra resolver helper for workspace crates. `getCargoWorkspaceCrateMap()` parses the root `Cargo.toml`, expands workspace members, reads member package names, and maps both hyphenated and underscore crate names to member directories. The Rust resolver checks local `src/name.rs` and `src/name/mod.rs`, then workspace member `src/lib.rs` or `src/main.rs`. Workspace hits get higher confidence so cross-crate imports can beat accidental same-file name matches.

Sources: [src/resolution/frameworks/cargo-workspace.ts:1-12](), [src/resolution/frameworks/cargo-workspace.ts:142-163](), [src/resolution/frameworks/cargo-workspace.ts:169-224](), [src/resolution/frameworks/rust.ts:71-87](), [src/resolution/frameworks/rust.ts:215-239](), [__tests__/frameworks.test.ts:544-610]()

## Tests Worth Reading First

For a first 30 minutes in this area, read the tests in this order:

1. `__tests__/frameworks.test.ts`: unit fixtures for `FrameworkResolver.extract()`, applicable-framework filtering, route extraction across frameworks, NestJS decorator edge cases, Laravel/Rails/Spring/Go/Rust examples, and Rust workspace crates.
2. `__tests__/frameworks-integration.test.ts`: end-to-end Django proof that a route node links to `UserListView`.
3. `__tests__/resolution.test.ts`: name matching, import path resolution, framework detection, edge kind promotion, path aliases, and re-export chains.

The planning doc is useful historical context because it states the intended architecture: replace a dead route-node-only hook with `extract()` returning both framework nodes and unresolved references, then let the existing resolver produce final edges. Treat the current code and tests as the source of truth where they differ.

Sources: [__tests__/frameworks.test.ts:5-39](), [__tests__/frameworks.test.ts:43-176](), [__tests__/frameworks.test.ts:178-457](), [__tests__/frameworks.test.ts:459-542](), [__tests__/frameworks-integration.test.ts:20-58](), [__tests__/resolution.test.ts:272-354](), [__tests__/resolution.test.ts:715-847](), [docs/plans/2026-04-24-framework-resolver-extract.md:5-9]()

## Extension Guidance

To add or change a resolver, keep the architecture portable: implement a local `FrameworkResolver` that depends on `ResolutionContext`, file contents, and indexed nodes rather than a hosted model, proprietary service, or vendor-specific connector. Register it in `src/resolution/frameworks/index.ts`, give it a `languages` list when possible, write small route extraction fixtures, and add at least one resolution fixture if the resolver has convention-based `resolve()` behavior.

For route support, prefer this contract:

```typescript
extract(filePath, content) {
  return {
    nodes: [routeNode],
    references: [{
      fromNodeId: routeNode.id,
      referenceName: handlerName,
      referenceKind: 'references',
      filePath,
      language,
      line,
      column: 0,
    }],
  };
}
```

That keeps framework knowledge at the edge of extraction while preserving the shared import/name/framework resolution pipeline for final graph edges.

Sources: [src/resolution/types.ts:123-143](), [src/resolution/frameworks/index.ts:96-105](), [src/extraction/tree-sitter.ts:2522-2545](), [docs/plans/2026-04-24-framework-resolver-extract.md:21-32]()

In short: CodeGraph resolves references by combining local project facts, framework conventions, import graphs, alias rules, and name matching. Framework routes are not a separate graph mechanism; they are source-derived `route` nodes plus ordinary unresolved references that become edges through the same resolver used for code symbols. Sources: [src/resolution/index.ts:450-496](), [src/extraction/tree-sitter.ts:2522-2547](), [__tests__/frameworks-integration.test.ts:40-55]()

---

## 06. Freshness, Storage & What to Try Next

> The closing map for the first 30 minutes: understand the SQLite schema, query and traversal layers, file watcher, git hook sync path, safety checks, and the concrete next experiments that prove the local graph is fresh and useful.

- Page Markdown: https://grok-wiki.com/public/wiki/colbymchenry-codegraph-89e8b2c4d43a/pages/06-freshness-storage-what-to-try-next.md
- Generated: 2026-05-22T16:27:10.789Z

### Source Files

- `src/db/schema.sql`
- `src/db/index.ts`
- `src/db/queries.ts`
- `src/db/sqlite-adapter.ts`
- `src/graph/traversal.ts`
- `src/graph/queries.ts`
- `src/sync/watcher.ts`
- `src/sync/git-hooks.ts`

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [src/db/schema.sql](src/db/schema.sql)
- [src/db/index.ts](src/db/index.ts)
- [src/db/queries.ts](src/db/queries.ts)
- [src/db/sqlite-adapter.ts](src/db/sqlite-adapter.ts)
- [src/graph/traversal.ts](src/graph/traversal.ts)
- [src/graph/queries.ts](src/graph/queries.ts)
- [src/sync/watcher.ts](src/sync/watcher.ts)
- [src/sync/git-hooks.ts](src/sync/git-hooks.ts)
- [src/sync/watch-policy.ts](src/sync/watch-policy.ts)
- [src/sync/index.ts](src/sync/index.ts)
- [src/index.ts](src/index.ts)
- [src/extraction/index.ts](src/extraction/index.ts)
- [src/bin/codegraph.ts](src/bin/codegraph.ts)
- [src/utils.ts](src/utils.ts)
- [__tests__/sync.test.ts](__tests__/sync.test.ts)
- [__tests__/watcher.test.ts](__tests__/watcher.test.ts)
- [__tests__/git-hooks.test.ts](__tests__/git-hooks.test.ts)
</details>

# Freshness, Storage & What to Try Next

This page is the closing map for the first 30 minutes in CodeGraph: where the local graph lives, how it is queried, how it stays fresh, and which experiments prove that the graph is both current and useful. The important thing to notice first is that freshness is local and file-backed: CodeGraph stores indexed symbols in `.codegraph/codegraph.db`, watches or syncs source changes, and exposes status/search/context commands without depending on a hosted model provider.

The provider-neutral posture matters for BYOC/BYOK workflows: the storage, sync, and query layers are ordinary repository files plus SQLite state. Any assistant or integration can consume the same graph through CLI or MCP surfaces without assuming a specific model vendor.

## Storage: The Local SQLite Graph

CodeGraph stores four main data shapes: `nodes`, `edges`, `files`, and `unresolved_refs`. Nodes are code symbols with source ranges and metadata; edges connect symbols; files record content hashes and indexing timestamps; unresolved refs hold references that need a later resolution pass. The schema also includes `schema_versions`, `project_metadata`, indexes for common lookups, and an FTS5 table kept current by triggers on `nodes`.

```mermaid
erDiagram
  files ||--o{ nodes : "file_path"
  nodes ||--o{ edges : "source"
  nodes ||--o{ edges : "target"
  nodes ||--o{ unresolved_refs : "from_node_id"
  schema_versions {
    integer version PK
    integer applied_at
    text description
  }
  files {
    text path PK
    text content_hash
    text language
    integer indexed_at
    integer node_count
  }
  nodes {
    text id PK
    text kind
    text name
    text qualified_name
    text file_path
    integer updated_at
  }
  edges {
    integer id PK
    text source
    text target
    text kind
    text provenance
  }
  unresolved_refs {
    integer id PK
    text from_node_id
    text reference_name
    text file_path
    text language
  }
```

Sources: [src/db/schema.sql:19-81](), [src/db/schema.sql:87-150]()

The default database path is `.codegraph/codegraph.db`. Initialization creates the parent directory, opens SQLite, enables foreign keys, sets WAL-oriented pragmas for the native backend, executes `schema.sql`, and records the current schema version when needed. Existing databases are opened through the same backend factory and then migrated if their schema version is behind.

Sources: [src/db/index.ts:32-68](), [src/db/index.ts:73-101](), [src/db/index.ts:181-190]()

### Backend Choice: Native First, WASM Fallback

The SQLite adapter presents one interface over `better-sqlite3` and `node-sqlite3-wasm`. It tries the native backend first and falls back to WASM if native loading fails. The WASM adapter rewrites named parameters, emulates transactions, adapts pragmas, and finalizes open statements on close so file locks are released.

The product signal to remember: `codegraph status` shows `Backend: native` or a WASM warning. WASM is deliberately supported for portability, but the code surfaces it because indexing and sync are slower there.

Sources: [src/db/sqlite-adapter.ts:8-23](), [src/db/sqlite-adapter.ts:72-123](), [src/db/sqlite-adapter.ts:124-229](), [src/db/sqlite-adapter.ts:231-267](), [src/index.ts:615-623](), [src/bin/codegraph.ts:721-735]()

## Query Layers: Prepared Storage APIs, Then Graph Views

`QueryBuilder` is the low-level database API. It converts SQLite rows into typed objects, keeps a small node cache, and owns prepared statements for insert/update/delete, node lookup, edge lookup, file records, unresolved refs, stats, and metadata. File freshness depends on the `files.content_hash` values written during extraction and compared during sync.

A short storage excerpt:

```ts
// src/db/queries.ts
INSERT INTO files (path, content_hash, language, size, modified_at, indexed_at, node_count, errors)
VALUES (@path, @contentHash, @language, @size, @modifiedAt, @indexedAt, @nodeCount, @errors)
ON CONFLICT(path) DO UPDATE SET
  content_hash = @contentHash,
  indexed_at = @indexedAt,
  node_count = @nodeCount
```

Sources: [src/db/queries.ts:84-151](), [src/db/queries.ts:193-253](), [src/db/queries.ts:1004-1048](), [src/db/queries.ts:1074-1149](), [src/db/queries.ts:1361-1407]()

`GraphTraverser` is the traversal layer. It uses `QueryBuilder` to perform BFS, DFS, callers, callees, call graphs, type hierarchy, usages, impact radius, paths, ancestors, and children. BFS has bounded defaults, supports direction and edge/node filters, and prioritizes `contains` and `calls` edges before broader references.

Sources: [src/graph/traversal.ts:13-20](), [src/graph/traversal.ts:48-125](), [src/graph/traversal.ts:207-227](), [src/graph/traversal.ts:229-345](), [src/graph/traversal.ts:456-581](), [src/graph/traversal.ts:590-640]()

`GraphQueryManager` builds higher-level questions from traversal and direct queries. It can assemble a node context, file dependencies and dependents, exported symbols, module structure, circular dependencies, node metrics, filtered subgraphs, and dead-code candidates. These are graph conveniences, not a separate store.

Sources: [src/graph/queries.ts:14-21](), [src/graph/queries.ts:23-108](), [src/graph/queries.ts:110-188](), [src/graph/queries.ts:230-330](), [src/graph/queries.ts:332-427]()

## Freshness: Hashes, Sync, Watcher, Git Hooks

Freshness starts in extraction. CodeGraph hashes file contents with SHA-256, skips storing unchanged files, deletes prior rows for changed files, inserts valid nodes/edges/unresolved refs, and writes a fresh `FileRecord` with `contentHash`, size, `modifiedAt`, `indexedAt`, node count, and any errors.

Sources: [src/extraction/index.ts:89-94](), [src/extraction/index.ts:1154-1225]()

`sync()` has two paths. In a git repo it uses `git status --porcelain --no-renames` to inspect only changed, added, and deleted files, then still hashes modified or untracked files against the DB so an untracked file is not repeatedly treated as new after it has been indexed. Outside git, or if git fails, it scans the current file set and compares it against tracked `files` records.

Sources: [src/extraction/index.ts:223-268](), [src/extraction/index.ts:1227-1371](), [src/extraction/index.ts:1374-1465]()

At the `CodeGraph` class level, `indexAll()`, `indexFiles()`, and `sync()` are protected by an in-process mutex and a cross-process lock file. That is important because the CLI, MCP server, and git hooks can all try to write the same SQLite database. If a lock looks stale, the lock utility can remove it; otherwise it tells users to run `codegraph unlock`.

Sources: [src/index.ts:139-164](), [src/index.ts:370-410](), [src/index.ts:417-490](), [src/utils.ts:177-239](), [src/utils.ts:241-260]()

### Watcher Path

`FileWatcher` uses recursive `fs.watch`, filters events through the same include/exclude rules as extraction, ignores `.codegraph/` writes, and debounces changes before calling `sync()`. If a sync is already running, later changes set `hasChanges` and schedule another pass after the current sync finishes.

Sources: [src/sync/watcher.ts:40-49](), [src/sync/watcher.ts:82-139](), [src/sync/watcher.ts:168-206](), [src/index.ts:505-528]()

The watch policy can disable live watching in known-problem environments. `CODEGRAPH_NO_WATCH=1` wins first, `CODEGRAPH_FORCE_WATCH=1` can override auto-detection, and WSL on Windows drive mounts like `/mnt/c` disables recursive watching because setup can be too slow for MCP startup. The sync module exports watcher, watch policy, and git hook helpers from one place.

Sources: [src/sync/watch-policy.ts:71-98](), [src/sync/index.ts:1-25]()

### Git Hook Sync Path

Git hooks are the fallback freshness path when the live watcher is not desirable. The default hooks are `post-commit`, `post-merge`, and `post-checkout`. The inserted shell block checks whether `codegraph` is on `PATH` and runs `codegraph sync` in the background, so git operations are not blocked. The installer uses marker comments so repeated installs replace CodeGraph’s block and preserve user-authored hook content.

```sh
# src/sync/git-hooks.ts
if command -v codegraph >/dev/null 2>&1; then
  ( codegraph sync >/dev/null 2>&1 & ) >/dev/null 2>&1
fi
```

Sources: [src/sync/git-hooks.ts:1-14](), [src/sync/git-hooks.ts:23-35](), [src/sync/git-hooks.ts:72-84](), [src/sync/git-hooks.ts:116-159](), [src/sync/git-hooks.ts:161-208]()

## Safety Checks Worth Knowing

| Safety check | Where it lives | Why it matters |
| --- | --- | --- |
| Project paths and file paths are validated against the root | `validatePathWithinRoot()` and extraction calls | Blocks path traversal before file reads |
| Sensitive system and home directories are rejected as project roots | `validateProjectPath()` | Avoids accidental indexing of dangerous locations |
| Cross-process writes use a PID lock file | `FileLock` plus `CodeGraph.indexAll()`/`sync()` | Prevents CLI, MCP, and hooks from writing concurrently |
| Watcher ignores `.codegraph/` | `FileWatcher.start()` | Avoids feedback loops from DB writes |
| Sync filters include/exclude patterns | `shouldIncludeFile()` and git change detection | Keeps unsupported or excluded files out of refresh work |
| Parser work has size checks, worker timeouts, and worker recycling | extraction orchestrator | Keeps large or pathological files from freezing the whole index |

Sources: [src/utils.ts:49-106](), [src/extraction/index.ts:104-126](), [src/extraction/index.ts:751-823](), [src/extraction/index.ts:602-731](), [src/sync/watcher.ts:106-118](), [src/index.ts:370-490]()

## What to Try Next

Use these as concrete experiments, not just commands to memorize.

| Experiment | Command or action | What proves it worked |
| --- | --- | --- |
| Build the graph from scratch | `codegraph init -i` | `.codegraph/codegraph.db` exists and `codegraph status` reports files, nodes, edges, DB size, and backend |
| Check freshness before editing | `codegraph status --json` | `pendingChanges` is all zero when the DB hashes match the working tree |
| Modify a known source file | Edit a function name, then run `codegraph status` | Pending changes show a modified file |
| Apply the change | `codegraph sync` | The command reports changed files, then `codegraph status` says the index is up to date |
| Prove old symbols disappear | Search for the old name after sync | The old symbol is gone and the new symbol is searchable |
| Verify deletion handling | Delete an indexed file, then run `codegraph sync` | Removed count increments and symbols from that file disappear |
| Inspect backend health | `codegraph status` | `Backend: native` is ideal; `wasm` means the portable fallback is active |
| Test watcher behavior through MCP/server mode | `codegraph serve --mcp` or use the configured MCP server | Watcher diagnostics say whether auto-sync is active or unavailable |
| Test hook fallback | Install/init in a git repo, then inspect hooks | Hooks include one CodeGraph marker block and invoke `codegraph sync` guarded by `command -v codegraph` |

Sources: [src/bin/codegraph.ts:391-472](), [src/bin/codegraph.ts:602-662](), [src/bin/codegraph.ts:664-775](), [src/bin/codegraph.ts:1098-1153](), [__tests__/sync.test.ts:92-152](), [__tests__/sync.test.ts:200-304](), [__tests__/git-hooks.test.ts:42-80]()

## First-30-Minute Reading Order

1. Start with `src/db/schema.sql` to understand what can be known: symbols, relationships, files, unresolved references, and metadata.
2. Read `src/extraction/index.ts` around `storeExtractionResult()` and `sync()` to understand freshness by content hash.
3. Read `src/db/queries.ts` for the storage API that every graph operation uses.
4. Read `src/graph/traversal.ts` and `src/graph/queries.ts` to see how callers, callees, impact, file dependencies, and context are assembled.
5. Read `src/sync/watcher.ts`, `src/sync/watch-policy.ts`, and `src/sync/git-hooks.ts` to understand automatic refresh versus fallback refresh.
6. Finish with `src/bin/codegraph.ts` status/sync output so you know how implementation state becomes user-facing diagnostics.

Sources: [src/db/schema.sql:19-81](), [src/extraction/index.ts:1154-1371](), [src/db/queries.ts:143-183](), [src/graph/traversal.ts:31-39](), [src/graph/queries.ts:11-21](), [src/sync/watcher.ts:40-49](), [src/sync/watch-policy.ts:71-98](), [src/bin/codegraph.ts:602-775]()

## Closing Summary

The mental model is simple: CodeGraph is a local SQLite graph plus a freshness loop. Extraction writes file hashes and symbol relationships, `QueryBuilder` reads and updates them, traversal builds useful graph answers, and freshness is maintained by manual sync, a debounced native watcher, or opt-in git hooks. The fastest confidence check is to edit one source file, run `codegraph status`, run `codegraph sync`, and verify that both search results and pending-change counts reflect the new state. Sources: [src/extraction/index.ts:1227-1371](), [src/bin/codegraph.ts:641-651](), [src/bin/codegraph.ts:757-773]()

---