# Core concepts

> Three-layer model (editor, knowledge engine, markdown files), filesystem-as-database persistence, link graph and backlinks, and git-backed attribution.

- Repository: sashimikun/open-knowledge
- GitHub: https://github.com/sashimikun/open-knowledge
- Human docs: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e
- Complete Markdown: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e/llms-full.txt

## Source Files

- `docs/content/reference/core-concepts.md`
- `packages/server/src/mcp/tools/index.ts`
- `packages/server/src/mcp/tools/links.ts`
- `packages/core/src/protocol-version.ts`
- `packages/core/src/markdown/entity-ref-guard.ts`
- `packages/server/src/mcp/agent-identity.ts`

---

---
title: "Core concepts"
description: "Three-layer model (editor, knowledge engine, markdown files), filesystem-as-database persistence, link graph and backlinks, and git-backed attribution."
---

Open Knowledge stores knowledge as markdown under a configured `content.dir`, serves it through a Hocuspocus-backed editor and MCP server (`packages/server`), and maintains a per-branch link index plus a shadow git repo (`.git/ok` or `.git/ok-{slug}`) for attribution history. All three surfaces operate on the same files; protocol compatibility is pinned at `PROTOCOL_VERSION = 1`.

## Three-layer model

The runtime splits into an editing surface, a consistency engine, and the content tree. Each layer reads and writes the same markdown files on disk.

```mermaid
flowchart TB
  subgraph surface["Editing surface"]
    Editor["Web editor / desktop app<br/>packages/app"]
    CLI["ok CLI — start, init, share"]
  end

  subgraph engine["Knowledge engine"]
    MCP["MCP server — 17 tools<br/>packages/server"]
    Hocus["Hocuspocus / Yjs server<br/>collaboration + HTTP APIs"]
    Index["BacklinkIndex<br/>per-branch link graph"]
    Shadow["Shadow git repo<br/>.git/ok"]
  end

  subgraph storage["Content layer"]
    FS["Markdown files<br/>content.dir"]
    Upstream["Project git<br/>.git"]
  end

  Editor --> Hocus
  CLI --> MCP
  MCP --> Hocus
  MCP --> FS
  Hocus --> FS
  Hocus --> Index
  Hocus --> Shadow
  FS --> Upstream
```

| Layer | Package / surface | Responsibility |
| --- | --- | --- |
| Editor | `packages/app`, desktop shell, `ok start --open` | WYSIWYG and source-mode editing, properties pane, timeline, rich MDX embeds |
| Knowledge engine | `packages/server` MCP + Hocuspocus | Frontmatter consistency, link-graph maintenance, agent writes, version history |
| Content | Files under `content.dir` | Durable, portable markdown; versioned by project git |

All three layers target the **same files**. Edit through the editor, through MCP write tools, or with any text editor on disk. The engine enforces consistency when you route through it, but raw file edits remain valid.

The MCP surface is **agent-agnostic**: any MCP-capable client (Claude Code, Cursor, Codex, or others) connects via the registered `open-knowledge` server entry. Agent identity is captured per connection (`connectionId`, `displayName`, `colorSeed`) and forwarded on writes through the `x-ok-connection-id` header.

<Info>
  Two read paths are server-free: `exec` and `preview_url` hit the filesystem directly. Graph reads (`links`), search, history, and all writes route through the running Hocuspocus server. See [Collaboration server](/collaboration-server).
</Info>

## Filesystem as database

Open Knowledge has **no separate database**. Persistence is the file system plus git.

| Property | Mechanism |
| --- | --- |
| Content root | `content.dir` in `.ok/config.yml` (default `"."` — project root) |
| Exclusions | `.okignore` patterns; system paths under `.ok/` |
| Engine scope | Only files admitted by the content filter inside `content.dir` |
| Portability | Plain `.md` / `.mdx`; readable with `cat`, `grep`, `git diff` |

<ParamField body="content.dir" type="string">
  Relative path from the project root to the knowledge-base directory. Resolved by `resolveContentDir(config, projectDir)`. Change it in `.ok/config.yml` to narrow scope (for example `docs`).
</ParamField>

The knowledge engine is a **management layer**, not a gatekeeper. It maintains frontmatter, link targets, and reference integrity when edits flow through MCP or the editor, but direct filesystem edits are always permitted. On the next server reconcile, the backlink index and shadow repo pick up offline changes.

## Link graph and backlinks

Internal cross-references use standard markdown links or wiki-link syntax. The `BacklinkIndex` parses both forms per git branch and maintains directed `backward` and `forward` maps.

**Markdown links**

```markdown
[Auth overview](./auth/overview.md)
[Root note](/getting-started.md)
```

**Wiki links**

```markdown
[[target-doc]]
[[target-doc#section|display label]]
```

`classifyMarkdownHref` resolves relative paths against the source document; `[[...]]` targets are classified separately. Entity references (`&amp;`, `&#123;`) are protected during the markdown pipeline so link parsing does not corrupt HTML entities.

### Backlinks

When document A links to document B, the index records the inverse on B automatically. You never author backlinks by hand — they are derived from forward links at index time.

<ResponseField name="backlinks" type="array">
  Entries with `source` (referring doc), optional `anchor`, and `snippet` (context around the link).
</ResponseField>

### Link-graph views

The MCP `links` tool exposes six `kind` values. Pass one kind or an array (for example `["dead", "orphans", "hubs"]`) for a single-call graph audit. Multi-kind failures land in an `errors` map; successful kinds still merge into the payload.

| `kind` | Scope | Requires `document` | Key parameters |
| --- | --- | --- | --- |
| `backlinks` | Pages linking **to** the target | Yes | — |
| `forward` | Links **from** the target (doc + external) | Yes | — |
| `dead` | Internal links whose target file does not exist | No | `sourceDocuments` (OR filter) |
| `orphans` | Disconnected pages | No | `mode`: `incoming` \| `outgoing` \| `both` (default `both`) |
| `hubs` | Most-linked-to pages | No | `limit` (default 20, max 100) |
| `suggest` | Prose mentions not yet wrapped in link syntax | Yes | Returns `mentions[{ source, excerpt, offset }]` |

<RequestExample>

```json
{
  "kind": ["dead", "orphans", "hubs"],
  "mode": "incoming",
  "limit": 10
}
```

</RequestExample>

<Warning>
  `links` requires a running Hocuspocus server. Without it, the tool returns `HOCUSPOCUS_NOT_RUNNING_ERROR`. Use `exec` for filesystem reads when the server is down; graph queries need `ok start`.
</Warning>

Orphan detection follows three modes:

| `mode` | Orphan definition |
| --- | --- |
| `incoming` | No pages link to this document |
| `outgoing` | This document links to nothing |
| `both` | No inbound **and** no outbound edges |

The `exec` read path enriches every matched wiki file with backlink counts, forward-link counts, and recent shadow-repo activity — the primary alternative to `links` for quick inspection during server-free reads.

## Git-backed attribution

Edit history lives in a **shadow git repository** alongside the project's upstream git. The shadow repo stores per-writer WIP chains, checkpoints, and upstream-import markers without replacing the project's own commit history.

### Shadow repo layout

| Location | When |
| --- | --- |
| `.git/ok` | Project root is the git worktree root |
| `.git/ok-{slug}` | Project is a subdirectory of a larger worktree |
| `{projectRoot}/.git/ok` | No ancestor `.git` found |

Each writer gets an isolated ref: `refs/wip/{branch}/{writerId}`. Commits carry `GIT_AUTHOR_NAME` / `GIT_AUTHOR_EMAIL` from the resolved writer identity and `GIT_COMMITTER_NAME = openknowledge`.

### Writer classification

| Writer ID pattern | Classification | Typical source |
| --- | --- | --- |
| `agent-{id}` | `agent` | MCP-connected AI agent |
| `principal-{id}` | `principal` | Authenticated human principal |
| `file-system` | `classified-file-system` | Direct filesystem edits outside OK |
| `git-upstream` | `classified-git-upstream` | Upstream git import |
| `openknowledge-service` | `classified-openknowledge-service` | Internal service operations |

Agent writes resolve identity from the MCP session: `displayName` and `colorSeed` come from the client name (sanitized, max 128 chars) or fall back to `connectionId`. Optional `summary` fields (≤80 chars) on `write`, `edit`, `move`, and `checkpoint` persist as commit subjects and appear on document timelines.

### History and recovery

| MCP tool | Purpose |
| --- | --- |
| `history` | List version timeline for a `document` or folder activity; entries carry `version` (40-char SHA), `author`, `kind` (`checkpoint` / `wip` / `upstream`), `contributors` |
| `checkpoint` | Project-wide snapshot; returns `{ version }` |
| `restore_version` | Restore one document to a historical `version` from `history` or `checkpoint` |

History reads require the Hocuspocus server and query the shadow repo sorted by timestamp descending. Filter by `kind`, `author`, or `excludeAuthor`; paginate with `limit` (default 50, max 200) and `offset`.

<Tip>
  Before deleting a heavily linked document, run `links({ kind: "backlinks", document: "…" })` to find referrers that would become dead links. Call `checkpoint()` first when you may need a project-wide rollback.
</Tip>

Upstream project git and shadow attribution git serve different roles: upstream git tracks what you push to GitHub; the shadow repo tracks **who** changed **what** and **when** at edit granularity, including agent sessions.

## Protocol boundary

`PROTOCOL_VERSION` is currently `1`. Editor, server, and MCP clients negotiate compatibility against this constant. Mismatched protocol versions between the collaboration server and connected clients indicate an upgrade is needed on one side.

## Related pages

<CardGroup>
  <Card title="Overview" href="/overview">
    Monorepo packages, runtime surfaces, and the shortest path from install to first agent-driven edit.
  </Card>
  <Card title="Project scaffold" href="/project-scaffold">
    `.ok/` layout, config scopes, `.okignore`, and `content.dir` semantics.
  </Card>
  <Card title="Collaboration server" href="/collaboration-server">
    Hocuspocus lifecycle, server-free reads versus routed writes, and lock behavior.
  </Card>
  <Card title="MCP tools reference" href="/mcp-tools-reference">
    All 17 tools, input nesting, preview envelopes, and conflict guards.
  </Card>
  <Card title="Editor workflows" href="/editor-workflows">
    WYSIWYG editing, timeline panel, and rich MDX embeds.
  </Card>
</CardGroup>
