# Semantic search setup

> Enable embeddings-based MCP search ranking with `ok embeddings set-key`, project-local `search.semantic.*` config, and content-egress constraints.

- Repository: sashimikun/open-knowledge
- GitHub: https://github.com/sashimikun/open-knowledge
- Human docs: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e
- Complete Markdown: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e/llms-full.txt

## Source Files

- `packages/cli/src/commands/embeddings/index.ts`
- `packages/server/src/mcp/tools/search.ts`
- `packages/core/src/config/schema.ts`
- `docs/content/reference/configuration.mdx`
- `packages/server/src/api-search-semantic.test.ts`
- `packages/cli/scripts/build-config-schema.mjs`

---

---
title: "Semantic search setup"
description: "Enable embeddings-based MCP search ranking with `ok embeddings set-key`, project-local `search.semantic.*` config, and content-egress constraints."
---

Semantic search adds an **embeddings-based ranking signal** to workspace search. When enabled, the MCP `search` tool fuses a vector similarity score with the existing lexical engine (title boost, body BM25, recency) so conceptually related pages surface even when they share no keywords — for example, a query about "auth retries" can rank a page titled "Credential Rotation" above unrelated hits.

The feature is **off by default** and **additive**. With it disabled, search is pure lexical ranking. When enabled but not ready (no API key, provider error, or index still warming), search **silently degrades to lexical** — it never blocks agent workflows or returns errors.

<Callout type="warn" title="Content egress">
When semantic search is **enabled** and an API key is set, the **search query** and **matching page content** are sent to the configured embeddings provider (OpenAI by default). Only content already in your searchable corpus is embedded — paths excluded by `.okignore` / `.gitignore` are never sent. Embedding is **lazy**: nothing leaves your machine until a search explicitly opts in with `semantic: true`. The API key is stored only in `~/.ok/secrets.yml` (mode `0600`) or the `OK_EMBEDDINGS_API_KEY` environment variable — never in `config.yml`, logs, or telemetry.
</Callout>

## Prerequisites

- A running OpenKnowledge project with the collaboration server (`ok start`).
- An OpenAI-compatible embeddings API key (OpenAI, Azure OpenAI, or a self-hosted endpoint that implements the same `/embeddings` contract).
- Project-local config scope — semantic settings live in `.ok/local/config.yml` (gitignored, per machine). Keys set in committed `.ok/config.yml` are **ignored** for egress safety.

## Enable semantic search

<Steps>

<Step title="Store the embeddings API key">

Run `ok embeddings set-key` from any directory. The command reads the key from a hidden prompt (TTY) or stdin (piped input) and writes it to `~/.ok/secrets.yml` under the `OPENAI_API_KEY` field.

```bash
ok embeddings set-key
```

<RequestExample>

```bash
echo "sk-your-key" | ok embeddings set-key
```

</RequestExample>

For CI or scripted runs, export `OK_EMBEDDINGS_API_KEY` instead. The file store takes precedence when both are present.

</Step>

<Step title="Enable semantic search for this project">

Turn on semantic search in **project-local** config. The CLI is the fastest path from a terminal:

```bash
cd /path/to/your/project
ok embeddings enable
```

This writes `search.semantic.enabled: true` to `.ok/local/config.yml`. You can also toggle **Settings → This project → Search** in OK Desktop; the UI shows an egress-confirmation prompt before enabling.

</Step>

<Step title="Start the server and verify capability">

```bash
ok start
ok embeddings status
```

<ResponseExample>

```text
Semantic search
  project:     /path/to/your/project

  This machine (all projects):
    API key:    set — ~/.ok/secrets.yml

  This project:
    enabled:    yes
    capability: AVAILABLE
    coverage:   42 / 42 pages embedded
    provider:   https://api.openai.com/v1
    model:      text-embedding-3-small
    dimensions: native (1536)
```

</ResponseExample>

`capability: AVAILABLE` means both `enabled` and a key are present. If the server is not running, coverage shows as unavailable until you start it and the index embeds.

Use `ok embeddings status --json` for machine-readable output, or `GET /api/semantic-status` while the server is up.

</Step>

<Step title="Confirm MCP search uses vectors">

From a connected agent, run a `search` call with a conceptually related query. The MCP `search` tool opts in to semantic ranking by default (`semantic` omitted or `true`).

<ResponseExample>

```json
{
  "query": "auth retries",
  "intent": "full_text",
  "resultCount": 3,
  "results": [
    {
      "kind": "page",
      "path": "guides/credential-rotation",
      "score": 8.42,
      "signals": {
        "lexical": 0,
        "fullText": 0,
        "recency": 0.12,
        "vector": 0.71
      }
    }
  ],
  "semantic": {
    "capable": true,
    "applied": true,
    "coverage": { "embedded": 42, "total": 42 }
  }
}
```

</ResponseExample>

A non-zero `signals.vector` on a hit confirms the embeddings signal contributed. The text response includes a one-line semantic note (for example, `Semantic: on — vector signal contributed (42/42 pages embedded).`).

</Step>

</Steps>

## Configuration keys

All `search.semantic.*` keys are **project-local** scope — they belong in `.ok/local/config.yml`, not committed `.ok/config.yml`. Wrong-scope keys are ignored; `ok config validate` reports them.

| Key | Type | Default | Description |
| --- | --- | --- | --- |
| `search.semantic.enabled` | boolean | `false` | Master toggle. Must be `true` for any embeddings egress. |
| `search.semantic.baseUrl` | string | `https://api.openai.com/v1` | OpenAI-compatible embeddings API base URL. |
| `search.semantic.model` | string | `text-embedding-3-small` | Model id served at `baseUrl`. Changing it invalidates the on-disk cache. |
| `search.semantic.dimensions` | number | native (1536) | Optional output vector size. Smaller values shrink `.ok/local/embeddings/` at some quality cost. |
| `search.semantic.similarityFloor` | number | `0` (off) | Optional cosine-similarity cutoff. Docs below the floor are excluded from vector candidacy. Most setups should leave this unset. |

<ParamField body="search.semantic.enabled" type="boolean" required={false}>
Project-local master switch. When `true` and a key is present, semantic ranking is available to searches that pass `semantic: true`. Default `false`.
</ParamField>

<ParamField body="search.semantic.baseUrl" type="string" required={false}>
Base URL for an OpenAI-compatible `/embeddings` endpoint. Use for Azure OpenAI, self-hosted models, or other providers. The API key is **not** stored here.
</ParamField>

<ParamField body="search.semantic.model" type="string" required={false}>
Embeddings model identifier. Must match a model your provider serves. The vector cache is keyed by provider + model + dimensions; changing any of these triggers re-embedding.
</ParamField>

<ParamField body="search.semantic.dimensions" type="number" required={false}>
Optional reduced output dimensionality (for example `512` or `1024` with `text-embedding-3-small`). Omit to use the model's native size (1536 for the default model).
</ParamField>

<ParamField body="search.semantic.similarityFloor" type="number" required={false}>
Hard cosine-similarity floor in `[0, 1]`. Vector-only candidates below this value are dropped before ranking. Default behavior (`0`) is rank-based — the closest pages are returned regardless of absolute score.
</ParamField>

Example project-local config for a non-OpenAI provider:

```yaml
search:
  semantic:
    enabled: true
    baseUrl: https://your-azure-endpoint.openai.azure.com/openai/deployments/embeddings
    model: text-embedding-3-small
    dimensions: 1024
```

Published JSON Schema for project-local keys is generated at build time (`config.project-local.schema.json`).

## CLI reference

| Command | Purpose |
| --- | --- |
| `ok embeddings set-key` | Store the embeddings API key in `~/.ok/secrets.yml` (`0600`). |
| `ok embeddings clear-key` | Remove the stored key from all backends. |
| `ok embeddings enable` | Set `search.semantic.enabled: true` in project-local config. |
| `ok embeddings disable` | Set `search.semantic.enabled: false` in project-local config. |
| `ok embeddings status` | Report key presence, enabled state, capability, coverage, and provider settings. |

Global flags `--cwd <path>` and `--json` apply to `enable`, `disable`, and `status`.

## How ranking works

```mermaid
flowchart LR
  A[Search request] --> B{semantic: true?}
  B -->|no| C[Lexical only]
  B -->|yes| D{enabled + key?}
  D -->|no| C
  D -->|yes| E{intent full_text<br/>query len ≥ 3?}
  E -->|no| C
  E -->|yes| F[Embed query]
  F --> G[Fuse vector + lexical via RRF]
  G --> H[Ranked results + semantic block]
  C --> H
```

Semantic ranking applies only when **all** of the following are true:

1. `search.semantic.enabled` is `true` in project-local config.
2. An API key is available (`~/.ok/secrets.yml` or `OK_EMBEDDINGS_API_KEY`).
3. The request passes `semantic: true` (MCP `search` defaults to `true`; HTTP callers must set it explicitly).
4. `intent` is `full_text` (not `omnibar` or `autocomplete`).
5. The trimmed query length is at least **3** characters.

The cmd-K omnibar stays **purely lexical** on per-keystroke queries. A deliberate "search by meaning" submit from the omnibar sends `intent: full_text` with `semantic: true` and does fuse vectors.

### Indexing and cache

- The first semantic search triggers a **background embed** of markdown pages in the searchable corpus.
- Vectors are cached incrementally under `.ok/local/embeddings/`. Only changed documents re-embed.
- The cache key includes provider URL, model, dimensions, and chunking config — changing any invalidates and rebuilds.
- Embedding batches run in the background; subsequent searches report `coverage.embedded / coverage.total` so agents know when vectors are still filling in.

### What is never embedded

| Content | Lexical search | Embedded for semantic |
| --- | --- | --- |
| Pages under `.okignore` / `.gitignore` exclusions | Excluded from index | Never sent |
| Dot-path segments (`.cursor/`, `.claude/`, etc.) | Searchable with penalty | Never embedded |
| Non-markdown files | Name/path only | Not embedded |

Hidden dot-path files can still appear in lexical results but never carry `signals.vector`.

## MCP `search` behavior

The MCP `search` tool routes through `POST /api/search` on the collaboration server. Key parameters:

<ParamField body="semantic" type="boolean" required={false}>
Per-call override. Set `false` to force pure-lexical ranking even when semantic search is enabled. Omit (or `true`) to use semantic when available. MCP defaults to `true`.
</ParamField>

<ParamField body="intent" type="string" required={false}>
`full_text` (default) includes body content and is required for semantic fusion. `omnibar` is title/path/folder only and never applies vectors.
</ParamField>

<ResponseField name="semantic" type="object">
Present when semantic search is enabled for the workspace. Fields: `capable` (key loaded and service ready), `applied` (at least one result carried a vector signal this call), `coverage` (`embedded` / `total` page counts).
</ResponseField>

<ResponseField name="signals.vector" type="number">
Cosine similarity for this hit, present only when semantic ranking contributed for that document.
</ResponseField>

Pair `search` with `exec` `grep` for exhaustive literal-string coverage across all file types. Semantic search improves **ranking** for markdown pages; it does not replace grep for code or config files.

## HTTP endpoints

:::endpoint GET /api/semantic-status
Returns embedding index status for the running server. Used by `ok embeddings status` (live coverage) and the editor settings pane.

<ResponseExample>

```json
{
  "enabled": true,
  "keyPresent": true,
  "keySource": "file",
  "keyHint": "-key",
  "ready": true,
  "capable": true,
  "embedded": 42,
  "total": 42
}
```

</ResponseExample>
:::

:::endpoint POST /api/search
Workspace search. Pass `semantic: true` with `intent: "full_text"` to opt into vector fusion. The MCP tool sets `source: "mcp"` automatically.

<RequestExample>

```json
{
  "query": "session token refresh",
  "intent": "full_text",
  "semantic": true,
  "limit": 20
}
```

</RequestExample>
:::

The editor can set keys via `POST /api/local-op/embeddings/set-key` and `POST /api/local-op/embeddings/clear-key` without shell access.

## Troubleshooting

<AccordionGroup>

<Accordion title="Search stays lexical after enabling">
Check `ok embeddings status`. Common causes:
- No API key — run `ok embeddings set-key`.
- `enabled: no` — run `ok embeddings enable` or toggle Settings → Search.
- `enabled: true` in committed `.ok/config.yml` instead of `.ok/local/config.yml` — move it to project-local scope.
- Query shorter than 3 characters, or `intent: "omnibar"`.
- Per-call `semantic: false` override.
</Accordion>

<Accordion title="coverage.embedded is less than total">
Normal during warm-up. The first semantic search kicks off background embedding. Retry after a few seconds; the `semantic` block reports progress. Changed docs re-embed incrementally on subsequent searches.
</Accordion>

<Accordion title="capable: false with a key set">
The embedder failed to initialize (invalid key, network error, dimension mismatch). Search degrades to lexical. Check server logs. A dimension mismatch error suggests setting `search.semantic.dimensions` to match the provider's output.
</Accordion>

<Accordion title="Disable semantic search completely">
Run `ok embeddings disable` in the project folder, or toggle off in Settings. Optionally `ok embeddings clear-key` to remove the machine-wide key. Existing cache files under `.ok/local/embeddings/` remain on disk but are unused.
</Accordion>

</AccordionGroup>

## Related pages

<Cards>
  <Card title="Configuration reference" href="/configuration-reference">
    Full config schema, precedence order, and environment variables including `OK_EMBEDDINGS_API_KEY`.
  </Card>
  <Card title="MCP tools reference" href="/mcp-tools-reference">
    `search` parameters, response shape, and the two-tier `search` + `exec` grep model.
  </Card>
  <Card title="Project scaffold" href="/project-scaffold">
    `.ok/local/` layout, project-local scope, and `.okignore` exclusion semantics.
  </Card>
  <Card title="Collaboration server" href="/collaboration-server">
    Server lifecycle, `ok start`, and which MCP reads require Hocuspocus.
  </Card>
</Cards>
