# Semantic search setup > Enable embeddings-based MCP search ranking with `ok embeddings set-key`, project-local `search.semantic.*` config, and content-egress constraints. - Repository: sashimikun/open-knowledge - GitHub: https://github.com/sashimikun/open-knowledge - Human docs: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e - Complete Markdown: https://grok-wiki.com/public/docs/sashimikun-open-knowledge-5c45105c876e/llms-full.txt ## Source Files - `packages/cli/src/commands/embeddings/index.ts` - `packages/server/src/mcp/tools/search.ts` - `packages/core/src/config/schema.ts` - `docs/content/reference/configuration.mdx` - `packages/server/src/api-search-semantic.test.ts` - `packages/cli/scripts/build-config-schema.mjs` --- --- title: "Semantic search setup" description: "Enable embeddings-based MCP search ranking with `ok embeddings set-key`, project-local `search.semantic.*` config, and content-egress constraints." --- Semantic search adds an **embeddings-based ranking signal** to workspace search. When enabled, the MCP `search` tool fuses a vector similarity score with the existing lexical engine (title boost, body BM25, recency) so conceptually related pages surface even when they share no keywords — for example, a query about "auth retries" can rank a page titled "Credential Rotation" above unrelated hits. The feature is **off by default** and **additive**. With it disabled, search is pure lexical ranking. When enabled but not ready (no API key, provider error, or index still warming), search **silently degrades to lexical** — it never blocks agent workflows or returns errors. When semantic search is **enabled** and an API key is set, the **search query** and **matching page content** are sent to the configured embeddings provider (OpenAI by default). Only content already in your searchable corpus is embedded — paths excluded by `.okignore` / `.gitignore` are never sent. Embedding is **lazy**: nothing leaves your machine until a search explicitly opts in with `semantic: true`. The API key is stored only in `~/.ok/secrets.yml` (mode `0600`) or the `OK_EMBEDDINGS_API_KEY` environment variable — never in `config.yml`, logs, or telemetry. ## Prerequisites - A running OpenKnowledge project with the collaboration server (`ok start`). - An OpenAI-compatible embeddings API key (OpenAI, Azure OpenAI, or a self-hosted endpoint that implements the same `/embeddings` contract). - Project-local config scope — semantic settings live in `.ok/local/config.yml` (gitignored, per machine). Keys set in committed `.ok/config.yml` are **ignored** for egress safety. ## Enable semantic search Run `ok embeddings set-key` from any directory. The command reads the key from a hidden prompt (TTY) or stdin (piped input) and writes it to `~/.ok/secrets.yml` under the `OPENAI_API_KEY` field. ```bash ok embeddings set-key ``` ```bash echo "sk-your-key" | ok embeddings set-key ``` For CI or scripted runs, export `OK_EMBEDDINGS_API_KEY` instead. The file store takes precedence when both are present. Turn on semantic search in **project-local** config. The CLI is the fastest path from a terminal: ```bash cd /path/to/your/project ok embeddings enable ``` This writes `search.semantic.enabled: true` to `.ok/local/config.yml`. You can also toggle **Settings → This project → Search** in OK Desktop; the UI shows an egress-confirmation prompt before enabling. ```bash ok start ok embeddings status ``` ```text Semantic search project: /path/to/your/project This machine (all projects): API key: set — ~/.ok/secrets.yml This project: enabled: yes capability: AVAILABLE coverage: 42 / 42 pages embedded provider: https://api.openai.com/v1 model: text-embedding-3-small dimensions: native (1536) ``` `capability: AVAILABLE` means both `enabled` and a key are present. If the server is not running, coverage shows as unavailable until you start it and the index embeds. Use `ok embeddings status --json` for machine-readable output, or `GET /api/semantic-status` while the server is up. From a connected agent, run a `search` call with a conceptually related query. The MCP `search` tool opts in to semantic ranking by default (`semantic` omitted or `true`). ```json { "query": "auth retries", "intent": "full_text", "resultCount": 3, "results": [ { "kind": "page", "path": "guides/credential-rotation", "score": 8.42, "signals": { "lexical": 0, "fullText": 0, "recency": 0.12, "vector": 0.71 } } ], "semantic": { "capable": true, "applied": true, "coverage": { "embedded": 42, "total": 42 } } } ``` A non-zero `signals.vector` on a hit confirms the embeddings signal contributed. The text response includes a one-line semantic note (for example, `Semantic: on — vector signal contributed (42/42 pages embedded).`). ## Configuration keys All `search.semantic.*` keys are **project-local** scope — they belong in `.ok/local/config.yml`, not committed `.ok/config.yml`. Wrong-scope keys are ignored; `ok config validate` reports them. | Key | Type | Default | Description | | --- | --- | --- | --- | | `search.semantic.enabled` | boolean | `false` | Master toggle. Must be `true` for any embeddings egress. | | `search.semantic.baseUrl` | string | `https://api.openai.com/v1` | OpenAI-compatible embeddings API base URL. | | `search.semantic.model` | string | `text-embedding-3-small` | Model id served at `baseUrl`. Changing it invalidates the on-disk cache. | | `search.semantic.dimensions` | number | native (1536) | Optional output vector size. Smaller values shrink `.ok/local/embeddings/` at some quality cost. | | `search.semantic.similarityFloor` | number | `0` (off) | Optional cosine-similarity cutoff. Docs below the floor are excluded from vector candidacy. Most setups should leave this unset. | Project-local master switch. When `true` and a key is present, semantic ranking is available to searches that pass `semantic: true`. Default `false`. Base URL for an OpenAI-compatible `/embeddings` endpoint. Use for Azure OpenAI, self-hosted models, or other providers. The API key is **not** stored here. Embeddings model identifier. Must match a model your provider serves. The vector cache is keyed by provider + model + dimensions; changing any of these triggers re-embedding. Optional reduced output dimensionality (for example `512` or `1024` with `text-embedding-3-small`). Omit to use the model's native size (1536 for the default model). Hard cosine-similarity floor in `[0, 1]`. Vector-only candidates below this value are dropped before ranking. Default behavior (`0`) is rank-based — the closest pages are returned regardless of absolute score. Example project-local config for a non-OpenAI provider: ```yaml search: semantic: enabled: true baseUrl: https://your-azure-endpoint.openai.azure.com/openai/deployments/embeddings model: text-embedding-3-small dimensions: 1024 ``` Published JSON Schema for project-local keys is generated at build time (`config.project-local.schema.json`). ## CLI reference | Command | Purpose | | --- | --- | | `ok embeddings set-key` | Store the embeddings API key in `~/.ok/secrets.yml` (`0600`). | | `ok embeddings clear-key` | Remove the stored key from all backends. | | `ok embeddings enable` | Set `search.semantic.enabled: true` in project-local config. | | `ok embeddings disable` | Set `search.semantic.enabled: false` in project-local config. | | `ok embeddings status` | Report key presence, enabled state, capability, coverage, and provider settings. | Global flags `--cwd ` and `--json` apply to `enable`, `disable`, and `status`. ## How ranking works ```mermaid flowchart LR A[Search request] --> B{semantic: true?} B -->|no| C[Lexical only] B -->|yes| D{enabled + key?} D -->|no| C D -->|yes| E{intent full_text
query len ≥ 3?} E -->|no| C E -->|yes| F[Embed query] F --> G[Fuse vector + lexical via RRF] G --> H[Ranked results + semantic block] C --> H ``` Semantic ranking applies only when **all** of the following are true: 1. `search.semantic.enabled` is `true` in project-local config. 2. An API key is available (`~/.ok/secrets.yml` or `OK_EMBEDDINGS_API_KEY`). 3. The request passes `semantic: true` (MCP `search` defaults to `true`; HTTP callers must set it explicitly). 4. `intent` is `full_text` (not `omnibar` or `autocomplete`). 5. The trimmed query length is at least **3** characters. The cmd-K omnibar stays **purely lexical** on per-keystroke queries. A deliberate "search by meaning" submit from the omnibar sends `intent: full_text` with `semantic: true` and does fuse vectors. ### Indexing and cache - The first semantic search triggers a **background embed** of markdown pages in the searchable corpus. - Vectors are cached incrementally under `.ok/local/embeddings/`. Only changed documents re-embed. - The cache key includes provider URL, model, dimensions, and chunking config — changing any invalidates and rebuilds. - Embedding batches run in the background; subsequent searches report `coverage.embedded / coverage.total` so agents know when vectors are still filling in. ### What is never embedded | Content | Lexical search | Embedded for semantic | | --- | --- | --- | | Pages under `.okignore` / `.gitignore` exclusions | Excluded from index | Never sent | | Dot-path segments (`.cursor/`, `.claude/`, etc.) | Searchable with penalty | Never embedded | | Non-markdown files | Name/path only | Not embedded | Hidden dot-path files can still appear in lexical results but never carry `signals.vector`. ## MCP `search` behavior The MCP `search` tool routes through `POST /api/search` on the collaboration server. Key parameters: Per-call override. Set `false` to force pure-lexical ranking even when semantic search is enabled. Omit (or `true`) to use semantic when available. MCP defaults to `true`. `full_text` (default) includes body content and is required for semantic fusion. `omnibar` is title/path/folder only and never applies vectors. Present when semantic search is enabled for the workspace. Fields: `capable` (key loaded and service ready), `applied` (at least one result carried a vector signal this call), `coverage` (`embedded` / `total` page counts). Cosine similarity for this hit, present only when semantic ranking contributed for that document. Pair `search` with `exec` `grep` for exhaustive literal-string coverage across all file types. Semantic search improves **ranking** for markdown pages; it does not replace grep for code or config files. ## HTTP endpoints :::endpoint GET /api/semantic-status Returns embedding index status for the running server. Used by `ok embeddings status` (live coverage) and the editor settings pane. ```json { "enabled": true, "keyPresent": true, "keySource": "file", "keyHint": "-key", "ready": true, "capable": true, "embedded": 42, "total": 42 } ``` ::: :::endpoint POST /api/search Workspace search. Pass `semantic: true` with `intent: "full_text"` to opt into vector fusion. The MCP tool sets `source: "mcp"` automatically. ```json { "query": "session token refresh", "intent": "full_text", "semantic": true, "limit": 20 } ``` ::: The editor can set keys via `POST /api/local-op/embeddings/set-key` and `POST /api/local-op/embeddings/clear-key` without shell access. ## Troubleshooting Check `ok embeddings status`. Common causes: - No API key — run `ok embeddings set-key`. - `enabled: no` — run `ok embeddings enable` or toggle Settings → Search. - `enabled: true` in committed `.ok/config.yml` instead of `.ok/local/config.yml` — move it to project-local scope. - Query shorter than 3 characters, or `intent: "omnibar"`. - Per-call `semantic: false` override. Normal during warm-up. The first semantic search kicks off background embedding. Retry after a few seconds; the `semantic` block reports progress. Changed docs re-embed incrementally on subsequent searches. The embedder failed to initialize (invalid key, network error, dimension mismatch). Search degrades to lexical. Check server logs. A dimension mismatch error suggests setting `search.semantic.dimensions` to match the provider's output. Run `ok embeddings disable` in the project folder, or toggle off in Settings. Optionally `ok embeddings clear-key` to remove the machine-wide key. Existing cache files under `.ok/local/embeddings/` remain on disk but are unused. ## Related pages Full config schema, precedence order, and environment variables including `OK_EMBEDDINGS_API_KEY`. `search` parameters, response shape, and the two-tier `search` + `exec` grep model. `.ok/local/` layout, project-local scope, and `.okignore` exclusion semantics. Server lifecycle, `ok start`, and which MCP reads require Hocuspocus.