Agent-readable wiki

Omni macOS: First 30 Minutes

Omni is a native macOS SwiftUI app for on-device semantic search over local files, powered by an in-process MLX-Swift port of jina-embeddings-v5-omni (text, image, video, and audio towers in one shared vector space). This wiki is a guided path from first glance to a working mental model of the engine, the app, and its surfaces.

Pages

Start Here: What Omni Is and the Read OrderThe one-paragraph product idea (on-device, no server, airgap-capable semantic search), the three build targets (OmniKit library, Omni app, omni-verify), the vocabulary a new reader needs (towers, retrieval LoRA, bf16 store, priority gate, base+delta search), and the fastest order to read the files that follow.
Setup Signals: Model Discovery, Signing, and First RunWhat actually has to be true before Omni runs: where the model directory is located (OMNI_MODEL_DIR, Application Support, HuggingFace cache via ModelLocator), the one-time model download flow, the Apple Team ID / code-signing requirement wired through project.yml, and the onboarding path the user sees on first launch.
OmniEngine: Embedding Towers, LoRA Merge, and the Priority GateThe heart of OmniKit: how WeightStore loads HF safetensors and merges the retrieval LoRA, how the Qwen3 text tower and Qwen3-VL / Whisper-style towers land in one shared space, and how MLX calls are serialized through a priority gate so an interactive query jumps ahead of in-flight indexing work.
Indexing Pipeline: Crawl, Extract, Chunk, Embed, StoreThe incremental ingestion path stage by stage: the file crawler and mtime/size change detection, text/PDF/media extraction, chunking, the concurrent decode stage feeding a single serialized GPU embed stage, and persistence into the SQLite-backed bf16 vector store. Includes the FSWatcher that triggers re-indexing.
Search Path: Query Qualifiers and bf16 Matmul ScoringHow a search-box string becomes results: the dependency-free SearchQueryParser that splits semantic text from key:value qualifiers (type, ext, in, date, after, score, sort), then exact brute-force cosine over the resident bf16 matrix split into a GPU-resident base plus a small delta, with top-K from a bounded min-heap and post-filtering.
The SwiftUI App: AppModel, Views, and Window CommandsThe app shell that drives the engine: the @main scene and menu-bar command groups, the AppModel state object that owns indexing and search, the main content layout (sidebar, results list, Quick Look, settings), and how onboarding and updater hooks are wired into launch.
Local Embedding Server: OpenAI / Cohere / Gemini-Compatible APIsA subsystem the README never mentions: an in-app HTTP server that exposes the engine as drop-in embedding APIs. Covers the Router and auth gate, the single ServingBackend seam onto OmniEngine + VectorStore, the per-provider SchemaAdapters (/v1/embeddings, /v1/embed, /v2/embed, Gemini :embedContent, /v1/search), and the controller/tab/log that manage it.
Verifying the Engine and Where to Go NextThe closing page: how numeric parity is proven rather than assumed (omni-verify against Python-generated fixtures, cosine >= 0.999 with matching token ids, image/video/audio matching the upstream model.py), how the test suite and fixture generators are run, and a short map of what to read next after the first 30 minutes.

Complete Markdown

The complete agent-readable Markdown files are published separately from this HTML page.