# Why This Repo Matters: A Self-Hosted ChatGPT With Batteries Included

> The hook: Odysseus packs chat, an agent, a hardware-aware model recommender, deep research, memory, email, calendar, and notes into one FastAPI app you can run on your own box — what is special, and what to notice before opening the code.

- Repository: pewdiepie-archdaemon/odysseus
- GitHub: https://github.com/pewdiepie-archdaemon/odysseus
- Human wiki: https://grok-wiki.com/public/wiki/pewdiepie-archdaemon-odysseus-8b8805c93124
- Complete Markdown: https://grok-wiki.com/public/wiki/pewdiepie-archdaemon-odysseus-8b8805c93124/llms-full.txt

## Source Files

- `README.md`
- `pyproject.toml`
- `requirements.txt`
- `ACKNOWLEDGMENTS.md`
- `ROADMAP.md`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [README.md](README.md)
- [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md)
- [ROADMAP.md](ROADMAP.md)
- [requirements.txt](requirements.txt)
- [docker-compose.yml](docker-compose.yml)
- [app.py](app.py)
- [services/hwfit/fit.py](services/hwfit/fit.py)
- [src/agent_loop.py](src/agent_loop.py)
- [src/deep_research.py](src/deep_research.py)
- [services/memory/memory_vector.py](services/memory/memory_vector.py)
</details>

# Why This Repo Matters: A Self-Hosted ChatGPT With Batteries Included

Odysseus is a single FastAPI app that boots a ChatGPT/Claude-style workspace on your own box and then keeps going: an agent loop, a hardware-aware model recommender, a multi-step deep-research engine, vector memory, IMAP/SMTP email triage, CalDAV calendar sync, notes, and a scheduler — all bundled with three companion services (ChromaDB, SearXNG, ntfy) in one `docker compose up`. There are plenty of "local ChatGPT UI" projects; what makes this one worth a closer look is how much *integrated* surface area lives behind one login. This page is a tour of what is special, what to notice before reading the code, and where the seams are.

## The hook: one repo, one process, ten surfaces

Read the README's feature list as a load-bearing claim, not marketing: the same Python process exposes chat, an agent, a model "Cookbook," deep research, model compare, a documents editor, memory/skills, email, notes/tasks, calendar, and assorted extras (image editor, theme editor, vision uploads, web search, presets, sessions, 2FA). Sources: [README.md:10-22]()

That breadth is the design. `app.py` is described in its own header as a "slim orchestrator" — the work is split into `core/` (auth, DB, middleware, constants), `routes/` (one router file per feature: chat, document, memory, model, calendar, email, notes, vault, webhooks, MCP, …), `services/` (docs, memory, search, hwfit, research, …), and `src/` (LLM core, agent loop, agent tools, chat processor). The `routes/` directory alone holds 47 router modules. Sources: [README.md:184-198](), [app.py:1-48]()

## Why it matters: an admin console, not a chat box

The README and `SECURITY.md` are unusually blunt about this: Odysseus has shell access, file uploads, model downloads, email/calendar integrations, web research, and API tokens — "treat it like an admin console." That sentence is the right mental model for the whole codebase. Sources: [README.md:125-136]()

You can see it in `app.py` itself. There is a hard-timeout middleware that aborts requests after `REQUEST_HARD_TIMEOUT` seconds with a 504, *except* on an explicit allowlist of long-running paths:

```python
# app.py
_TIMEOUT_EXEMPT_PREFIXES = (
    "/api/chat",            # streaming
    "/api/shell/stream",    # SSE
    "/api/research",        # multi-minute jobs
    "/api/model/download",  # tmux setup may run pip installs
    "/api/model/probe",     # SSE; iterates models ...
    "/api/model-endpoints", # /probe sub-route ...
    "/api/cookbook/setup",  # remote pacman/apt installs
    "/api/upload",          # large files
    "/api/image",           # diffusion proxies (inpaint/harmonize/upscale/etc.)
)
```

Sources: [app.py:64-99]()

Auth is a layered middleware: cookie sessions for the UI, bearer tokens (`ody_…`) for integrations cached in-process with bcrypt verification on miss, a loopback-only `X-Odysseus-Internal-Tool` header so the agent can call admin-gated routes during a turn, and an opt-in `LOCALHOST_BYPASS` for dev. Bearer-token requests are pinned to `current_user = "api"` so they cannot ride normal cookie routes. Sources: [app.py:104-271]()

## The mechanism, one layer at a time

### Provider-neutral LLM plumbing

Chat is wired around `stream_llm()` / `stream_llm_with_fallback()` in `src/llm_core`, with model endpoints stored in the DB and auto-detected at startup. The README lists vLLM, llama.cpp, Ollama, OpenRouter, and OpenAI as adapters; a startup task probes `http://localhost:11434/v1/models` and, if Ollama is running, silently inserts a `ModelEndpoint` row called "Ollama (local)". Sources: [README.md:11](), [app.py:899-930](), [src/agent_loop.py:17]()

There is no hard-coded provider in the agent loop. The system prompt is provider-agnostic; tools are surfaced through `agent_tools` and an MCP manager that's loaded at startup with a 20-second timeout so a flaky MCP server cannot block the UI. Sources: [app.py:712-725](), [src/agent_loop.py:22-35]()

### The agent loop: fenced blocks + MCP

`src/agent_loop.py` (≈2,100 lines) wraps streaming completions in a multi-round tool-execution loop. The model emits tool calls as fenced code blocks; the loop parses, executes, formats, and feeds results back. The same loop integrates a built-in MCP client, with per-server "disabled tools" loaded from the `McpServer` table — so an operator can ban specific tools without uninstalling the server. Sources: [src/agent_loop.py:1-56]()

### Cookbook: a hardware-aware model recommender

The "what model fits on my GPU?" feature is more concrete than most readmes admit. `services/hwfit/fit.py` ships a hard-coded GPU memory-bandwidth table (RTX 50/40/30/20/16-series, H100/H200, A100, L40S, Radeon 7000/6000, MI300/250/210/100, AMD 9070) and a per-use-case scoring weight vector for `general / coding / reasoning / chat / multimodal / embedding / tts / stt`. Speed is estimated from `bandwidth / model_size` with a 0.55 efficiency factor, then adjusted for MoE (active params only) and CPU offload. Sources: [services/hwfit/fit.py:9-86]()

That table is why the Cookbook can recommend a quant and serve mode (vLLM vs llama.cpp) instead of just listing model names. It is also why operators on the wrong card get realistic numbers and not vendor-PR throughput.

### Deep Research: an iterative LLM-in-the-loop

`src/deep_research.py` describes itself as an "IterResearch-style" Think → Search → Extract → Synthesize loop, adapted from Alibaba's Tongyi DeepResearch (Apache-2.0). Round zero generates a research plan (sub-questions, key topics, success criteria) as JSON; subsequent rounds generate fresh queries against "what we know so far" and a round counter. The synthesis output flows into `src/visual_report.py`, which is why `markdown` is a hard core dependency, not optional. Sources: [src/deep_research.py:1-60](), [ACKNOWLEDGMENTS.md:33-39](), [requirements.txt:21-23]()

### Memory: ChromaDB + ONNX embeddings, with a fallback

Semantic memory lives in `services/memory/memory_vector.py` on a Chroma collection literally named `odysseus_memories`, with cosine HNSW and pre-computed embeddings from `EmbeddingClient`. `chromadb-client` and `fastembed` are in `requirements.txt` with an explicit comment that they're "installed by default — the app still degrades to keyword fallback if they're ever missing." That graceful-degrade pattern shows up again in the roadmap's "Better degraded-state reporting" item. Sources: [services/memory/memory_vector.py:1-50](), [requirements.txt:13-19](), [ROADMAP.md:19]()

### Email, calendar, notes, tasks

Email is IMAP/SMTP with per-account routing and AI triage (urgency, auto-tag, auto-summary, auto-reply, auto-spam) — see the polling/handler split in `routes/email_pollers.py` and `routes/email_helpers.py`. Calendar uses `icalendar` for `.ics` import/export and the `caldav` library for PROPFIND/REPORT sync against Radicale, Nextcloud, Apple, and Fastmail; `ACKNOWLEDGMENTS.md` notes that `caldav` is dual-licensed and used under Apache-2.0 specifically to keep the core permissive. Notes/tasks ship with a `croniter`-based scheduler and ntfy/browser/email channels. Sources: [requirements.txt:24-35](), [ACKNOWLEDGMENTS.md:111-115](), [ACKNOWLEDGMENTS.md:139-155]()

## Surprising details a README reader will miss

- **It's not just MIT throughout — license hygiene is explicit.** `pypdf` (BSD) replaces `chardet` (LGPL) for PDF text, `charset-normalizer` (MIT) replaces chardet, and PyMuPDF (AGPL) is quarantined to `src/pdf_forms.py` and the optional requirements file so the MIT core can run without it. AGPL only "activates" if you install the optional form-filling feature. Sources: [ACKNOWLEDGMENTS.md:139-155]()
- **Three startup warm-ups happen in parallel before the first request.** A background-job monitor (re-invokes the agent when a `#!bg` shell job completes), an MCP `register_builtin_servers` + `connect_all_enabled` with a 20-second budget, a RAG tool-index warmup that pre-loads embeddings + opens Chroma + indexes built-in tools, and an LLM endpoint warmup ping (looped every 60s). This is the kind of work an "MVP" usually skips. Sources: [app.py:700-773]()
- **Static assets are served with `Cache-Control: no-cache` for `.js/.css/.html`.** A custom `_RevalidatingStatic` mount forces conditional revalidation because the app ships raw ES modules with no build step or versioned URLs — without it, deploys would not show up until a manual hard refresh. Sources: [app.py:277-293]()
- **Bearer tokens are cached in-process with explicit invalidation.** `_token_cache` maps prefix → list of `(id, hash, owner, scopes)`. The DB rebuild is triggered by `app.state._token_cache_dirty`, flipped by API-token routes on create/revoke. `last_used_at` is updated fire-and-forget on a `to_thread` task so bcrypt + DB write doesn't sit on the request. Sources: [app.py:136-250]()
- **Nightly "skill audit" job.** At ~02:00 local, the scheduler tests + judges the least-recently-checked entries in the skill library and auto-fixes/escalates weak ones (never deletes). Gated by `skill_audit_nightly`, hour by `skill_audit_hour`, batch by `skill_audit_batch`. Sources: [app.py:875-898]()
- **Disk-backed skill files get repaired on boot.** Orphaned/`test-owner` `SKILL.md` files get reassigned to the primary admin so a fresh deploy doesn't see an empty library. A periodic null-owner sweep does the same for anything created while auth was disabled or localhost-bypassed. Sources: [app.py:822-868]()

## Tradeoffs the README and ROADMAP openly admit

ROADMAP.md is unusually honest about where the seams are: "SQUASH BUGS" is item 1, Cookbook reliability "across different machines, GPUs, drivers, shells, and Python environments" is called out as the area most likely to break, popup/dropdown placement inside transformed modals is a known brute-force fix, and `static/style.css` is "basically Calypso's island atm." The author also notes that "most of Odysseus's code was written *with* AI models, not just by a human." That context matters when reading the code — expect lots of inline comments explaining *why*, lots of route files, and the occasional load-bearing comment a maintainer left for themselves. Sources: [ROADMAP.md:8-30](), [ACKNOWLEDGMENTS.md:158-168]()

The security posture is the other big tradeoff. The README repeats it three times in one section: keep `AUTH_ENABLED=true`, don't expose to the public internet without HTTPS + a trusted reverse proxy, and review per-user privileges before exposing a deployment. The non-admin default is sensible (no shell/Python/file read-write, admin-gated MCP/API-token/webhook/cookbook/backup/settings), but the surface area is real. Sources: [README.md:125-149]()

## How the pieces fit (one diagram)

```mermaid
flowchart LR
    subgraph Client["Browser PWA"]
      UI["static/index.html + js/"]
    end

    subgraph Process["FastAPI process (app.py)"]
      MW["Middleware:\nSecurityHeaders + Timeout + Auth\n(cookie / Bearer ody_… / loopback)"]
      subgraph Routes["routes/ (47 router files)"]
        Chat["chat_routes"]
        Doc["document_routes"]
        Mem["memory_routes"]
        Cal["calendar_routes"]
        Mail["email_routes"]
        Cook["cookbook_routes / hwfit_routes"]
        Res["research_routes"]
      end
      subgraph SrcCore["src/ engine"]
        Loop["agent_loop.py\n(stream + tool rounds)"]
        Tools["agent_tools / tool_index\n+ MCP manager"]
        LLM["llm_core (provider-neutral)"]
        DR["deep_research.py\n(IterResearch loop)"]
      end
      subgraph Svc["services/"]
        Hw["hwfit/ fit + hardware\n(GPU bandwidth table)"]
        MemSvc["memory/ memory_vector\n(ChromaDB)"]
        Search["search/ (SearXNG client)"]
      end
    end

    subgraph Companions["Compose-bundled"]
      Chroma["ChromaDB :8000"]
      Sx["SearXNG :8080"]
      Ntfy["ntfy :80"]
    end

    subgraph External["External / BYOK"]
      Provs["LLM endpoints\nvLLM / llama.cpp / Ollama /\nOpenAI / OpenRouter / Anthropic / Gemini"]
      IMAP["IMAP / SMTP"]
      DAV["CalDAV (Radicale / Nextcloud / Apple / Fastmail)"]
    end

    UI --> MW --> Routes
    Routes --> SrcCore
    SrcCore --> Svc
    MemSvc --> Chroma
    Search --> Sx
    Routes --> Ntfy
    LLM --> Provs
    Mail --> IMAP
    Cal --> DAV
```

Sources: [README.md:184-198](), [app.py:104-271](), [docker-compose.yml:1-79](), [src/agent_loop.py:17-35](), [services/memory/memory_vector.py:14-50]()

## What builders should notice

If you are evaluating Odysseus to fork, extend, or just borrow ideas:

| What | Where | Why it's worth a closer look |
|---|---|---|
| Provider-neutral LLM core | `src/llm_core.py`, `src/model_discovery.py`, `app.py:899-930` | Same code path serves vLLM, llama.cpp, Ollama, and OpenAI-compatible HTTP. No vendor lock. |
| Fenced-block agent loop with MCP | `src/agent_loop.py` | Multi-round tool execution that works with any model that can write code fences, plus an MCP manager for external tools. |
| Hardware-aware recommender | `services/hwfit/fit.py`, `services/hwfit/hardware.py` | Concrete bandwidth table + quant-aware speed/quality model. Reusable outside the UI via `scripts/odysseus-cookbook`. |
| Iterative research engine | `src/deep_research.py`, `services/research/` | A real Plan → Search → Extract → Synthesize loop; reuses the same LLM/search infra. |
| ChromaDB + fastembed memory with keyword fallback | `services/memory/memory_vector.py`, `requirements.txt:13-19` | Graceful degrade is a design principle here, not an afterthought. |
| Auth + token caching pattern | `app.py:136-271` | Worth reading even if you don't use the rest — it's a clean example of bcrypt-cache + invalidation + loopback bypass for in-process tools. |
| Bundled-but-detachable companion services | `docker-compose.yml`, `ACKNOWLEDGMENTS.md:42-53` | ChromaDB, SearXNG, ntfy are pulled as upstream images, not vendored. You can swap or remove them. |
| Honest roadmap | `ROADMAP.md` | The author flags where the code is brittle. Take it at face value. |

The closing summary: Odysseus is interesting less because any one of its features is novel and more because the integration cost — auth, scheduling, memory, search, tools, providers, calendar, email — has actually been paid in one repo, under a permissive core license, on a stack you can run on a laptop or a home server. If you're considering self-hosted AI workspaces, it's worth at least a `docker compose up` before you build your own.