# The Python Backend: FastAPI Server

> The FastAPI server is the engine room: it hosts the REST + SSE API, manages the SQLite/Alembic database, runs background export tasks, validates LLM responses, and serves static assets. Every LLM call, slide write, and export goes through here.

- Repository: presenton/presenton
- GitHub: https://github.com/presenton/presenton
- Human wiki: https://grok-wiki.com/public/wiki/presenton-presenton-f6685dc028cc
- Complete Markdown: https://grok-wiki.com/public/wiki/presenton-presenton-f6685dc028cc/llms-full.txt

## Source Files

- `servers/fastapi/api/main.py`
- `servers/fastapi/api/lifespan.py`
- `servers/fastapi/api/middlewares.py`
- `servers/fastapi/services/database.py`
- `servers/fastapi/services/export_task_service.py`
- `servers/fastapi/api/v1/ppt/router.py`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [servers/fastapi/api/main.py](servers/fastapi/api/main.py)
- [servers/fastapi/api/lifespan.py](servers/fastapi/api/lifespan.py)
- [servers/fastapi/api/middlewares.py](servers/fastapi/api/middlewares.py)
- [servers/fastapi/services/database.py](servers/fastapi/services/database.py)
- [servers/fastapi/services/export_task_service.py](servers/fastapi/services/export_task_service.py)
- [servers/fastapi/api/v1/ppt/router.py](servers/fastapi/api/v1/ppt/router.py)
- [servers/fastapi/api/v1/ppt/endpoints/presentation.py](servers/fastapi/api/v1/ppt/endpoints/presentation.py)
- [servers/fastapi/api/v1/ppt/endpoints/outlines.py](servers/fastapi/api/v1/ppt/endpoints/outlines.py)
- [servers/fastapi/models/sql/presentation.py](servers/fastapi/models/sql/presentation.py)
- [servers/fastapi/models/sql/slide.py](servers/fastapi/models/sql/slide.py)
- [servers/fastapi/models/sse_response.py](servers/fastapi/models/sse_response.py)
- [servers/fastapi/utils/user_config.py](servers/fastapi/utils/user_config.py)
- [servers/fastapi/utils/llm_calls/generate_presentation_outlines.py](servers/fastapi/utils/llm_calls/generate_presentation_outlines.py)
- [servers/fastapi/migrations.py](servers/fastapi/migrations.py)
</details>

# The Python Backend: FastAPI Server

The FastAPI server at `servers/fastapi/` is the central engine of Presenton. Every action the user takes — typing a topic, picking a template, watching slides appear one by one, downloading a finished deck — flows through this server. It hosts the REST and Server-Sent Events (SSE) API, manages the SQLite database, bridges the LLM of your choice, runs Node.js export subprocesses, and serves static assets to the Next.js frontend.

This page walks through how all those responsibilities are organized: application startup, middleware, routing, database, LLM calls, streaming, and the export pipeline.

---

## Application Entry Point

The application object is created in `main.py` using FastAPI's standard constructor, wired to a lifespan context manager for startup/shutdown logic:

```python
# servers/fastapi/api/main.py:57
app = FastAPI(lifespan=app_lifespan)
```

Four top-level routers are registered immediately after:

| Router import | Mounted prefix |
|---|---|
| `API_V1_PPT_ROUTER` | `/api/v1/ppt` |
| `API_V1_WEBHOOK_ROUTER` | `/api/v1/webhook` |
| `API_V1_MOCK_ROUTER` | `/api/v1/mock` |
| `API_V1_AUTH_ROUTER` | `/api/v1/auth` |

Two `StaticFiles` mounts are added next: one for `/app_data` (user-generated images, exported files) and one for `/static` (bundled UI icons and assets). A special `static_icon_fallback_middleware` catches any 404 under `/static/icons/` and returns a `placeholder.svg` rather than a broken-image error — handling cases where Phosphor icon names changed between versions.

```python
# servers/fastapi/api/main.py:89-101
@app.middleware("http")
async def static_icon_fallback_middleware(request: Request, call_next):
    ...
    if not path.startswith("/static/icons/"):
        return response
    placeholder = get_resource_path("static/icons/placeholder.svg")
    ...
    return FileResponse(placeholder, media_type="image/svg+xml")
```

Sentry error tracking is initialized before the app object is built (`_maybe_init_sentry()`), pulling `SENTRY_DSN`, `SENTRY_TRACES_SAMPLE_RATE`, and `SENTRY_SEND_DEFAULT_PII` from environment variables. The SDK import is guarded in a try/except so Sentry remains optional in builds that omit it.

Sources: [servers/fastapi/api/main.py:1-101]()

---

## Lifespan: Startup and Shutdown

`app_lifespan` is an `asynccontextmanager` that runs once at process start and once at shutdown. In order, it:

1. Configures Python logging from the `LOG_LEVEL` environment variable (default `INFO`).
2. Ensures the `APP_DATA_DIRECTORY` folder exists on disk.
3. Calls `migrate_database_on_startup()` (Alembic migrations, when `MIGRATE_DATABASE_ON_STARTUP=true`).
4. Calls `create_db_and_tables()` (SQLModel `CREATE TABLE IF NOT EXISTS` for all registered models).
5. Bootstraps single-user credentials from `AUTH_USERNAME`/`AUTH_PASSWORD` env vars if provided.
6. Calls `check_llm_and_image_provider_api_or_model_availability()` to warn early if an API key or model is missing.
7. On shutdown, disposes the async SQLAlchemy connection pool via `dispose_engines()`.

```python
# servers/fastapi/api/lifespan.py:83-100
@asynccontextmanager
async def app_lifespan(_: FastAPI):
    _configure_application_logging()
    os.makedirs(get_app_data_directory_env(), exist_ok=True)
    await migrate_database_on_startup()
    await create_db_and_tables()
    _bootstrap_auth_from_env()
    await check_llm_and_image_provider_api_or_model_availability()
    yield
    await dispose_engines()
```

The auth bootstrap supports three environment-driven scenarios: `RESET_AUTH=true` wipes stored credentials (recovery), setting `AUTH_USERNAME`+`AUTH_PASSWORD` on a fresh instance seeds credentials without visiting the UI (first-run preseed), and `AUTH_OVERRIDE_FROM_ENV=true` forces an overwrite of existing credentials. Errors in this block are caught and logged rather than crashing startup.

Sources: [servers/fastapi/api/lifespan.py:36-100]()

---

## Middleware Stack

Two custom `BaseHTTPMiddleware` classes sit in the stack, applied in reverse registration order (innermost first in FastAPI):

### `UserConfigEnvUpdateMiddleware`

Runs on every request (unless `CAN_CHANGE_KEYS=false`). It reads a `userConfig.json` file from disk and merges the stored LLM/image provider credentials into the process's environment variables. This is how a user can configure their API keys through the UI without restarting the container — the settings are persisted to disk and re-applied from the JSON file on each request.

```python
# servers/fastapi/api/middlewares.py:15-19
class UserConfigEnvUpdateMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        if get_can_change_keys_env() != "false":
            update_env_with_user_config()
        return await call_next(request)
```

### `SessionAuthMiddleware`

Guards all `/api/` routes and `/app_data/` paths (except `/app_data/images/`, which must be accessible by the export subprocess). The check is skipped entirely when `DISABLE_AUTH=true`.

The gate works in two layers:

1. **Session token check** — reads a cookie (`SESSION_COOKIE_NAME`) and validates it against stored credentials. Returns `status 428` ("setup required") if no credentials exist yet.
2. **Basic auth fallback** — if no session cookie is present, the middleware tries HTTP Basic credentials from the `Authorization` header, letting headless API callers (including the PPTX exporter) authenticate without a session cookie.

The `/api/v1/auth/` prefix is always exempt so the login endpoint itself can be reached.

Sources: [servers/fastapi/api/middlewares.py:1-83]()

---

## Database Layer

The database layer is built on **SQLAlchemy async** + **SQLModel** (a thin Pydantic-aware wrapper).

```python
# servers/fastapi/services/database.py:27-36
database_url, connect_args = get_database_url_and_connect_args()
_pool_kwargs = get_pool_kwargs() if "sqlite" not in database_url else {}

sql_engine: AsyncEngine = create_async_engine(
    database_url, connect_args=connect_args, **_pool_kwargs
)
async_session_maker = async_sessionmaker(sql_engine, expire_on_commit=False)
```

Pool configuration (`pool_size`, `max_overflow`, etc.) is applied only for server-class databases; SQLite uses its own file-locking model and ignores pool settings. The `get_async_session()` generator is used as a FastAPI `Depends` throughout all endpoints.

### Schema

The following SQLModel tables are created on startup:

| Table | Model class | Purpose |
|---|---|---|
| `presentations` | `PresentationModel` | One row per presentation; stores content, outline, layout, structure as JSON |
| `slides` | `SlideModel` | One row per slide, foreign key to `presentations` with `CASCADE DELETE` |
| `key_value` | `KeyValueSqlModel` | General key-value store |
| `chat_history_messages` | `ChatHistoryMessageModel` | Per-presentation chat thread |
| `image_assets` | `ImageAsset` | Tracks generated/fetched images |
| `presentation_layout_codes` | `PresentationLayoutCodeModel` | Custom template code + font lists |
| `template_create_infos` | `TemplateCreateInfoModel` | Template creation metadata |
| `templates` | `TemplateModel` | User-created layout templates |
| `webhook_subscriptions` | `WebhookSubscription` | Registered webhook URLs |
| `async_presentation_generation_tasks` | `AsyncPresentationGenerationTaskModel` | Background task status rows |
| `ollama_pull_status` | `OllamaPullStatus` | Tracks Ollama model download progress |

`PresentationModel` stores complex objects (outlines, layout, structure) as JSON blobs and exposes typed accessors like `get_layout() -> PresentationLayoutModel` and `get_structure() -> PresentationStructureModel`.

Sources: [servers/fastapi/services/database.py:1-77](), [servers/fastapi/models/sql/presentation.py:1-82](), [servers/fastapi/models/sql/slide.py:1-33]()

---

## Routing: The `/api/v1/ppt` Tree

The `API_V1_PPT_ROUTER` aggregates twenty sub-routers under `/api/v1/ppt`. Each major feature area has its own `APIRouter`:

| Sub-router | Typical responsibility |
|---|---|
| `PRESENTATION_ROUTER` | Create, fetch, stream, generate, edit, derive presentations |
| `OUTLINES_ROUTER` | Stream outline generation over SSE |
| `SLIDE_ROUTER` | Fetch/update individual slides |
| `CHAT_ROUTER` | Per-presentation conversational chat |
| `LAYOUT_MANAGEMENT_ROUTER` | Slide-to-HTML rendering |
| `PPTX_SLIDES_ROUTER` | Import slides from an uploaded PPTX |
| `PDF_SLIDES_ROUTER` | Import slides from an uploaded PDF |
| `FILES_ROUTER` | Upload supporting documents |
| `IMAGES_ROUTER` | Image search/generation |
| `ICONS_ROUTER` | Icon search |
| `FONTS_ROUTER` | Font listing |
| `OLLAMA_ROUTER` | Ollama model management |
| `OPENAI_ROUTER` / `ANTHROPIC_ROUTER` / `GOOGLE_ROUTER` | Provider-specific API key validation |
| `CODEX_AUTH_ROUTER` | OAuth token exchange for hosted Codex provider |
| `TEMPLATE_ROUTER` | Custom template CRUD |
| `THEMES_ROUTER` / `THEME_ROUTER` | Theme management and AI theme generation |

Sources: [servers/fastapi/api/v1/ppt/router.py:1-47]()

---

## Presentation Generation: End-to-End

The generation flow is the most complex part of the backend. There are two surface-level endpoints that share the same core `generate_presentation_handler` function:

- `POST /api/v1/ppt/presentation/generate` — **synchronous**: blocks until the full deck is generated and exported, then returns a file path.
- `POST /api/v1/ppt/presentation/generate/async` — **background**: immediately returns an `AsyncPresentationGenerationTaskModel` row; the actual work runs in FastAPI's `BackgroundTasks`. Clients poll `GET /api/v1/ppt/presentation/status/{id}` for progress.

A third interactive flow uses SSE for the step-by-step UI experience:

- `GET /api/v1/ppt/outlines/stream/{id}` — streams outline text tokens over SSE as the LLM generates them.
- `GET /api/v1/ppt/presentation/stream/{id}` — streams completed slide objects one at a time over SSE, with parallel asset fetching in the background.

### Step-by-step inside `generate_presentation_handler`

```
1. Validate inputs (slide count limits, template existence)
2. If file attachments are provided → load and extract document text
3. Call LLM to generate outlines (streamed internally, collected into text)
4. Parse outlines using `dirtyjson` (tolerates LLM formatting imperfections)
5. Ask LLM to map each outline to a slide layout index (structure)
6. Insert table-of-contents placeholder slides if requested
7. Generate slide content: batches of 10 concurrently via asyncio.gather
8. For each batch: immediately start asset fetch tasks in parallel
9. await all asset tasks
10. Persist PresentationModel + SlideModel + ImageAsset rows to DB
11. Call ExportTaskService.export_from_url() → spawns Node.js subprocess → returns file path
12. Fire webhook (success or failure) via ConcurrentService
```

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:628-989]()

---

## Server-Sent Events (SSE) Protocol

Both streaming endpoints return `StreamingResponse` with `media_type="text/event-stream"`. The wire format is defined by four Pydantic models:

```python
# servers/fastapi/models/sse_response.py
SSEResponse        → event: response\ndata: {"type": "chunk", "chunk": "..."}
SSEStatusResponse  → event: response\ndata: {"type": "status", "status": "..."}
SSEErrorResponse   → event: response\ndata: {"type": "error", "detail": "..."}
SSECompleteResponse→ event: response\ndata: {"type": "complete", "<key>": {...}}
```

The outline stream emits one `chunk` event per text token from the LLM, then a single `complete` event holding the saved `PresentationModel`. The slide stream emits chunk events containing raw slide JSON as each slide finishes LLM generation, then additional `slide_assets` events as background image/icon downloads complete — allowing the frontend to progressively enrich slides without waiting for all assets.

```python
# servers/fastapi/api/v1/ppt/endpoints/presentation.py:400-403
yield SSEResponse(
    event="response",
    data=json.dumps({"type": "chunk", "chunk": '{ "slides": [ '}),
).to_string()
```

Sources: [servers/fastapi/models/sse_response.py:1-50](), [servers/fastapi/api/v1/ppt/endpoints/outlines.py:103-127](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:365-519]()

---

## LLM Provider Abstraction

All LLM calls go through a thin `llmai` abstraction layer (`from llmai import get_client`). The actual provider is selected at runtime from the `LLM` environment variable. The `UserConfigEnvUpdateMiddleware` keeps these variables current on every request, so a user can switch providers through the UI without restarting the server.

Providers verified in `utils/user_config.py` (env vars read/written per request):

| Provider | Key env var |
|---|---|
| OpenAI | `OPENAI_API_KEY`, `OPENAI_MODEL` |
| Anthropic | `ANTHROPIC_API_KEY`, `ANTHROPIC_MODEL` |
| Google Gemini | `GOOGLE_API_KEY`, `GOOGLE_MODEL` |
| Google Vertex AI | `VERTEX_*` |
| Azure OpenAI | `AZURE_OPENAI_*` |
| AWS Bedrock | `BEDROCK_*` |
| Ollama (local) | `OLLAMA_URL`, `OLLAMA_MODEL` |
| OpenRouter | `OPENROUTER_*` |
| Fireworks, Together, Cerebras | provider-specific vars |
| LiteLLM, LM Studio | proxy URL + API key |
| Custom OpenAI-compatible | `CUSTOM_LLM_URL`, `CUSTOM_LLM_API_KEY` |
| Codex (hosted) | OAuth tokens via `CODEX_*` |

This is a BYOK (Bring Your Own Key) design: the server never hard-codes a provider. Users supply keys through the settings UI or via environment variables; the middleware propagates them into `os.environ` before each request.

Sources: [servers/fastapi/utils/user_config.py:1-76](), [servers/fastapi/utils/llm_calls/generate_presentation_outlines.py:1-24]()

---

## Export Task Service

Exporting a finished presentation to PPTX or PDF is delegated to a Node.js runtime (`presentation-export/index.cjs`). The Python side spawns it as an async subprocess and communicates through temporary JSON files:

```
Python                          Node.js (presentation-export)
──────                          ─────────────────────────────
write export_task.json    →     read task, run headless export
                                write export_task.response.json
read response.json        ←     (process exits 0 on success)
```

The `ExportTaskService` class (`services/export_task_service.py`) handles:

- **Runtime discovery**: finds `index.cjs` (or `index.js`) by probing several candidate paths under `EXPORT_RUNTIME_DIR`, `EXPORT_PACKAGE_ROOT`, or relative to the working directory.
- **Converter binary**: selects a platform+arch-specific native binary (`convert-linux-x64`, `convert-darwin-arm64`, etc.) for the PPTX-to-HTML conversion step.
- **Child process management**: uses `asyncio.create_subprocess_exec` with a 300-second timeout and a `BoundedTextBuffer` to capture stdout/stderr without unbounded memory growth.
- **Three task types**: `export` (render slides URL → PDF/PPTX), `pptx-to-html` (parse an uploaded PPTX into slide HTML), `extract-schema` (derive a JSON schema from a template URL).

```python
# servers/fastapi/services/export_task_service.py:291-344
process = await asyncio.create_subprocess_exec(
    *command,
    cwd=self.export_dir,
    stdout=asyncio.subprocess.PIPE,
    stderr=asyncio.subprocess.PIPE,
    env=env,
    **_windows_hidden_subprocess_kwargs(),
)
```

On Windows, `CREATE_NO_WINDOW` is passed to prevent a console window from appearing. The `EXPORT_TASK_SERVICE` singleton is instantiated at module level and shared across all requests.

Sources: [servers/fastapi/services/export_task_service.py:70-467]()

---

## Request Lifecycle Summary

```text
HTTP Request
    │
    ▼
CORSMiddleware          (allow all origins)
    │
    ▼
SessionAuthMiddleware   (cookie or Basic auth; 401/428 on failure)
    │
    ▼
UserConfigEnvUpdateMiddleware  (sync userConfig.json → os.environ)
    │
    ▼
static_icon_fallback_middleware (404 on /static/icons/ → placeholder.svg)
    │
    ▼
Router dispatch
    ├── /api/v1/auth/    (exempt from auth)
    ├── /api/v1/ppt/     (presentation, slides, chat, export…)
    ├── /api/v1/webhook/
    └── /api/v1/mock/
    │
    ▼
Endpoint handler
    ├── Depends(get_async_session) → SQLAlchemy AsyncSession
    ├── LLM call via llmai abstraction
    └── ExportTaskService (Node.js subprocess, when export needed)
```

The entire stack is designed to work with any LLM provider and any SQL-compatible database (SQLite by default, PostgreSQL or MySQL with pool settings). There are no hard dependencies on a specific cloud vendor, making the backend fully self-hostable and BYOK-friendly. The `lifespan` hook ensures the database is always current before the first request is served, and that connection pools are cleanly released on shutdown.

Sources: [servers/fastapi/api/main.py:57-101](), [servers/fastapi/api/lifespan.py:83-100](), [servers/fastapi/services/database.py:27-76]()