# From Prompt to Slides: The Generation Pipeline

> Step by step: how a user's topic travels through outline generation, layout selection, per-slide content calls, image fetching, and final assembly — all coordinated by the FastAPI backend.

- Repository: presenton/presenton
- GitHub: https://github.com/presenton/presenton
- Human wiki: https://grok-wiki.com/public/wiki/presenton-presenton-f6685dc028cc
- Complete Markdown: https://grok-wiki.com/public/wiki/presenton-presenton-f6685dc028cc/llms-full.txt

## Source Files

- `servers/fastapi/api/v1/ppt/endpoints/presentation.py`
- `servers/fastapi/utils/llm_calls/generate_presentation_structure.py`
- `servers/fastapi/utils/outline_utils.py`
- `servers/fastapi/api/v1/ppt/background_tasks.py`
- `servers/fastapi/services/concurrent_service.py`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [servers/fastapi/api/v1/ppt/endpoints/presentation.py](servers/fastapi/api/v1/ppt/endpoints/presentation.py)
- [servers/fastapi/utils/llm_calls/generate_presentation_outlines.py](servers/fastapi/utils/llm_calls/generate_presentation_outlines.py)
- [servers/fastapi/utils/llm_calls/generate_presentation_structure.py](servers/fastapi/utils/llm_calls/generate_presentation_structure.py)
- [servers/fastapi/utils/llm_calls/generate_slide_content.py](servers/fastapi/utils/llm_calls/generate_slide_content.py)
- [servers/fastapi/utils/outline_utils.py](servers/fastapi/utils/outline_utils.py)
- [servers/fastapi/utils/process_slides.py](servers/fastapi/utils/process_slides.py)
- [servers/fastapi/services/concurrent_service.py](servers/fastapi/services/concurrent_service.py)
- [servers/fastapi/api/v1/ppt/background_tasks.py](servers/fastapi/api/v1/ppt/background_tasks.py)
</details>

# From Prompt to Slides: The Generation Pipeline

When a user types a topic and clicks "Generate," Presenton turns that plain-text prompt into a fully populated, visually designed slide deck. This page traces every step of that journey — from the first HTTP request to the final file written to disk — as it actually runs inside the FastAPI backend.

Understanding this pipeline matters if you want to extend Presenton (adding a new layout, swapping the LLM provider, tuning asset fetching), debug a broken generation, or just reason about where latency comes from.

---

## Overview: The Three Main Phases

The pipeline has three conceptually distinct phases that run in sequence:

```text
┌─────────────────────────────────────────────────────────────────────┐
│ Phase 1 – PLANNING                                                  │
│   Outline generation  →  TOC injection  →  Layout selection         │
├─────────────────────────────────────────────────────────────────────┤
│ Phase 2 – CONTENT                                                   │
│   Per-slide content calls (batched, concurrent)                     │
├─────────────────────────────────────────────────────────────────────┤
│ Phase 3 – ASSEMBLY                                                  │
│   Asset fetching (images + icons)  →  DB persist  →  Export         │
└─────────────────────────────────────────────────────────────────────┘
```

There are two entry points, both calling the same shared handler:

| Endpoint | Mode | Returns |
|---|---|---|
| `POST /presentation/generate` | Synchronous | `PresentationPathAndEditPath` immediately |
| `POST /presentation/generate/async` | Background task | Task record with a polling `id` |

Both resolve to `generate_presentation_handler()` in `presentation.py`.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:1018-1075]()

---

## Phase 1: Planning — Outlines, TOC, and Layout

### Step 1.1 — Validation and Setup

Before any LLM call, `check_if_api_request_is_valid()` enforces business rules:

- `content`, `slides_markdown`, or `files` must be present.
- `n_slides` must be between 1 and `MAX_NUMBER_OF_SLIDES`.
- If table-of-contents is requested, at least 3 slides are required.
- The `template` must be a known built-in name or a `custom-<uuid>` pointing to a real DB record.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:570-625]()

### Step 1.2 — Document Loading (optional)

When the request includes uploaded `files`, a `DocumentsLoader` extracts their text and concatenates it into `additional_context`. This context is passed verbatim to the outline prompt, letting the LLM draw from an uploaded PDF or DOCX without the user needing to paste the text manually.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:652-661]()

### Step 1.3 — Outline Generation via Streaming LLM Call

The outline step asks the LLM to produce a structured list of slide summaries — one per slide — using `generate_ppt_outline()`.

The function builds two prompt halves:

- **System prompt** (`get_system_prompt()`): Sets verbosity target (concise ≈ 20 words/slide, standard ≈ 40 words, text-heavy ≈ 60 words), tone, language, title-slide rules, and Markdown format requirements.
- **User prompt** (`get_user_prompt()`): Injects the user's topic, desired slide count, language, tone, today's date, and the extracted document context.

The call uses **streaming JSON** — the LLM sends tokens as they arrive and the backend accumulates them. If `web_search=True`, a `WebSearchTool` is attached so the model can fetch live facts before drafting slides.

```python
# generate_presentation_outlines.py:205-229
async for event in stream_generate_events(client, **get_generate_kwargs(
    model=model,
    messages=get_messages(...),
    response_format=response_format,
    tools=([WebSearchTool()] if use_search_tool else None),
    stream=True,
)):
    if getattr(event, "type", None) == "content":
        yield event.chunk
```

The streamed text is parsed with `dirtyjson` (a lenient JSON parser), then validated against a `PresentationOutlineModel` with exactly `n_slides` entries.

Sources: [servers/fastapi/utils/llm_calls/generate_presentation_outlines.py:172-237](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:701-744]()

### Step 1.4 — Slide Count Adjustment for Table of Contents

If the user requested a TOC, the backend subtracts the number of TOC slides from the outline count so the final deck still totals the requested `n_slides`. The helpers in `outline_utils.py` do the math:

- `get_no_of_outlines_to_generate_for_n_slides()` — how many content outlines to request from the LLM when some slides will be TOC placeholders.
- `get_no_of_toc_required_for_n_outlines()` — how many TOC slides to insert, given the outline count.
- `get_presentation_outline_model_with_toc()` — inserts synthetic TOC `SlideOutlineModel` entries at the correct position (after the title slide) with page-number annotations.

Sources: [servers/fastapi/utils/outline_utils.py:44-137]()

### Step 1.5 — Layout Selection (`generate_presentation_structure`)

With outlines ready, the backend asks the LLM to assign a **slide layout index** to each outline — choosing from the slide templates available in the selected theme.

Two prompts exist:

- **Standard prompt** (`GET_MESSAGES_SYSTEM_PROMPT`): Encourages visual variety, content-driven layout choices (process → process layout, data → chart layout, etc.), and alternating adjacent layouts.
- **Markdown input prompt** (`STRUCTURE_FROM_SLIDES_MARKDOWN_SYSTEM_PROMPT`): Used when the user provided raw slide markdown instead of a topic; enforces stricter table-and-chart selection rules.

The response is a JSON array of integers — one layout index per slide — validated against a dynamically-built Pydantic schema that has exactly `n_slides` entries.

```python
# generate_presentation_structure.py:135-184
async def generate_presentation_structure(
    presentation_outline, presentation_layout, instructions, using_slides_markdown
) -> PresentationStructureModel:
    ...
    content = await generate_structured_with_schema_retries(
        client, model, messages=messages,
        response_format=response_format, ...
    )
    return PresentationStructureModel(**content)
```

If the layout is marked `ordered` (a fixed-sequence theme), the LLM step is skipped and indices are derived directly from the theme definition.

Sources: [servers/fastapi/utils/llm_calls/generate_presentation_structure.py:135-184](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:797-840]()

---

## Phase 2: Content — Per-Slide LLM Calls

### Step 2.1 — Batched Concurrent Content Generation

Each slide now has an outline (what to say) and a layout index (how to display it). The per-slide content step fills in the layout's schema with real, structured JSON.

Slides are processed in **batches of 10**. Within each batch, all content calls run concurrently via `asyncio.gather`:

```python
# presentation.py:881-898
batch_size = 10
for start in range(0, len(slide_layouts), batch_size):
    end = min(start + batch_size, len(slide_layouts))
    content_tasks = [
        get_slide_content_from_type_and_outline(
            slide_layouts[i], presentation_outlines.slides[i],
            language_to_use, request.tone.value, request.verbosity.value, request.instructions,
        )
        for i in range(start, end)
    ]
    batch_contents = await asyncio.gather(*content_tasks)
```

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:879-935]()

### Step 2.2 — Slide Content LLM Call

For each slide, `get_slide_content_from_type_and_outline()` drives a separate LLM call:

1. The slide layout's `json_schema` is stripped of asset placeholder fields (`__image_url__`, `__icon_url__`) — those will be filled later.
2. A `__speaker_note__` field (100–500 chars of plain text) is injected into the schema.
3. The LLM receives the slide's markdown outline plus the schema and must return JSON that matches it exactly.

```python
# generate_slide_content.py:172-187
response_schema = remove_fields_from_schema(
    slide_layout.json_schema, ["__image_url__", "__icon_url__"]
)
response_schema = add_field_in_schema(response_schema, {
    "__speaker_note__": {"type": "string", "minLength": 100, "maxLength": 500, ...}
}, True)
```

The call uses `generate_structured_with_schema_retries` — if the model returns malformed JSON, it retries automatically.

Sources: [servers/fastapi/utils/llm_calls/generate_slide_content.py:161-215]()

---

## Phase 3: Assembly — Assets, Persistence, and Export

### Step 3.1 — Placeholder Injection (Streaming Path)

In the streaming variant (`GET /presentation/stream/{id}`), slides are streamed to the frontend as they complete. To avoid stalling the stream while images load, the backend calls `process_slide_add_placeholder_assets()` immediately after each slide is generated. This writes `/static/images/placeholder.jpg` and `/static/icons/placeholder.svg` into the slide content so the UI can render something right away.

Sources: [servers/fastapi/utils/process_slides.py:220-239](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:432-433]()

### Step 3.2 — Asset Fetching (Images and Icons)

`process_slide_and_fetch_assets()` resolves every `__image_prompt__` and `__icon_query__` field in each slide's content dict:

- **Images**: If the outline already contained an image URL (parsed by `get_images_for_slides_from_outline()` using a regex that finds `.jpg/.png/.webp` links), that URL is used directly. Otherwise, `ImageGenerationService.generate_image()` is called with the prompt text.
- **Icons**: `ICON_FINDER_SERVICE.search_icons()` takes the icon query string and an `icon_weight` from the layout theme. The first result URL is written back into `__icon_url__`. If nothing is found, a static placeholder SVG is used.

All image and icon fetches within a slide run concurrently via `asyncio.gather`. Asset tasks for one batch start **while the next batch's LLM calls are still running**, overlapping I/O with compute:

```python
# presentation.py:923-935
asset_tasks = [
    asyncio.create_task(
        process_slide_and_fetch_assets(
            image_generation_service, slide,
            outline_image_urls=image_urls_for_batch[offset],
            icon_weight=layout_model.icon_weight,
        )
    )
    for offset, slide in enumerate(batch_slides)
]
async_assets_generation_tasks.extend(asset_tasks)
```

Sources: [servers/fastapi/utils/process_slides.py:16-90](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:923-944]()

### Step 3.3 — Database Persistence

Once all slides and assets are ready, everything is written to the database in a single commit:

```python
# presentation.py:950-953
sql_session.add(presentation)
sql_session.add_all(slides)
sql_session.add_all(generated_assets)
await sql_session.commit()
```

`ImageAsset` records are stored alongside the slides so the file paths survive server restarts.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:949-953]()

### Step 3.4 — Export

`export_presentation()` converts the stored presentation into the requested format (`pptx`, PDF, or others). It receives the presentation ID and a cookie header forwarded from the original request, so the export worker can authenticate against the same session.

The completed path and an edit URL (`/presentation?id=<uuid>`) are returned to the caller.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:961-971]()

### Step 3.5 — Webhook Notification

After success (or failure), `CONCURRENT_SERVICE.run_task()` fires a webhook in the background without blocking the response. The `ConcurrentService` wraps each call in an `asyncio.Task` and keeps a reference set to prevent garbage collection before the task completes.

```python
# concurrent_service.py:16-37
def run_task(self, delay, callable, *args, **kwargs):
    async def wrapper():
        if delay: await asyncio.sleep(delay)
        await callable(*args, **kwargs)
    task = asyncio.create_task(wrapper())
    self._background_tasks.add(task)
    task.add_done_callback(self.on_task_done)
```

Sources: [servers/fastapi/services/concurrent_service.py:6-40]()

---

## Concurrency Model

```text
Batch 1 LLM calls (10 slides) ──────┐
                                     ├─ asyncio.gather ─► Batch 1 slides ready
Batch 1 asset tasks (start now) ─────┘   │
                                          │ (running in background)
Batch 2 LLM calls (10 slides) ──────┐    │
                                     ├─ asyncio.gather ─► Batch 2 slides ready
Batch 2 asset tasks (start now) ─────┘
        ...
await asyncio.gather(*all_asset_tasks)   ← waits for all assets at the end
```

This design means image generation and icon fetching for slide batch N overlap with LLM content generation for slide batch N+1, keeping GPU/network I/O from becoming a sequential bottleneck.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:880-944]()

---

## The Streaming Alternative

The `GET /presentation/stream/{id}` endpoint offers a different delivery model. Instead of waiting for all slides, it yields Server-Sent Events (SSE) in real time:

| SSE event type | When sent |
|---|---|
| `chunk` (opening brace) | Before any slide |
| `chunk` (slide JSON) | After each slide's content is generated |
| `slide_assets` | After each slide's assets resolve |
| `chunk` (closing brace) | After all slides |
| `complete` | Full `PresentationWithSlides` payload |

Asset tasks fire immediately for each slide and their results are flushed to the client as soon as they resolve, even while later slides are still being generated.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:385-519]()

---

## Alternate Input: Slides Markdown

Instead of a topic, callers can submit `slides_markdown` — a list of pre-written Markdown strings, one per slide. The pipeline adapts:

- Outline generation is **skipped**; the markdown is wrapped directly into `SlideOutlineModel` instances.
- A different system prompt is used for layout selection that focuses on matching layouts to markdown structure (table detection, image detection, etc.).
- `get_images_for_slides_from_outline()` scans each markdown string for embedded image URLs and passes them into asset fetching so they are used instead of generated images.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:748-756](), [servers/fastapi/utils/outline_utils.py:184-205]()

---

## Two-Phase UI Flow (Create + Prepare + Stream)

The API also exposes a more interactive, step-by-step path for the UI:

1. `POST /presentation/create` — saves the raw request and returns a `PresentationModel` ID.
2. `POST /presentation/prepare` — takes user-edited outlines and a chosen layout, runs layout selection, inserts TOC entries, and saves the structure.
3. `GET /presentation/stream/{id}` — streams slide content and assets using the prepared structure.

This allows the UI to show the generated outlines and let the user revise them before content is committed, without re-running the expensive outline LLM call.

Sources: [servers/fastapi/api/v1/ppt/endpoints/presentation.py:232-363](), [servers/fastapi/api/v1/ppt/endpoints/presentation.py:365-519]()

---

## Summary

A user's prompt travels through five LLM interactions (outline, layout selection, and one call per slide) coordinated entirely within a single async Python process. The FastAPI backend pipelines these calls to overlap network I/O — asset fetching runs concurrently with content generation for subsequent slides — so the total wall-clock time is far less than the sum of individual call latencies. The final output is a file on disk plus a database record that the frontend uses to render the editable presentation view. The full orchestration lives in `generate_presentation_handler()` at [servers/fastapi/api/v1/ppt/endpoints/presentation.py:628-989]().
