# Dashboard, Observability, Tunneling & Deployment

> The built-in real-time monitoring dashboard, vendor-neutral OpenTelemetry tracing, call metrics and cost tracking, Cloudflare quick-tunnel vs. static webhook URL trade-offs, test mode (no phone required), Docker Compose setup, and the agent skills bundle for AI coding assistants — the complete operational surface for running Patter in development and production.

- Repository: PatterAI/Patter
- GitHub: https://github.com/PatterAI/Patter
- Human wiki: https://grok-wiki.com/public/wiki/patterai-patter-57d14e233afc
- Complete Markdown: https://grok-wiki.com/public/wiki/patterai-patter-57d14e233afc/llms-full.txt

## Source Files

- `libraries/python/getpatter/observability/`
- `libraries/python/getpatter/tunnel.py`
- `libraries/python/getpatter/tunnels/`
- `libraries/python/getpatter/test_mode.py`
- `dashboard-app/src/App.tsx`
- `Dockerfile`
- `docker-compose.yml`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [libraries/python/getpatter/observability/__init__.py](libraries/python/getpatter/observability/__init__.py)
- [libraries/python/getpatter/observability/tracing.py](libraries/python/getpatter/observability/tracing.py)
- [libraries/python/getpatter/observability/metric_types.py](libraries/python/getpatter/observability/metric_types.py)
- [libraries/python/getpatter/observability/event_bus.py](libraries/python/getpatter/observability/event_bus.py)
- [libraries/python/getpatter/observability/attributes.py](libraries/python/getpatter/observability/attributes.py)
- [libraries/python/getpatter/tunnel.py](libraries/python/getpatter/tunnel.py)
- [libraries/python/getpatter/tunnels/__init__.py](libraries/python/getpatter/tunnels/__init__.py)
- [libraries/python/getpatter/test_mode.py](libraries/python/getpatter/test_mode.py)
- [libraries/python/getpatter/dashboard/routes.py](libraries/python/getpatter/dashboard/routes.py)
- [libraries/python/getpatter/dashboard/store.py](libraries/python/getpatter/dashboard/store.py)
- [libraries/python/getpatter/dashboard/auth.py](libraries/python/getpatter/dashboard/auth.py)
- [libraries/python/getpatter/dashboard/export.py](libraries/python/getpatter/dashboard/export.py)
- [libraries/python/getpatter/dashboard/persistence.py](libraries/python/getpatter/dashboard/persistence.py)
- [libraries/python/getpatter/dashboard/ui.py](libraries/python/getpatter/dashboard/ui.py)
- [libraries/python/getpatter/pricing.py](libraries/python/getpatter/pricing.py)
- [libraries/python/getpatter/server.py](libraries/python/getpatter/server.py)
- [dashboard-app/src/App.tsx](dashboard-app/src/App.tsx)
- [dashboard-app/src/hooks/useDashboardData.ts](dashboard-app/src/hooks/useDashboardData.ts)
- [Dockerfile](Dockerfile)
- [docker-compose.yml](docker-compose.yml)
- [skills-lock.json](skills-lock.json)
</details>

# Dashboard, Observability, Tunneling & Deployment

Patter's operational surface spans four concerns that work together during both development and production: a real-time web dashboard backed by a Server-Sent Events (SSE) feed, a layered observability stack (in-process event bus, typed metrics, and optional OpenTelemetry tracing), automatic HTTPS tunneling for webhook delivery, and a terminal test mode that lets developers iterate without a phone or SIM card. Together they form a self-contained stack that can run on a developer laptop via Docker Compose or be promoted to a hosted server without changing application code.

This page documents each subsystem in depth — its architecture, configuration surface, data flow, and operational trade-offs — covering both the Python SDK implementation and the React dashboard SPA.

---

## Real-Time Monitoring Dashboard

### Overview and Architecture

The dashboard is a Vite + React SPA compiled into a single self-contained HTML file (`ui.html`) by `vite-plugin-singlefile`. At runtime the file is loaded from the Python package via `importlib.resources` and served as the root route of the embedded FastAPI server.

```text
┌─────────────────────────────────────────────────────┐
│  Embedded FastAPI server (server.py)                │
│                                                     │
│  GET  /                         → ui.html (SPA)    │
│  GET  /api/dashboard/calls      → call list JSON   │
│  GET  /api/dashboard/calls/:id  → single call JSON │
│  GET  /api/dashboard/active     → live calls JSON  │
│  GET  /api/dashboard/aggregates → aggregate stats  │
│  GET  /api/dashboard/events     → SSE stream       │
│  GET  /api/dashboard/export/calls → CSV / JSON     │
│  DELETE /api/dashboard/calls/:id  → soft delete    │
│  POST /api/dashboard/calls/delete → batch delete   │
└─────────────────────────────────────────────────────┘
```

The SPA asset is shipped inside the wheel via `[tool.setuptools.package-data]` in `pyproject.toml` and regenerated by running `npm run build && npm run sync` inside `dashboard-app/`.

Sources: [libraries/python/getpatter/dashboard/ui.py:1-35](), [libraries/python/getpatter/dashboard/routes.py:1-30]()

### MetricsStore — In-Memory Call State

`MetricsStore` is the single source of truth for the dashboard's data layer. It is thread-safe, uses a `threading.Lock` for all mutations, and publishes events via asyncio queues so SSE subscribers never block the call path.

| Method | Description |
|---|---|
| `record_call_initiated(data)` | Pre-registers an outbound call before any media arrives |
| `record_call_start(data)` | Marks media stream start; upgrades `initiated` → `in-progress` |
| `record_turn(data)` | Appends a completed turn; publishes `turn_complete` |
| `record_call_end(data, metrics)` | Moves the call to history with final `CallMetrics`; publishes `call_end` |
| `update_call_status(call_id, status)` | Handles Twilio-style status callbacks; moves terminal statuses to history |
| `get_aggregates()` | Computes total calls, total cost, avg duration, avg p95 latency |
| `hydrate(log_root)` | Replays `metadata.json` and `transcript.jsonl` files from disk on restart |
| `delete_calls(call_ids)` | Soft-deletes calls: hidden from UI, on-disk files untouched |

The store keeps at most `max_calls` (default 500) completed calls in memory. Calls are stored oldest-first; all read paths (`get_calls`, `get_aggregates`, `get_calls_in_range`) return newest-first and filter soft-deleted IDs.

**Persistence across restarts.** At startup, `hydrate(log_root)` walks `<log_root>/calls/YYYY/MM/DD/<call_id>/metadata.json` files written by `CallLogger` and reconstructs the in-memory call list. Soft-deleted IDs are persisted to `<log_root>/.deleted_call_ids.json` atomically via `os.replace` so deletions survive restarts. A corrupt file per entry is skipped individually with a `DEBUG` log rather than aborting hydration.

Sources: [libraries/python/getpatter/dashboard/store.py:50-120](), [libraries/python/getpatter/dashboard/store.py:380-430]()

### Server-Sent Events (SSE) Feed

The dashboard SPA opens a persistent `EventSource('/api/dashboard/events')` connection. The server keeps an `asyncio.Queue(maxsize=100)` per subscriber and broadcasts events from `MetricsStore._publish`.

**Event types published over SSE:**

| Event | Trigger |
|---|---|
| `call_initiated` | `record_call_initiated` — outbound dial pre-registered |
| `call_start` | `record_call_start` — media stream begins |
| `call_status` | `update_call_status` — Twilio status callback |
| `turn_complete` | `record_turn` — one conversation turn logged |
| `call_end` | `record_call_end` — call finalised with metrics |
| `calls_deleted` | `delete_calls` — one or more calls soft-deleted |

The server sends a `: keepalive\n\n` comment every 30 seconds to keep proxies from timing out the connection. If a subscriber's queue is full (100 events), the subscriber is dropped silently.

The SPA (`useDashboardData.ts`) reconnects with exponential backoff (1 s → 30 s cap, max 5 attempts) before falling back to polling every 5 seconds.

Sources: [libraries/python/getpatter/dashboard/routes.py:105-140](), [dashboard-app/src/hooks/useDashboardData.ts:14-55]()

### SPA Metrics and UI

The React SPA (`App.tsx`) renders four headline metric tiles (total calls in range, avg p95 latency, spend, active now), a call table with search, and a right-side detail panel showing live transcript and per-call metrics.

Key computed values:
- **Range filtering** — calls are bucketed into 1 h / 24 h / 7 d / All-time windows. Live calls (`status === 'live'`) are always shown regardless of range.
- **Sparklines** — computed by `computeSparkline` in `lib/mappers.ts` aligned to natural time boundaries (e.g. full hours).
- **SDK version pill** — sourced from `/api/dashboard/aggregates`'s `sdk_version` field, which reflects the installed `getpatter.__version__` at runtime.
- **Phone number masking** — the topbar reveals or hides phone numbers via a toggle stored in `useUiPrefs`.

Sources: [dashboard-app/src/App.tsx:1-120]()

### Dashboard Authentication

When `token` is non-empty, all dashboard routes require a valid bearer token via constant-time `hmac.compare_digest`. Two delivery mechanisms are supported so browser navigation still works:

- `Authorization: Bearer <token>` header
- `?token=<token>` query parameter

An empty token disables authentication entirely (suitable for local development).

Sources: [libraries/python/getpatter/dashboard/auth.py:1-35]()

### Data Export

`GET /api/dashboard/export/calls` supports `?format=csv` and `?format=json` with optional `?from=<ISO8601>&to=<ISO8601>` date filtering. Soft-deleted calls are excluded from all exports.

CSV columns: `call_id`, `caller`, `callee`, `direction`, `started_at`, `ended_at`, `duration_s`, `cost_total`, `cost_stt`, `cost_tts`, `cost_llm`, `cost_telephony`, `avg_latency_ms`, `turns_count`, `provider_mode`.

Sources: [libraries/python/getpatter/dashboard/export.py:1-55](), [libraries/python/getpatter/dashboard/routes.py:150-190]()

---

## Observability

Patter's observability stack has three independent layers that can be used in any combination.

```text
┌───────────────────────────────────────────────────────────────────┐
│  Layer 1: In-process EventBus                                     │
│  Synchronous + async handlers; fire-and-forget; never blocks call │
├───────────────────────────────────────────────────────────────────┤
│  Layer 2: Typed Metric Dataclasses                                │
│  Frozen, provider-neutral; fed into MetricsStore + exporters      │
├───────────────────────────────────────────────────────────────────┤
│  Layer 3: OpenTelemetry Tracing (opt-in, PATTER_OTEL_ENABLED=1)   │
│  Standard OTLP export; no PII in span attributes                  │
└───────────────────────────────────────────────────────────────────┘
```

### Layer 1 — In-Process EventBus

`EventBus` is a lightweight pub-sub emitter for pipeline-internal events. Handlers are fire-and-forget: exceptions are caught and logged so a misbehaving observer never disrupts the call.

```python
bus = EventBus()
unsub = bus.on("turn_ended", lambda payload: print(payload))
bus.emit("turn_ended", {"turn_index": 0})
unsub()  # remove listener
```

Async callbacks are scheduled via `asyncio.create_task` (requires a running event loop). Sync callbacks are called inline.

**Supported event types** (`PatterEventType`):

| Event | When emitted |
|---|---|
| `turn_started` / `turn_ended` | Around each user/agent turn |
| `eou_metrics` | End-of-utterance timing captured |
| `interruption` | Barge-in detected |
| `llm_metrics` / `stt_metrics` / `tts_metrics` | Provider-stage metrics |
| `metrics_collected` | Full turn metrics aggregated |
| `call_ended` | Call teardown |
| `transcript_partial` / `transcript_final` | STT transcript events |
| `llm_chunk` / `tts_chunk` | Streaming token / audio chunk |
| `tool_call_started` | Tool invocation begins |

Sources: [libraries/python/getpatter/observability/event_bus.py:1-70]()

### Layer 2 — Typed Metric Dataclasses

All metric payloads are frozen dataclasses (no Pydantic dependency) defined in `metric_types.py`. They form the canonical observability surface consumed by the dashboard, EventBus handlers, and exporters.

| Type | Fields |
|---|---|
| `EOUMetrics` | `end_of_utterance_delay`, `transcription_delay`, `on_user_turn_completed_delay` (all ms) |
| `InterruptionMetrics` | `total_duration`, `detection_delay`, `num_interruptions`, `num_backchannels` (seconds) |
| `TTFBMetrics` | `processor`, `value`, `model` |
| `ProcessingMetrics` | `processor`, `value`, `model` |
| `LLMUsage` | `prompt_tokens`, `completion_tokens`, `total_tokens`, cached/creation/read tokens |
| `RealtimeUsage` | `session_duration_seconds`, `tokens_per_second`, `InputTokenDetails`, `OutputTokenDetails` |

All timestamps default to `time.time()` at dataclass construction, keeping the pipeline code free of explicit clock calls.

Sources: [libraries/python/getpatter/observability/metric_types.py:1-115]()

### Layer 3 — OpenTelemetry Tracing (Opt-In)

OTel tracing is disabled by default. It activates only when `PATTER_OTEL_ENABLED=1` is set **and** the `opentelemetry-sdk` package is installed. No telemetry is emitted without explicit opt-in.

**Enabling tracing:**

```bash
pip install "getpatter[tracing]"
export PATTER_OTEL_ENABLED=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
```

```python
from getpatter.observability import init_tracing
init_tracing(service_name="my-agent", resource_attributes={"deployment.env": "prod"})
```

**Span hierarchy for a single call:**

```text
getpatter.call          ← top-level call span (stream handler)
  getpatter.endpoint    ← silence-detected → LLM-dispatch window
  getpatter.stt         ← one STT inference
  getpatter.llm         ← one LLM completion
    getpatter.tool      ← one tool invocation
  getpatter.tts         ← one TTS synthesis
  getpatter.bargein     ← interrupt-detected → TTS-stopped window
```

**Privacy guarantee:** Only sizes and provider identifiers are recorded as span attributes. User utterances and tool payloads are never included.

**`start_span` context manager** is a zero-overhead no-op when tracing is disabled — callers do not need to guard against a disabled state:

```python
with start_span("getpatter.llm", attributes={"llm.provider": "openai"}) as span:
    ...  # span is None when tracing is off
```

**`patter_call_scope`** binds a `call_id` and an optional `side` label to the current asyncio task tree using `ContextVar`, so spans from deeply nested provider code automatically inherit the call identity.

**`attach_span_exporter`** wires a custom `SpanExporter` into the global `TracerProvider` via `SimpleSpanProcessor`. Idempotent on the same exporter object reference.

Sources: [libraries/python/getpatter/observability/tracing.py:1-160](), [libraries/python/getpatter/observability/attributes.py:1-120]()

### Cost Tracking

`pricing.py` maintains a versioned `DEFAULT_PRICING` table (version `2026.3`, last updated 2026-05-08). Each provider entry carries provider-level defaults plus an optional `models` dict for model-specific overrides. Lookup uses longest-prefix matching so versioned model IDs like `claude-haiku-4-5-20251001` resolve correctly against `claude-haiku-4-5`.

Cost fields flow through `CallMetrics.cost` into `MetricsStore` and are surfaced in:
- The dashboard's per-call detail panel and aggregate `total_cost`
- The CSV export columns `cost_total`, `cost_stt`, `cost_tts`, `cost_llm`, `cost_telephony`
- OTel `patter.cost.*` span attributes via `record_patter_attrs`

Operator overrides:

```python
Patter(pricing={
    "elevenlabs": {"models": {"my_custom_model": {"price": 0.075}}}
})
```

Sources: [libraries/python/getpatter/pricing.py:1-50]()

---

## Tunneling

Patter needs a publicly reachable HTTPS URL to receive webhook callbacks from telephony carriers. Three tunnel strategies are available via the `getpatter.tunnels` module.

### Strategy Comparison

| Strategy | Class | Public URL | Account required | Stable URL | Process managed |
|---|---|---|---|---|---|
| Cloudflare Quick Tunnel | `CloudflareTunnel` | `*.trycloudflare.com` | No | No (random each run) | Yes (`cloudflared` subprocess) |
| Static / bring-your-own | `Static` | Any hostname | Depends | Yes | No |
| Ngrok (directive only) | `Ngrok` | `*.ngrok.io` | Yes | Yes (with reserved) | No (Phase 1a marker) |

### CloudflareTunnel — Automatic Setup

When `tunnel=True` or `tunnel=CloudflareTunnel()` is passed to the server, `start_tunnel(port)` spawns a `cloudflared tunnel --url http://localhost:<port>` subprocess, reads the assigned `*.trycloudflare.com` hostname from `cloudflared`'s stderr, and registers an `atexit` handler to stop the process on exit.

```python
from getpatter.tunnels import CloudflareTunnel
phone = Patter(mode="local", tunnel=CloudflareTunnel())
```

Requirements: `cloudflared` binary on `PATH`.

```bash
# macOS
brew install cloudflared
# Debian/Ubuntu
sudo apt install cloudflared
```

The URL regex used to extract the hostname is `https://([a-zA-Z0-9._-]+\.trycloudflare\.com)`, applied to both stdout and stderr streams concurrently. A `TimeoutError` is raised if no URL appears within 30 seconds.

**Trade-off:** The URL is ephemeral — it changes every time the process restarts. For production or long-lived development sessions use a `Static` hostname.

Sources: [libraries/python/getpatter/tunnel.py:1-115]()

### Static — Stable Webhook URL

`Static(hostname="...")` tells the server to use a pre-existing public hostname. The SDK will not spawn any subprocess. The hostname must already route HTTPS traffic to the local port (via ngrok, Cloudflare Tunnel with a named route, a reverse proxy, or a direct public IP).

```python
from getpatter.tunnels import Static
phone = Patter(mode="local", tunnel=Static(hostname="agent.example.com"))
```

`Static` validates that `hostname` is non-empty at construction time.

Sources: [libraries/python/getpatter/tunnels/__init__.py:40-60]()

### Ngrok (Planned)

`Ngrok` is currently a marker dataclass — it records an optional reserved `hostname` but does not launch a subprocess. Programmatic `ngrok` integration via the `ngrok` Python package is noted as a future addition. Users who already run `ngrok` manually should use `Static` with the public hostname.

Sources: [libraries/python/getpatter/tunnels/__init__.py:18-38]()

---

## Test Mode — No Phone Required

`TestSession.run(agent)` starts an interactive terminal REPL that simulates a phone call without requiring telephony, STT, or TTS. This is the recommended first step for iterating on agent logic.

```python
phone = Patter(mode="local", phone_number="+15550001234")
agent = phone.agent(
    system_prompt="You are helpful.",
    stt=DeepgramSTT(api_key="..."),
    tts=ElevenLabsTTS(api_key="..."),
)
await phone.test(agent)
```

### Session Behaviour

On entry the REPL:
1. Generates a synthetic `call_id` (`test_<12 hex chars>`), caller (`+15550000001`), and callee (`+15550000002`).
2. Fires `on_call_start` if registered, accepting the same override dict as a real call.
3. Prints the agent's `first_message` if set.
4. Enters a `readline`-based REPL in a thread-pool executor (so the async event loop remains unblocked).

**Built-in LLM fallback.** If no `on_message` handler is provided but `openai_key` is supplied, a `LLMLoop` is instantiated using `gpt-4o-mini` (Realtime models are silently swapped to chat-completions for test mode compatibility). System prompt variable substitution runs before the loop starts.

**REPL commands:**

| Command | Effect |
|---|---|
| `/quit` | End session immediately |
| `/hangup` | Simulate caller hangup |
| `/transfer <number>` | Simulate a transfer (agent-initiated) |
| `/history` | Print conversation history so far |

On exit the REPL fires `on_call_end` with the full transcript so end-of-call hooks are exercised.

Sources: [libraries/python/getpatter/test_mode.py:1-175]()

---

## Docker Compose Deployment

### Dockerfile

The provided `Dockerfile` uses `python:3.13-slim` as the base, installs `getpatter[local]` and `python-dotenv`, copies the working directory, exposes port 8000, and defaults to running `python/main.py`:

```dockerfile
FROM python:3.13-slim
WORKDIR /app
RUN pip install --no-cache-dir "getpatter[local]" python-dotenv
COPY . .
EXPOSE 8000
CMD ["python", "python/main.py"]
```

Override the entry point to run any agent script:

```bash
docker run patter python my_agent.py
```

Sources: [Dockerfile:1-14]()

### docker-compose.yml

The minimal Compose file builds the image from the local context, forwards port `8000`, loads environment from `.env`, and restarts unless manually stopped:

```yaml
services:
  patter:
    build: .
    ports:
      - "8000:8000"
    env_file: .env
    restart: unless-stopped
```

```bash
cp .env.example .env   # fill in API keys
docker compose up --build
```

The dashboard will be accessible at `http://localhost:8000` once the agent starts. In this configuration the Cloudflare tunnel is optional — a `Static` hostname pointing to your public IP or a deployed ingress is preferred for production.

Sources: [docker-compose.yml:1-7]()

### Sequence: Development Startup

```mermaid
sequenceDiagram
    participant Dev as Developer
    participant Patter as EmbeddedServer
    participant CF as cloudflared
    participant Carrier as Telephony Carrier
    participant Browser as Dashboard Browser

    Dev->>Patter: python main.py (or docker compose up)
    Patter->>CF: spawn cloudflared tunnel --url http://localhost:8000
    CF-->>Patter: trycloudflare.com URL extracted from stderr
    Patter->>Carrier: register webhook_url = https://<hostname>
    Patter->>Browser: serve GET / → ui.html (SPA)
    Browser->>Patter: EventSource /api/dashboard/events
    Carrier->>Patter: POST /webhook (inbound call)
    Patter->>Browser: SSE: call_start, turn_complete, call_end
```

---

## Agent Skills Bundle

The repository ships one locked skill for AI coding assistants:

| Skill | Source | Hash |
|---|---|---|
| `line-voice-agent` | `cartesia-ai/skills` (GitHub) | `554487...` |

The skill is declared in `skills-lock.json` and the resolved artefact lives under `.agents/skills/line-voice-agent/`. This provides AI coding assistants (such as Goose and pi) with context-aware guidance for building voice agents with the Cartesia Line SDK, tool calling, multi-agent handoffs, and real-time interruption handling — all within the Patter development environment.

Sources: [skills-lock.json:1-11]()

---

## Summary

Patter's operational layer is designed around three principles: **zero-cost defaults** (tracing is off unless `PATTER_OTEL_ENABLED=1` is set; the dashboard requires no external database), **bring-your-own infrastructure** (tunnel strategy, OTLP backend, pricing overrides, and metrics backends are all pluggable), and **rapid local iteration** (test mode collapses the telephony/STT/TTS stack to a terminal REPL with a one-liner). In production the same FastAPI server that handles webhooks also serves the React dashboard, exposes the SSE feed, and writes per-call JSONL logs — all within a single process that fits in a `python:3.13-slim` container.