# Carrier & Telephony Layer

> How Twilio and Telnyx carrier adapters are structured, the WebSocket and webhook call-control handshakes, inbound vs. outbound call flows, DTMF, AMD, call transfer, voicemail drop, and recording parity between carriers.

- Repository: PatterAI/Patter
- GitHub: https://github.com/PatterAI/Patter
- Human wiki: https://grok-wiki.com/public/wiki/patterai-patter-57d14e233afc
- Complete Markdown: https://grok-wiki.com/public/wiki/patterai-patter-57d14e233afc/llms-full.txt

## Source Files

- `libraries/python/getpatter/carriers/twilio.py`
- `libraries/python/getpatter/carriers/telnyx.py`
- `libraries/python/getpatter/telephony/twilio.py`
- `libraries/python/getpatter/telephony/telnyx.py`
- `libraries/python/getpatter/telephony/common.py`
- `libraries/python/getpatter/server.py`
- `libraries/typescript/src/server.ts`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [libraries/python/getpatter/carriers/twilio.py](libraries/python/getpatter/carriers/twilio.py)
- [libraries/python/getpatter/carriers/telnyx.py](libraries/python/getpatter/carriers/telnyx.py)
- [libraries/python/getpatter/providers/twilio_adapter.py](libraries/python/getpatter/providers/twilio_adapter.py)
- [libraries/python/getpatter/providers/telnyx_adapter.py](libraries/python/getpatter/providers/telnyx_adapter.py)
- [libraries/python/getpatter/telephony/twilio.py](libraries/python/getpatter/telephony/twilio.py)
- [libraries/python/getpatter/telephony/telnyx.py](libraries/python/getpatter/telephony/telnyx.py)
- [libraries/python/getpatter/telephony/common.py](libraries/python/getpatter/telephony/common.py)
- [libraries/python/getpatter/server.py](libraries/python/getpatter/server.py)
- [libraries/typescript/src/server.ts](libraries/typescript/src/server.ts)
- [libraries/python/getpatter/client.py](libraries/python/getpatter/client.py)
</details>

# Carrier & Telephony Layer

The carrier and telephony layer is the interface between Patter's AI pipeline and the public switched telephone network. It provides a uniform abstraction over Twilio and Telnyx so the rest of the system — stream handlers, metrics, observability — can operate without knowing which carrier is active. The layer covers credential management, webhook signature verification, WebSocket media-stream bridging, audio codec transcoding, DTMF, answering-machine detection (AMD), call transfer, voicemail drop, and recording.

Both Python (`libraries/python/getpatter/`) and TypeScript (`libraries/typescript/src/`) runtimes implement the full Twilio and Telnyx surface with close behavioural parity; this page is grounded primarily in the Python implementation, with TypeScript differences noted where they exist.

---

## Carrier Credentials and the `kind` Discriminator

Each carrier is represented by a frozen, self-validating dataclass in `getpatter/carriers/`.

```python
# libraries/python/getpatter/carriers/twilio.py
@dataclass(frozen=True)
class Carrier:
    account_sid: str = ""
    auth_token: str = ""

    @property
    def kind(self) -> str:
        return "twilio"
```

```python
# libraries/python/getpatter/carriers/telnyx.py
@dataclass(frozen=True)
class Carrier:
    api_key: str = ""
    connection_id: str = ""
    public_key: str = ""   # Ed25519 public key for webhook verification

    @property
    def kind(self) -> str:
        return "telnyx"
```

`__post_init__` on both classes falls back to environment variables (`TWILIO_ACCOUNT_SID` / `TWILIO_AUTH_TOKEN`; `TELNYX_API_KEY` / `TELNYX_CONNECTION_ID`) and raises `ValueError` if required fields are missing. The `kind` property is the stable discriminator used downstream by Phase 2 dispatch to instantiate the correct `TwilioAdapter` or `TelnyxAdapter`.

Sources: [carriers/twilio.py:10-43](), [carriers/telnyx.py:10-50]()

---

## Provider Adapters

Both adapters implement the common `TelephonyProvider` interface (`providers/base.py`) with four operations: `provision_number`, `configure_number`, `initiate_call`, and `end_call`.

| Capability | `TwilioAdapter` | `TelnyxAdapter` |
|---|---|---|
| Client library | `twilio.rest.Client` (sync, run in executor) | `httpx.AsyncClient` (native async) |
| Provision | `available_phone_numbers.local.list` → `incoming_phone_numbers.create` | `GET /available_phone_numbers` → `POST /number_orders` |
| Configure | `incoming_phone_numbers[n].update(voice_url=…)` | `PATCH /phone_numbers/{id}/voice` with `connection_id` |
| Initiate call | `calls.create(twiml=<Connect><Stream>…)` | `POST /calls` (stream attached later, see below) |
| End call | `calls(sid).update(status="completed")` | `POST /calls/{id}/actions/hangup` |
| Observability | emits `patter.cost.telephony_minutes` on active span | same |

`TwilioAdapter.initiate_call` builds a TwiML `<Connect><Stream url="wss://…/outbound"/>` inline and passes it to `calls.create`, so the outbound call is already wired to the WebSocket URL at dial time.

`TelnyxAdapter.initiate_call` posts only to `POST /calls`. Telnyx does **not** accept stream parameters at dial time; media streaming is attached separately after the `call.answered` webhook fires (via `actions/answer` with inline stream params). The `stream_url` argument to `initiate_call` is intentionally unused and retained only for interface parity.

Sources: [providers/twilio_adapter.py:34-161](), [providers/telnyx_adapter.py:19-185]()

---

## Webhook and WebSocket Route Map

The `EmbeddedServer` (Python: `server.py`, TypeScript: `server.ts`) registers all carrier routes on a single FastAPI / Express application.

```text
HTTP Webhooks
  POST /webhooks/twilio/voice          ← inbound call arrival (returns TwiML)
  POST /webhooks/twilio/status         ← call lifecycle transitions
  POST /webhooks/twilio/amd            ← Async AMD result
  POST /webhooks/twilio/recording      ← recording completion
  POST /webhooks/telnyx/voice          ← all Telnyx Call Control events (single endpoint)

WebSocket Streams
  WS  /ws/stream/{call_id}             ← Twilio Media Stream (inbound)
  WS  /ws/stream/outbound              ← Twilio Media Stream (outbound)
  WS  /ws/telnyx/stream/{call_id}      ← Telnyx bidirectional media (inbound)
  WS  /ws/telnyx/stream/outbound       ← Telnyx bidirectional media (outbound)
```

Sources: [server.py:448-470](), [server.py:727-760](), [server.py:851-1001]()

---

## Webhook Signature Verification

### Twilio — HMAC-SHA1

The Twilio path uses the `twilio-python` `RequestValidator` to verify the `X-Twilio-Signature` header. The validator is instantiated from `twilio_token` and the reconstructed `https://` URL. If the `twilio` package is missing when a `twilio_token` is present, the server **rejects** the request with HTTP 503 rather than silently skipping validation.

### Telnyx — Ed25519

Telnyx signs every webhook with an Ed25519 private key. The server verifies the `telnyx-signature-ed25519` header using the base64-encoded DER public key stored in `LocalConfig.telnyx_public_key`. The signed payload is `timestamp + "|" + raw_body`. Timestamp staleness is checked with a 300-second tolerance (handling the Telnyx seconds-vs-milliseconds epoch ambiguity with a heuristic). Multiple comma-separated signatures in the header are each tried in order to support key rotation; the webhook is accepted if any one verifies.

If `require_signature=True` (the default) and the public key is absent, the webhook returns HTTP 503.

Sources: [server.py:151-207]() (Telnyx Ed25519), [server.py:454-510]() (Twilio HMAC-SHA1)

---

## Inbound Call Flow

### Twilio

```mermaid
sequenceDiagram
    participant Carrier as Twilio
    participant Server as EmbeddedServer
    participant Bridge as twilio_stream_bridge
    participant AI as StreamHandler

    Carrier->>Server: POST /webhooks/twilio/voice (CallSid, From, To)
    Server->>Carrier: 200 TwiML <Connect><Stream url="wss://…/ws/stream/{sid}"><Parameter caller/callee>
    Carrier-->>Bridge: WebSocket connect /ws/stream/{call_sid}
    Bridge->>Bridge: event="start" → read customParameters (caller, callee, callSid)
    Bridge->>AI: handler.start()
    loop media
        Carrier-->>Bridge: event="media" (mulaw 8kHz, base64)
        Bridge->>AI: handler.on_audio_received(mulaw)
    end
    Carrier-->>Bridge: event="stop"
    Bridge->>Bridge: flush + cleanup + metrics finalization
```

**Critical detail on `<Parameter>` tags**: Twilio's Media Stream implementation strips query-string parameters from the `<Stream url=…>` before opening the WebSocket. Caller and callee must be forwarded as `<Parameter name="caller" value="…"/>` children of `<Stream>`; the bridge reads them from `start.customParameters` on the WS `start` frame.

Sources: [telephony/twilio.py:67-97](), [telephony/twilio.py:268-316]()

### Telnyx

Telnyx uses a REST command model. All call lifecycle events arrive at the single `POST /webhooks/telnyx/voice` endpoint as JSON with an `event_type` discriminator.

```mermaid
sequenceDiagram
    participant Carrier as Telnyx
    participant Server as EmbeddedServer
    participant Bridge as telnyx_stream_bridge
    participant AI as StreamHandler

    Carrier->>Server: POST /webhooks/telnyx/voice {event_type: "call.initiated"}
    Server->>Carrier: POST /calls/{id}/actions/answer (inline stream_url, PCMU 8kHz)
    Carrier->>Server: POST /webhooks/telnyx/voice {event_type: "call.answered"} (no-op, stream already active)
    Carrier-->>Bridge: WebSocket connect /ws/telnyx/stream/{call_control_id}?caller=…&callee=…
    Bridge->>Bridge: event="start" → extract call_control_id, from, to
    Bridge->>AI: handler.start()
    loop media (inbound_track only)
        Carrier-->>Bridge: event="media" {track: "inbound", payload: base64}
        Bridge->>AI: handler.on_audio_received(mulaw)
    end
    Carrier-->>Bridge: event="stop"
    Bridge->>Bridge: flush + cleanup + metrics finalization
```

**Inline stream optimisation**: Rather than answering the call and then POSTing `streaming_start` as two separate REST calls, the server folds both into a single `actions/answer` body. This removes one `call.answered` webhook round-trip and one HTTP POST, saving approximately 100–200 ms per inbound call.

The `inbound_track` stream filter is set on Telnyx to halve upstream WebSocket bandwidth; outbound echo frames (track=`outbound`) are discarded in the bridge even when `both_tracks` is negotiated.

Sources: [server.py:851-888](), [telephony/telnyx.py:134-163](), [telephony/telnyx.py:375-384]()

---

## Outbound Call Flow

Both carriers share the same `Patter.call()` entry point in `client.py`. The dispatch switches on `config.telephony_provider`.

**Twilio outbound**: `TwilioAdapter.initiate_call()` calls `twilio.calls.create()` with an inline TwiML body that points to `/ws/stream/outbound`. All extra parameters (AMD, ring timeout, status callback) are passed as snake_case kwargs; the twilio-python SDK translates them to PascalCase on the wire. Passing PascalCase directly would raise `TypeError`.

**Telnyx outbound**: `TelnyxAdapter.initiate_call()` posts to `POST /calls` with no stream URL (unsupported at dial time). The `call.answered` webhook triggers `actions/answer` with the stream parameters. Telnyx receives the WebSocket connection at `/ws/telnyx/stream/outbound`.

Sources: [client.py:628-757](), [providers/twilio_adapter.py:74-104](), [providers/telnyx_adapter.py:94-163]()

---

## Audio Codec and Transcoding

Both carriers negotiate PCMU (G.711 μ-law) 8 kHz on the RTP / WebSocket leg. Audio frames are base64-encoded in JSON.

### `TwilioAudioSender`

Twilio Media Streams deliver mulaw 8 kHz. Outbound direction depends on the provider mode:

- **`openai_realtime` / `openai_realtime_2`**: OpenAI Realtime is configured with `audio_format="g711_ulaw"`, so it emits 8 kHz mulaw directly. The sender sets `input_is_mulaw_8k=True` and forwards bytes as-is, avoiding a 24 kHz → 16 kHz → 8 kHz resample chain that would produce audibly slurred output.
- **`pipeline` / `elevenlabs_convai`**: TTS providers emit PCM16 at 16 kHz. The sender transcodes using a `StatefulResampler` (preserves IIR filter state across chunks to avoid aliasing artifacts from restarting on each frame) and a `PcmCarry` buffer that aligns odd-length chunks before resampling.

The sender also implements playback marks (`send_mark`, `on_mark_confirmed`) for tracking TTS playback completion, and a `flush()` method to drain the resampler tail at call end.

### `TelnyxAudioSender`

Structurally identical to the Twilio sender. Telnyx does not support playback marks, so `send_mark` is a documented no-op. `send_clear` emits `{"event": "clear"}`.

Sources: [telephony/twilio.py:107-222](), [telephony/telnyx.py:182-266]()

---

## DTMF

### Inbound DTMF

| Carrier | Delivery mechanism | Bridge handling |
|---|---|---|
| Twilio | In-band `event="dtmf"` on the media-stream WebSocket | `handler.on_dtmf(digit)` + optional `on_transcript` |
| Telnyx | In-band `event="dtmf"` on the media-stream WebSocket **and** out-of-band `call.dtmf.received` REST webhook | In-band: `handler.on_dtmf(digit)`; Webhook: acknowledged with HTTP 200 (no duplicate processing) |

Both paths also fire `on_transcript` with `{"role": "user", "text": "[DTMF: {digit}]"}` for observability.

### Outbound DTMF (Telnyx only)

The `_telnyx_send_dtmf` helper posts one `actions/send_dtmf` command per digit to the Telnyx Call Control REST API, with a configurable inter-digit delay (default 300 ms). Allowed characters are `0–9`, `*`, `#`, `A–D`, `a–d`, `w`, `W`; `w`/`W` are Telnyx pause characters (500 ms each). Duration is clamped to 100–500 ms per digit.

Sources: [telephony/twilio.py:392-409](), [telephony/telnyx.py:432-455](), [telephony/telnyx.py:293-344]()

---

## Answering Machine Detection (AMD)

AMD is **enabled by default** on outbound calls (`machine_detection=True`). It can be disabled per-call to avoid per-call AMD billing on known-human destinations.

| | Twilio | Telnyx |
|---|---|---|
| Activation | `machine_detection="DetectMessageEnd"` + `async_amd="true"` + `async_amd_status_callback` | `answering_machine_detection="greeting_end"` in `POST /calls` |
| Answer latency | Zero additional latency on human pickup (Async AMD) | Zero additional latency |
| Result delivery | `POST /webhooks/twilio/amd` (`AnsweredBy` field) | `POST /webhooks/telnyx/voice` with `event_type="call.machine.detection.ended"` (`result` field) |
| Machine values | `machine_end_beep`, `machine_end_silence` | `machine`, `machine_detected` |

Both paths normalise to a carrier-agnostic `MachineDetectionResult` with classification `"human" | "machine" | "fax" | "unknown"`:

```python
# server.py
def _classify_twilio_amd(answered_by: str) -> str:
    if answered_by == "human": return "human"
    if answered_by.startswith("machine_"): return "machine"
    if answered_by == "fax": return "fax"
    return "unknown"
```

Sources: [server.py:103-122](), [client.py:651-657](), [providers/telnyx_adapter.py:136-140]()

---

## Voicemail Drop

When AMD classifies the callee as a machine and a `voicemail_message` is configured, the server executes a carrier-specific drop.

**Twilio**: POSTs a TwiML update to `Calls/{sid}.json` with `<Response><Say>{message}</Say><Hangup/>`. Validates the CallSid format (34 chars, `CA` prefix) before interpolating it into the REST URL to prevent path traversal.

**Telnyx** (`handle_amd_result`): Posts to `calls/{id}/actions/speak` with the message text, then after a heuristic sleep (~150 ms per character, capped at 30 s), posts to `actions/hangup`. A `client_state` marker (`voicemail-drop`, base64-encoded) is included so a future `call.speak.ended` webhook can trigger the hangup with exact timing instead of the heuristic.

Sources: [server.py:674-703](), [telephony/telnyx.py:46-103]()

---

## Call Transfer

Both bridges expose a `transfer_fn` injected into the stream handler that fires when the agent invokes the `transfer_call` tool.

**Twilio** (`_twilio_transfer`): POSTs TwiML `<Response><Dial>{number}</Dial></Response>` to `Calls/{sid}.json`, after validating E.164 format and the CallSid 34-char format.

**Telnyx** (`_telnyx_transfer`): POSTs `{"to": number}` to `calls/{id}/actions/transfer`. Accepts either a validated E.164 number or a SIP URI (`sip:user@host` / `sips:user@host`). An optional `client_state` string is base64-encoded per Telnyx contract and echoed on subsequent webhooks.

Sources: [telephony/twilio.py:420-452](), [telephony/telnyx.py:461-495]()

---

## Recording

Both carriers support optional call recording (enabled with `recording=True` at `Patter` construction).

| Step | Twilio | Telnyx |
|---|---|---|
| Start | `POST /Accounts/{sid}/Calls/{call_sid}/Recordings.json` at stream start | `POST /calls/{id}/actions/record_start` (`format=mp3`, `channels=single`) at stream start |
| Stop | Automatic on call end (Twilio manages) | `POST /calls/{id}/actions/record_stop` in the `finally` cleanup block |
| Completion webhook | `POST /webhooks/twilio/recording` (`RecordingSid`, `RecordingUrl`) | `call.recording.saved` event at `/webhooks/telnyx/voice` (`recording_urls.mp3`, `public_recording_urls.mp3`) |
| Failure | Non-fatal warning logged | Non-fatal warning logged |

Sources: [telephony/twilio.py:305-319](), [telephony/telnyx.py:346-387](), [server.py:560-568](), [server.py:968-985]()

---

## Security Guardrails

Several defences are embedded throughout the layer:

- **Twilio SID validation** (`_validate_twilio_sid`): All REST calls that interpolate a `CallSid` into a URL validate it as exactly 34 characters with a two-letter prefix and 32 lowercase hex digits. This prevents path traversal / SSRF against the Twilio API.
- **SSRF protection** (`validate_webhook_url`): Blocks non-HTTP(S) schemes and all private, loopback, link-local, and cloud-metadata IP ranges / hostnames before any fetch. Mirrors between Python (`server.py:44-101`) and TypeScript (`server.ts:130-202`).
- **WebSocket message size cap**: Both Twilio and Telnyx bridges reject messages over 1 MB (Twilio audio frames are ~160 bytes; Telnyx 640 bytes) to prevent memory exhaustion from malformed or malicious stream peers.
- **Per-IP WebSocket connection cap**: `MAX_WS_PER_IP = 10` enforced in both Python and TypeScript; excess connections are closed with code 1008 before acceptance.
- **Fail-closed signature enforcement**: Missing `twilio_token` or `telnyx_public_key` with `require_signature=True` (the default) returns HTTP 503 rather than accepting unsigned webhooks.

Sources: [telephony/twilio.py:46-55](), [server.py:44-101](), [server.py:37-43]()

---

## Provider Mode Selection

All three provider modes work identically on both carriers. The mode is resolved from `agent.provider` at stream start:

| Mode | STT | LLM+TTS | `audio_format` to AI provider |
|---|---|---|---|
| `openai_realtime` (default) | Built into OpenAI Realtime | OpenAI Realtime | `"g711_ulaw"` (mulaw bypass) |
| `openai_realtime_2` | Built into OpenAI Realtime (GA API) | OpenAI Realtime | `"g711_ulaw"` |
| `elevenlabs_convai` | ElevenLabs | ElevenLabs | PCM16 |
| `pipeline` | Configurable (`deepgram`, `whisper`, `cartesia`, `soniox`, `speechmatics`, `assemblyai`) | Configurable LLM + TTS | PCM16 |

For OpenAI Realtime modes, `TwilioAudioSender` and `TelnyxAudioSender` are both created with `input_is_mulaw_8k=True`, forwarding the carrier's native mulaw bytes directly to OpenAI and bypassing the stateful resampler chain.

Sources: [telephony/twilio.py:334-345](), [telephony/telnyx.py:287-292](), [telephony/twilio.py:346-411]()

---

## Metrics and Observability

Both bridge cleanup paths (`finally`) follow the same sequence:

1. Flush the `AudioSender` resampler tail (prevents clipping the last audio frame).
2. Call `handler.cleanup()` on the stream handler.
3. Emit `patter.cost.telephony_minutes` on the active OTel span via the adapter's `record_call_end_cost`.
4. Query the actual telephony cost from the carrier REST API (`Calls/{sid}.json` for Twilio; `GET /calls/{id}` for Telnyx) and write it to the metrics accumulator.
5. Query Deepgram STT cost if applicable.
6. Finalize metrics with `metrics.end_call()`.
7. Fire `on_call_end` with the structured result (call_id, transcript, metrics).
8. Close the `patter_call_scope` OTel context — done last so all cleanup spans inherit `patter.call_id` and `patter.side`.

Sources: [telephony/twilio.py:495-570](), [telephony/telnyx.py:580-660]()

---

## Summary

The carrier and telephony layer provides a symmetric dual-adapter design: `twilio.Carrier` / `TwilioAdapter` and `telnyx.Carrier` / `TelnyxAdapter` implement the same `TelephonyProvider` interface and expose functionally identical bridges (`twilio_stream_bridge` / `telnyx_stream_bridge`) over different wire protocols. Twilio uses TwiML webhooks and synchronous-then-streamed Media Streams; Telnyx uses a REST command model with Call Control events and inline stream negotiation. Codec transcoding, AMD, voicemail drop, call transfer, DTMF, recording, security guardrails, and metrics finalization are all implemented on both carriers with documented behavioural parity. Provider mode selection (OpenAI Realtime, ElevenLabs ConvAI, or pipeline STT+LLM+TTS) is orthogonal to carrier selection and applies uniformly to both bridges.
