# Nine Providers, One Interface: What Must Every Adapter Guarantee?

> The providers/ directory contains adapters for Anthropic, OpenAI (completions, responses, Codex), Azure OpenAI, Google AI, Google Vertex, Mistral, Amazon Bedrock, Cloudflare, and a faux test provider. This page asks what the common stream/streamSimple contract is, where adapters diverge (Bedrock's Node-only constraint, GitHub Copilot's custom headers, OpenAI prompt-cache specifics), and what the faux provider reveals about testability.

- Repository: earendil-works/pi
- GitHub: https://github.com/earendil-works/pi
- Human wiki: https://grok-wiki.com/public/wiki/earendil-works-pi-8b87608fc234
- Complete Markdown: https://grok-wiki.com/public/wiki/earendil-works-pi-8b87608fc234/llms-full.txt

## Source Files

- `packages/ai/src/providers/anthropic.ts`
- `packages/ai/src/providers/amazon-bedrock.ts`
- `packages/ai/src/providers/faux.ts`
- `packages/ai/src/providers/github-copilot-headers.ts`
- `packages/ai/src/providers/transform-messages.ts`
- `packages/ai/src/providers/openai-prompt-cache.ts`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:

- [packages/ai/src/types.ts](packages/ai/src/types.ts)
- [packages/ai/src/providers/anthropic.ts](packages/ai/src/providers/anthropic.ts)
- [packages/ai/src/providers/amazon-bedrock.ts](packages/ai/src/providers/amazon-bedrock.ts)
- [packages/ai/src/providers/faux.ts](packages/ai/src/providers/faux.ts)
- [packages/ai/src/providers/github-copilot-headers.ts](packages/ai/src/providers/github-copilot-headers.ts)
- [packages/ai/src/providers/transform-messages.ts](packages/ai/src/providers/transform-messages.ts)
- [packages/ai/src/providers/openai-prompt-cache.ts](packages/ai/src/providers/openai-prompt-cache.ts)
- [packages/ai/src/providers/openai-responses.ts](packages/ai/src/providers/openai-responses.ts)
- [packages/ai/src/providers/register-builtins.ts](packages/ai/src/providers/register-builtins.ts)
- [packages/ai/src/utils/event-stream.ts](packages/ai/src/utils/event-stream.ts)
</details>

# Nine Providers, One Interface: What Must Every Adapter Guarantee?

The `packages/ai/src/providers/` directory contains adapters for nine distinct AI backends: Anthropic (native messages API), OpenAI (completions, responses, and Codex variants), Azure OpenAI Responses, Google AI, Google Vertex, Mistral, Amazon Bedrock, Cloudflare, and a faux test provider. Each adapter exposes a narrow, stable interface — `stream` and `streamSimple` — that the rest of the codebase calls without needing to know which provider sits underneath.

This page asks the Socratic questions: what is the minimum a provider must guarantee, where do individual adapters diverge from the common mold, and what does the existence of a faux provider reveal about how the system is tested and reasoned about?

---

## What Is the Common Contract?

### The StreamFunction type

Every provider adapter reduces to a single type:

```typescript
// packages/ai/src/types.ts:206-210
export type StreamFunction<TApi extends Api = Api, TOptions extends StreamOptions = StreamOptions> = (
  model: Model<TApi>,
  context: Context,
  options?: TOptions,
) => AssistantMessageEventStream;
```

Three inputs — typed model, conversation context, options — and one output: an `AssistantMessageEventStream`. No exceptions thrown from the call site. All failure paths are encoded as stream events.

Each provider registers two functions:

| Function | Options type | Purpose |
|---|---|---|
| `stream` | Provider-specific (e.g., `AnthropicOptions`, `BedrockOptions`) | Full control — enables provider-specific knobs |
| `streamSimple` | `SimpleStreamOptions` | Portable subset: `reasoning` level, budgets, standard options |

`streamSimple` is the portability surface. A caller using `streamSimple` can switch providers without learning each provider's native option vocabulary.

Sources: [packages/ai/src/types.ts:192-210]()

### The event stream protocol

`AssistantMessageEventStream` is a push-based async iterable. All adapters must emit events in a defined order:

```
start → (text_start → text_delta* → text_end)*
      → (thinking_start → thinking_delta* → thinking_end)*
      → (toolcall_start → toolcall_delta* → toolcall_end)*
      → done | error
```

The `done` event carries the final `AssistantMessage`; the `error` event carries a partial one with `stopReason: "error" | "aborted"` and an `errorMessage`.

Crucially, providers never throw. If an error occurs mid-stream, the adapter catches it, sets `stopReason`, and emits `error` before calling `stream.end()`. This invariant is verified by looking at both the Anthropic and Bedrock adapters:

```typescript
// packages/ai/src/providers/anthropic.ts:692-703
} catch (error) {
  for (const block of output.content) {
    delete (block as { index?: number }).index;
    delete (block as { partialJson?: string }).partialJson;
  }
  output.stopReason = options?.signal?.aborted ? "aborted" : "error";
  output.errorMessage = error instanceof Error ? error.message : JSON.stringify(error);
  stream.push({ type: "error", reason: output.stopReason, error: output });
  stream.end();
}
```

The Bedrock adapter does the same: `stream.push({ type: "error" ... }); stream.end()` (amazon-bedrock.ts:259-262). The faux provider matches this too (faux.ts:311-323 for abort handling; faux.ts:381-385 for normal error propagation). The pattern is consistent across all three providers inspected.

Sources: [packages/ai/src/utils/event-stream.ts:69-82](), [packages/ai/src/providers/anthropic.ts:692-703](), [packages/ai/src/providers/amazon-bedrock.ts:253-263]()

### Shared usage shape

Every adapter produces the same `usage` structure on every `AssistantMessage`:

```typescript
usage: {
  input: number;
  output: number;
  cacheRead: number;
  cacheWrite: number;
  totalTokens: number;
  cost: { input, output, cacheRead, cacheWrite, total };
}
```

If a provider does not report cache tokens (e.g., an older API), those fields remain 0. The shape is always present — consumers never need to guard for `undefined`.

Sources: [packages/ai/src/providers/anthropic.ts:455-471](), [packages/ai/src/providers/amazon-bedrock.ts:97-113]()

### What `transformMessages` does for every adapter

Before serializing messages to any provider's wire format, both Anthropic and Bedrock call `transformMessages`. This shared pre-processing step handles:

1. **Image downgrade**: if the target model does not include `"image"` in `model.input`, all images in user and tool-result messages are replaced with placeholder text.
2. **Cross-model thinking stripping**: redacted or signed thinking blocks are dropped or converted to plain text when replaying into a different model than the one that generated them.
3. **Tool call ID normalization**: OpenAI Responses API generates IDs that are 450+ characters with `|` characters; Anthropic requires `^[a-zA-Z0-9_-]+` (max 64 chars). `normalizeToolCallId` re-encodes them.
4. **Orphaned tool call repair**: if an assistant message ended with tool calls but no matching results exist (e.g., after an aborted request), synthetic `toolResult` messages with `isError: true` are inserted so the conversation replay satisfies provider requirements.

```typescript
// packages/ai/src/providers/transform-messages.ts:64-67
export function transformMessages<TApi extends Api>(
  messages: Message[],
  model: Model<TApi>,
  normalizeToolCallId?: (id: string, model: Model<TApi>, source: AssistantMessage) => string,
): Message[]
```

Sources: [packages/ai/src/providers/transform-messages.ts:64-220]()

---

## Where Adapters Diverge

### Amazon Bedrock: Node-only constraint

Bedrock is the only adapter with an explicit Node.js/Bun environment check:

```typescript
// packages/ai/src/providers/amazon-bedrock.ts:141-176
if (typeof process !== "undefined" && (process.versions?.node || process.versions?.bun)) {
  // ... region resolution, proxy setup, HTTP handler config
} else {
  // Non-Node environment (browser): fall back to us-east-1
  config.region = configuredRegion || ... || "us-east-1";
}
```

Several capabilities only work in Node:

- **HTTP proxy support**: `NodeHttpHandler` wraps proxy agents for HTTP(S) tunneling. Without this, Bedrock traffic cannot be routed through corporate proxies.
- **AWS credential chain**: profile-based auth (`AWS_PROFILE`) relies on reading `~/.aws/config`, which is not available in the browser.
- **SigV4 signing**: the AWS SDK's default signing uses Node-native crypto.

In the browser, the adapter falls back to `us-east-1` and relies on externally injected credentials. This is the only adapter with such explicit runtime branching.

Additionally, Bedrock has a `setBedrockProviderModule` escape hatch in `register-builtins.ts` (line 125-130), which allows the Bedrock implementation to be overridden by an external module — useful for environments where dynamic import of the AWS SDK is impractical.

Sources: [packages/ai/src/providers/amazon-bedrock.ts:141-176](), [packages/ai/src/providers/register-builtins.ts:118-130]()

### Bedrock's content-block start divergence

Unlike Anthropic, where `contentBlockStart` events are sent for text blocks, Bedrock omits them for text. The adapter compensates by lazily creating text blocks on the first `contentBlockDelta`:

```typescript
// packages/ai/src/providers/amazon-bedrock.ts:379-391
if (delta?.text !== undefined) {
  // If no text block exists yet, create one,
  // as handleContentBlockStart is not sent for text blocks
  if (!block) {
    const newBlock: Block = { type: "text", text: "", index: contentBlockIndex };
    output.content.push(newBlock);
    ...
    stream.push({ type: "text_start", ... });
  }
```

This is a per-adapter normalization: the event consumer always sees `text_start` before `text_delta`, regardless of the wire protocol.

Sources: [packages/ai/src/providers/amazon-bedrock.ts:379-395]()

### GitHub Copilot: custom dynamic headers

The GitHub Copilot provider runs on top of the Anthropic messages API (same wire format) but requires several additional headers that must be computed per-request, not just once at client construction. These are produced in `github-copilot-headers.ts`:

```typescript
// packages/ai/src/providers/github-copilot-headers.ts:23-37
export function buildCopilotDynamicHeaders(params: {
  messages: Message[];
  hasImages: boolean;
}): Record<string, string> {
  const headers: Record<string, string> = {
    "X-Initiator": inferCopilotInitiator(params.messages),
    "Openai-Intent": "conversation-edits",
  };
  if (params.hasImages) {
    headers["Copilot-Vision-Request"] = "true";
  }
  return headers;
}
```

`X-Initiator` is `"user"` when the last message is a user turn, `"agent"` otherwise. `Copilot-Vision-Request: true` is added only when images are present in the conversation. These headers are merged into the Anthropic client's `defaultHeaders` at each invocation:

```typescript
// packages/ai/src/providers/anthropic.ts:484-490
if (model.provider === "github-copilot") {
  const hasImages = hasCopilotVisionInput(context.messages);
  copilotDynamicHeaders = buildCopilotDynamicHeaders({
    messages: context.messages,
    hasImages,
  });
}
```

Bearer token auth (not API key auth) is used for the Copilot client, with `authToken: apiKey` instead of `apiKey: apiKey`.

Sources: [packages/ai/src/providers/github-copilot-headers.ts:1-37](), [packages/ai/src/providers/anthropic.ts:483-506]()

### OpenAI prompt cache: a 64-character key constraint

OpenAI prompt caching uses a cache key that must not exceed 64 Unicode code points. The `openai-prompt-cache.ts` utility enforces this:

```typescript
// packages/ai/src/providers/openai-prompt-cache.ts:1-8
export const OPENAI_PROMPT_CACHE_KEY_MAX_LENGTH = 64;

export function clampOpenAIPromptCacheKey(key: string | undefined): string | undefined {
  if (key === undefined) return undefined;
  const chars = Array.from(key);
  if (chars.length <= OPENAI_PROMPT_CACHE_KEY_MAX_LENGTH) return key;
  return chars.slice(0, OPENAI_PROMPT_CACHE_KEY_MAX_LENGTH).join("");
}
```

Note that `Array.from(key)` iterates Unicode code points (not UTF-16 code units), which correctly handles multi-byte characters like emoji. This function is called by `openai-responses.ts` when constructing the OpenAI client with a session-based cache key.

Sources: [packages/ai/src/providers/openai-prompt-cache.ts:1-8]()

### Anthropic: stealth mode and OAuth identity headers

The Anthropic adapter has a "stealth mode" for OAuth token users. When the API key starts with `sk-ant-oat`, the adapter identifies itself as Claude Code:

```typescript
// packages/ai/src/providers/anthropic.ts:844-863
if (isOAuthToken(apiKey)) {
  const client = new Anthropic({
    ...
    defaultHeaders: mergeHeaders({
      "anthropic-beta": ["claude-code-20250219", "oauth-2025-04-20", ...betaFeatures].join(","),
      "user-agent": `claude-cli/${claudeCodeVersion}`,
      "x-app": "cli",
    }, ...),
  });
  return { client, isOAuthToken: true };
}
```

When `isOAuth` is true, tool names are translated to Claude Code's canonical casing (`Read`, `Write`, `Edit`, `Bash`, etc.) before being sent, and reversed when received. This allows the OAuth path to impersonate Claude Code's tool namespace exactly.

Sources: [packages/ai/src/providers/anthropic.ts:69-106](), [packages/ai/src/providers/anthropic.ts:844-864]()

### Cache control placement: last-block injection

Both Anthropic and Bedrock apply cache control markers at the same logical position — the last content block of the last user message in the conversation. For Anthropic this means appending `cache_control: { type: "ephemeral" }` to the final content block. For Bedrock this means appending a `cachePoint` block after the last user content block. Neither provider receives explicit per-block cache annotations from the caller; the adapters inject them deterministically.

Sources: [packages/ai/src/providers/anthropic.ts:1136-1158](), [packages/ai/src/providers/amazon-bedrock.ts:762-773]()

---

## Provider × Feature Divergence Matrix

| Feature | Anthropic | Bedrock | OpenAI Responses | Copilot | Faux |
|---|---|---|---|---|---|
| Wire format | SSE (own decoder) | AWS SDK stream | OpenAI SDK stream | Anthropic SDK | In-memory |
| Auth | API key / Bearer / OAuth | SigV4 / Bearer token | API key | Bearer (OAuth) | None |
| Cache control | `cache_control` on blocks | `cachePoint` blocks | Session key (24h / default) | Via Anthropic | Simulated |
| Thinking | adaptive / budget / disabled | adaptive / budget (Claude only) | `reasoningEffort` string | Via Anthropic | Pass-through |
| Proxy support | Via `baseUrl` | `NodeHttpHandler` only | Via `baseUrl` | Via `baseUrl` | N/A |
| Custom headers | `defaultHeaders` merge | Not supported (SDK auth) | Client headers | Dynamic per-request | N/A |
| Node-only | No | Yes | No | No | No |
| Vision detection | `model.input` check | `model.input` check | `model.input` check | `hasCopilotVisionInput` + header | Supported |

---

## What the Faux Provider Reveals About Testability

### What problem does the faux provider solve?

Testing code that calls a real LLM is slow, expensive, and non-deterministic. But replacing the LLM with a mock at the HTTP layer means you're testing the wrong boundary — the real interface is the `AssistantMessageEventStream`, not HTTP responses.

The faux provider tests the streaming contract from the inside. It registers as a real provider via `registerApiProvider`, meaning callers cannot distinguish it from Anthropic at the call site. It emits the exact same event sequence: `start`, `text_start`, `text_delta` chunks, `text_end`, `done`. The chunk size is randomized between `min` and `max` token sizes to shake out partial-text handling.

### What the faux provider exposes

```typescript
// packages/ai/src/providers/faux.ts:96-101
export type FauxResponseFactory = (
  context: Context,
  options: StreamOptions | undefined,
  state: { callCount: number },
  model: Model<string>,
) => AssistantMessage | Promise<AssistantMessage>;
```

Tests can pass either a prebuilt `AssistantMessage` or a factory function. The factory receives the full `Context` and call count, so tests can assert on what the provider received (e.g., the accumulated tool results in `context.messages`), return different responses on each invocation, and simulate stateful multi-turn conversations.

The faux provider also simulates prompt caching:

```typescript
// packages/ai/src/providers/faux.ts:215-225
const previousPrompt = promptCache.get(sessionId);
if (previousPrompt) {
  const cachedChars = commonPrefixLength(previousPrompt, promptText);
  cacheRead = estimateTokens(previousPrompt.slice(0, cachedChars));
  cacheWrite = estimateTokens(promptText.slice(cachedChars));
  input = Math.max(0, promptTokens - cacheRead);
}
```

It tracks session prompts in memory and computes cache hits based on common prefix length — a faithful-enough model to test cache-aware billing logic without a real API.

Token throughput can be rate-limited via `tokensPerSecond` to test streaming cancellation and abort handling. The abort signal is checked before every content block, making this a strong regression harness for abort-mid-stream edge cases.

Sources: [packages/ai/src/providers/faux.ts:96-101](), [packages/ai/src/providers/faux.ts:201-239](), [packages/ai/src/providers/faux.ts:296-389]()

---

## Lifecycle of a Streaming Request

```text
Caller
  │
  ▼
stream(model, context, options)          ← StreamFunction signature
  │
  ├─ transformMessages()                 ← shared pre-processing (image downgrade,
  │                                         thinking strip, tool ID normalization)
  │
  ├─ Provider-specific client setup
  │   Anthropic  → new Anthropic({ apiKey|authToken, defaultHeaders, betaFeatures })
  │   Bedrock    → new BedrockRuntimeClient({ region, credentials, requestHandler })
  │   OpenAI     → new OpenAI({ apiKey, baseURL, defaultHeaders })
  │   Faux       → in-memory queue
  │
  ├─ Wire call (SSE / AWS SDK stream / OpenAI stream / microtask)
  │
  ├─ Event normalization loop:
  │   content_block_start / delta / stop → text_start / text_delta / text_end
  │                                      → toolcall_start / toolcall_delta / toolcall_end
  │                                      → thinking_start / thinking_delta / thinking_end
  │
  └─ stream.push(done | error) + stream.end()
```

The normalization loop is the largest per-provider divergence point. Anthropic uses a hand-rolled SSE decoder (`iterateSseMessages` → `iterateAnthropicEvents`). Bedrock drives `for await (const item of response.stream!)` over the AWS SDK's async iterable. OpenAI responses use the OpenAI SDK's streaming client. All three ultimately produce the same event vocabulary before the events reach the caller.

Sources: [packages/ai/src/providers/anthropic.ts:347-445](), [packages/ai/src/providers/amazon-bedrock.ts:213-241]()

---

## Summary

Every adapter in `packages/ai/src/providers/` must satisfy four non-negotiable guarantees: return an `AssistantMessageEventStream` synchronously, emit all failures through the stream (never throw), populate the standard `usage` shape, and apply `transformMessages` before serializing conversation history. The shared `stream` / `streamSimple` pair is the boundary that makes providers interchangeable.

Where adapters diverge is in exactly the places forced by their host platforms: Bedrock's Node-only AWS SDK and proxy model, GitHub Copilot's per-request dynamic headers and bearer auth, Anthropic's OAuth stealth identity and per-block `cache_control`, and OpenAI's 64-character prompt cache key limit. The faux provider, by contrast, eliminates all of these specifics and reduces the adapter to its logical core — a push-based, abort-aware, cache-simulating event emitter — proving that the `StreamFunction` contract is both necessary and sufficient for testing the entire streaming pipeline without any real network call.

Sources: [packages/ai/src/providers/faux.ts:391-499](), [packages/ai/src/types.ts:199-210]()
