# Configure providers

> Set up LLM and embedder clients via `he config init`, per-service `he config llm` / `he config embedder`, environment variables, or programmatic `create_client()` for mixed cloud and local vLLM deployments.

- Repository: yifanfeng97/Hyper-Extract
- GitHub: https://github.com/yifanfeng97/Hyper-Extract
- Human docs: https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf
- Complete Markdown: https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/llms-full.txt

## Source Files

- `hyperextract/cli/commands/config.py`
- `hyperextract/cli/config.py`
- `hyperextract/utils/client.py`
- `hyperextract/cli/README.md`
- `.env.example`

---

---
title: "Configure providers"
description: "Set up LLM and embedder clients via `he config init`, per-service `he config llm` / `he config embedder`, environment variables, or programmatic `create_client()` for mixed cloud and local vLLM deployments."
---

Hyper-Extract resolves LLM and embedder clients from `~/.he/config.toml`, environment-variable fallbacks, or the Python factory API (`create_client`, `create_llm`, `create_embedder`, `get_client`). CLI commands such as `he parse` call `validate_config()` before running and exit if credentials or vLLM `base_url` values are missing.

## Configuration surfaces

| Surface | Entry point | Persists to disk | Typical use |
|---------|-------------|------------------|---------------|
| Interactive CLI | `he config init` | Yes (`~/.he/config.toml`) | First-time setup |
| Per-service CLI | `he config llm`, `he config embedder` | Yes | Mixed providers, model overrides |
| Environment variables | `OPENAI_API_KEY`, `OPENAI_BASE_URL` | No | CI/CD, temporary overrides |
| Python factory | `create_client()`, `get_client()` | No (reads file if using `get_client`) | Scripts, notebooks, custom deployments |

```mermaid
flowchart LR
  subgraph cli ["CLI layer"]
    init["he config init"]
    llmCmd["he config llm"]
    embCmd["he config embedder"]
  end

  subgraph store ["~/.he/config.toml"]
    llmSec["[llm]"]
    embSec["[embedder]"]
  end

  subgraph resolve ["ConfigManager.get_*_config()"]
    fileVal["File values"]
    envFallback["OPENAI_API_KEY / OPENAI_BASE_URL"]
    preset["PROVIDER_PRESETS base_url"]
  end

  subgraph runtime ["Client factory"]
    getClient["get_client()"]
    createClient["create_client()"]
    chat["ChatOpenAI"]
    embed["OpenAIEmbeddings / CompatibleEmbeddings"]
  end

  init --> store
  llmCmd --> store
  embCmd --> store
  store --> fileVal
  fileVal --> envFallback
  envFallback --> preset
  preset --> getClient
  createClient --> chat
  createClient --> embed
  getClient --> chat
  getClient --> embed
```

<Note>
`Template.create()` loads clients from `get_client()` when `llm_client` and `embedder` are omitted, so file-based configuration applies to both CLI and Python template workflows.
</Note>

## Provider presets

Three built-in presets supply default models and base URLs. The `vllm` preset has no defaults — you must set `model` and `base_url` explicitly.

| Provider | Default LLM | Default embedder | Default `base_url` |
|----------|-------------|------------------|--------------------|
| `openai` | `gpt-4o-mini` | `text-embedding-3-small` | `https://api.openai.com/v1` |
| `bailian` | `qwen3.6-plus` | `text-embedding-v4` | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
| `vllm` | — | — | — (required) |

Interactive `he config init` also offers a **custom** OpenAI-compatible option. It behaves like a provider without preset defaults: you supply model names and `base_url` values manually.

## CLI setup

<Steps>
<Step title="Initialize configuration">

Run interactive setup or pass flags for non-interactive configuration.

<CodeGroup>
```bash Interactive
he config init
```

```bash OpenAI one-liner
he config init -p openai -k sk-your-key
```

```bash Bailian one-liner
he config init -p bailian -k sk-your-key
```

```bash API key only (OpenAI defaults)
he config init -k sk-your-key
```
</CodeGroup>

Quick mode (`-p` + `-k`) writes both `[llm]` and `[embedder]` sections using preset default models. For `vllm`, run interactive init or configure each service separately.

</Step>

<Step title="Configure services individually (optional)">

Use per-service commands when LLM and embedder run on different providers or endpoints.

```bash
# LLM only
he config llm -p bailian -k sk-your-key -m qwen-plus

# Embedder only
he config embedder -p vllm -u http://localhost:8001/v1 -k dummy -m BAAI/bge-m3
```

<ParamField body="--provider" type="string">
Provider preset: `openai`, `bailian`, or `vllm`.
</ParamField>

<ParamField body="--api-key" type="string">
API key for the service. vLLM accepts `dummy` when the server does not enforce keys.
</ParamField>

<ParamField body="--model" type="string">
Model name served by the endpoint.
</ParamField>

<ParamField body="--base-url" type="string">
OpenAI-compatible API root (for example `http://localhost:8000/v1`). Required for `vllm`.
</ParamField>

<ParamField body="--show" type="boolean">
Display current settings for the service without writing changes.
</ParamField>

<ParamField body="--unset" type="boolean">
Reset the service section to defaults and save.
</ParamField>

</Step>

<Step title="Verify configuration">

```bash
he config show
he config llm --show
he config embedder --show
```

`he config show` prints a table with provider, model, masked API key, and base URL for both services.

</Step>
</Steps>

### Config file format

`he config init` and per-service commands persist settings to `~/.he/config.toml` (Windows: `%USERPROFILE%\.he\config.toml`).

```toml
[llm]
provider = "bailian"
model = "qwen3.6-plus"
api_key = "sk-your-api-key"
base_url = ""

[embedder]
provider = "vllm"
model = "BAAI/bge-m3"
api_key = "dummy"
base_url = "http://localhost:8001/v1"
```

Empty `base_url` fields resolve from the provider preset at runtime. For `vllm`, an empty `base_url` fails validation.

## Environment variables

`.env.example` documents the two credential-related variables:

```bash
OPENAI_API_KEY=sk-your-api-key-here
OPENAI_BASE_URL=https://api.openai.com/v1
```

| Variable | Applies to | Resolution |
|----------|------------|------------|
| `OPENAI_API_KEY` | `[llm].api_key`, `[embedder].api_key` | Used when the corresponding config field is empty |
| `OPENAI_BASE_URL` | `[llm].base_url`, `[embedder].base_url` | Used when the corresponding config field is empty, before preset resolution |

<Warning>
Config file values take precedence over environment variables. Empty fields in `config.toml` fall back to `OPENAI_API_KEY` and `OPENAI_BASE_URL`, not the other way around.
</Warning>

`create_llm()` and `create_embedder()` also read `OPENAI_API_KEY` when no `api_key` is passed in the spec or kwargs.

Logging is controlled separately:

| Variable | Purpose |
|----------|---------|
| `HYPER_EXTRACT_LOG_LEVEL` | Root log level (`DEBUG`, `INFO`, `WARNING`, `ERROR`) |
| `HYPER_EXTRACT_LOG_FILE` | Optional log file path |

## Programmatic client factory

The SDK exports four factory functions from `hyperextract`:

```python
from hyperextract import create_client, create_llm, create_embedder, get_client
```

| Function | Returns | Config source |
|----------|---------|---------------|
| `create_client()` | `(llm, embedder)` tuple | Arguments only |
| `create_llm()` | `ChatOpenAI` | Spec string or dict |
| `create_embedder()` | `OpenAIEmbeddings` or `CompatibleEmbeddings` | Spec string or dict |
| `get_client()` | `(llm, embedder)` tuple | `~/.he/config.toml` (or custom path) |

### String shorthand

Specs use `provider:model@url` syntax:

| Format | Example | Behavior |
|--------|---------|----------|
| `provider` | `"bailian"` | Preset defaults for model and URL |
| `provider:model` | `"bailian:qwen-plus"` | Override model, keep preset URL |
| `provider:model@url` | `"vllm:Qwen3.5-9B@http://localhost:8000/v1"` | Full manual specification |

Dict specs pass through with the same keys: `provider`, `model`, `base_url`, `api_key`.

### Deployment patterns

<Tabs>
<Tab title="Single cloud provider">

```python
llm, emb = create_client("openai", api_key="sk-xxx")
# or
llm, emb = create_client("bailian", api_key="sk-xxx")
```

Both services share the provider preset defaults.

</Tab>
<Tab title="Local vLLM (two services)">

```python
llm, emb = create_client(
    llm="vllm:Qwen3.5-9B@http://localhost:8000/v1",
    embedder="vllm:bge-m3@http://localhost:8001/v1",
    api_key="dummy",
)
```

LLM and embedder typically run on separate ports. See `examples/providers/vllm_demo.py`.

</Tab>
<Tab title="Mixed cloud + local">

```python
llm, emb = create_client(
    llm="bailian:qwen-plus",
    embedder="vllm:bge-m3@http://localhost:8001/v1",
    api_key="sk-xxx",
)
```

Cloud LLM with on-premise embeddings — a common cost/latency split.

</Tab>
<Tab title="Config file">

```python
llm, emb = get_client()  # reads ~/.he/config.toml
# or
llm, emb = get_client("/path/to/config.toml")
```

Equivalent CLI setup:

```bash
he config init -p bailian -k sk-xxx
```

Then use `Template.create("general/biography_graph", language="en")` without passing clients explicitly.

</Tab>
</Tabs>

### Embedder selection

`create_embedder()` chooses the implementation based on `base_url`:

- **Official OpenAI URL** (`https://api.openai.com/v1`) → `OpenAIEmbeddings` (native tiktoken batching)
- **Any other URL** → `CompatibleEmbeddings` (string-only input, conservative batch size of 10, tiktoken chunking)

Non-OpenAI-compatible endpoints (Bailian, vLLM, Ollama, LiteLLM proxies) require `CompatibleEmbeddings` because most providers reject pre-tokenized integer lists.

Extra kwargs on `create_client()` (for example `temperature=0.5`) forward to `ChatOpenAI`.

## Mixed deployment examples

<Tabs>
<Tab title="CLI">

```bash
# Cloud LLM + local embedder
he config llm -p bailian -k sk-your-key
he config embedder -p vllm \
  -u http://localhost:8001/v1 \
  -k dummy \
  -m BAAI/bge-m3

# Local LLM + cloud embedder
he config llm -p vllm \
  -u http://localhost:8000/v1 \
  -k dummy \
  -m Qwen/Qwen3.5-9B
he config embedder -p bailian -k sk-your-key
```

</Tab>
<Tab title="Python">

```python
from hyperextract import create_client, AutoGraph

llm, emb = create_client(
    llm="bailian",
    embedder="vllm:bge-m3@http://localhost:8001/v1",
    api_key="sk-xxx",
)

graph = AutoGraph(
    instruction="Extract people and their relationships",
    llm_client=llm,
    embedder=emb,
    node_key_extractor=lambda n: n.name,
    edge_key_extractor=lambda e: (e.source, e.target, e.type),
    nodes_in_edge_extractor=lambda e: (e.source, e.target),
)
```

</Tab>
</Tabs>

## Validation and CLI enforcement

`ConfigManager.validate()` checks resolved configuration before extraction commands run:

| Condition | Result |
|-----------|--------|
| `provider == "vllm"` and empty `base_url` | Fails with `vLLM provider requires base_url.` |
| Non-vLLM LLM with empty `api_key` | Fails — suggests `he config llm --api-key YOUR_KEY` |
| vLLM embedder with empty `base_url` | Fails with `vLLM embedder requires base_url.` |
| Non-vLLM embedder with empty `api_key` | Fails — suggests `he config embedder --api-key YOUR_KEY` |

`validate_config()` in the CLI prints the error and exits with code 1. Commands that call it include `he parse`, `he feed`, `he build-index`, `he search`, and `he talk`.

<Check>
After configuration, confirm services respond before running extraction:

```bash
curl http://localhost:8000/v1/models   # vLLM LLM
curl http://localhost:8001/v1/models   # vLLM embedder
he config show
```
</Check>

## Common failure modes

<AccordionGroup>
<Accordion title="Missing API key on cloud provider">

```text
Error: LLM API key is not configured. Run 'he config llm --api-key YOUR_KEY'
```

Set the key via CLI or export `OPENAI_API_KEY` when the config field is empty.

</Accordion>

<Accordion title="vLLM missing base_url">

```text
Error: vLLM provider requires base_url.
```

Set `--base-url` on `he config llm` / `he config embedder`, or use the full `provider:model@url` shorthand in Python.

</Accordion>

<Accordion title="Provider requires explicit base_url at resolution time">

```text
ValueError: Provider 'vllm' requires explicit base_url.
```

Raised by `_resolve_base_url()` when a vLLM provider has no URL in config, environment, or preset.

</Accordion>

<Accordion title="create_client() called with no arguments">

```text
ValueError: Must provide llm=, embedder=, or provider= argument.
```

Pass a provider shorthand, separate `llm`/`embedder` specs, or use `get_client()` for file-based config.

</Accordion>
</AccordionGroup>

## Next

<CardGroup>
<Card title="Provider system" href="/provider-system">
BYOC/BYOK model, `provider:model@url` shorthand, `CompatibleEmbeddings`, and verified model compatibility.
</Card>
<Card title="Configuration reference" href="/configuration-reference">
Full `~/.he/config.toml` schema, defaults, and environment variable precedence rules.
</Card>
<Card title="Quickstart" href="/quickstart">
First extraction after `he config init`: parse, search, and visualize a Knowledge Abstract.
</Card>
<Card title="Python API reference" href="/python-api-reference">
`create_client`, `get_client`, `Template.create`, and AutoType lifecycle methods.
</Card>
<Card title="Troubleshooting" href="/troubleshooting">
Debug logging, template errors, and provider connection failures.
</Card>
</CardGroup>
