# Configure providers > Set up LLM and embedder clients via `he config init`, per-service `he config llm` / `he config embedder`, environment variables, or programmatic `create_client()` for mixed cloud and local vLLM deployments. - Repository: yifanfeng97/Hyper-Extract - GitHub: https://github.com/yifanfeng97/Hyper-Extract - Human docs: https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf - Complete Markdown: https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/llms-full.txt ## Source Files - `hyperextract/cli/commands/config.py` - `hyperextract/cli/config.py` - `hyperextract/utils/client.py` - `hyperextract/cli/README.md` - `.env.example` --- --- title: "Configure providers" description: "Set up LLM and embedder clients via `he config init`, per-service `he config llm` / `he config embedder`, environment variables, or programmatic `create_client()` for mixed cloud and local vLLM deployments." --- Hyper-Extract resolves LLM and embedder clients from `~/.he/config.toml`, environment-variable fallbacks, or the Python factory API (`create_client`, `create_llm`, `create_embedder`, `get_client`). CLI commands such as `he parse` call `validate_config()` before running and exit if credentials or vLLM `base_url` values are missing. ## Configuration surfaces | Surface | Entry point | Persists to disk | Typical use | |---------|-------------|------------------|---------------| | Interactive CLI | `he config init` | Yes (`~/.he/config.toml`) | First-time setup | | Per-service CLI | `he config llm`, `he config embedder` | Yes | Mixed providers, model overrides | | Environment variables | `OPENAI_API_KEY`, `OPENAI_BASE_URL` | No | CI/CD, temporary overrides | | Python factory | `create_client()`, `get_client()` | No (reads file if using `get_client`) | Scripts, notebooks, custom deployments | ```mermaid flowchart LR subgraph cli ["CLI layer"] init["he config init"] llmCmd["he config llm"] embCmd["he config embedder"] end subgraph store ["~/.he/config.toml"] llmSec["[llm]"] embSec["[embedder]"] end subgraph resolve ["ConfigManager.get_*_config()"] fileVal["File values"] envFallback["OPENAI_API_KEY / OPENAI_BASE_URL"] preset["PROVIDER_PRESETS base_url"] end subgraph runtime ["Client factory"] getClient["get_client()"] createClient["create_client()"] chat["ChatOpenAI"] embed["OpenAIEmbeddings / CompatibleEmbeddings"] end init --> store llmCmd --> store embCmd --> store store --> fileVal fileVal --> envFallback envFallback --> preset preset --> getClient createClient --> chat createClient --> embed getClient --> chat getClient --> embed ``` `Template.create()` loads clients from `get_client()` when `llm_client` and `embedder` are omitted, so file-based configuration applies to both CLI and Python template workflows. ## Provider presets Three built-in presets supply default models and base URLs. The `vllm` preset has no defaults — you must set `model` and `base_url` explicitly. | Provider | Default LLM | Default embedder | Default `base_url` | |----------|-------------|------------------|--------------------| | `openai` | `gpt-4o-mini` | `text-embedding-3-small` | `https://api.openai.com/v1` | | `bailian` | `qwen3.6-plus` | `text-embedding-v4` | `https://dashscope.aliyuncs.com/compatible-mode/v1` | | `vllm` | — | — | — (required) | Interactive `he config init` also offers a **custom** OpenAI-compatible option. It behaves like a provider without preset defaults: you supply model names and `base_url` values manually. ## CLI setup Run interactive setup or pass flags for non-interactive configuration. ```bash Interactive he config init ``` ```bash OpenAI one-liner he config init -p openai -k sk-your-key ``` ```bash Bailian one-liner he config init -p bailian -k sk-your-key ``` ```bash API key only (OpenAI defaults) he config init -k sk-your-key ``` Quick mode (`-p` + `-k`) writes both `[llm]` and `[embedder]` sections using preset default models. For `vllm`, run interactive init or configure each service separately. Use per-service commands when LLM and embedder run on different providers or endpoints. ```bash # LLM only he config llm -p bailian -k sk-your-key -m qwen-plus # Embedder only he config embedder -p vllm -u http://localhost:8001/v1 -k dummy -m BAAI/bge-m3 ``` Provider preset: `openai`, `bailian`, or `vllm`. API key for the service. vLLM accepts `dummy` when the server does not enforce keys. Model name served by the endpoint. OpenAI-compatible API root (for example `http://localhost:8000/v1`). Required for `vllm`. Display current settings for the service without writing changes. Reset the service section to defaults and save. ```bash he config show he config llm --show he config embedder --show ``` `he config show` prints a table with provider, model, masked API key, and base URL for both services. ### Config file format `he config init` and per-service commands persist settings to `~/.he/config.toml` (Windows: `%USERPROFILE%\.he\config.toml`). ```toml [llm] provider = "bailian" model = "qwen3.6-plus" api_key = "sk-your-api-key" base_url = "" [embedder] provider = "vllm" model = "BAAI/bge-m3" api_key = "dummy" base_url = "http://localhost:8001/v1" ``` Empty `base_url` fields resolve from the provider preset at runtime. For `vllm`, an empty `base_url` fails validation. ## Environment variables `.env.example` documents the two credential-related variables: ```bash OPENAI_API_KEY=sk-your-api-key-here OPENAI_BASE_URL=https://api.openai.com/v1 ``` | Variable | Applies to | Resolution | |----------|------------|------------| | `OPENAI_API_KEY` | `[llm].api_key`, `[embedder].api_key` | Used when the corresponding config field is empty | | `OPENAI_BASE_URL` | `[llm].base_url`, `[embedder].base_url` | Used when the corresponding config field is empty, before preset resolution | Config file values take precedence over environment variables. Empty fields in `config.toml` fall back to `OPENAI_API_KEY` and `OPENAI_BASE_URL`, not the other way around. `create_llm()` and `create_embedder()` also read `OPENAI_API_KEY` when no `api_key` is passed in the spec or kwargs. Logging is controlled separately: | Variable | Purpose | |----------|---------| | `HYPER_EXTRACT_LOG_LEVEL` | Root log level (`DEBUG`, `INFO`, `WARNING`, `ERROR`) | | `HYPER_EXTRACT_LOG_FILE` | Optional log file path | ## Programmatic client factory The SDK exports four factory functions from `hyperextract`: ```python from hyperextract import create_client, create_llm, create_embedder, get_client ``` | Function | Returns | Config source | |----------|---------|---------------| | `create_client()` | `(llm, embedder)` tuple | Arguments only | | `create_llm()` | `ChatOpenAI` | Spec string or dict | | `create_embedder()` | `OpenAIEmbeddings` or `CompatibleEmbeddings` | Spec string or dict | | `get_client()` | `(llm, embedder)` tuple | `~/.he/config.toml` (or custom path) | ### String shorthand Specs use `provider:model@url` syntax: | Format | Example | Behavior | |--------|---------|----------| | `provider` | `"bailian"` | Preset defaults for model and URL | | `provider:model` | `"bailian:qwen-plus"` | Override model, keep preset URL | | `provider:model@url` | `"vllm:Qwen3.5-9B@http://localhost:8000/v1"` | Full manual specification | Dict specs pass through with the same keys: `provider`, `model`, `base_url`, `api_key`. ### Deployment patterns ```python llm, emb = create_client("openai", api_key="sk-xxx") # or llm, emb = create_client("bailian", api_key="sk-xxx") ``` Both services share the provider preset defaults. ```python llm, emb = create_client( llm="vllm:Qwen3.5-9B@http://localhost:8000/v1", embedder="vllm:bge-m3@http://localhost:8001/v1", api_key="dummy", ) ``` LLM and embedder typically run on separate ports. See `examples/providers/vllm_demo.py`. ```python llm, emb = create_client( llm="bailian:qwen-plus", embedder="vllm:bge-m3@http://localhost:8001/v1", api_key="sk-xxx", ) ``` Cloud LLM with on-premise embeddings — a common cost/latency split. ```python llm, emb = get_client() # reads ~/.he/config.toml # or llm, emb = get_client("/path/to/config.toml") ``` Equivalent CLI setup: ```bash he config init -p bailian -k sk-xxx ``` Then use `Template.create("general/biography_graph", language="en")` without passing clients explicitly. ### Embedder selection `create_embedder()` chooses the implementation based on `base_url`: - **Official OpenAI URL** (`https://api.openai.com/v1`) → `OpenAIEmbeddings` (native tiktoken batching) - **Any other URL** → `CompatibleEmbeddings` (string-only input, conservative batch size of 10, tiktoken chunking) Non-OpenAI-compatible endpoints (Bailian, vLLM, Ollama, LiteLLM proxies) require `CompatibleEmbeddings` because most providers reject pre-tokenized integer lists. Extra kwargs on `create_client()` (for example `temperature=0.5`) forward to `ChatOpenAI`. ## Mixed deployment examples ```bash # Cloud LLM + local embedder he config llm -p bailian -k sk-your-key he config embedder -p vllm \ -u http://localhost:8001/v1 \ -k dummy \ -m BAAI/bge-m3 # Local LLM + cloud embedder he config llm -p vllm \ -u http://localhost:8000/v1 \ -k dummy \ -m Qwen/Qwen3.5-9B he config embedder -p bailian -k sk-your-key ``` ```python from hyperextract import create_client, AutoGraph llm, emb = create_client( llm="bailian", embedder="vllm:bge-m3@http://localhost:8001/v1", api_key="sk-xxx", ) graph = AutoGraph( instruction="Extract people and their relationships", llm_client=llm, embedder=emb, node_key_extractor=lambda n: n.name, edge_key_extractor=lambda e: (e.source, e.target, e.type), nodes_in_edge_extractor=lambda e: (e.source, e.target), ) ``` ## Validation and CLI enforcement `ConfigManager.validate()` checks resolved configuration before extraction commands run: | Condition | Result | |-----------|--------| | `provider == "vllm"` and empty `base_url` | Fails with `vLLM provider requires base_url.` | | Non-vLLM LLM with empty `api_key` | Fails — suggests `he config llm --api-key YOUR_KEY` | | vLLM embedder with empty `base_url` | Fails with `vLLM embedder requires base_url.` | | Non-vLLM embedder with empty `api_key` | Fails — suggests `he config embedder --api-key YOUR_KEY` | `validate_config()` in the CLI prints the error and exits with code 1. Commands that call it include `he parse`, `he feed`, `he build-index`, `he search`, and `he talk`. After configuration, confirm services respond before running extraction: ```bash curl http://localhost:8000/v1/models # vLLM LLM curl http://localhost:8001/v1/models # vLLM embedder he config show ``` ## Common failure modes ```text Error: LLM API key is not configured. Run 'he config llm --api-key YOUR_KEY' ``` Set the key via CLI or export `OPENAI_API_KEY` when the config field is empty. ```text Error: vLLM provider requires base_url. ``` Set `--base-url` on `he config llm` / `he config embedder`, or use the full `provider:model@url` shorthand in Python. ```text ValueError: Provider 'vllm' requires explicit base_url. ``` Raised by `_resolve_base_url()` when a vLLM provider has no URL in config, environment, or preset. ```text ValueError: Must provide llm=, embedder=, or provider= argument. ``` Pass a provider shorthand, separate `llm`/`embedder` specs, or use `get_client()` for file-based config. ## Next BYOC/BYOK model, `provider:model@url` shorthand, `CompatibleEmbeddings`, and verified model compatibility. Full `~/.he/config.toml` schema, defaults, and environment variable precedence rules. First extraction after `he config init`: parse, search, and visualize a Knowledge Abstract. `create_client`, `get_client`, `Template.create`, and AutoType lifecycle methods. Debug logging, template errors, and provider connection failures.