# Hyper-Extract Documentation > Reference for the Hyper-Extract LLM knowledge extraction framework: CLI (`he`), Python API (`Template`, AutoTypes, `create_client`), YAML templates, extraction methods, and Knowledge Abstract lifecycle. This is a Grok-Wiki source-grounded repository documentation set. Use the complete Markdown link when an agent needs the full repo context. ## Context Links - [Complete Markdown docs](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/llms-full.txt) - [Complete Markdown alias](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf.md) - [Human interactive docs](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf) - [GitHub repository](https://github.com/yifanfeng97/Hyper-Extract) ## Repository - Repository: yifanfeng97/Hyper-Extract - Generated: 2026-06-18T20:59:59.470Z - Updated: 2026-06-18T21:02:44.802Z - Runtime: Grok CLI - Format: Documentation - Pages: 22 ## Pages - [Overview](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/01-overview.md): What Hyper-Extract exposes (CLI `he`, Python `Template` API, 8 AutoTypes, 80+ YAML presets, 9 extraction methods), runtime assumptions (Python 3.11+, structured LLM output), and the shortest path from install to a queryable Knowledge Abstract. - [Installation](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/02-installation.md): Install via `uv tool install hyperextract` or `uv pip install hyperextract`, Python version constraints, optional provider extras (`anthropic`, `google`, `all`), and first-run configuration prerequisites. - [Quickstart](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/03-quickstart.md): First successful extraction: `he config init`, `he parse` with a preset template, `he search` / `he show`, and the equivalent Python `Template.create` + `feed_text` path using the Tesla biography example. - [Knowledge Abstracts](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/04-knowledge-abstracts.md): The on-disk Knowledge Abstract (KA) model: `data.json`, `metadata.json`, and `index/` layout; lifecycle methods (`parse`, `feed_text`, `dump`, `load`, `build_index`); and incremental evolution via `he feed`. - [Auto-Types](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/05-auto-types.md): Eight strongly-typed extraction primitives (`AutoModel`, `AutoList`, `AutoSet`, `AutoGraph`, `AutoHypergraph`, `AutoTemporalGraph`, `AutoSpatialGraph`, `AutoSpatioTemporalGraph`): structure, merge behavior, indexing, and when to pick each type. - [Templates vs methods](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/06-templates-vs-methods.md): Domain YAML templates (`general/biography_graph`, `finance/earnings_summary`, etc.) versus algorithm-driven method templates (`method/light_rag`, `method/atom`); language requirements (`--lang` for templates, English-only for methods); and selection criteria. - [Provider system](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/07-provider-system.md): BYOC/BYOK provider model: `openai`, `bailian`, and `vllm` presets; `provider:model@url` shorthand; `CompatibleEmbeddings` for non-OpenAI endpoints; and verified model compatibility requirements (`json_schema` / function calling). - [Configure providers](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/08-configure-providers.md): Set up LLM and embedder clients via `he config init`, per-service `he config llm` / `he config embedder`, environment variables, or programmatic `create_client()` for mixed cloud and local vLLM deployments. - [Extract and evolve knowledge](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/09-extract-and-evolve-knowledge.md): Run `he parse` (single file, directory of `.md`/`.txt`, or stdin), choose templates interactively or by ID, control indexing with `--no-index`, append documents with `he feed`, and rebuild indexes with `he build-index`. - [Search, chat, and visualize](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/10-search-chat-and-visualize.md): Query Knowledge Abstracts with `he search` and `he talk` (single query or `-i` interactive mode), inspect stats via `he info`, and render graphs through OntoSight with `he show` or `AutoType.show()`. - [Create custom templates](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/11-create-custom-templates.md): Author domain YAML templates: type selection, field and identifier design, multilingual `language` blocks, merge strategies, and validation workflow per the design guide and preset base templates. - [Use extraction methods](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/12-use-extraction-methods.md): Invoke algorithm templates via `he parse -m light_rag` or `Template.create("method/hyper_rag")`; direct method classes (`Light_RAG`, `Atom`, etc.); and method-specific kwargs such as `observation_time` for temporal extractors. - [Template design skills](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/13-template-design-skills.md): Agent-assisted template authoring with `hyperextract-skills`: brainstorm requirements, record/graph designers, yaml-validator rules, template-optimizer fixes, and multilingual conversion workflows. - [CLI reference](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/14-cli-reference.md): Complete `he` command surface: `parse`, `feed`, `build-index`, `search`, `talk`, `show`, `info`, `list template`, `list method`, `config` subcommands, flags, defaults, exit conditions, and input/output contracts. - [Python API reference](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/15-python-api-reference.md): Exported SDK: `Template.create/get/list`, `BaseAutoType` lifecycle (`parse`, `feed_text`, `search`, `chat`, `dump`, `load`, `build_index`, `show`), `create_client` / `create_llm` / `create_embedder` / `get_client`, and logging helpers. - [Configuration reference](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/16-configuration-reference.md): `~/.he/config.toml` schema for `[llm]` and `[embedder]`, provider presets and default models, environment variable precedence (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `HYPER_EXTRACT_LOG_LEVEL`, `HYPER_EXTRACT_LOG_FILE`), and validation rules. - [Template schema reference](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/17-template-schema-reference.md): YAML template fields (`language`, `name`, `type`, `tags`, `description`, `output`, `guideline`, `identifiers`, `options`, `display`), valid autotypes and field types, merge strategies, and identifier patterns. - [Extraction methods reference](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/18-extraction-methods-reference.md): Registered methods (`graph_rag`, `light_rag`, `hyper_rag`, `hypergraph_rag`, `cog_rag`, `itext2kg`, `itext2kg_star`, `kg_gen`, `atom`): autotype output, descriptions, registry API, and constructor kwargs. - [Tesla biography recipe](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/19-tesla-biography-recipe.md): End-to-end CLI and Python workflow using `examples/en/tesla.md` with `general/biography_graph`: parse, visualize, semantic search, and Q&A with expected artifacts under the output directory. - [Method demos](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/20-method-demos.md): Runnable scripts under `examples/en/methods/` for each extraction engine: instantiate method classes, `feed_text`, `chat`, and `show` with LangChain clients and dotenv configuration. - [Troubleshooting](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/21-troubleshooting.md): Common failure modes: missing API keys, vLLM `base_url` requirements, `--lang` required for knowledge templates, empty output directory conflicts, missing `data.json` or index for `search`/`talk`, template resolution errors, and debug logging via `HYPER_EXTRACT_LOG_LEVEL`. - [Contributing](https://grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/pages/22-contributing.md): Development setup with `uv`, running `pytest` and coverage, CI matrix (Python 3.11–3.12, Ubuntu/macOS), lint workflow, optional integration tests, and how to add templates or register new extraction methods. ## Source Files - `.env.example` - `.github/workflows/integration.yml` - `.github/workflows/lint.yml` - `.github/workflows/test.yml` - `.python-version` - `examples/en/methods/atom_demo.py` - `examples/en/methods/graph_rag_demo.py` - `examples/en/methods/hyper_rag_demo.py` - `examples/en/methods/kg_gen_demo.py` - `examples/en/methods/light_rag_demo.py` - `examples/en/tesla_question.md` - `examples/en/tesla.md` - `hyperextract-skills/graph-designer/SKILL.md` - `hyperextract-skills/README.md` - `hyperextract-skills/SKILL.md` - `hyperextract-skills/template-optimizer/SKILL.md` - `hyperextract-skills/yaml-validator/SKILL.md` - `hyperextract/__init__.py` - `hyperextract/cli/__main__.py` - `hyperextract/cli/cli.py` - `hyperextract/cli/commands/config.py` - `hyperextract/cli/commands/list.py` - `hyperextract/cli/config.py` - `hyperextract/cli/README.md` - `hyperextract/cli/utils.py` - `hyperextract/methods/rag/graph_rag.py` - `hyperextract/methods/rag/hyper_rag.py` - `hyperextract/methods/rag/light_rag.py` - `hyperextract/methods/registry.py` - `hyperextract/methods/typical/atom.py` - `hyperextract/methods/typical/kg_gen.py` - `hyperextract/templates/DESIGN_GUIDE.md` - `hyperextract/templates/presets/finance/earnings_summary.yaml` - `hyperextract/templates/presets/general/base_graph.yaml` - `hyperextract/templates/presets/general/biography_graph.yaml` - `hyperextract/templates/README.md` - `hyperextract/types/__init__.py` - `hyperextract/types/base.py` - `hyperextract/types/graph.py` - `hyperextract/types/hypergraph.py` - `hyperextract/types/model.py` - `hyperextract/types/spatial_graph.py` - `hyperextract/types/spatio_temporal_graph.py` - `hyperextract/types/temporal_graph.py` - `hyperextract/utils/client.py` - `hyperextract/utils/logging.py` - `hyperextract/utils/template_engine/factory.py` - `hyperextract/utils/template_engine/gallery.py` - `hyperextract/utils/template_engine/parsers/loader.py` - `hyperextract/utils/template_engine/parsers/schemas/base.py` - `hyperextract/utils/template_engine/parsers/schemas/graph.py` - `hyperextract/utils/template_engine/parsers/schemas/naive.py` - `hyperextract/utils/template_engine/template.py` - `pyproject.toml` - `README.md` - `tests/cli/test_verbose.py` - `tests/conftest.py`