# Helper detection

> How transpiler helpers (Babel, TypeScript/tslib, SWC) are matched by AST body shape across imported, inlined, hoisted, and minified forms; MatchContext and helper lifecycle layers.

- Repository: pionxzh/wakaru
- GitHub: https://github.com/pionxzh/wakaru
- Human docs: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b
- Complete Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/llms-full.txt

## Source Files

- `docs/helper-detection.md`
- `docs/learnings/helper-detection-pattern-engine.md`
- `crates/core/src/rules/helper_matcher.rs`
- `crates/core/src/rules/match_context.rs`
- `crates/core/src/rules/transpiler_helper_utils/mod.rs`

---

---
title: "Helper detection"
description: "How transpiler helpers (Babel, TypeScript/tslib, SWC) are matched by AST body shape across imported, inlined, hoisted, and minified forms; MatchContext and helper lifecycle layers."
---

Wakaru recovers transpiler runtime helpers by matching **AST body shape** and **import paths**, not function names. Detection runs in `transpiler_helper_utils` (`collect_transpiler_helpers`, `LocalHelperContext`), binding identity is tracked via `MatchContext` and `helper_matcher.rs`, and per-helper restoration rules consume the cached context during the `Helpers` pipeline stage.

## Four helper forms in bundled output

Transpilers (Babel, TypeScript/tslib, SWC) inject runtime helpers that appear in bundled JavaScript in four shapes:

| Form | Example | Detection strategy |
|---|---|---|
| **Imported** | `require("@babel/runtime/helpers/interopRequireDefault")` | Import path in `paths.rs` maps to `TranspilerHelperKind` |
| **Inlined** | `function _x(obj) { return obj && obj.__esModule ? obj : { default: obj }; }` at module top | Body-shape matcher (`fn(&Function) -> bool`) |
| **Hoisted** | Shared webpack module accessed via `require(42)`; name lost | Body shape after unpack; cross-module facts link numeric refs |
| **Minified** | `function(e){return e&&e.__esModule?e:{default:e}}` | Same body-shape matchers; names ignored, `SyntaxContext` preserved |

<Note>
esbuild bundler helpers (`__commonJS`, `__esm`, `__toESM`, `__toCommonJS`) are handled in the unpacker, not by transpiler helper detection.
</Note>

## Three-layer architecture

Helper recovery splits into three intentional layers — more structure than scattered tuple checks, without a general AST pattern DSL.

```mermaid
flowchart TB
  subgraph detect ["Detection"]
    COLLECT["collect_transpiler_helpers / LocalHelperContext"]
    MATCHERS["Rule-local body-shape matchers"]
    PATHS["paths.rs import-path tables"]
  end

  subgraph identity ["Binding identity"]
    MC["MatchContext — named binding slots"]
    HM["helper_matcher.rs — BindingKey primitives"]
  end

  subgraph lifecycle ["Lifecycle"]
    LC["lifecycle.rs — ref tracking, dependency graph"]
    RESTORE["Per-helper VisitMut rules"]
  end

  PATHS --> COLLECT
  MATCHERS --> COLLECT
  MC --> MATCHERS
  HM --> MATCHERS
  HM --> LC
  COLLECT --> RESTORE
  LC --> RESTORE
```

### Layer 1: `MatchContext` (binding-aware matching)

`MatchContext` (`match_context.rs`) maps function parameters and discovered locals to **named slots**, then checks binding identity with `SyntaxContext`:

<ResponseField name="from_params" type="fn(&Function, &[&str]) -> Option<MatchContext>">
Extracts simple-ident params into named slots. Returns `None` if param count or shape does not match.
</ResponseField>

<ResponseField name="is_binding" type="fn(&Expr, &str) -> bool">
True when `expr` is an identifier matching the slot's `(sym, ctxt)`.
</ResponseField>

<ResponseField name="is_member_of" type="fn(&Expr, &str, &str) -> bool">
True when `expr` is `<slot>.prop_name` (supports ident and string-literal computed props).
</ResponseField>

Use `MatchContext` when several identifiers must refer to the same binding — e.g. `_classCallCheck`, `_inherits`, `_objectWithoutProperties`. Do **not** use it as a full pattern engine; surrounding matchers remain ordinary Rust over SWC nodes.

### Layer 2: `helper_matcher.rs` (binding lifecycle)

Shared low-level primitives for scope-sensitive helper work:

| Primitive | Purpose |
|---|---|
| `BindingKey` = `(Atom, SyntaxContext)` | Unique binding identity |
| `ident_matches_binding` / `expr_matches_binding` | Binding-safe identifier checks |
| `member_of_binding` | `key.prop` member access |
| `remaining_refs_outside_*` | Reference counting, skipping helper declarations |
| `remove_fn_decls_by_binding` / `remove_var_declarators_by_binding` | Declaration cleanup after rewrite |

Every binding match checks **both** symbol and `SyntaxContext` — matching by `sym` alone hits wrong inner-scope locals.

### Layer 3: Rule-local matching

Each helper kind owns its semantic shape recognition. The central scanner (`detect_helper_from_fn`) dispatches to per-helper predicates; some rules keep **stateful** or **marker-based** detection locally:

| Rule module | Detection style |
|---|---|
| `transpiler_helper_utils/matchers.rs` | Babel/SWC body shapes + import paths |
| `un_typeof_polyfill.rs` | TypeScript `typeof Symbol.iterator` polyfill |
| `un_to_consumable_array.rs` | TypeScript `__spreadArray` |
| `un_template_literal.rs` | Tagged-template cache factories (esbuild aliases globals) |
| `un_webpack_interop.rs` | `require.n`, `require.t`, `require.o` |
| `un_object_spread.rs` | esbuild `__spreadValues` / `__spreadProps` (module-wide alias state) |

<Warning>
The central scanner's matchers are `fn(&Function) -> bool` and must not depend on bundler-specific module state. esbuild object-spread detection stays in `un_object_spread.rs` by design.
</Warning>

## Detection scan

`collect_transpiler_helpers()` walks module-level declarations and returns `HashMap<BindingKey, TranspilerHelperKind>`:

```text
scan module-level declarations
  → for each function body, run shape matchers
  → for each Babel runtime import, map path → TranspilerHelperKind
  → for each tslib import/require alias, map raw TsHelperKind
  → collect (BindingKey, TranspilerHelperKind) pairs
```

### Scan targets

`collect.rs` inspects:

- `function` declarations and `export function`
- `var x = function…` / arrow assignments
- `var x = require("@babel/runtime/helpers/…")` and `.default` chains
- `import … from "@babel/runtime/helpers/…"` and `tslib` named imports
- `Object.assign || function(target)…` extends polyfill forms

### Babel sub-helper gating

Babel 7+ uses thin OR-chain dispatchers (`return f(x) || g(x) || h(x)`). The scanner only accepts these when `module_has_babel_sub_helper_signals` finds `Array.isArray`, `Array.from`, or `Symbol.iterator` elsewhere in the module — preventing false positives on unrelated OR chains.

### `LocalHelperContext`

`LocalHelperContext` extends the scan with TypeScript/tslib state:

<ResponseField name="helpers" type="HashMap<BindingKey, TranspilerHelperKind>">
Babel/SWC semantic helpers from body shape and import paths.
</ResponseField>

<ResponseField name="ts_helpers" type="HashMap<BindingKey, TsHelperInfo>">
Raw TypeScript helpers (`__awaiter`, `__generator`, `__spreadArray`, …) tracked separately from semantic kinds.
</ResponseField>

<ResponseField name="tslib_namespaces" type="HashSet<BindingKey>">
Bindings that namespace-import or require `tslib`.
</ResponseField>

Pipeline consumers call `RuleRunContext::local_helpers()`, which **lazily builds** one `LocalHelperContext` per module (with `unresolved_mark`) and reuses it across helper rules in the same pass. Direct rule tests build context themselves.

Utility methods:

| Method | Behavior |
|---|---|
| `helpers_of_kind(kind)` | Filter helpers by `TranspilerHelperKind` |
| `is_helper_callee(expr, kind)` | Match local binding, tslib namespace member, or `require("tslib").helper` |
| `remove_helpers_with_dependencies` | Remove helper + transitive `HelperDependency` bindings when unreferenced |
| `remove_unused_inline_ts_helpers` | Drop inlined TS helpers no longer referenced |

## Matching strategies

### Body-shape predicates

Shape matchers are plain `fn(&Function) -> bool` functions. They check essential structural elements and ignore variable names.

**`interopRequireDefault`** — single param, `__esModule` test, returns `{default: obj}`:

```javascript
// Babel, SWC, minified — same shape, different names
function _interopRequireDefault(obj) {
  return obj && obj.__esModule ? obj : { default: obj };
}
```

Matcher uses `MatchContext::from_params(func, &["obj"])` and accepts ternary-return or if/return forms. Inline IIFEs are classified via `classify_inline_helper_call`.

**Marker accumulation** — complex helpers scan the body for signals anywhere, not at fixed positions:

| Helper | Key signals |
|---|---|
| `toConsumableArray` | `Array.isArray` + `Array.from` |
| `slicedToArray` | `Symbol.iterator` + `Array.isArray`, or OR-chain with sub-helpers |
| `extends` | `Object.assign` + `.apply(this, arguments)` |
| `objectSpread` | `arguments` ref + `Object.defineProperty` or descriptor APIs |
| `interopRequireWildcard` | `__esModule` + for-in or `Object.keys`/`getOwnPropertyDescriptor` |

`scan_stmts_for_markers` and `BodyMarkerState` implement this accumulation pattern.

**Tagged template literal** — signal-based matching on a 2-param function:

| Variant | Required signals |
|---|---|
| Babel spec | `slice_copy` + `freeze_define_raw` |
| Babel loose | `slice_copy` + `raw_assignment` |
| TypeScript | `define_property_raw` |

### Import-path matching

`paths.rs` maps known runtime package paths to helper kinds. Babel and SWC paths are unified per kind:

```javascript
// Both resolve to TranspilerHelperKind::InteropRequireDefault
"@babel/runtime/helpers/interopRequireDefault"
"@swc/helpers/_/_interop_require_default"
```

### Generated-name fallback

When body shape is ambiguous, SWC/esbuild generated names provide a secondary signal — e.g. `_object_spread`, `__spreadValues`, `__objRest` map to `ObjectSpread` / `ObjectWithoutProperties` when the init is a function.

### TypeScript/tslib channel

`ts_helpers.rs` tracks raw `TsHelperKind` values separately. Rules like `UnAsyncAwait` match detected `__awaiter` / `__generator` aliases directly rather than renaming to canonical globals. `is_helper_callee` also resolves `tslib` namespace members and `require("tslib").__awaiter` patterns.

## `TranspilerHelperKind` coverage

| `TranspilerHelperKind` | Babel | tslib | SWC | Restoration rule |
|---|---|---|---|---|
| `InteropRequireDefault` | `_interopRequireDefault` | — | `_interop_require_default` | `UnInteropRequireDefault` |
| `InteropRequireWildcard` | `_interopRequireWildcard` | — | `_interop_require_wildcard` | `UnInteropRequireWildcard` |
| `ToConsumableArray` | `_toConsumableArray` | `__spreadArray` | `_to_consumable_array` | `UnToConsumableArray` |
| `Extends` | `_extends` | `__assign` | `_extends` | (inline in spread/rest rules) |
| `SlicedToArray` | `_slicedToArray` | `__read` | `_sliced_to_array` | `UnSlicedToArray` |
| `ObjectSpread` | `_objectSpread(2)` | — | `_object_spread(_props)` | `UnObjectSpread` |
| `ObjectWithoutProperties` | `_objectWithoutProperties` | `__rest` | `_object_without_properties` | `UnObjectRest` |
| `ClassCallCheck` | `_classCallCheck` | — | `_class_call_check` | `UnClassCallCheck` |
| `AsyncToGenerator` | `_asyncToGenerator` | `__awaiter`+`__generator` | `_async_to_generator` | `un_async_await.rs` |
| `TaggedTemplateLiteral` | `_taggedTemplateLiteral` | — | `_tagged_template_literal` | `UnTemplateLiteral` |
| `HelperDependency` | `_define_property`, `ownKeys`, … | — | sub-helpers | Removed with parent helper |

Kinds without a dedicated rewrite rule (`DefineProperty`, `Typeof`, `HelperDependency`) still participate in `is_helper_module` detection for cross-module fact collection.

## Pipeline placement

Helper detection and restoration run in the **`Helpers` stage** of `apply_rules()`, after **Syntax** normalization. Stage 1 rules like `UnIndirectCall` and `UnBracketNotation` must run first so patterns such as `(0, x.default)()` and `["default"]` are normalized before matchers run.

```text
Syntax stage          Helpers stage                    Structural stage
─────────────────     ─────────────────────────────    ─────────────────
UnIndirectCall    →   UnInteropRequireDefault      →   UnTemplateLiteral
UnBracketNotation →   UnToConsumableArray          →   UnAsyncAwait
                      UnObjectSpread / Rest
                      UnSlicedToArray
                      … UnEsm (requires webpack interop)
                      UnObjectSpread2 / Rest2 / SlicedToArray2
```

`UnInteropRequireDefault` explicitly requires `UnIndirectCall` and `UnBracketNotation`. Late helper passes (`UnObjectSpread2`, `UnObjectRest2`, `UnSlicedToArray2`) run after `UnEsm` converts `require()` to `import`.

## Restoration flow

Each helper kind has a dedicated `VisitMut` rule. The typical cycle:

<Steps>
<Step title="Detect bindings">
`LocalHelperContext` identifies helper `(BindingKey, kind)` pairs.
</Step>
<Step title="Rewrite call sites">
Rule rewrites helper invocations to idiomatic syntax — e.g. `_interopRequireDefault(require("a"))` → `require("a")`, then `.default` → direct reference.
</Step>
<Step title="Remove declarations">
`remove_helpers_with_dependencies` drops helper functions and transitive `HelperDependency` bindings when no external references remain.
</Step>
</Steps>

Reference tracking uses `remaining_refs_outside_declarations` to exclude the helper's own binding and self-references from the "still in use" count.

## Cross-module helper facts

`collect_module_facts()` records two export channels after Stage 2:

| Field | Content |
|---|---|
| `helper_exports` | Semantic helpers as public `HelperKind` |
| `ts_helper_exports` | Raw TypeScript helpers as `TypeScriptHelperKind` |
| `is_helper_module` | True when module exports any recognized helper (including dependency-only kinds) |

Phase 2 rules (e.g. `UnSlicedToArray` with `ModuleFactsMap`) resolve cross-module helper refs when a consumer imports from a hoisted helper module. See cross-module facts for the two-phase unpack barrier.

## Version drift and relaxed matching

Bundled code often strips version markers; inlined helpers erase them entirely. Wakaru uses **relaxed matching** — check essential semantic structure, not exact AST equality.

Tolerated variation includes:

- Ternary vs if/else conditional forms
- `.default` vs `["default"]` property access (after `UnBracketNotation`)
- Extra `Object.defineProperty` for non-configurable exports
- Added null checks

If a future helper version fundamentally changes semantics, it needs a new matcher — not a version gate.

## What not to build

<Info>
A corpus matcher, ast-grep-style DSL, and skeleton-pattern engine were prototyped, measured against real bundles, and **reverted**. ~93% of detection is marker-based or stateful and cannot be expressed as fixed patterns; the migratable remainder (~10–14 of ~209 functions) is too small for a shared engine to pay off.
</Info>

Rejected approaches:

| Approach | Why rejected |
|---|---|
| Custom IR layer | SWC AST is already high-level; second IR duplicates cost |
| CFG hashing | Minifier changes scramble naive hashes; canonicalization is the hard part |
| Version auto-detect | Version strings stripped in real bundles |
| Configurable pass graphs | Current fixed-order pipeline is intentional |

Real line savings come from **targeted deduplication** (e.g. consolidating inline-vs-declaration detection), not a generic matcher framework.

## Add a new helper matcher

<Steps>
<Step title="Write a failing test">
Add cases to `crates/core/tests/` or `transpiler_helper_utils/tests.rs` with the exact input AST shape.
</Step>
<Step title="Implement the body-shape predicate">
Add `is_my_helper_fn(func: &Function) -> bool` in `matchers.rs`. Use `MatchContext::from_params` when multiple params must correlate. Use `scan_stmts_for_markers` when signals appear anywhere in the body.
</Step>
<Step title="Wire detection">
Register in `detect_helper_from_fn` and, if applicable, `detect_helper_from_path` path tables.
</Step>
<Step title="Add restoration rule">
Create a `VisitMut` rule, register it in `pipeline.rs` under `Helpers` stage at the correct dependency position.
</Step>
<Step title="Verify">
Run focused rule tests, then pipeline snapshots (`noop_pipeline`, webpack4/esbuild unpack tests).
</Step>
</Steps>

<Check>
Every binding match must gate on `SyntaxContext`. Use `BindingRenamer` for renames — never rename by `sym` alone.
</Check>

## Related pages

<CardGroup>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Stage ordering, `unresolved_mark` scope gating, and where helper rules sit relative to syntax normalization.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Two-phase unpack barrier, `ModuleFactsMap`, and hoisted helper module resolution.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Full `RuleDescriptor` registry with `Helpers` stage entries and cross-rule dependencies.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, and definition-of-done checklist for new rules.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
Bisect helper-rule regressions with `--trace-rules` and snapshot diff workflow.
</Card>
</CardGroup>
