# Wakaru Documentation

> Reference for Wakaru's JavaScript decompiler and bundle splitter: CLI flags, Rust and WASM APIs, unpack formats, rewrite levels, rule pipeline, and contributor workflows.

## Context Links

- [Agent index](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/llms.txt)
- [Human interactive docs](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b)
- [GitHub repository](https://github.com/pionxzh/wakaru)

## Repository Metadata

- Repository: pionxzh/wakaru

- Generated: 2026-06-28T01:17:23.575Z
- Updated: 2026-06-28T01:23:52.756Z
- Runtime: Grok CLI
- Format: Documentation
- Pages: 24

## Page Index

- 01. [Overview](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/01-overview.md) - What Wakaru decompiles and unpacks, the three-crate workspace layout (core, CLI, WASM), primary entry points, and the shortest path from input JavaScript to readable output.
- 02. [Installation](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/02-installation.md) - Install via npm optional platform packages, npx, or GitHub release binaries; Rust toolchain requirements for building from source; Node engine constraints.
- 03. [Quickstart](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/03-quickstart.md) - First successful runs: decompile a single file, unpack a bundle to a directory, verify stdout vs -o output, and expected success signals for each mode.
- 04. [Decompile pipeline](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/04-decompile-pipeline.md) - Single-file flow: parse, resolver marks, staged rule application, optional source-map rename, fixer, emit; parallel execution during unpack; unresolved_mark scope gating.
- 05. [Bundle formats and unpacking](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/05-bundle-formats-and-unpacking.md) - Detection order and BundleFormat variants (webpack4/5, browserify, SystemJS, esbuild/Bun, AMD, scope-hoisted); raw vs full unpack; multi-file and directory scan semantics.
- 06. [Rewrite levels and assumptions](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/06-rewrite-levels-and-assumptions.md) - RewriteLevel (minimal, standard, aggressive), DceMode and --dce behavior, named rewrite assumptions (e.g. no_document_all), and reproduce-first policy for new heuristics.
- 07. [Helper detection](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/07-helper-detection.md) - How transpiler helpers (Babel, TypeScript/tslib, SWC) are matched by AST body shape across imported, inlined, hoisted, and minified forms; MatchContext and helper lifecycle layers.
- 08. [Cross-module facts](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/08-cross-module-facts.md) - Two-phase unpack barrier: Phase 1 fact collection after UnEsm, ModuleFactsMap shape, and Phase 2 rules (namespace_decomposition, cross-module helper refs) that read other modules' import/export facts.
- 09. [Unpack bundles](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/09-unpack-bundles.md) - Operational guide for --unpack modes (auto vs strict), --raw extraction, multi-file entry+chunk inputs, directory scanning rules, and --force overwrite protection.
- 10. [Use source maps](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/10-use-source-maps.md) - Provide --source-map for identifier recovery and import dedup, emit decompiled maps with --emit-source-map, extract embedded sources via wakaru extract, and pipeline ordering constraints.
- 11. [JSON output and CI integration](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/11-json-output-and-ci-integration.md) - Machine-readable --json stdout schema for decompile and unpack, warning kinds and is_error flags, elapsed_ms timing, and piping patterns for automation pipelines.
- 12. [WASM and playground](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/12-wasm-and-playground.md) - Build wakaru-wasm for the browser playground, decompile/unpack JS bindings, TypeScript result types, and Vite integration for the online demo.
- 13. [Develop transformation rules](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/13-develop-transformation-rules.md) - Add or modify VisitMut rules: test-first workflow, pipeline placement in RuleDescriptor order, unresolved_mark guards, BindingRenamer for renames, and definition-of-done verification checklist.
- 14. [Trace the rule pipeline](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/14-trace-the-rule-pipeline.md) - Use debug trace (and --profile / --profile-rules) to bisect single-file regressions with per-rule diffs, --from/--until ranges, and limitations for bundle unpack debugging.
- 15. [CLI reference](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/15-cli-reference.md) - Complete wakaru command surface: global flags, unpack modes, subcommands (extract, debug trace/normalize), stdin/stdout behavior, formatter, diagnostics, and profiling options.
- 16. [Core API reference](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/16-core-api-reference.md) - Exported wakaru-core functions (decompile, unpack, unpack_files, unpack_raw, trace_rules), DecompileOptions fields, DceMode, UnpackOutput warnings, and RewriteLevel defaults.
- 17. [WASM API reference](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/17-wasm-api-reference.md) - wasm_bindgen exports decompile, unpack, and ruleNames; parameter types, WakaruDecompileResult and WakaruUnpackResult JSON shapes, and warning kind strings.
- 18. [Rule pipeline reference](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/18-rule-pipeline-reference.md) - Ordered RuleDescriptor registry, RuleStage groupings, rule_names() identifiers, RulePipelineOptions ranges, and documented cross-rule dependencies from the inventory.
- 19. [Webpack bundle recipe](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/19-webpack-bundle-recipe.md) - End-to-end workflow using webpack4 and webpack5 testcases: build fixtures, run wakaru --unpack, compare against dist/*.pretty.js reference output, and multi-chunk inputs.
- 20. [Esbuild and Browserify recipe](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/20-esbuild-and-browserify-recipe.md) - Unpack browserify standalone bundles and esbuild/Bun scope-hoisted output: detection markers (__export, __commonJS), strict vs auto heuristic split, and testcase verification commands.
- 21. [Troubleshooting](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/21-troubleshooting.md) - Common failure modes: overwrite protection, unpack directory skip behavior, UnpackWarningKind codes, TDZ and parse-recovery warnings, formatter failures, and bug report fields.
- 22. [Debug regressions](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/22-debug-regressions.md) - Investigate snapshot drift, raw vs final webpack4 layers, rule trace bisection, profile export, and symptom-to-cause mapping (unresolved_mark, early-rule cascades).
- 23. [Contributing](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/23-contributing.md) - Fork-and-branch workflow, required cargo fmt/clippy/test checks, conventional commits, areas where contributions are most valuable, and links to architecture and testing docs.
- 24. [Testing and snapshots](https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/24-testing-and-snapshots.md) - cargo nextest vs cargo test, insta snapshot workflow, required pipeline test binaries, rule-level test patterns, Test262 round-trip coverage, and pre-commit verification matrix.

## Source File Index

- `.cargo/config.toml`
- `.config/nextest.toml`
- `.github/ISSUE_TEMPLATE/bug_report.yml`
- `.github/workflows/playground.yml`
- `.github/workflows/rust-ci.yml`
- `.github/workflows/rust-release.yml`
- `AGENTS.md`
- `Cargo.toml`
- `CONTRIBUTING.md`
- `crates/cli/src/discovery.rs`
- `crates/cli/src/formatter.rs`
- `crates/cli/src/json_output.rs`
- `crates/cli/src/main.rs`
- `crates/cli/src/output.rs`
- `crates/core/src/driver.rs`
- `crates/core/src/driver/types.rs`
- `crates/core/src/driver/unpack.rs`
- `crates/core/src/facts.rs`
- `crates/core/src/lib.rs`
- `crates/core/src/namespace_decomposition.rs`
- `crates/core/src/rules/cross_module_helper_refs.rs`
- `crates/core/src/rules/helper_matcher.rs`
- `crates/core/src/rules/import_dedup.rs`
- `crates/core/src/rules/match_context.rs`
- `crates/core/src/rules/mod.rs`
- `crates/core/src/rules/pipeline.rs`
- `crates/core/src/rules/rename_utils.rs`
- `crates/core/src/rules/transpiler_helper_utils/mod.rs`
- `crates/core/src/sourcemap_rename.rs`
- `crates/core/src/tdz_check.rs`
- `crates/core/src/unpacker/browserify.rs`
- `crates/core/src/unpacker/esbuild.rs`
- `crates/core/src/unpacker/mod.rs`
- `crates/core/src/unpacker/scope_hoist.rs`
- `crates/core/src/unpacker/webpack4.rs`
- `crates/core/src/unpacker/webpack5.rs`
- `crates/formatter/src/lib.rs`
- `crates/wasm/src/lib.rs`
- `docs/architecture.md`
- `docs/debugging.md`
- `docs/fact-system.md`
- `docs/helper-detection.md`
- `docs/learnings/helper-detection-pattern-engine.md`
- `docs/releasing.md`
- `docs/rewrite-assumptions.md`
- `docs/rule-dependency-inventory.md`
- `docs/test262-roundtrip.md`
- `docs/testing.md`
- `npm/bin/wakaru`
- `npm/package.json`
- `playground/package.json`
- `playground/scripts/build-wasm.mjs`
- `playground/vite.config.ts`
- `README.md`
- `scripts/correctness/test262-roundtrip.mjs`
- `testcases/browserify/dist/index.js`
- `testcases/browserify/README.md`
- `testcases/webpack4/dist/index.js`
- `testcases/webpack4/README.md`
- `testcases/webpack4/webpack.config.js`
- `testcases/webpack5/dist/index.js`
- `testcases/webpack5/README.md`
- `testcases/webpack5/webpack.config.mjs`

---

## 01. Overview

> What Wakaru decompiles and unpacks, the three-crate workspace layout (core, CLI, WASM), primary entry points, and the shortest path from input JavaScript to readable output.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/01-overview.md
- Generated: 2026-06-28T01:05:15.031Z

### Source Files

- `README.md`
- `docs/architecture.md`
- `crates/core/src/lib.rs`
- `crates/core/src/driver.rs`
- `crates/cli/src/main.rs`
- `crates/wasm/src/lib.rs`

---
title: "Overview"
description: "What Wakaru decompiles and unpacks, the three-crate workspace layout (core, CLI, WASM), primary entry points, and the shortest path from input JavaScript to readable output."
---

Wakaru is a Rust/SWC-based JavaScript decompiler and bundle splitter: `wakaru-core` owns parsing, format detection, ~60 AST rewrite rules, and emission; `wakaru-cli` and `wakaru-wasm` are thin frontends over the same `decompile` and `unpack` APIs.

## What Wakaru transforms

Production JavaScript often passes through three layers before it reaches Wakaru:

| Layer | Typical tools | What Wakaru reverses |
|-------|---------------|----------------------|
| Bundlers | webpack 4/5, Browserify, SystemJS, esbuild, Bun, Rollup/Vite | Splits collapsed modules, removes runtime wrappers, recovers `import`/`export` |
| Transpilers | Babel, TypeScript/tslib, SWC | Unwraps helper functions (`__extends`, `__awaiter`, tslib imports, etc.) and restores modern syntax |
| Minifiers | Terser and bundler-integrated minification | Expands booleans, template literals, control flow, and identifier patterns |

Wakaru handles all three in one pass. Feed it a minified module or a production bundle; it returns readable, modern ESNext-shaped JavaScript.

## Two operations

| Operation | Input | Output | Core API |
|-----------|-------|--------|----------|
| **Decompile** | One `.js`/`.mjs`/`.cjs` file (or stdin) | One readable file (or stdout) | `decompile(source, DecompileOptions)` |
| **Unpack + decompile** | One or more bundle/chunk files, or a directory scan | Many module files under an output directory | `unpack`, `unpack_files`, `unpack_raw`, `unpack_files_raw` |

Unpackers detect the bundle format and extract raw module code strings. The driver then runs the decompile rule pipeline on each module. With `--raw`, extraction stops before readability rules run.

<Note>
Directory inputs require `--unpack`. Wakaru recursively scans `.js`, `.mjs`, and `.cjs` files, skips hidden paths and `node_modules`, and includes only files detected as bundles or chunks. Explicit file inputs fall back to single-file decompile when no bundle format matches.
</Note>

## Supported bundle formats

Detection runs in fixed order; first match wins. `BundleFormat` variants:

| Order | Format | Identifier |
|-------|--------|------------|
| 1 | webpack 5 entry/runtime | `webpack5` |
| 2 | webpack 4 | `webpack4` |
| 3 | webpack 5 JSONP chunk | `webpack5` |
| 4 | Browserify standalone | `browserify` |
| 5 | SystemJS `System.register` | `systemjs` |
| 6 | esbuild / Bun (CJS helpers) | `esbuild` |
| 7 | AMD `define` | `amd` |
| — | Scope-hoisted ESM (heuristic, `--unpack=auto`) | `scope-hoisted` |

Pure ESM scope-hoisted output without `__export` or `__commonJS` markers may not structurally unpack; it falls through to single-file decompile unless heuristic splitting is enabled.

## Workspace layout

The Cargo workspace (`members = ["crates/*"]`) centers on three publishable surfaces plus a shared formatter:

:::files
crates/
  core/          # wakaru-core — parser, unpackers, rules, driver, facts
    src/
      lib.rs           — public API re-exports
      driver/          — decompile, unpack, trace orchestration
      rules/           — ~60 VisitMut transformation rules
      unpacker/        — per-format bundle splitters
      facts.rs         — cross-module import/export facts
  cli/           # wakaru-cli — `wakaru` binary (clap + rayon I/O)
    src/main.rs
  wasm/          # wakaru-wasm — wasm-bindgen bindings for browser playground
    src/lib.rs
  formatter/     # wakaru-formatter — optional oxc post-format pass (CLI + WASM)
:::

| Crate | Package | Role |
|-------|---------|------|
| `core` | `wakaru-core` | All transformation logic; consumed by CLI, WASM, and tests |
| `cli` | `wakaru-cli` / `@wakaru/cli` on npm | Command-line entry, file I/O, JSON output, profiling |
| `wasm` | `wakaru-wasm` | Browser bindings: `decompile`, `unpack`, `ruleNames` |
| `formatter` | `wakaru-formatter` | Optional `--formatter` pass after decompilation |

## Architecture

```mermaid
flowchart TB
    subgraph inputs [Inputs]
        SF[single file]
        BD[bundle / chunk / directory]
    end

    subgraph cli_wasm [Frontends]
        CLI[wakaru-cli]
        WASM[wakaru-wasm]
    end

    subgraph core [wakaru-core]
        DRV[driver]
        UNP[unpacker]
        RUL[rules pipeline]
        FAC[facts + cross-module pass]
    end

    SF --> CLI
    BD --> CLI
    SF --> WASM
    BD --> WASM
    CLI --> DRV
    WASM --> DRV

    BD --> UNP
    UNP -->|module strings| DRV
    SF --> DRV

    DRV -->|parse → resolver → rules| RUL
    RUL -->|unpack only| FAC
    FAC --> RUL
    RUL -->|fixer → emit| OUT[readable JS]
```

**Single-file decompile** (`driver/single_file.rs`):

```
parse → resolver(unresolved_mark) → apply_rules → [optional source-map rename] → fixer → print
```

**Unpack + decompile** (`driver/unpack.rs`) adds a two-phase parallel pipeline after `unpack_bundle`:

1. **Phase 1** — parse each module, run rules through `UnEsm`, collect cross-module facts.
2. **Phase 2** — re-parse, cross-module late pass (namespace decomposition, re-export consolidation), run remaining rules, emit.

Phase 1 and Phase 2 each parse modules independently so `SyntaxContext` stays continuous within the emitted pipeline.

## Primary entry points

### CLI (`wakaru`)

The `wakaru` binary in `crates/cli/src/main.rs` routes three top-level paths:

| Path | Trigger | Behavior |
|------|---------|----------|
| Default | `wakaru input.js` | Single-file decompile |
| Unpack | `wakaru bundle.js --unpack -o out/` | Bundle split + per-module decompile |
| Subcommands | `wakaru extract`, `wakaru debug trace` | Source-map extraction; rule-pipeline debugging |

<ParamField body="--unpack" type="auto | strict">
`auto` (default): structural detection plus heuristic scope-hoisted fallback. `strict`: structural detectors only, no heuristic fallback.
</ParamField>

<ParamField body="--level" type="minimal | standard | aggressive" default="standard">
Controls rewrite aggressiveness. Rules gate risky subpatterns internally rather than skipping entire rules.
</ParamField>

<ParamField body="--raw" type="flag">
With `--unpack`, write extractor output before the decompile rule pipeline.
</ParamField>

### Core API (`wakaru-core`)

Public exports from `crates/core/src/lib.rs`:

<ResponseField name="decompile" type="(source, DecompileOptions) → DecompileOutput">
Single-file transformation. Options include `sourcemap`, `dce_mode`, `level`, `diagnostics`, `emit_source_map`.
</ResponseField>

<ResponseField name="unpack" type="(source, DecompileOptions) → UnpackOutput">
Detect bundle format, split modules, run two-phase decompile pipeline.
</ResponseField>

<ResponseField name="unpack_files" type="(Vec&lt;UnpackInput&gt;, DecompileOptions) → UnpackOutput">
Multi-source unpack (entry + chunks). Merges module sets and stabilizes webpack numeric IDs across files.
</ResponseField>

<ResponseField name="trace_rules" type="(source, options, RuleTraceOptions) → Vec&lt;RuleTraceEvent&gt;">
Per-rule before/after snapshots for single-file bisection.
</ResponseField>

### WASM API (`wakaru-wasm`)

`wasm_bindgen` exports mirror the core driver:

- `decompile(source, level?, sourcemap?, diagnostics?, formatter?, emitSourceMap?)` → `WakaruDecompileResult`
- `unpack(source, level?, heuristicSplit?, diagnostics?, formatter?, emitSourceMap?)` → `WakaruUnpackResult`
- `ruleNames()` → `string[]`

These power the [online playground](https://wakaru.vercel.app/playground).

## Shortest path to readable output

<Steps>
<Step title="Install or run without install">

<CodeGroup>
```bash title="npx (no install)"
npx @wakaru/cli input.js -o output.js
```

```bash title="global npm"
npm install -g @wakaru/cli@latest
wakaru input.js -o output.js
```

```bash title="cargo (from source)"
cargo run -p wakaru-cli -- input.js -o output.js
```
</CodeGroup>

</Step>

<Step title="Decompile a single file">

```bash
wakaru input.js -o output.js
```

Without `-o`, output goes to stdout. Stdin works via pipe or `-`:

```bash
cat input.js | wakaru > output.js
```

**Verification:** stderr shows warnings (if any); output file contains expanded syntax, restored helpers, and readable identifiers at `standard` level.

</Step>

<Step title="Unpack a production bundle">

```bash
wakaru bundle.js --unpack -o out/
```

For a build output directory:

```bash
wakaru dist/ --unpack -o out/
```

**Verification:** stderr reports `detected: webpack5` (or matching format) and `total: N module(s)`. `out/` contains one file per recovered module.

</Step>

<Step title="Optional enhancements">

```bash
# Recover original names from a source map
wakaru input.js --source-map input.js.map -o output.js

# Stronger readability (may alter edge-case behavior)
wakaru input.js --level aggressive -o output.js

# Machine-readable summary for CI
wakaru bundle.js --unpack --json -o out/
```

</Step>
</Steps>

## Rule pipeline at a glance

`apply_rules` in `crates/core/src/rules/pipeline.rs` runs ~60 `VisitMut` rules in a fixed order across six stages:

1. Syntax normalization (sequences, booleans, bracket notation)
2. Transpiler helper unwrapping + module reconstruction (`UnEsm`, webpack interop)
3. Structural restoration (template literals, spread, optional chaining)
4. Complex patterns (IIFEs, classes, JSX, regenerator, async/await)
5. Modernization (`let`/`const`, arrow functions, `for..of`)
6. Cleanup (import dedup, smart inline/rename, optional dead-code removal)

Rules that match identifiers by name **must** gate on `unresolved_mark` from the SWC resolver to avoid renaming inner-scope bindings. See the decompile pipeline page for stage detail and the cross-module barrier during unpack.

<Info>
Default dead-code behavior (`DceMode::TransformOnly`) removes only transform-induced dead code. Pass `--dce` for a full reachability sweep.
</Info>

## Distribution surfaces

| Surface | Package / artifact | Typical use |
|---------|-------------------|-------------|
| npm | `@wakaru/cli` | Local CLI via `npx` or global install |
| GitHub Releases | Pre-built binaries | CI or environments without Node |
| WASM | Built from `crates/wasm` | Browser playground and JS integrations |
| Rust crate | `wakaru-core` (path dep) | Custom tooling, tests, rule development |

## Related pages

<CardGroup>
<Card title="Quickstart" href="/quickstart">
First successful runs for decompile and unpack modes with expected success signals.
</Card>
<Card title="Installation" href="/installation">
npm optional platform packages, release binaries, and Rust toolchain for source builds.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Parse, resolver marks, staged rules, source-map rename ordering, and `unresolved_mark` gating.
</Card>
<Card title="Bundle formats and unpacking" href="/bundle-formats-and-unpacking">
Detection order, raw vs full unpack, multi-file and directory scan semantics.
</Card>
<Card title="CLI reference" href="/cli-reference">
Complete `wakaru` command surface, flags, and subcommands.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`DecompileOptions`, `UnpackOutput`, warning kinds, and exported driver functions.
</Card>
</CardGroup>

---

## 02. Installation

> Install via npm optional platform packages, npx, or GitHub release binaries; Rust toolchain requirements for building from source; Node engine constraints.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/02-installation.md
- Generated: 2026-06-28T01:05:25.891Z

### Source Files

- `README.md`
- `npm/package.json`
- `npm/bin/wakaru`
- `Cargo.toml`
- `.github/workflows/rust-release.yml`

---
title: "Installation"
description: "Install via npm optional platform packages, npx, or GitHub release binaries; Rust toolchain requirements for building from source; Node engine constraints."
---

`@wakaru/cli` ships a Node launcher (`npm/bin/wakaru`) that resolves a prebuilt native binary from optional platform packages. End users typically install through npm or `npx`; contributors build `wakaru-cli` from the Rust workspace.

## Supported platforms

Prebuilt npm and GitHub release binaries cover four `platform-arch` pairs:

| Platform | npm optional package | GitHub release artifact |
|----------|----------------------|-------------------------|
| macOS ARM64 (`darwin-arm64`) | `@wakaru/cli-darwin-arm64` | `wakaru-darwin-arm64.tar.gz` |
| Linux x64 (`linux-x64`) | `@wakaru/cli-linux-x64` | `wakaru-linux-x64.tar.gz` |
| Linux ARM64 (`linux-arm64`) | `@wakaru/cli-linux-arm64` | `wakaru-linux-arm64.tar.gz` |
| Windows x64 (`win32-x64`) | `@wakaru/cli-win32-x64` | `wakaru-win32-x64.zip` |

<Warning>
Intel macOS (`darwin-x64`), Windows ARM64, and other unlisted combinations are not distributed on npm. The launcher exits with `Unsupported platform` when no mapping exists. Build from source on those targets, or use a supported host.
</Warning>

## Node.js requirement

<ParamField body="engines.node" type="semver" required>
Minimum Node.js version enforced by `@wakaru/cli`.
</ParamField>

```json
"engines": { "node": ">=16.0.0" }
```

Node is required for the npm entrypoint even though decompilation runs in the native Rust binary. `npx` and global installs both need a compatible Node runtime.

## Install methods

<Tabs>
<Tab title="npm (global)">

Install the launcher and let npm pull the matching optional platform binary:

```bash
npm install -g @wakaru/cli@latest
```

Pin a specific release by replacing `@latest` with a version tag (for example `@1.5.0`).

</Tab>
<Tab title="npx (one-off)">

Run without a global install:

```bash
npx @wakaru/cli input.js -o output.js
npx @wakaru/cli bundle.js --unpack -o out/
```

`npx` downloads `@wakaru/cli` and the platform package for the current host on each invocation (subject to npm cache).

</Tab>
<Tab title="GitHub releases">

Download a prebuilt binary from [GitHub Releases](https://github.com/pionxzh/wakaru/releases). Release tags (`v*`) publish the four archives listed in the platform table.

<Steps>
<Step title="Download the archive for your platform">

Pick `wakaru-darwin-arm64.tar.gz`, `wakaru-linux-x64.tar.gz`, `wakaru-linux-arm64.tar.gz`, or `wakaru-win32-x64.zip`.

</Step>
<Step title="Extract and place on PATH">

<CodeGroup>
```bash title="macOS / Linux"
tar xzf wakaru-<platform>.tar.gz
chmod +x wakaru
sudo mv wakaru /usr/local/bin/   # or another directory on PATH
```

```powershell title="Windows"
Expand-Archive wakaru-win32-x64.zip -DestinationPath .
# Add the folder containing wakaru.exe to PATH
```
</CodeGroup>

</Step>
<Step title="Verify">

```bash
wakaru --help
```

</Step>
</Steps>

</Tab>
</Tabs>

## How npm optional platform packages work

`@wakaru/cli` declares four `optionalDependencies`, one per published triple. npm installs only the package whose `os` and `cpu` fields match the host.

```text
@wakaru/cli
├── bin/wakaru          # Node shim (require.resolve + spawn)
└── optionalDependencies
    ├── @wakaru/cli-darwin-arm64   → wakaru
    ├── @wakaru/cli-linux-x64      → wakaru
    ├── @wakaru/cli-linux-arm64    → wakaru
    └── @wakaru/cli-win32-x64      → wakaru.exe
```

At runtime the shim:

1. Reads `process.platform` and `process.arch`.
2. Resolves the platform package binary with `require.resolve`.
3. Spawns it with `stdio: "inherit"` and forwards `process.argv.slice(2)`.

If resolution fails (optional install skipped or corrupted `node_modules`), stderr suggests:

```text
Try reinstalling: npm install @wakaru/cli@next
```

<Tip>
In monorepos, ensure `@wakaru/cli` is a direct dependency (or devDependency) so the optional platform package installs for the machine running the command. Hoisting alone does not guarantee the platform tarball is present.
</Tip>

## Build from source

Building compiles `wakaru-cli` from the workspace root. CI and releases use the **stable** Rust toolchain via `dtolnay/rust-toolchain@stable`; the workspace targets **edition 2021**.

<Steps>
<Step title="Install Rust">

Install [rustup](https://rustup.rs/) and select the stable channel:

```bash
rustup default stable
```

</Step>
<Step title="Clone and build">

```bash
git clone https://github.com/pionxzh/wakaru.git
cd wakaru
cargo build --release -p wakaru-cli
```

The binary lands at `target/release/wakaru` (or `target/release/wakaru.exe` on Windows).

</Step>
<Step title="Run without installing">

```bash
cargo run -p wakaru-cli -- input.js -o output.js
cargo run -p wakaru-cli -- --unpack bundle.js -o out/
```

</Step>
<Step title="Optional: faster test iteration">

For development, `docs/testing.md` documents a `dev-release` profile:

```bash
cargo build --profile dev-release -p wakaru-cli
```

</Step>
</Steps>

### Cross-compilation targets

Release automation builds these Rust triples (matching the npm platform matrix):

| Rust target | npm / release label |
|-------------|---------------------|
| `aarch64-apple-darwin` | `darwin-arm64` |
| `x86_64-unknown-linux-gnu` | `linux-x64` |
| `aarch64-unknown-linux-gnu` | `linux-arm64` |
| `x86_64-pc-windows-msvc` | `win32-x64` |

Example cross-build:

```bash
rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu -p wakaru-cli
```

Linux release binaries link against **glibc** (`*-unknown-linux-gnu`). Minimal musl-based images may need a local build rather than the published tarball.

### Development dependencies

Not required to run `wakaru`, but common when hacking on the repo:

| Tool | Purpose |
|------|---------|
| `cargo-nextest` | Faster parallel test runs (`cargo nextest run --workspace`) |
| `cargo-insta` | Review and accept snapshot diffs |
| `wasm-pack` + `wasm32-unknown-unknown` | Build `wakaru-wasm` for the browser playground |

See <Card title="Contributing" href="/contributing">Contributing</Card> and <Card title="Testing and snapshots" href="/testing-and-snapshots">Testing and snapshots</Card> for the full verification matrix.

<Note>
The workspace is not set up for `cargo install wakaru-cli` from crates.io. Use `cargo build -p wakaru-cli`, npm, or GitHub release binaries.
</Note>

## Verify installation

<RequestExample>
```bash
wakaru --help
echo 'var a=1;' | wakaru
```
</RequestExample>

<Check>
A successful install prints CLI help and decompiled output on stdout. Version alignment: the workspace `Cargo.toml`, `npm/package.json`, and platform packages share the same semver (currently `1.5.0` at time of writing).
</Check>

## Troubleshooting

| Symptom | Likely cause | Action |
|---------|--------------|--------|
| `Unsupported platform: …` | Host not in the four supported pairs | Build from source or use GitHub releases on a supported machine |
| `Could not find the wakaru binary for …` | Optional platform package missing | Reinstall: `npm install @wakaru/cli@latest` (or `@next` per launcher hint) |
| `engine` warnings from npm | Node &lt; 16 | Upgrade Node to `>=16.0.0` |
| Binary runs but `command not found` after GitHub install | Not on PATH | Move `wakaru` / `wakaru.exe` into a PATH directory |

For overwrite protection, unpack skips, and warning codes after install, see <Card title="Troubleshooting" href="/troubleshooting">Troubleshooting</Card>.

## Related pages

<CardGroup>
<Card title="Quickstart" href="/quickstart">
First decompile and unpack commands, stdout vs `-o`, and success signals.
</Card>
<Card title="CLI reference" href="/cli-reference">
Full flag surface, subcommands, and stdin/stdout behavior.
</Card>
<Card title="WASM and playground" href="/wasm-and-playground">
Build `wakaru-wasm` and run the browser demo locally.
</Card>
<Card title="Contributing" href="/contributing">
Fork workflow, `cargo fmt` / clippy / test gates, and where to send PRs.
</Card>
</CardGroup>

---

## 03. Quickstart

> First successful runs: decompile a single file, unpack a bundle to a directory, verify stdout vs -o output, and expected success signals for each mode.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/03-quickstart.md
- Generated: 2026-06-28T01:08:06.079Z

### Source Files

- `README.md`
- `crates/cli/src/main.rs`
- `testcases/webpack5/README.md`
- `testcases/webpack5/dist/index.js`
- `docs/architecture.md`

---
title: "Quickstart"
description: "First successful runs: decompile a single file, unpack a bundle to a directory, verify stdout vs -o output, and expected success signals for each mode."
---

The `wakaru` CLI exposes two primary operations: **decompile** (transform one JavaScript file through the rule pipeline) and **unpack** (split a bundle into modules, then decompile each). Decompile writes to stdout by default; unpack always requires `-o/--output` as a destination directory.

## Prerequisites

Install Wakaru before running the commands below. Three entry points are supported:

<Tabs>
  <Tab title="npx (no install)">
    ```bash
    npx @wakaru/cli <input> [flags]
    ```
  </Tab>
  <Tab title="npm global">
    ```bash
    npm install -g @wakaru/cli@latest
    wakaru <input> [flags]
    ```
  </Tab>
  <Tab title="cargo (from source)">
    ```bash
    cargo run -p wakaru-cli -- <input> [flags]
    ```
    Run all `cargo` commands from the repository root.
  </Tab>
</Tabs>

<CardGroup>
  <Card title="Installation" href="/installation">
    Platform packages, Node engine constraints, and GitHub release binaries.
  </Card>
  <Card title="Overview" href="/overview">
    What Wakaru decompiles, unpacks, and how the core/cli/wasm crates fit together.
  </Card>
</CardGroup>

## Decompile a single file

Single-file mode runs `decompile()` on one input: parse → resolver marks → ~60 rewrite rules → fixer → emit. No bundle detection runs unless you pass `--unpack`.

<Steps>
  <Step title="Pick an input">
    Pass a file path, pipe stdin, or use `-` explicitly:

    <CodeGroup>
      ```bash File path
      wakaru input.js
      ```
      ```bash Stdin pipe
      cat input.js | wakaru
      ```
      ```bash Explicit stdin
      wakaru - < input.js
      ```
    </CodeGroup>

    If no file is given and stdin is not a terminal, Wakaru reads stdin automatically. Otherwise it exits with `no input specified; pass a file path or pipe code on stdin`.
  </Step>

  <Step title="Choose stdout or a file">
    <ParamField body="-o, --output" type="path">
      Optional output file. When omitted, decompiled JavaScript is printed to **stdout**.
    </ParamField>

    <CodeGroup>
      ```bash Stdout (default)
      wakaru input.js
      ```
      ```bash Write to file
      wakaru input.js -o output.js
      ```
    </CodeGroup>
  </Step>

  <Step title="Verify success">
    A successful decompile has these signals:

    | Signal | Stdout (`-o` omitted) | File (`-o` set) |
    |--------|----------------------|-----------------|
    | Exit code | `0` | `0` |
    | Primary output | Decompiled JS on stdout | File created at `-o` path |
    | Warnings | Printed to stderr (when not `--json`) | Same |
    | Summary line | None | None |

    <RequestExample>
      ```bash
      echo 'var a=1,b=2;console.log(a+b);' | wakaru
      ```
    </RequestExample>

    <ResponseExample>
      ```javascript
      const a = 1;
      const b = 2;
      console.log(a + b);
      ```
    </ResponseExample>

    Warnings use `warning:` labels on stderr. Error-level warnings (for example parse-recovery failures) print `error:` and cause a non-zero exit.
  </Step>
</Steps>

<Note>
Passing a **directory** without `--unpack` fails with `cannot decompile a directory. Pass a JavaScript file or use --unpack`. Directory inputs are valid only with `--unpack`.
</Note>

## Unpack a bundle to a directory

Unpack mode detects the bundle format (webpack 4/5, browserify, SystemJS, esbuild/Bun scope-hoisted, and others), splits it into modules, and decompiles each in parallel. Output is always written as multiple files under `-o`.

<Steps>
  <Step title="Provide bundle input and output directory">
    <ParamField body="--unpack" type="enum (auto | strict)" required>
      Enables bundle splitting. `--unpack` and `--unpack=auto` (default) use structural detection plus heuristic fallback for scope-hoisted bundles. `--unpack=strict` uses structural detection only.
    </ParamField>

    <ParamField body="-o, --output" type="directory" required>
      Destination directory. **Required** with `--unpack`; omitting it fails immediately with `--unpack requires -o/--output to choose an output directory`.
    </ParamField>

    ```bash
    wakaru bundle.js --unpack -o out/
    ```
  </Step>

  <Step title="Inspect the output layout">
    Wakaru creates one `.js` file per extracted module, preserving relative paths from the bundle (for example `src/a.js`, `entry.js`).

    <RequestExample>
      ```bash
      wakaru testcases/webpack5/dist/index.js --unpack -o /tmp/wakaru-out
      ls /tmp/wakaru-out/src/
      ```
    </RequestExample>

    <ResponseExample>
      ```javascript
      // /tmp/wakaru-out/src/a.js
      class A {
          constructor(){
              this.label = 'a';
          }
          print() {
              console.log('a', this.version);
          }
      }
      class A_A {
          constructor(){
              this.label = 'a_a';
          }
      }
      export { A };
      export { A_A };
      ```
    </ResponseExample>

    :::files
    out/
    ├── entry.js
    └── src/
        ├── 1.js
        ├── a.js
        ├── b.js
        └── ...
    :::
  </Step>

  <Step title="Verify success">
    When stderr is a terminal and `--json` is not set, Wakaru prints a summary to **stderr** (not stdout):

    <ResponseExample>
      ```
      detected: webpack5
      total: 7 module(s) in 30ms
      ```
    </ResponseExample>

    For directory inputs, an additional scan line appears:

    <ResponseExample>
      ```
      scanned: 4 file(s), detected: 4 bundle/chunk file(s), skipped: 0 file(s)
      detected: webpack5, scope-hoisted
      total: 85 module(s) in 1.05s
      ```
    </ResponseExample>

    | Signal | Meaning |
    |--------|---------|
    | Exit code `0` | All modules decompiled without error-level warnings |
    | `detected: <formats>` | Bundle formats matched during detection |
    | `total: N module(s)` | Module files written under `-o` |
    | `(K failed)` suffix | Present when `K` modules had error-level warnings; exit is non-zero |
    | Files on disk | One `.js` per module under the output directory |

    <Warning>
      Summary lines are suppressed when stderr is not a terminal (for example in CI pipes). Use `--json` for machine-readable status in that case.
    </Warning>
  </Step>
</Steps>

### Additional unpack inputs

<CodeGroup>
  ```bash Directory scan
  wakaru dist/ --unpack -o out/
  ```
  ```bash Multiple explicit files
  wakaru entry.js chunk.js --unpack -o out/
  ```
  ```bash Raw extraction (no decompile rules)
  wakaru bundle.js --unpack --raw -o out/
  ```
</CodeGroup>

Directory scans recurse into `.js`, `.mjs`, and `.cjs` files, skip hidden paths and `node_modules`, and include only files detected as bundles or chunks. Non-matching files are skipped—not copied or decompiled.

## Stdout vs `-o` behavior

| Mode | `-o` omitted | `-o` set |
|------|--------------|----------|
| Decompile | JS to **stdout** | JS to file; stdout silent |
| Unpack | **Error** — `-o` required | Module files to directory; stdout silent |
| `--json` decompile | JSON to stdout (includes `code`) | JSON to stdout (`code` omitted; file written) |
| `--json` unpack | JSON to stdout | JSON to stdout; modules written to `-o` |

<Info>
With `--json`, structured output always goes to **stdout**. Human-readable summaries and warnings go to **stderr** (warnings are folded into the JSON object and not duplicated on stderr when `--json` is active).
</Info>

### JSON success shapes

<ParamField body="--json" type="flag">
  Emit machine-readable JSON to stdout instead of human-readable summaries.
</ParamField>

**Single-file decompile** (stdout mode — `code` included):

<ResponseExample>
```json
{"code":"const a = 1;\n","warnings":[],"elapsed_ms":10}
```
</ResponseExample>

**Single-file decompile** (with `-o` — `code` omitted, file on disk):

<ResponseExample>
```json
{"warnings":[],"elapsed_ms":9}
```
</ResponseExample>

**Unpack**:

<ResponseExample>
```json
{
  "detected_formats": ["webpack5"],
  "modules": [
    {"filename": "src/1.js"},
    {"filename": "src/a.js"},
    {"filename": "entry.js"}
  ],
  "warnings": [],
  "total": 7,
  "failed": 0,
  "elapsed_ms": 19
}
```
</ResponseExample>

<ResponseField name="total" type="number">
  Module files written to the output directory.
</ResponseField>

<ResponseField name="failed" type="number">
  Modules with error-level warnings. Non-zero `failed` causes a non-zero exit code.
</ResponseField>

<ResponseField name="detected_formats" type="string[]">
  Bundle formats detected (for example `webpack5`, `scope-hoisted`).
</ResponseField>

## Run against repository testcases

The repo ships webpack, browserify, and other fixtures under `testcases/`. Use them to confirm a local build before working on your own bundles.

<CodeGroup>
  ```bash Decompile (cargo)
  cargo run -p wakaru-cli -- testcases/webpack5/dist/index.js -o /tmp/out.js
  ```
  ```bash Unpack webpack5 bundle
  cargo run -p wakaru-cli -- testcases/webpack5/dist/index.js --unpack -o /tmp/unpacked/
  ```
  ```bash Unpack entire dist directory
  cargo run -p wakaru-cli -- testcases/webpack5/dist/ --unpack -o /tmp/unpacked/
  ```
</CodeGroup>

Reference output for webpack5 lives in `testcases/webpack5/dist/index.pretty.js` (prettier-formatted bundle, not per-module unpack output).

## Common first-run failures

| Error | Cause | Fix |
|-------|-------|-----|
| `--unpack requires -o/--output` | Unpack without output directory | Add `-o out/` |
| `no input specified` | No file and stdin is a terminal | Pass a file or pipe stdin |
| `output file … already exists` | `-o` points to an existing file | Delete the file or pass `--force` |
| `output directory … is not empty` | Unpack into a populated directory | Use an empty directory or pass `--force` |
| `errors in N module(s): …` | Error-level warnings during decompile/unpack | Inspect stderr or `--json` warnings; see troubleshooting |

<Tip>
Default rewrite level is `standard`. Pass `--level minimal` for near-zero semantic changes or `--level aggressive` for maximum readability. See [Rewrite levels and assumptions](/rewrite-levels-and-assumptions).
</Tip>

## Next

<CardGroup>
  <Card title="Unpack bundles" href="/unpack-bundles">
    Operational guide for `--unpack` modes, `--raw`, multi-file inputs, directory scanning, and `--force`.
  </Card>
  <Card title="Decompile pipeline" href="/decompile-pipeline">
    Parse → rules → fixer flow for single-file decompilation.
  </Card>
  <Card title="CLI reference" href="/cli-reference">
    Complete flag surface, subcommands, stdin/stdout behavior, and profiling options.
  </Card>
  <Card title="Webpack bundle recipe" href="/webpack-bundle-recipe">
    End-to-end workflow with webpack4/webpack5 testcases.
  </Card>
  <Card title="Troubleshooting" href="/troubleshooting">
    Overwrite protection, skip behavior, warning kinds, and bug-report fields.
  </Card>
</CardGroup>

---

## 04. Decompile pipeline

> Single-file flow: parse, resolver marks, staged rule application, optional source-map rename, fixer, emit; parallel execution during unpack; unresolved_mark scope gating.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/04-decompile-pipeline.md
- Generated: 2026-06-28T01:05:18.416Z

### Source Files

- `docs/architecture.md`
- `crates/core/src/driver.rs`
- `crates/core/src/rules/pipeline.rs`
- `crates/core/src/rules/mod.rs`
- `docs/rule-dependency-inventory.md`

---
title: "Decompile pipeline"
description: "Single-file flow: parse, resolver marks, staged rule application, optional source-map rename, fixer, emit; parallel execution during unpack; unresolved_mark scope gating."
---

Wakaru's decompile pipeline turns a single JavaScript module AST into readable ESNext by running a fixed-order rule registry after SWC's resolver. The public entry point is `decompile()` in `crates/core/src/driver/single_file.rs`; bundle unpacking reuses the same rule machinery per module inside a two-phase, rayon-parallel driver in `crates/core/src/driver/unpack/phases.rs`.

## Pipeline overview

```mermaid
flowchart TB
    subgraph single["Single-file decompile()"]
        IN[input string] --> PARSE[parse_js_with_recovery]
        PARSE --> RES[resolver marks]
        RES --> RULES[apply_rules — full registry]
        RULES --> SM{sourcemap bytes?}
        SM -->|yes| RENAME[ImportDedup → apply_sourcemap_renames → UnImportRename]
        SM -->|no| FIX
        RENAME --> FIX[apply_fixer]
        FIX --> EMIT[print_js / print_js_with_srcmap]
        EMIT --> OUT[DecompileOutput]
    end

    subgraph unpack["Unpack multi-module"]
        SPLIT[unpacker extract] --> P1[Phase 1: par_iter until UnEsm + facts]
        P1 --> BARRIER[ModuleFactsMap barrier]
        BARRIER --> P2[Phase 2: par_iter late pass + UnObjectSpread2..UnReturn]
        P2 --> OUT2[UnpackOutput modules]
    end
```

| Mode | Entry function | Rule scope per module |
|------|----------------|----------------------|
| Single file | `decompile(source, DecompileOptions)` | Full registry (`start_from` / `stop_after` optional via trace) |
| Unpack Phase 1 | `unpack_multi_module_with_plan` | `RulePipelineOptions::until("UnEsm")` + fact extraction |
| Unpack Phase 2 | same | `between("UnObjectSpread2", "UnReturn")` + cross-module late pass |
| Rule trace | `trace_rules` | Configurable range; single-file only |

<Note>
Rule tracing rejects bundle inputs. Use normal `decompile` or `unpack` for bundles; see [Trace the rule pipeline](/trace-rule-pipeline).
</Note>

## Single-file stages

`decompile()` runs entirely inside `GLOBALS.set` so SWC `SyntaxContext` values stay consistent for one module.

<Steps>
<Step title="Parse">
`parse_js_with_recovery` in `crates/core/src/driver/io.rs` builds an SWC `Module` from the input string. Syntax is chosen from the filename extension (`.ts`, `.tsx`, `.jsx`, or default ES+JSX). Recoverable parse errors are collected for optional diagnostics; unrecoverable parse failures abort.

<ParamField body="filename" type="string">
Used for syntax detection, diagnostic messages, and output source-map `file` field.
</ParamField>
</Step>

<Step title="Resolver marks">
SWC's `resolver(unresolved_mark, top_level_mark, false)` assigns `SyntaxContext` to every identifier. Free variables (globals like `Object`, `require`) receive `unresolved_mark` as their outer context; bound locals and parameters get module-scoped contexts.

This mark is threaded through every rule runner and is the primary scope gate for identifier matching.
</Step>

<Step title="Rule application">
`apply_rules` walks the ordered `RULE_DESCRIPTORS` registry in `crates/core/src/rules/pipeline.rs`. Each descriptor has an `id`, `RuleStage`, optional `requires` metadata, an `enabled` gate, and a `run` function.

<ParamField body="dce_mode" type="DceMode">
Controls late `DeadDecls` / `DeadImports` passes. API default is `Off`; the CLI defaults to `TransformOnly` and uses `Full` with `--dce`.
</ParamField>

<ParamField body="level" type="RewriteLevel">
`minimal`, `standard` (default), or `aggressive`. Gates risky subpatterns inside rules and whole rules such as `UnJsx`, `ArrowFunction`, and `UnDestructuring`.
</ParamField>
</Step>

<Step title="Optional source-map rename">
When `DecompileOptions.sourcemap` is set, three passes run **after** the main rule pipeline and **before** the fixer:

1. `ImportDedup` — merge repeated imports from the same specifier
2. `apply_sourcemap_renames` — recover original names via position lookup in `sourcesContent`
3. `UnImportRename` — clean import aliases after rename

Renaming runs late because rules detect patterns by minified helper names (`require`, `__generator`, `__esModule`), and `ImportDedup` needs `UnEsm` to have converted `require()` to `import` first.
</Step>

<Step title="Fixer and emit">
`apply_fixer` normalizes the AST for printing. Output is produced by `print_js` or, when `emit_source_map` is true, `print_js_with_srcmap` plus `build_output_sourcemap` (v3 JSON mapping decompiled output back to input positions).

Optional `diagnostics` adds TDZ checks, duplicate-declaration scans, and output parse verification as `UnpackWarning` entries in `DecompileOutput.warnings`.
</Step>
</Steps>

### RequestExample

```bash
cargo run -p wakaru-cli -- minified.js -o readable.js
```

### ResponseExample

`DecompileOutput` shape:

| Field | Type | Default |
|-------|------|---------|
| `code` | `String` | emitted JavaScript |
| `warnings` | `Vec<UnpackWarning>` | empty unless `diagnostics: true` |
| `source_map` | `Option<String>` | `None` unless `emit_source_map: true` |

## Rule registry and stages

Roughly 60 `VisitMut` rules are registered via `define_rule_registry!`. Order is fixed; `RuleDescriptor::requires` documents ordering constraints that the inventory expands on.

| `RuleStage` | Role | Examples |
|-------------|------|----------|
| `Syntax` | Minified syntax normalization | `SimplifySequence`, `UnBracketNotation`, `FlipComparisons` |
| `Helpers` | Transpiler helper unwrapping, module reconstruction | `UnInteropRequireDefault`, `UnEsm`, `UnWebpackInterop` |
| `Structural` | Pattern restoration | `UnTemplateLiteral`, `UnOptionalChaining`, `ObjectAssignSpread` |
| `Complex` | Higher-level recovery | `UnIife`, `UnEs6Class`, `UnRegenerator`, `UnAsyncAwait` |
| `Modernization` | ESNext upgrades | `VarDeclToLetConst`, `ArrowFunction`, `UnForOf` |
| `Cleanup` | Renaming, inlining, DCE | `SmartInline`, `SmartRename`, `DeadDecls`, `UnReturn` |

Several rules run multiple times under suffixed ids (`UnWebpackInterop2`, `UnIife2`, `UnParameters3`, etc.) because earlier passes expose shapes later passes must handle.

`RulePipelineOptions` controls partial execution:

<ParamField body="start_from" type="Option<&str>">
First rule id to run (inclusive). Used by `trace_rules` and tests.
</ParamField>

<ParamField body="stop_after" type="Option<&str>">
Last rule id to run (inclusive). Helpers: `until("UnEsm")`, `between("UnObjectSpread2", "UnReturn")`.
</ParamField>

<ParamField body="module_facts" type="Option<&ModuleFactsMap>">
Injected during unpack Phase 2 so fact-aware rules (`UnTemplateLiteral`, `UnForOf`, `UnRegenerator`, helper rules) can read cross-module import/export data.
</ParamField>

<Info>
Full rule order and dependency edges live in [Rule pipeline reference](/rule-pipeline-reference). Semantic assumptions per rewrite level are in [Rewrite levels and assumptions](/rewrite-levels-and-assumptions).
</Info>

## `unresolved_mark` scope gating

After `resolver()` runs, rules that match identifiers **by name** must distinguish free globals from bound locals. A webpack factory parameter `e` and an inner function parameter `e` share a symbol but not a `SyntaxContext`.

Every new visitor that matches by name should take `unresolved_mark: Mark` and gate matches:

```rust
if id.ctxt.outer() != self.unresolved_mark {
    return; // bound local — do not transform
}
```

Rules that use this pattern include `SimplifySequence`, `FlipComparisons`, `UnWebpackInterop`, `UnArgumentSpread`, `UnNullishCoalescing`, `UnJsx`, `SmartInline`, and `UnWebpackDefineGetters`. Renames must go through `rename_utils::BindingRenamer` (`rename_bindings_in_module`), never a custom rename by `sym` alone — that would hit unrelated inner-scope bindings.

<Warning>
Skipping the `unresolved_mark` guard causes cross-scope renames: a rule matching webpack param `e` would also rename `e` inside `function inner(e) { ... }`.
</Warning>

## Unpack: parallel two-phase execution

When a bundle is unpacked, the driver does **not** call `decompile()` directly on each raw string. Instead `unpack_multi_module_with_plan` in `phases.rs` runs:

**Phase 1** (`modules.par_iter()`):
- Parse each extracted module
- `resolver` + optional numeric webpack ID rewrites
- `apply_rules` with `until("UnEsm")`
- ESM recovery on a clone (or in-place when AST won't be reused) for fact extraction
- `collect_module_facts` → `ModuleFactsMap`
- Optionally retain the through-`UnEsm` AST for Phase 2 reuse

**Cross-module barrier** (sequential):
- Merge per-module facts into `ModuleFactsMap`
- Build filename rename map from provenance markers (standard+ only)

**Phase 2** (`phase2_inputs.into_par_iter()`):
- Resume from Phase 1 AST when no input source map and no `emit_source_map`; otherwise reparse and rerun through `UnEsm`
- Cross-module late pass: `run_reexport_consolidation`, `run_namespace_decomposition`
- `apply_rules` with `between("UnObjectSpread2", "UnReturn")` + `with_module_facts`
- Targeted cleanup (`SimplifySequence`, `UnAssignmentMerging`, factory-IIFE ESM recovery, export pruning)
- Same source-map rename trio as single-file when `sourcemap` is provided
- `apply_fixer` → emit

Both phases use rayon; on targets without threading, rayon falls back to sequential execution.

<Note>
The through-`UnEsm` range runs twice per module because `SyntaxContext` must stay continuous within the emitted pipeline. Reusing a Phase 1 AST after a fresh parse would break ctxt-sensitive rename rules.
</Note>

Phase 1 fact collection uses `RewriteLevel::Standard` for ESM recovery regardless of output level, so facts stay stable; Phase 2 applies the caller's `options.level`.

Individual module failures are best-effort: parse or decompile errors preserve raw extracted code and append `UnpackWarning` entries (`FactCollectionParseFailed`, `DecompileFailed`) rather than aborting the whole unpack.

## Dead-code elimination in the pipeline

Late cleanup rules `DeadDecls` and `DeadImports` are gated by `dce_mode.is_enabled()`:

| `DceMode` | Behavior |
|-----------|----------|
| `Off` | No dead-code passes (API `decompile` default) |
| `TransformOnly` | Remove only transform-induced dead code; snapshot pre-dead spans at pipeline start |
| `Full` | Full reachability sweep |

CLI default is `TransformOnly`; pass `--dce` for `Full`. Tests often use `DceMode::Off` to isolate structural restoration from cleanup.

## Debugging and partial runs

`trace_rules` replays the single-file pipeline with `apply_rules_with_observer`, capturing per-rule before/after snapshots as git-style diffs via `format_trace_events`. It shares resolver + rule machinery but skips source-map rename, fixer-adjacent source-map output, and rejects detected bundles.

Use `RulePipelineOptions::between(start, stop)` in tests or trace mode to bisect regressions without running the full ~60-rule chain.

## Key files

```text
crates/core/src/
  driver/
    single_file.rs      decompile() orchestration
    unpack/phases.rs    two-phase parallel module pipeline
    io.rs               parse, print, fixer bridge
    trace.rs            per-rule observer tracing
  rules/
    pipeline.rs         RULE_DESCRIPTORS registry, apply_rules
    mod.rs              RewriteLevel, rule exports
  sourcemap_rename.rs   late rename after rules
```

## Related pages

<CardGroup>
<Card title="Overview" href="/overview">
Workspace layout, unpack vs decompile, and primary entry points.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` list, stage groupings, and dependency inventory.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Phase 1 fact collection, `ModuleFactsMap`, and Phase 2 fact-aware rules.
</Card>
<Card title="Use source maps" href="/use-source-maps">
Input `--source-map` rename constraints and `--emit-source-map` output.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs, `--from`/`--until` ranges, and bisection workflow.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Adding rules: `unresolved_mark` guards, pipeline placement, test-first workflow.
</Card>
</CardGroup>

---

## 05. Bundle formats and unpacking

> Detection order and BundleFormat variants (webpack4/5, browserify, SystemJS, esbuild/Bun, AMD, scope-hoisted); raw vs full unpack; multi-file and directory scan semantics.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/05-bundle-formats-and-unpacking.md
- Generated: 2026-06-28T01:06:34.399Z

### Source Files

- `docs/architecture.md`
- `crates/core/src/unpacker/mod.rs`
- `crates/core/src/unpacker/webpack4.rs`
- `crates/core/src/unpacker/webpack5.rs`
- `crates/core/src/driver/unpack.rs`
- `crates/cli/src/discovery.rs`

---
title: Bundle formats and unpacking
description: Detection order and BundleFormat variants (webpack4/5, browserify, SystemJS, esbuild/Bun, AMD, scope-hoisted); raw vs full unpack; multi-file and directory scan semantics.
---

Wakaru splits bundled JavaScript into per-module source files, then runs the decompile rule pipeline on each module. Unpacking is format-driven: the core unpacker parses the input once, tries detectors in a fixed order, and returns an `UnpackResult` with extracted module code strings. The driver then either emits those strings as-is (`unpack_raw`) or runs the two-phase multi-module decompile pipeline (`unpack`).

## Detection pipeline

Detection starts in `try_unpack_bundle`, which parses the source as an ES module and walks a fixed candidate list. **First match wins** — later detectors never run once a format matches.

```mermaid
flowchart TD
    A[Parse source as ES module] --> B{detect_bundle_candidate<br/>on top-level module}
    B -->|match| Z[Return UnpackResult]
    B -->|no match| C[Collect UMD/AMD unwrap candidates]
    C --> D{detect_bundle_candidate<br/>on each unwrapped module}
    D -->|match| Z
    D -->|no match| E[AMD detector on top-level module]
    E -->|match| Z
    E -->|no match| F[No bundle detected]
```

On the **top-level module**, `detect_bundle_candidate` runs in this order:

1. **webpack5** — IIFE/arrow bootstrap with module factory array or object
2. **webpack5 runtime entry** — standalone runtime IIFE (`require.m`, `require.f`, `require.e`, …) when allowed
3. **webpack4** — `(function(modules){…})([…])` or object-form with `__webpack_require__`
4. **webpack5 chunk** — JSONP chunk push (`webpackChunk…push([[ids], {modules}])`)
5. **browserify** — nested IIFE with module object and entry array
6. **SystemJS** — top-level `System.register(…)` calls
7. **esbuild / Bun** — `__export`, `__commonJS`, or `__esm` helper shapes

If the top-level parse does not match, Wakaru unwraps **UMD and AMD wrapper factories** and retries the same candidate list on inner module bodies (without runtime-entry detection). If that still fails, a dedicated **AMD** pass handles `define(…)`-only modules and plain UMD bodies.

When structural detection returns `None`, the driver can still split the file in **auto mode** using the scope-hoisted heuristic (see below).

## BundleFormat variants

`BundleFormat` is the tag recorded on every successful unpack. It appears in `UnpackOutput.detected_formats` and drives format-specific normalization.

| `BundleFormat` | `as_str()` | Typical input shape | Notes |
|---|---|---|---|
| `Webpack5` | `webpack5` | IIFE bootstrap, runtime entry, or JSONP chunk | Runtime-only files become a single `entry.js` module |
| `Webpack4` | `webpack4` | IIFE with array- or object-form module table | Renames factory params, rewrites `require(id)`, normalizes runtime helpers |
| `Browserify` | `browserify` | `(function e(t,n,r){…})({id:[fn,deps]}, …, [entries])` | Standalone browserify bundles |
| `SystemJs` | `systemjs` | `System.register(name, deps, factory)` | May re-detect nested dynamic-export bundles |
| `Esbuild` | `esbuild` | `__export` / `__commonJS` / `__esm` helpers | Bun emits the same helper shapes; pure ESM without markers falls through |
| `Amd` | `amd` | `define(id, deps, factory)` or UMD-wrapped factories | Runs after wrapper unwrap attempts |
| `ScopeHoisted` | `scope-hoisted` | Large flat top-level with reference-graph clusters | Heuristic split, not a bundler-specific AST matcher |

Each extracted module is an `UnpackedModule` with `id`, `is_entry`, `code`, and `filename`. Filenames come from bundler paths (webpack object keys), numeric indices (`module-N.js`, `entry.js`), or heuristic cluster names.

## Format-specific behavior

### Webpack 5

Three detection paths share the `Webpack5` format tag:

- **Bootstrap bundle** — empty-param IIFE whose body contains the module table and `__webpack_require__`-style runtime.
- **Runtime entry** — IIFE whose body matches webpack5 runtime signatures (`require.m`, `require.f`, `require.e`, `require.u`, `require.t`). The entire source is kept as one `entry.js` module so chunk-loading code stays intact for multi-file unpack.
- **JSONP chunk** — `(self.webpackChunk… = … \|\| []).push([[chunkIds], { numericId: factory, … }])`. Multiple push statements in one file are merged.

For multi-file inputs, `detect_chunk_ids` harvests chunk IDs from each input so numeric `require` references can be rewritten across entry and chunk files.

### Webpack 4

Detects the classic webpack4 IIFE: unary `!function(…)` or plain call, with array-form (`[factory, …]`) or object-form (`{"./src/index.js": factory, …}`) module tables.

Extraction runs `normalize_extracted_webpack_module` on each factory:

- Rename `module` / `exports` / `require` factory parameters
- Rewrite numeric or string `require(id)` to output filenames
- Normalize `require.n` getter access
- Apply `WebpackRuntimeNormalizer` while **preserving** `require.r` / `require.d` ESM markers for the later rule pipeline

### Browserify

Matches the browserify standalone pattern: outer call whose callee is itself a call, first argument is the modules object (`{id: [factory, deps], …}`), third argument is the entry-id array.

### SystemJS

Collects `System.register` calls. Each register becomes a module; dynamic-export bundles may be unpacked again by re-invoking the main detector on reconstructed source.

### Esbuild and Bun

Looks for bundler helper symbols and factory declarations (`var X = __commonJS({ "path"(exports, module) { … } })`). Scope-hoisted ESM is detected via `__export(ns, { name: () => binding })` namespace boundaries.

Pure scope-hoisted ESM from Rollup or Vite **without** `__export` or `__commonJS` markers does not match this detector. Those bundles rely on the scope-hoisted heuristic in auto mode (or single-file decompile).

### AMD

Handles files that are only `define(…)` calls, plus plain UMD module bodies exposed after wrapper peeling.

### Scope-hoisted (heuristic)

`scope_hoist::split_scope_hoisted` is a **reference-graph clusterer**, not a bundler signature matcher. It:

1. Optionally unwraps a top-level IIFE
2. Requires at least 10 top-level declarations
3. Builds a reference graph and union-finds clusters
4. Emits modules only when **two or more** clusters are found

This path sets `allow_cycle_premerge: false` because import cycles among heuristic splits are not safe to pre-merge.

## Auto vs strict detection

CLI unpack mode controls `DecompileOptions.heuristic_split`:

<ParamField body="--unpack / --unpack=auto" type="flag" default>
Structural detectors run first. When they return `None`, scope-hoisted heuristic splitting is attempted. Directory scans also accept heuristic scope-hoisted files.
</ParamField>

<ParamField body="--unpack=strict" type="flag">
Structural detection only. No scope-hoisted fallback. Unrecognized bundles fall back to single-file decompile (explicit file inputs) or are skipped (directory scans).
</ParamField>

Nested re-splitting of already-detected modules is a separate, stricter gate: it requires **both** `heuristic_split` and `RewriteLevel::Aggressive`. When enabled, each extracted module is re-examined with the scope-hoisted clusterer and split further only if resulting import paths resolve.

## Raw vs full unpack

Detection is identical for raw and full unpack — `try_unpack_bundle_raw` delegates to `try_unpack_bundle`. The difference is what happens **after** extraction.

### Full unpack (`unpack` / `unpack_files`)

```
detect → extract modules → two-phase decompile pipeline per module → emit
```

Phase 1 (parallel): parse each module → rules through `UnEsm` → collect cross-module facts.

Phase 2 (parallel): parse again → through-`UnEsm` → cross-module late pass → `UnTemplateLiteral` through `UnReturn` → emit.

When no bundle is detected in auto mode, Wakaru tries scope-hoisted splitting; if that yields multiple modules it unpacks them (with `DceMode::Off` for the heuristic path). Otherwise it falls back to **single-file decompile** and returns one `module.js`.

### Raw unpack (`unpack_raw` / `unpack_files_raw`)

```
detect → extract modules → emit code strings (no rule pipeline)
```

Raw output keeps bundler-specific shapes the decompile pipeline is meant to recover — webpack ESM markers (`require.r`, `require.d`), export getters, and similar. Extractors still apply **bundler-coupled normalization** at the extraction boundary (factory param rename, module ID rewrites, runtime wrapper removal).

For heuristic scope-hoisted splits in raw mode, Wakaru applies a narrow `UnEsm`-only normalization pass so the output is runnable standalone. Normalization failures produce `RawNormalizationFailed` warnings and preserve unparsed code.

If nothing is detected, raw unpack returns the original source as a single `module.js` without running rules.

<AccordionGroup>
<Accordion title="When to use raw vs full">
Use **raw** to inspect extraction boundaries, debug detector output, or compare against `webpack4_unpack_raw` snapshots. Use **full** for readable, idiomatic source — that is the normal `wakaru --unpack` path.
</Accordion>
</AccordionGroup>

## Multi-file unpack

Pass multiple explicit files (entry + chunks) or a scanned directory. `unpack_files` / `unpack_files_raw` process every input independently, then merge.

<Steps>
<Step title="Detect each input">
Each file runs the same detector chain. Non-bundle files in multi-file mode become **fallback modules** (their source is kept under the input basename) rather than being dropped.
</Step>

<Step title="Stabilize the merged module set">
`prepare_multi_source_modules` assigns unique output filenames across inputs and builds a `NumericRewritePlan`. Unambiguous numeric webpack module IDs map to final filenames so `require(529)` in an entry can point at `529.js` from a chunk file. **Duplicate numeric IDs across inputs are left unrewritten** to avoid merging unrelated webpack runtimes from the same directory scan.
</Step>

<Step title="Run the pipeline once">
The combined module set goes through either the two-phase decompile pipeline (full) or raw emission with numeric rewrites (raw). Cross-module facts in full unpack can see imports/exports from every input file.
</Step>
</Steps>

A single-input call delegates to `unpack` / `unpack_raw` and does not run merge preparation.

## Directory scan semantics

When `--unpack` receives a **directory**, the CLI walks it recursively:

- Includes `.js`, `.mjs`, `.cjs`
- Skips hidden files/directories (names starting with `.`)
- Skips `node_modules`
- Sorts discovered paths for stable ordering

Each candidate is filtered through `is_detected_unpack_input`:

- Structural `try_unpack_bundle` match, **or**
- In auto mode, scope-hoisted heuristic that yields more than one module

Non-matching files are **skipped** — they are not decompiled or copied. If a directory scan finds no detected files, the CLI errors with `no bundle or chunk files detected in directory input`.

Explicit **file** inputs do not use detect-only filtering: unrecognized bundles fall back to single-file decompile (auto/full) or raw passthrough.

## Entry points

<CodeGroup>
```bash CLI
# Full unpack with auto detection
wakaru --unpack bundle.js -o out/

# Structural detection only
wakaru --unpack=strict bundle.js -o out/

# Extraction without decompile rules
wakaru --unpack --raw bundle.js -o out/

# Entry + chunk files
wakaru --unpack entry.js chunk.123.js -o out/
```

```rust Core API
use wakaru_core::{unpack, unpack_files, unpack_raw, DecompileOptions, UnpackInput};

let output = unpack(source, DecompileOptions::default())?;
let output = unpack_files(inputs, options)?;
let output = unpack_raw(source, &options)?;
```
</CodeGroup>

<ResponseField name="UnpackOutput" type="object">
<Expandable title="fields">
<ResponseField name="modules" type="Vec<(String, String)>">
Output filename paired with decompiled or raw module source.
</ResponseField>
<ResponseField name="detected_formats" type="Vec<BundleFormat>">
Formats matched across all inputs (deduplicated).
</ResponseField>
<ResponseField name="warnings" type="Vec<UnpackWarning>">
Per-module parse, normalization, or cycle warnings. Consult troubleshooting for `UnpackWarningKind` codes.
</ResponseField>
</Expandable>
</ResponseField>

## Related pages

<CardGroup cols={2}>
<Card title="Unpack bundles" href="/unpack-bundles">
Operational guide for `--unpack` modes, `--raw`, multi-file inputs, directory rules, and `--force`.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
What happens after extraction: resolver marks, staged rules, and parallel execution.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Two-phase unpack barrier and Phase 2 rules that consume facts from other modules.
</Card>
<Card title="Esbuild and Browserify recipe" href="/esbuild-browserify-recipe">
End-to-end unpacking for `__export` / `__commonJS` bundles and browserify standalone output.
</Card>
<Card title="Webpack bundle recipe" href="/webpack-bundle-recipe">
Entry + chunk workflows for webpack4 and webpack5 test fixtures.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`unpack`, `unpack_files`, `unpack_raw`, and `DecompileOptions` fields.
</Card>
</CardGroup>

---

## 06. Rewrite levels and assumptions

> RewriteLevel (minimal, standard, aggressive), DceMode and --dce behavior, named rewrite assumptions (e.g. no_document_all), and reproduce-first policy for new heuristics.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/06-rewrite-levels-and-assumptions.md
- Generated: 2026-06-28T01:06:52.631Z

### Source Files

- `docs/rewrite-assumptions.md`
- `crates/core/src/rules/mod.rs`
- `crates/core/src/driver/types.rs`
- `README.md`
- `docs/rule-dependency-inventory.md`

---
title: Rewrite levels and assumptions
description: RewriteLevel (minimal, standard, aggressive), DceMode and --dce behavior, named rewrite assumptions, and the reproduce-first policy for new heuristics.
---

Wakaru's rewrite policy controls how aggressively the decompiler recovers readable source from minified or transpiled JavaScript. Two knobs work together: **rewrite level** (`minimal`, `standard`, `aggressive`) gates which recovery patterns run, and **dead-code elimination (DCE) mode** controls whether late cleanup removes only transform-induced dead code or performs a full reachability sweep.

Rewrite level answers *how much* to recover. Named **rewrite assumptions** explain *why* a pattern is safe when the AST alone cannot prove it. Together they form the semantic contract between Wakaru and callers who need predictable output.

## Rewrite levels

`RewriteLevel` is an ordered enum: `Minimal` < `Standard` < `Aggressive`. It flows from the CLI (`--level`), the core API (`DecompileOptions.level`), and the WASM bindings (`level` parameter). Unrecognized level strings fall back to `standard`.

| Level | Goal | Typical use |
|-------|------|-------------|
| `minimal` | Near-zero semantic change within documented dynamic-scope limits. Prefer binding proofs and strict-check patterns over assumptions. | Auditing, diffing, correctness checks (Test262 round-trip uses `minimal` by default). |
| `standard` | Default. Recover common generated-source patterns when evidence is strong and local. May rely on `no_document_all`; builtin alias inlining also runs here. | Everyday decompilation and bundle unpacking. |
| `aggressive` | Speculative, compiler-intent-heavy recovery when patterns are promising but proof is weaker. Enables `pure_getters` and `stable_builtins` assumption flags. | Reading unfamiliar bundles where cleaner output outweighs edge-case fidelity. |

<ParamField body="--level" type="enum">
Rewrite aggressiveness. Values: `minimal`, `standard` (default), `aggressive`.
</ParamField>

<ParamField body="level" type="RewriteLevel">
Core API and WASM equivalent of `--level`. Default: `RewriteLevel::Standard`.
</ParamField>

### How level gating works

Rules do not all move in or out of the pipeline as a block. Wakaru uses two gating styles:

1. **Whole-rule gating** — the rule descriptor's `is_enabled` callback skips the rule below a threshold. Examples at `standard+`: `UnJsx`, `ArrowFunction`, `UnDestructuring`, `SmartRename`, `UnUseStrict`, `UnEsbuildCjsWrapper`.
2. **Subpattern gating** — the rule always runs, but individual pattern matchers check `RewriteLevel` or `RewritePolicy` internally. Examples: `UnNullishCoalescing` (strict vs loose null checks), `UnIndirectCall` (identifier vs member indirect calls), `UnEsm` (entire CJS→ESM conversion disabled below `standard`), `SmartInline` (temp inlining and builtin alias inlining).

The internal **Safety** field in the rule inventory describes how risky a rewrite is in principle. **Rewrite level** describes what end users get by default. A rule can be labeled `heuristic` internally while still running at `minimal` with only its safe subpatterns enabled.

### What changes at each level

The table below summarizes high-impact differences verified by level-gating tests. Many syntax-normalization rules (bracket notation, void removal, helper unwrapping) run at all levels.

| Capability | `minimal` | `standard` | `aggressive` |
|------------|-----------|------------|--------------|
| Loose `== null` → `?.` / `??` | Off (needs strict checks or isolated temps) | On (`no_document_all`) | On + looser temp naming |
| Member-expression read collapse | Off | Identifier bases only | On (`pure_getters`) |
| Builtin alias inlining (`const E = TypeError`) | Preserved | Inlined (level-gated) | Inlined |
| `require` → `import` / ESM reconstruction | Off | On | On |
| JSX syntax recovery | Off (runtime calls preserved) | On for strong shapes | On + dynamic tag aliases |
| Arrow function conversion | Off | On | On |
| `fn.apply(undefined, args)` → spread | Off | On | On |
| Default parameter recovery (arguments-based) | Off | On | On |
| Class field / static field recovery | Off | On | On |
| `SmartRename` readable names | Off | On | On |
| `SmartInline` temp/single-use inlining | Off | On | On |
| Index destructuring grouping (`obj[0]`, `obj[1]`) | Off | Off | On |

At `minimal`, `VarDeclToLetConst` still runs but keeps exported `var` bindings to avoid changing module surface shape.

### Unpack-only level interactions

During multi-module unpack, two additional level gates apply:

- **Filename recovery** from provenance markers requires `standard+` (readability rewrite, not structural).
- **Dead helper-module elimination** requires DCE enabled *and* `standard+`, because dropping modules is structural and depends on the binding→side-effect import downgrade that only runs when DCE is on.

At `aggressive` with heuristic unpack enabled, Wakaru may retry scope-hoist splitting inside modules already extracted by a structural bundle detector.

## Named rewrite assumptions

`RewriteAssumptions` and `RewritePolicy` bundle the level with the assumptions it may rely on. Rules that need assumption-aware logic construct `RewritePolicy::from_level(level)` rather than checking the enum alone.

| Assumption | `minimal` | `standard` | `aggressive` |
|------------|-----------|------------|--------------|
| `no_document_all` | `false` | `true` | `true` |
| `pure_getters` | `false` | `false` | `true` |
| `stable_builtins` | `false` | `false` | `true` |

Some rules also gate on `RewriteLevel` directly (e.g. `SmartInline` builtin alias inlining at `standard+`) even when the corresponding assumption flag is still `false` at `standard`. The flags are the named semantic contract; level checks are the enforcement mechanism.

### `no_document_all`

The input does not depend on legacy `document.all` falsy-object behavior. Loose null checks like `x == null` and `x != null` are not strictly equivalent to strict null/undefined tests.

**Affects:** `UnOptionalChaining`, `UnNullishCoalescing` (loose null-check recovery).

At `minimal`, these rules still recover from strict checks (`x === null || x === undefined`) and from temp-based patterns where binding analysis proves single evaluation.

```js
// Loose — requires no_document_all (standard+)
x != null ? x : fallback   //  →  x ?? fallback

// Strict — runs at all levels
x === null || x === undefined ? fallback : x  //  →  x ?? fallback
```

### `pure_getters`

Property reads on the rewritten base are stable and side-effect-free. Recovery that collapses multiple reads of the same base into one (e.g. `obj.value != null ? obj.value : fb` → `obj.value ?? fb`) changes behavior if the property is a getter with side effects.

**Affects:** `UnOptionalChaining`, `UnNullishCoalescing` (repeated-base forms). The `pure_getters` flag is `true` only at `aggressive`; identifier-base collapses at `standard` use separate level checks.

Rules prefer **temp-based recovery** when available — a compiler temp that proves single evaluation is safer than assuming `pure_getters`:

```js
var _a;
(_a = obj.value) != null ? _a : fallback  //  →  obj.value ?? fallback  (safe at minimal)
```

Member-expression bases (e.g. `a.b.prop`) require `aggressive` unless a temp proves single evaluation.

### `stable_builtins`

Global builtins and their methods are not patched between alias capture and later use. Minifiers emit aliases like `const E = TypeError` or `const O = Object` to save bytes; inlining them re-reads the global at the use site.

**Affects:** `SmartInline` builtin/global alias inlining (`standard+`, documented under this assumption).

At `minimal`, captured builtin aliases are preserved.

### Generated temporaries (hard rule, not an assumption)

Compiler-introduced temporaries may be removed only when reference analysis proves they are isolated to the matched pattern. If a temp is read or written outside the pattern, it stays — no level or assumption overrides this.

```js
var _tmp;
const out = (_tmp = obj.value) == null ? fallback : _tmp;
console.log(_tmp);  // temp escapes — do not remove
```

### Dynamic scope limits

Wakaru does not model `eval`, `with`, or host-level observation of generated temporaries (e.g. top-level script `var` leaking to `globalThis`). Rules perform binding/reference analysis within the containing function or module scope. Document this limitation for `minimal` users who expect strict runtime equivalence.

## Dead-code elimination (DCE)

`DceMode` controls the optional late cleanup phase (`DeadDecls`, `DeadImports`) near the end of the rule pipeline. `DeadUninitializedDecls` always runs (it removes isolated compiler temps left by optional-chaining/nullish recovery) regardless of DCE mode.

| Mode | Behavior |
|------|----------|
| `Off` | No `DeadDecls` / `DeadImports` pass. All dead code preserved, including transform-induced leftovers. |
| `TransformOnly` | **Delta DCE.** Snapshot pre-pipeline dead declarations and imports; after all rules run, remove only bindings that became dead because of transforms. Pre-existing dead code in the input is preserved. |
| `Full` | Full reachability sweep. Removes all dead declarations and imports, including code that was already dead in the input. |

<ParamField body="--dce" type="flag">
Opt into `DceMode::Full`. Without this flag, the CLI uses `TransformOnly`.
</ParamField>

<ParamField body="dce_mode" type="DceMode">
Core API field on `DecompileOptions`. Default: `DceMode::Off`.
</ParamField>

### Defaults by entry point

| Entry point | Default `dce_mode` | Notes |
|-------------|-------------------|-------|
| CLI (`wakaru input.js`) | `TransformOnly` | Pass `--dce` for `Full`. |
| CLI (`wakaru --unpack`) | `TransformOnly` | Same as single-file. |
| `DecompileOptions::default()` | `Off` | API callers opt in explicitly. Tests use `Off` to snapshot structural restoration separately. |
| WASM `decompile()` | `TransformOnly` | Playground gets transform-induced cleanup by default. |
| WASM `unpack()` | `Off` (via `Default`) | Set `dce_mode` in options if binding from Rust. |

Delta DCE records pre-dead spans at pipeline start when mode is `TransformOnly`:

```js
// Transform-induced dead helper: removed in TransformOnly
function _classCallCheck(...) { ... }
class Foo { constructor() { _classCallCheck(this, Foo); } }
// → class Foo { constructor() {} }  (_classCallCheck declaration dropped)

// Pre-existing dead helper: preserved in TransformOnly
function _unusedHelper(x) { return x + 1; }
export const value = 42;
// → _unusedHelper stays unless --dce / DceMode::Full
```

`DeadDecls` runs before `DeadImports` because removing dead helpers can leave import specifiers unreferenced.

## Choosing a configuration

<Steps>
<Step title="Pick a rewrite level">
Start with `standard` for readable output. Use `minimal` when behavioral fidelity matters more than readability (auditing, round-trip checks). Use `aggressive` for heavily minified bundles where you mainly want to read the code.
</Step>
<Step title="Decide on dead-code cleanup">
Use CLI defaults (`TransformOnly`) for everyday work — transform leftovers are removed while input dead code stays visible. Add `--dce` when you want a full sweep. API integrators should set `dce_mode` explicitly; default is `Off`.
</Step>
<Step title="Verify output">
Compare against expectations for your level. If loose nullish or optional-chaining recovery surprises you at `standard`, try `minimal`. If output still has unused helpers that were dead before decompilation, that is expected without `--dce`.
</Step>
</Steps>

<CodeGroup>

```bash title="Conservative audit"
wakaru input.js --level minimal -o output.js
```

```bash title="Default decompile"
wakaru input.js -o output.js
# equivalent to --level standard with TransformOnly DCE
```

```bash title="Maximum recovery + full DCE"
wakaru input.js --level aggressive --dce -o output.js
```

```bash title="Unpack with aggressive heuristics"
wakaru bundle.js --unpack --level aggressive -o unpacked/
```

</CodeGroup>

## Reproduce-first policy

New generated-code recovery heuristics should start from a **reproduced compiler, bundler, or minifier shape** — a small input snippet plus the tool and version that produced the lowered code.

Good reproduction sources: Babel, TypeScript, SWC, esbuild, terser, webpack, Rollup, and emitted helper/runtime code from real packages.

A bug report alone does not justify a new heuristic if the producing tool and shape cannot be reproduced. Patterns that look generated but cannot be traced to a known toolchain belong in `aggressive` at most, with a test comment noting the shape is speculative and why reproduction was unavailable.

The repository ships reproduction matrices under `scripts/repro/` (optional/nullish, parameters, for-of, enum, and others) that accept `--level` to compare recovery across levels.

## Rule author checklist

Before adding or widening a rewrite:

1. **Reproduce** the lowered shape from a known toolchain, or place the rewrite in `aggressive` and note it is speculative.
2. **Decide** the lowest level where the rewrite belongs.
3. **Name the assumption** (`no_document_all`, `pure_getters`, `stable_builtins`) in a test name or code comment when the transform is not provable from the AST alone.
4. **Prefer binding/reference proof** over assumptions. A temp proving single evaluation beats relying on `pure_getters`.
5. **Never override concrete observed use** — a temp read outside the matched pattern means the temp stays.

Mixed rules gate subpatterns inside the rule. Whole-rule defaults that remain heuristic at `minimal` include `UnConditionals`; rules disabled entirely below `standard` include `SmartRename` and several modernization passes listed in the rule inventory.

## Related pages

<CardGroup>
<Card title="Decompile pipeline" href="/decompile-pipeline">
How `RewriteLevel` and `DceMode` flow through parse, rule application, and emit — including where `DeadDecls` and `DeadImports` sit in the ordered pipeline.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow for adding rules, pipeline placement, and the `unresolved_mark` guard every identifier-matching visitor needs.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` registry with `standard_or_above` gates and per-rule level notes.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`DecompileOptions` fields, `DceMode`, `RewriteLevel`, and `RewriteAssumptions` exports from `wakaru-core`.
</Card>
<Card title="CLI reference" href="/cli-reference">
Complete `--level` and `--dce` flag documentation alongside unpack modes and diagnostics.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Bisect level-related regressions with per-rule diffs and `--from`/`--until` ranges.
</Card>
</CardGroup>

---

## 07. Helper detection

> How transpiler helpers (Babel, TypeScript/tslib, SWC) are matched by AST body shape across imported, inlined, hoisted, and minified forms; MatchContext and helper lifecycle layers.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/07-helper-detection.md
- Generated: 2026-06-28T01:06:51.552Z

### Source Files

- `docs/helper-detection.md`
- `docs/learnings/helper-detection-pattern-engine.md`
- `crates/core/src/rules/helper_matcher.rs`
- `crates/core/src/rules/match_context.rs`
- `crates/core/src/rules/transpiler_helper_utils/mod.rs`

---
title: "Helper detection"
description: "How transpiler helpers (Babel, TypeScript/tslib, SWC) are matched by AST body shape across imported, inlined, hoisted, and minified forms; MatchContext and helper lifecycle layers."
---

Wakaru recovers transpiler runtime helpers by matching **AST body shape** and **import paths**, not function names. Detection runs in `transpiler_helper_utils` (`collect_transpiler_helpers`, `LocalHelperContext`), binding identity is tracked via `MatchContext` and `helper_matcher.rs`, and per-helper restoration rules consume the cached context during the `Helpers` pipeline stage.

## Four helper forms in bundled output

Transpilers (Babel, TypeScript/tslib, SWC) inject runtime helpers that appear in bundled JavaScript in four shapes:

| Form | Example | Detection strategy |
|---|---|---|
| **Imported** | `require("@babel/runtime/helpers/interopRequireDefault")` | Import path in `paths.rs` maps to `TranspilerHelperKind` |
| **Inlined** | `function _x(obj) { return obj && obj.__esModule ? obj : { default: obj }; }` at module top | Body-shape matcher (`fn(&Function) -> bool`) |
| **Hoisted** | Shared webpack module accessed via `require(42)`; name lost | Body shape after unpack; cross-module facts link numeric refs |
| **Minified** | `function(e){return e&&e.__esModule?e:{default:e}}` | Same body-shape matchers; names ignored, `SyntaxContext` preserved |

<Note>
esbuild bundler helpers (`__commonJS`, `__esm`, `__toESM`, `__toCommonJS`) are handled in the unpacker, not by transpiler helper detection.
</Note>

## Three-layer architecture

Helper recovery splits into three intentional layers — more structure than scattered tuple checks, without a general AST pattern DSL.

```mermaid
flowchart TB
  subgraph detect ["Detection"]
    COLLECT["collect_transpiler_helpers / LocalHelperContext"]
    MATCHERS["Rule-local body-shape matchers"]
    PATHS["paths.rs import-path tables"]
  end

  subgraph identity ["Binding identity"]
    MC["MatchContext — named binding slots"]
    HM["helper_matcher.rs — BindingKey primitives"]
  end

  subgraph lifecycle ["Lifecycle"]
    LC["lifecycle.rs — ref tracking, dependency graph"]
    RESTORE["Per-helper VisitMut rules"]
  end

  PATHS --> COLLECT
  MATCHERS --> COLLECT
  MC --> MATCHERS
  HM --> MATCHERS
  HM --> LC
  COLLECT --> RESTORE
  LC --> RESTORE
```

### Layer 1: `MatchContext` (binding-aware matching)

`MatchContext` (`match_context.rs`) maps function parameters and discovered locals to **named slots**, then checks binding identity with `SyntaxContext`:

<ResponseField name="from_params" type="fn(&Function, &[&str]) -> Option<MatchContext>">
Extracts simple-ident params into named slots. Returns `None` if param count or shape does not match.
</ResponseField>

<ResponseField name="is_binding" type="fn(&Expr, &str) -> bool">
True when `expr` is an identifier matching the slot's `(sym, ctxt)`.
</ResponseField>

<ResponseField name="is_member_of" type="fn(&Expr, &str, &str) -> bool">
True when `expr` is `<slot>.prop_name` (supports ident and string-literal computed props).
</ResponseField>

Use `MatchContext` when several identifiers must refer to the same binding — e.g. `_classCallCheck`, `_inherits`, `_objectWithoutProperties`. Do **not** use it as a full pattern engine; surrounding matchers remain ordinary Rust over SWC nodes.

### Layer 2: `helper_matcher.rs` (binding lifecycle)

Shared low-level primitives for scope-sensitive helper work:

| Primitive | Purpose |
|---|---|
| `BindingKey` = `(Atom, SyntaxContext)` | Unique binding identity |
| `ident_matches_binding` / `expr_matches_binding` | Binding-safe identifier checks |
| `member_of_binding` | `key.prop` member access |
| `remaining_refs_outside_*` | Reference counting, skipping helper declarations |
| `remove_fn_decls_by_binding` / `remove_var_declarators_by_binding` | Declaration cleanup after rewrite |

Every binding match checks **both** symbol and `SyntaxContext` — matching by `sym` alone hits wrong inner-scope locals.

### Layer 3: Rule-local matching

Each helper kind owns its semantic shape recognition. The central scanner (`detect_helper_from_fn`) dispatches to per-helper predicates; some rules keep **stateful** or **marker-based** detection locally:

| Rule module | Detection style |
|---|---|
| `transpiler_helper_utils/matchers.rs` | Babel/SWC body shapes + import paths |
| `un_typeof_polyfill.rs` | TypeScript `typeof Symbol.iterator` polyfill |
| `un_to_consumable_array.rs` | TypeScript `__spreadArray` |
| `un_template_literal.rs` | Tagged-template cache factories (esbuild aliases globals) |
| `un_webpack_interop.rs` | `require.n`, `require.t`, `require.o` |
| `un_object_spread.rs` | esbuild `__spreadValues` / `__spreadProps` (module-wide alias state) |

<Warning>
The central scanner's matchers are `fn(&Function) -> bool` and must not depend on bundler-specific module state. esbuild object-spread detection stays in `un_object_spread.rs` by design.
</Warning>

## Detection scan

`collect_transpiler_helpers()` walks module-level declarations and returns `HashMap<BindingKey, TranspilerHelperKind>`:

```text
scan module-level declarations
  → for each function body, run shape matchers
  → for each Babel runtime import, map path → TranspilerHelperKind
  → for each tslib import/require alias, map raw TsHelperKind
  → collect (BindingKey, TranspilerHelperKind) pairs
```

### Scan targets

`collect.rs` inspects:

- `function` declarations and `export function`
- `var x = function…` / arrow assignments
- `var x = require("@babel/runtime/helpers/…")` and `.default` chains
- `import … from "@babel/runtime/helpers/…"` and `tslib` named imports
- `Object.assign || function(target)…` extends polyfill forms

### Babel sub-helper gating

Babel 7+ uses thin OR-chain dispatchers (`return f(x) || g(x) || h(x)`). The scanner only accepts these when `module_has_babel_sub_helper_signals` finds `Array.isArray`, `Array.from`, or `Symbol.iterator` elsewhere in the module — preventing false positives on unrelated OR chains.

### `LocalHelperContext`

`LocalHelperContext` extends the scan with TypeScript/tslib state:

<ResponseField name="helpers" type="HashMap<BindingKey, TranspilerHelperKind>">
Babel/SWC semantic helpers from body shape and import paths.
</ResponseField>

<ResponseField name="ts_helpers" type="HashMap<BindingKey, TsHelperInfo>">
Raw TypeScript helpers (`__awaiter`, `__generator`, `__spreadArray`, …) tracked separately from semantic kinds.
</ResponseField>

<ResponseField name="tslib_namespaces" type="HashSet<BindingKey>">
Bindings that namespace-import or require `tslib`.
</ResponseField>

Pipeline consumers call `RuleRunContext::local_helpers()`, which **lazily builds** one `LocalHelperContext` per module (with `unresolved_mark`) and reuses it across helper rules in the same pass. Direct rule tests build context themselves.

Utility methods:

| Method | Behavior |
|---|---|
| `helpers_of_kind(kind)` | Filter helpers by `TranspilerHelperKind` |
| `is_helper_callee(expr, kind)` | Match local binding, tslib namespace member, or `require("tslib").helper` |
| `remove_helpers_with_dependencies` | Remove helper + transitive `HelperDependency` bindings when unreferenced |
| `remove_unused_inline_ts_helpers` | Drop inlined TS helpers no longer referenced |

## Matching strategies

### Body-shape predicates

Shape matchers are plain `fn(&Function) -> bool` functions. They check essential structural elements and ignore variable names.

**`interopRequireDefault`** — single param, `__esModule` test, returns `{default: obj}`:

```javascript
// Babel, SWC, minified — same shape, different names
function _interopRequireDefault(obj) {
  return obj && obj.__esModule ? obj : { default: obj };
}
```

Matcher uses `MatchContext::from_params(func, &["obj"])` and accepts ternary-return or if/return forms. Inline IIFEs are classified via `classify_inline_helper_call`.

**Marker accumulation** — complex helpers scan the body for signals anywhere, not at fixed positions:

| Helper | Key signals |
|---|---|
| `toConsumableArray` | `Array.isArray` + `Array.from` |
| `slicedToArray` | `Symbol.iterator` + `Array.isArray`, or OR-chain with sub-helpers |
| `extends` | `Object.assign` + `.apply(this, arguments)` |
| `objectSpread` | `arguments` ref + `Object.defineProperty` or descriptor APIs |
| `interopRequireWildcard` | `__esModule` + for-in or `Object.keys`/`getOwnPropertyDescriptor` |

`scan_stmts_for_markers` and `BodyMarkerState` implement this accumulation pattern.

**Tagged template literal** — signal-based matching on a 2-param function:

| Variant | Required signals |
|---|---|
| Babel spec | `slice_copy` + `freeze_define_raw` |
| Babel loose | `slice_copy` + `raw_assignment` |
| TypeScript | `define_property_raw` |

### Import-path matching

`paths.rs` maps known runtime package paths to helper kinds. Babel and SWC paths are unified per kind:

```javascript
// Both resolve to TranspilerHelperKind::InteropRequireDefault
"@babel/runtime/helpers/interopRequireDefault"
"@swc/helpers/_/_interop_require_default"
```

### Generated-name fallback

When body shape is ambiguous, SWC/esbuild generated names provide a secondary signal — e.g. `_object_spread`, `__spreadValues`, `__objRest` map to `ObjectSpread` / `ObjectWithoutProperties` when the init is a function.

### TypeScript/tslib channel

`ts_helpers.rs` tracks raw `TsHelperKind` values separately. Rules like `UnAsyncAwait` match detected `__awaiter` / `__generator` aliases directly rather than renaming to canonical globals. `is_helper_callee` also resolves `tslib` namespace members and `require("tslib").__awaiter` patterns.

## `TranspilerHelperKind` coverage

| `TranspilerHelperKind` | Babel | tslib | SWC | Restoration rule |
|---|---|---|---|---|
| `InteropRequireDefault` | `_interopRequireDefault` | — | `_interop_require_default` | `UnInteropRequireDefault` |
| `InteropRequireWildcard` | `_interopRequireWildcard` | — | `_interop_require_wildcard` | `UnInteropRequireWildcard` |
| `ToConsumableArray` | `_toConsumableArray` | `__spreadArray` | `_to_consumable_array` | `UnToConsumableArray` |
| `Extends` | `_extends` | `__assign` | `_extends` | (inline in spread/rest rules) |
| `SlicedToArray` | `_slicedToArray` | `__read` | `_sliced_to_array` | `UnSlicedToArray` |
| `ObjectSpread` | `_objectSpread(2)` | — | `_object_spread(_props)` | `UnObjectSpread` |
| `ObjectWithoutProperties` | `_objectWithoutProperties` | `__rest` | `_object_without_properties` | `UnObjectRest` |
| `ClassCallCheck` | `_classCallCheck` | — | `_class_call_check` | `UnClassCallCheck` |
| `AsyncToGenerator` | `_asyncToGenerator` | `__awaiter`+`__generator` | `_async_to_generator` | `un_async_await.rs` |
| `TaggedTemplateLiteral` | `_taggedTemplateLiteral` | — | `_tagged_template_literal` | `UnTemplateLiteral` |
| `HelperDependency` | `_define_property`, `ownKeys`, … | — | sub-helpers | Removed with parent helper |

Kinds without a dedicated rewrite rule (`DefineProperty`, `Typeof`, `HelperDependency`) still participate in `is_helper_module` detection for cross-module fact collection.

## Pipeline placement

Helper detection and restoration run in the **`Helpers` stage** of `apply_rules()`, after **Syntax** normalization. Stage 1 rules like `UnIndirectCall` and `UnBracketNotation` must run first so patterns such as `(0, x.default)()` and `["default"]` are normalized before matchers run.

```text
Syntax stage          Helpers stage                    Structural stage
─────────────────     ─────────────────────────────    ─────────────────
UnIndirectCall    →   UnInteropRequireDefault      →   UnTemplateLiteral
UnBracketNotation →   UnToConsumableArray          →   UnAsyncAwait
                      UnObjectSpread / Rest
                      UnSlicedToArray
                      … UnEsm (requires webpack interop)
                      UnObjectSpread2 / Rest2 / SlicedToArray2
```

`UnInteropRequireDefault` explicitly requires `UnIndirectCall` and `UnBracketNotation`. Late helper passes (`UnObjectSpread2`, `UnObjectRest2`, `UnSlicedToArray2`) run after `UnEsm` converts `require()` to `import`.

## Restoration flow

Each helper kind has a dedicated `VisitMut` rule. The typical cycle:

<Steps>
<Step title="Detect bindings">
`LocalHelperContext` identifies helper `(BindingKey, kind)` pairs.
</Step>
<Step title="Rewrite call sites">
Rule rewrites helper invocations to idiomatic syntax — e.g. `_interopRequireDefault(require("a"))` → `require("a")`, then `.default` → direct reference.
</Step>
<Step title="Remove declarations">
`remove_helpers_with_dependencies` drops helper functions and transitive `HelperDependency` bindings when no external references remain.
</Step>
</Steps>

Reference tracking uses `remaining_refs_outside_declarations` to exclude the helper's own binding and self-references from the "still in use" count.

## Cross-module helper facts

`collect_module_facts()` records two export channels after Stage 2:

| Field | Content |
|---|---|
| `helper_exports` | Semantic helpers as public `HelperKind` |
| `ts_helper_exports` | Raw TypeScript helpers as `TypeScriptHelperKind` |
| `is_helper_module` | True when module exports any recognized helper (including dependency-only kinds) |

Phase 2 rules (e.g. `UnSlicedToArray` with `ModuleFactsMap`) resolve cross-module helper refs when a consumer imports from a hoisted helper module. See cross-module facts for the two-phase unpack barrier.

## Version drift and relaxed matching

Bundled code often strips version markers; inlined helpers erase them entirely. Wakaru uses **relaxed matching** — check essential semantic structure, not exact AST equality.

Tolerated variation includes:

- Ternary vs if/else conditional forms
- `.default` vs `["default"]` property access (after `UnBracketNotation`)
- Extra `Object.defineProperty` for non-configurable exports
- Added null checks

If a future helper version fundamentally changes semantics, it needs a new matcher — not a version gate.

## What not to build

<Info>
A corpus matcher, ast-grep-style DSL, and skeleton-pattern engine were prototyped, measured against real bundles, and **reverted**. ~93% of detection is marker-based or stateful and cannot be expressed as fixed patterns; the migratable remainder (~10–14 of ~209 functions) is too small for a shared engine to pay off.
</Info>

Rejected approaches:

| Approach | Why rejected |
|---|---|
| Custom IR layer | SWC AST is already high-level; second IR duplicates cost |
| CFG hashing | Minifier changes scramble naive hashes; canonicalization is the hard part |
| Version auto-detect | Version strings stripped in real bundles |
| Configurable pass graphs | Current fixed-order pipeline is intentional |

Real line savings come from **targeted deduplication** (e.g. consolidating inline-vs-declaration detection), not a generic matcher framework.

## Add a new helper matcher

<Steps>
<Step title="Write a failing test">
Add cases to `crates/core/tests/` or `transpiler_helper_utils/tests.rs` with the exact input AST shape.
</Step>
<Step title="Implement the body-shape predicate">
Add `is_my_helper_fn(func: &Function) -> bool` in `matchers.rs`. Use `MatchContext::from_params` when multiple params must correlate. Use `scan_stmts_for_markers` when signals appear anywhere in the body.
</Step>
<Step title="Wire detection">
Register in `detect_helper_from_fn` and, if applicable, `detect_helper_from_path` path tables.
</Step>
<Step title="Add restoration rule">
Create a `VisitMut` rule, register it in `pipeline.rs` under `Helpers` stage at the correct dependency position.
</Step>
<Step title="Verify">
Run focused rule tests, then pipeline snapshots (`noop_pipeline`, webpack4/esbuild unpack tests).
</Step>
</Steps>

<Check>
Every binding match must gate on `SyntaxContext`. Use `BindingRenamer` for renames — never rename by `sym` alone.
</Check>

## Related pages

<CardGroup>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Stage ordering, `unresolved_mark` scope gating, and where helper rules sit relative to syntax normalization.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Two-phase unpack barrier, `ModuleFactsMap`, and hoisted helper module resolution.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Full `RuleDescriptor` registry with `Helpers` stage entries and cross-rule dependencies.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, and definition-of-done checklist for new rules.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
Bisect helper-rule regressions with `--trace-rules` and snapshot diff workflow.
</Card>
</CardGroup>

---

## 08. Cross-module facts

> Two-phase unpack barrier: Phase 1 fact collection after UnEsm, ModuleFactsMap shape, and Phase 2 rules (namespace_decomposition, cross-module helper refs) that read other modules' import/export facts.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/08-cross-module-facts.md
- Generated: 2026-06-28T01:07:36.951Z

### Source Files

- `docs/fact-system.md`
- `crates/core/src/facts.rs`
- `crates/core/src/driver/unpack.rs`
- `crates/core/src/namespace_decomposition.rs`
- `crates/core/src/rules/cross_module_helper_refs.rs`

---
title: "Cross-module facts"
description: "Two-phase unpack barrier: Phase 1 fact collection after UnEsm, ModuleFactsMap shape, and Phase 2 rules (namespace_decomposition, cross-module helper refs) that read other modules' import/export facts."
---

Multi-module `unpack()` runs a two-phase pipeline with a cross-module barrier: Phase 1 normalizes each extracted module through `UnEsm`, extracts import/export facts, and assembles a shared `ModuleFactsMap`; Phase 2 re-runs the through-`UnEsm` range, applies barrier passes that read other modules' facts, then continues with fact-aware helper rules. Single-file `decompile()` does not use this system.

<Info>
Facts are read-only snapshots derived from the post-Stage-2 AST. Rules mutate only the current module's AST — they never write back to `ModuleFactsMap`.
</Info>

## When facts apply

| Mode | Uses `ModuleFactsMap` |
|------|----------------------|
| `unpack()` / `unpack_files()` | Yes — all modules share one map at the barrier |
| `decompile()` (single file) | No — `RulePipelineOptions::module_facts` stays `None` |
| `unpack_raw()` | No — raw extraction skips the rule pipeline |

Facts exist because transpiler helpers and namespace imports are often split into separate bundle modules. A consumer module's rewrite (for example `import h from "./helpers"; h.default(...)`) requires proof that the target module exports a known helper shape.

## Two-phase barrier

`unpack_multi_module_with_plan` in `crates/core/src/driver/unpack/phases.rs` orchestrates both phases. Each phase processes modules in parallel via Rayon.

```mermaid
flowchart TB
    subgraph phase1 ["Phase 1 — per module, parallel"]
        P1A[parse + resolver]
        P1B[rules until UnEsm]
        P1C[recover_late_esm_from_factory_iifes]
        P1D["collect_module_facts(module)"]
        P1A --> P1B --> P1C --> P1D
    end

    subgraph barrier ["Cross-module barrier"]
        MAP["ModuleFactsMap::insert(filename, facts)"]
    end

    subgraph phase2 ["Phase 2 — per module, parallel"]
        P2A[resume or re-parse + rules until UnEsm]
        P2B[run_reexport_consolidation]
        P2C[run_namespace_decomposition]
        P2D["rules UnObjectSpread2 → UnReturn<br/>with_module_facts(map)"]
        P2E[late cleanup + emit]
        P2A --> P2B --> P2C --> P2D --> P2E
    end

    phase1 --> barrier --> phase2
```

### Why the through-`UnEsm` range runs twice

SWC's `SyntaxContext` must remain continuous within the emitted module pipeline. Phase 1 may discard its AST (source-map mode always re-parses; the no-sourcemap path can reuse the Phase 1 AST when `can_reuse_phase1_ast` is true). Either way, Phase 2 needs a fresh or carefully resumed AST so ctxt-sensitive rules (`UnImportRename`, `BindingRenamer`, and others) see consistent binding identities.

<Note>
Phase 1 clones the module before `recover_late_esm_from_factory_iifes` when the AST will be reused, so fact extraction sees recovered ESM shapes while Phase 2 resumes from the pre-recovery through-`UnEsm` state and runs its own recovery at `options.level`.
</Note>

### Phase 1 failure semantics

If a module fails to parse during fact collection, Wakaru records an `UnpackWarning` with kind `FactCollectionParseFailed` and inserts empty facts for that module. The unpack continues best-effort rather than aborting the entire bundle.

## `ModuleFacts` shape

`collect_module_facts` in `crates/core/src/facts.rs` is a pure function over the post-Stage-2 AST. Call it immediately after `UnEsm` and ESM recovery, before later rules mutate import/export structure.

<ResponseField name="imports" type="Vec<ImportFact>">
Each `ImportFact` records `local`, `source`, and `kind` (`Default`, `Namespace`, or `Named(imported)`).
</ResponseField>

<ResponseField name="exports" type="Vec<ExportFact>">
Each `ExportFact` records `exported`, optional `local`, and `kind` (`Default` or `Named`). The exported name `"default"` marks default exports.
</ResponseField>

<ResponseField name="helper_exports" type="Vec<HelperExportFact>">
Transpiler helper identity proven from exported binding body shape. Kinds include `Extends`, `ObjectSpread`, `AsyncToGenerator`, `InteropRequireDefault`, and others defined by `HelperKind`.
</ResponseField>

<ResponseField name="default_object_helper_exports" type="Vec<HelperExportFact>">
Helpers exported as properties on a default-export object literal (`export default { extends: _extends, ... }`).
</ResponseField>

<ResponseField name="ts_helper_exports" type="Vec<TypeScriptHelperExportFact>">
Raw TypeScript/tslib helper identity (`__awaiter`, `__generator`, `__spreadArray`, etc.) proven from export shape or tslib registrar patterns.
</ResponseField>

<ResponseField name="passthrough_target" type="Option<Atom>">
Set when the module is a pure passthrough: body is exactly `export default require("./X.js")` with no other statements. Importers can be redirected to `./X.js`.
</ResponseField>

<ResponseField name="is_helper_module" type="bool">
True when the module exports any recognized transpiler helper, including kinds with no rewrite mapping (for example `_defineProperty`). Used by dead helper-module elimination.
</ResponseField>

Helper export facts are conservative: they record identity only when the exported local binding matches a known helper body or runtime export shape. They do not speculate from consumer-side usage patterns.

## `ModuleFactsMap` lookup

`ModuleFactsMap` stores facts keyed by **canonical module filename** (the unpacked output path, not the import specifier string alone).

| Operation | Behavior |
|-----------|----------|
| `insert(key, facts)` | Normalizes `key` by stripping leading `./` |
| `get(specifier)` | Tries canonical form, then common extension variants (`.js`, `.jsx`), then extension-stripped forms |

This handles specifier variants like `./lib/foo.js`, `lib/foo.js`, and `lib/foo` resolving to the same module.

<Warning>
Filename recovery (`build_rename_map`) runs at the barrier but is kept separate from the fact map. Fact-driven passes operate on provisional filenames; only the final emit step rewrites import sources to recovered names.
</Warning>

## Phase 2 barrier passes

These free functions take `(&mut Module, &ModuleFactsMap)` and run sequentially before the fact-aware rule range.

### `run_reexport_consolidation`

Redirects imports from passthrough modules to their actual target. When a default import from a passthrough is used only via member access, the import source is rewritten:

```
import x from "./passthrough.js"  →  import * as x from "./target.js"
```

`resolve_passthrough` follows chains of `passthrough_target` facts transitively, detecting cycles.

### `run_namespace_decomposition`

Rewrites namespace-like imports into named imports when usage is property-access only and the target module exports those names:

```
import r from "./x"; r.foo()  →  import { foo } from "./x"; foo()
```

Safety checks include inner-scope shadowing, mixed default+named imports on the same declaration, JSX intrinsic vs component distinction, and readability backoff when too many collision aliases would be needed. Reused pre-existing named specifiers propagate their real `SyntaxContext` so downstream `(sym, ctxt)` passes match rewritten usages.

## Fact-aware pipeline rules

After the barrier passes, Phase 2 runs:

```rust
RulePipelineOptions::between("UnObjectSpread2", "UnReturn")
    .with_module_facts(facts_ref)
```

`RulePipelineOptions::with_module_facts` threads the map into `RuleRunContext`. Rules that accept facts use optional `new_with_facts` constructors; single-file `decompile()` keeps the normal constructors with `module_facts: None`.

### `collect_cross_module_helper_refs`

`crates/core/src/rules/cross_module_helper_refs.rs` bridges consumer import specifiers to producer `helper_exports` / `default_object_helper_exports` / `ts_helper_exports`:

| Import shape | Lookup |
|--------------|--------|
| Default | `helper_exports` where `exported == "default"`, plus `default_object_helper_exports` as a namespace map |
| Named | `helper_exports` matched by exported name |
| Namespace | All helper exports chained into a member-access namespace map |

Returns `CrossModuleHelperRefs { direct, namespaces }` keyed by `(sym, ctxt)` binding keys.

### Rules that consume cross-module facts

| Rule | Cross-module behavior |
|------|----------------------|
| `UnObjectSpread` / `UnObjectSpread2` | Recognizes `extends` / `objectSpread` helpers imported from a separate helper module |
| `UnObjectRest` / `UnObjectRest2` | Same pattern for object-rest helpers |
| `UnSlicedToArray2` | Cross-module `slicedToArray` / tslib `__read` refs |
| `UnTemplateLiteral` | Cross-module `taggedTemplateLiteral` refs |
| `UnRegenerator` | Proves `AsyncToGenerator` default export on imported helper modules (including interop `require()` aliases) |

<Note>
Cross-module helper recognition does not remove the consumer import when helper identity alone cannot prove the helper module is side-effect-free. `DeadImports` may downgrade binding imports to side-effect imports; dead helper-module elimination (when DCE is enabled at standard+ rewrite level) can then drop unused helper modules.
</Note>

## Adding a fact-reading pass

<Steps>
<Step title="Implement a barrier function">

Add a free function in `crates/core/src/` with signature `fn run_my_pass(module: &mut Module, module_facts: &ModuleFactsMap)`. Derive all conclusions locally from facts you read — do not write back to the map.

</Step>

<Step title="Wire into Phase 2">

Call it from `run_phase2_tail` in `phases.rs` between `apply_rules(..., until("UnEsm"))` and the `UnObjectSpread2`–`UnReturn` range, alongside `run_reexport_consolidation` and `run_namespace_decomposition`.

</Step>

<Step title="Or extend an existing pipeline rule">

For rules that must stay at their current pipeline position, add an optional `new_with_facts` constructor and read `ctx.module_facts` in the pipeline runner. Thread facts only through the multi-module rule runner.

</Step>

<Step title="Add unit tests">

Follow `crates/core/tests/namespace_decomposition_rule.rs`: use `facts_for(source)` to synthesize a target module's facts, build a `ModuleFactsMap`, and assert the rewrite output.

```rust
fn facts_for(source: &str) -> ModuleFacts {
    // parse → resolver → collect_module_facts(&module)
}

let mut map = ModuleFactsMap::new();
map.insert("target.js", facts_for(r#"export function foo() {}"#));
run_namespace_decomposition(&mut module, &map);
```

</Step>
</Steps>

## Implementation constraints

### Identifier and span gotchas

| Constraint | Reason |
|------------|--------|
| Use `DUMMY_SP` for new import specifiers and rewritten usage idents | `apply_sourcemap_renames()` skips idents only when `span.is_dummy()` |
| Propagate `SyntaxContext` when reusing an existing binding | Downstream `(sym, ctxt)` passes (for example `UnImportRename`) must see rewrites as the same binding |
| Use `SyntaxContext::empty()` for freshly created import specifiers | New bindings match each other without re-running the resolver |

### Non-goals

- No shared mutable state between rules in the same phase
- No multi-round fact merging
- No speculative facts — a fact holds only if the post-Stage-2 AST proves it
- Rules derive heavier semantic conclusions (for example namespace-projection equivalence) internally; they do not emit observations back into the map

## Debugging fact-driven rewrites

Fact collection happens at a fixed pipeline point. To bisect a single-file regression, use `--trace-rules` with `--from` / `--until` ranges. Bundle unpack debugging cannot trace per-module fact assembly directly — compare Phase 1 fact output via `ModuleFactsMap` display formatting or add targeted unit tests with synthetic `facts_for` maps.

<AccordionGroup>
<Accordion title="Namespace decomposition skipped for a valid-looking import">

Check that `module_facts.get(source)` resolves the target specifier, that all accessed properties appear in `target_facts.exports` with `ExportKind::Named`, and that usage is property-access only (no bare binding reference, no computed access, no assignment to members).

</Accordion>

<Accordion title="Cross-module helper not recognized">

Verify the helper module's `helper_exports` or `ts_helper_exports` contain the expected kind after Phase 1. Helper identity must be proven from the producer's export AST, not inferred from consumer call shape alone.

</Accordion>

<Accordion title="Empty facts for a module">

Look for `FactCollectionParseFailed` in unpack warnings. The module may contain invalid standalone JS after extraction; other modules still decompile.

</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Bundle formats and unpacking" href="/bundle-formats-and-unpacking">
Detection order, raw vs full unpack, and when multi-module fact collection activates.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Single-file pipeline flow, `unresolved_mark` gating, and how unpack parallelizes per module.
</Card>
<Card title="Helper detection" href="/helper-detection">
How helper body shapes are matched before they become `helper_exports` facts.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` registry, Stage 2 boundary at `UnEsm`, and `with_module_facts` threading.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, and `BindingRenamer` conventions for new rules.
</Card>
</CardGroup>

---

## 09. Unpack bundles

> Operational guide for --unpack modes (auto vs strict), --raw extraction, multi-file entry+chunk inputs, directory scanning rules, and --force overwrite protection.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/09-unpack-bundles.md
- Generated: 2026-06-28T01:07:47.227Z

### Source Files

- `README.md`
- `crates/cli/src/main.rs`
- `crates/cli/src/discovery.rs`
- `crates/core/src/driver.rs`
- `docs/architecture.md`
- `crates/cli/src/output.rs`

---
title: Unpack bundles
description: Operational guide for --unpack modes (auto vs strict), --raw extraction, multi-file entry+chunk inputs, directory scanning rules, and --force overwrite protection.
---

Unpack mode splits bundled JavaScript into separate module files, then runs the decompile pipeline on each module in parallel. Use it when you have a webpack, esbuild, Browserify, or similar bundle — or a `dist/` directory full of entry and chunk files — and want readable source back on disk.

## Prerequisites

- An output directory via `-o` / `--output`. `--unpack` always requires it; Wakaru refuses to unpack to stdout.
- One or more inputs: a single bundle file, multiple explicit files (entry + chunks), a directory path, or stdin (`-` or piped input when no file is given).

<Steps>
<Step title="Choose an unpack mode">

| Mode | Flag | Behavior |
|------|------|----------|
| Auto (default) | `--unpack` or `--unpack=auto` | Structural bundle detection first; if no format matches, heuristic scope-hoisted splitting for Rollup/Vite/flat esbuild output |
| Strict | `--unpack=strict` | Structural detection only — webpack, Browserify, esbuild/Bun markers, SystemJS, AMD, webpack chunks. No heuristic fallback |

Auto mode is appropriate for most real-world bundles. Use strict when you want predictable, marker-based splitting only — for example, when heuristic scope splitting produces false positives.

</Step>

<Step title="Run unpack">

<CodeGroup>

```bash title="Full unpack (default)"
wakaru bundle.js --unpack -o out/
```

```bash title="Raw extraction only"
wakaru bundle.js --unpack --raw -o out/
```

```bash title="Strict structural detection"
wakaru bundle.js --unpack=strict -o out/
```

```bash title="Webpack entry + chunk files"
wakaru entry.js chunk.abc123.js --unpack -o out/
```

```bash title="Scan a dist directory"
wakaru dist/ --unpack -o out/
```

</CodeGroup>

</Step>

<Step title="Verify output">

On a TTY, Wakaru prints a summary to stderr:

```
scanned: 4 file(s), detected: 2 bundle/chunk file(s), skipped: 2 file(s)
detected: webpack5, webpack5_chunk
total: 47 module(s) in 1.23s
```

- **scanned / detected / skipped** — only shown when at least one directory input was scanned
- **detected** — bundle formats recognized across all inputs
- **total** — module files written under the output directory

The process exits non-zero if any module produced an error-level warning (for example, a parse failure during decompilation).

</Step>
</Steps>

## Unpack modes in detail

### Auto mode (`--unpack` / `--unpack=auto`)

Auto mode sets `heuristic_split: true` in the core driver. The unpacker tries structural detectors in order (webpack5, webpack4, webpack5 chunk, Browserify, SystemJS, esbuild/Bun). When none match:

1. **Single explicit file** — Wakaru attempts scope-hoisted splitting. If that yields more than one module, those modules are unpacked. Otherwise the file falls back to single-file decompilation and is written as `module.js`.
2. **Directory scan** — files that fail both structural and heuristic detection are **skipped**, not copied or decompiled.

At `--level aggressive`, auto mode also retries scope-hoist splitting **inside** modules already extracted by a structural detector (nested scope split). This can recover additional modules from large webpack output that still contains scope-hoisted regions.

### Strict mode (`--unpack=strict`)

Strict mode disables heuristic splitting entirely. Only files that match a structural bundle or chunk shape are unpacked. Scope-hoisted Rollup/Vite output without `__export` or `__commonJS` markers will not split in strict mode.

For explicit file inputs that are not bundles, strict mode still falls back to single-file decompilation (same as auto). Directory scans use the same detection gate but never apply the heuristic — plain `.js` files in `dist/` are skipped.

## Raw extraction (`--raw`)

`--raw` requires `--unpack`. It runs extraction and bundler-coupled normalization only — **not** the full decompile rule pipeline (~60 rewrite rules, cross-module fact collection, or Phase 2 late pass).

Use raw extraction when you want to:

- Inspect what the unpacker extracted before readability transforms
- Debug bundle detection or extraction boundaries
- Compare against reference fixtures at the raw layer

Raw output still includes webpack-specific normalization tied to extraction (factory param renaming, `require()` rewriting, runtime helper removal). Webpack ESM markers and export getters remain so a later full decompile pass can recover live exports.

```bash
wakaru bundle.js --unpack --raw -o raw/
wakaru entry.js chunk.js --unpack --raw -o raw/
```

`--formatter` still applies to raw output if passed. `--emit-source-map` is not populated in raw mode.

## Multi-file entry and chunk inputs

Pass multiple file paths to combine entry runtime and async chunks in one unpack operation:

```bash
wakaru main.bundle.js src_greet_js.bundle.js --unpack -o out/
```

When more than one input is provided, the core `unpack_files` (or `unpack_files_raw`) path:

1. Detects each file independently
2. Merges extracted modules into a single module set
3. Stabilizes filenames and rewrites unambiguous numeric webpack module IDs across physical files so entry→chunk references resolve
4. Runs the two-phase parallel decompile pipeline once over the combined set

Duplicate numeric webpack IDs across unrelated runtimes (common when scanning a whole `dist/` tree) are treated as ambiguous and are **not** rewritten globally — this prevents merging unrelated bundles.

Files that are not detected as bundles still participate as fallback modules (basename used as filename) when passed explicitly. This differs from directory scanning, which skips non-detected files entirely.

<ParamField body="--source-map" type="path">
Supported only with a **single** input file. Multi-file unpack rejects `--source-map` / `-m`.
</ParamField>

## Directory scanning

Directory inputs work **only** with `--unpack`. Passing a directory without `--unpack` errors with guidance to use `--unpack`.

```bash
wakaru dist/ --unpack -o out/
```

### What gets scanned

The CLI recursively walks the directory and collects candidates matching:

| Rule | Detail |
|------|--------|
| Extensions | `.js`, `.mjs`, `.cjs` (case-insensitive) |
| Skipped directories | Hidden directories (name starts with `.`) and `node_modules` |
| Skipped files | Hidden files (name starts with `.`) |
| Non-JS files | Ignored entirely (`.map`, `.ts`, `.txt`, etc.) |

Results are sorted by path string before processing.

### Detect-only filter

Each candidate is read and passed to `is_detected_unpack_input`. A file is included only when:

- Structural bundle detection succeeds (`try_unpack_bundle`), **or**
- Auto mode is on and scope-hoisted heuristic splitting would produce more than one module

Plain application source, runtime-like stubs without bundle markers, and non-bundle chunks are **skipped** — not copied, not decompiled. If every file in a directory input is skipped, Wakaru exits with:

```
no bundle or chunk files detected in directory input
```

### Scan statistics

When at least one directory was scanned, stderr reports:

```
scanned: N file(s), detected: M bundle/chunk file(s), skipped: K file(s)
```

`scanned` counts all `.js`/`.mjs`/`.cjs` candidates read; `detected` counts those kept; `skipped` counts those filtered out.

## Output directory and path safety

`-o` / `--output` must be a directory path. Wakaru creates it if missing.

Module filenames come from bundle contents and may contain traversal attempts. Before writing, the CLI:

1. Lexically validates each filename (rejects `../`, absolute paths, Windows drive prefixes)
2. Deduplicates collisions (`index.js` → `index_2.js`, etc.)
3. Canonicalizes parent directories and rejects symlink escapes outside the output root

Untrusted paths like `../node_modules/@wakaru/cli/bin/wakaru` are written **inside** the output directory as ordinary relative paths, never outside it.

## Overwrite protection (`--force`)

Wakaru refuses to clobber existing output unless `--force` is passed.

| Target | Without `--force` | With `--force` |
|--------|-------------------|----------------|
| Single output file (decompile mode) | Error if file exists | Overwrites |
| Output directory (empty or new) | Creates and writes | Creates and writes |
| Output directory (non-empty) | Error: `output directory … is not empty` | Writes into directory |
| Output path exists but is a file (unpack `-o`) | Error: `exists and is not a directory` | N/A |

When `--force` writes into an **existing non-empty** directory, Wakaru uses a write-if-changed fast path: identical files (same length and bytes) are not rewritten, which avoids touching timestamps on unchanged modules during re-runs.

```bash
wakaru dist/ --unpack -o out/              # fails if out/ has files
wakaru dist/ --unpack -o out/ --force      # updates changed modules only
```

`--force` is a global flag and also applies to `wakaru extract` and `wakaru debug trace` output paths.

## Combining with other flags

| Flag | With `--unpack` |
|------|-----------------|
| `--level minimal\|standard\|aggressive` | Controls rewrite aggressiveness per module; `aggressive` + auto enables nested scope split |
| `--dce` | Full dead-code sweep on each decompiled module (not raw) |
| `--formatter` | Final format pass on each output file |
| `--emit-source-map` | Writes `.map` alongside each module (full unpack only) |
| `--json` | Machine-readable JSON summary to stdout; human summary still on stderr unless `--json` suppresses styled warnings |
| `--diagnostics` | Post-transform checks; warnings in stderr or JSON |
| `--profile` | Chrome trace including parallel module work |

Example CI invocation:

```bash
wakaru dist/ --unpack --json --force -o out/ 2>stderr.json
```

See [JSON output and CI](/json-output-and-ci) for the stdout schema (`detected_formats`, `modules`, `warnings`, `total`, `failed`, `elapsed_ms`).

## Raw vs full unpack

```mermaid
flowchart TD
    A[Input file(s)] --> B{--raw?}
    B -->|no| C[Detect bundle format]
    B -->|yes| D[Detect + extract only]
    C --> E[Phase 1: parallel fact collection]
    E --> F[Phase 2: cross-module decompile]
    F --> G[Write modules to -o]
    D --> H[Bundler-coupled normalization]
    H --> G
```

Full unpack runs the complete two-phase pipeline described in [Decompile pipeline](/decompile-pipeline) and [Cross-module facts](/cross-module-facts). Raw unpack stops after extraction.

## Common failure modes

<AccordionGroup>
<Accordion title="--unpack requires -o/--output">

Unpack always needs an output directory. Stdout is not supported for multi-module output.

```bash
wakaru bundle.js --unpack        # error
wakaru bundle.js --unpack -o out/  # ok
```

</Accordion>

<Accordion title="Directory contains no detected bundles">

All candidates were read but none matched bundle/chunk shapes (strict) or heuristic scope split (auto). Add explicit file paths, switch to auto mode, or verify the input is actually bundled output.

</Accordion>

<Accordion title="Output directory is not empty">

Pass `--force` to write into an existing directory, or choose a new output path.

</Accordion>

<Accordion title="Single module.js for an expected bundle">

The input may be scope-hoisted ESM without structural markers. Try `--unpack` (auto) instead of strict, or `--level aggressive` for nested splitting inside webpack output. See [Bundle formats and unpacking](/bundle-formats-and-unpacking) for detection order and format markers.

</Accordion>

<Accordion title="errors in N module(s)">

One or more modules failed during decompilation (parse recovery, TDZ, etc.). Warnings name the failing module filename. See [Troubleshooting](/troubleshooting) for `UnpackWarningKind` codes.

</Accordion>
</AccordionGroup>

## Related pages

<Card href="/quickstart" title="Quickstart" icon="play">
First successful unpack runs and expected success signals.
</Card>

<Card href="/bundle-formats-and-unpacking" title="Bundle formats and unpacking" icon="package">
Detection order, format variants, and raw vs full semantics.
</Card>

<Card href="/webpack-bundle-recipe" title="Webpack bundle recipe" icon="book">
End-to-end webpack4/webpack5 entry + chunk workflow.
</Card>

<Card href="/cli-reference" title="CLI reference" icon="terminal">
Complete flag surface including unpack sub-options.
</Card>

<Card href="/troubleshooting" title="Troubleshooting" icon="alert-circle">
Overwrite protection, directory skip behavior, and warning kinds.
</Card>

---

## 10. Use source maps

> Provide --source-map for identifier recovery and import dedup, emit decompiled maps with --emit-source-map, extract embedded sources via wakaru extract, and pipeline ordering constraints.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/10-use-source-maps.md
- Generated: 2026-06-28T01:08:30.101Z

### Source Files

- `README.md`
- `crates/core/src/sourcemap_rename.rs`
- `crates/cli/src/main.rs`
- `docs/architecture.md`
- `crates/core/src/rules/import_dedup.rs`

---
title: Use source maps
description: Provide --source-map for identifier recovery and import dedup, emit decompiled maps with --emit-source-map, extract embedded sources via wakaru extract, and pipeline ordering constraints.
---

Wakaru supports source maps in three distinct ways: **input maps** that improve decompiled readability, **output maps** that link decompiled code back to the minified input, and **`wakaru extract`** to recover original source files embedded in a map. Input and output maps are independent — you can use either, both, or neither.

## When to use each feature

| Feature | Flag / command | Purpose |
|---------|----------------|---------|
| Input source map | `--source-map` / `-m` | Recover original identifier names; deduplicate and merge imports before rename |
| Output source map | `--emit-source-map` | Emit a v3 `.map` alongside each decompiled output file |
| Extract embedded sources | `wakaru extract` | Write `sourcesContent` entries from a map to disk |

The strongest results come from **single-file decompile** with a companion `.map` from the same build that produced the minified JavaScript. Identifier positions in the AST must align with the map's generated positions.

## Provide an input source map

Pass the v3 source map that corresponds to the minified or bundled file you are decompiling.

<Steps>
<Step title="Locate the companion map">

Find the `.map` file emitted by your bundler or minifier alongside the JavaScript you feed to Wakaru. The map must describe the **same generated file** as the input — not a parent bundle when you are decompiling an already-extracted module.

</Step>

<Step title="Run decompile with --source-map">

<CodeGroup>

```bash title="Single file"
wakaru input.js --source-map input.js.map -o output.js
```

```bash title="Short alias"
wakaru input.js -m input.js.map -o output.js
```

```bash title="Unpack a single bundle"
wakaru bundle.js --unpack --source-map bundle.js.map -o out/
```

</CodeGroup>

</Step>

<Step title="Verify improved names">

Compare output with and without the map. With `--source-map`, minified single-letter bindings should recover readable names where mappings and `sourcesContent` are present. Duplicate imports from scope-hoisted bundles should collapse into canonical bindings.

</Step>
</Steps>

<ParamField body="--source-map" type="path" required={false}>
Path to a v3 source map (`.map`). Aliases: `-m`, `--sourcemap`. Read as raw bytes and passed to `DecompileOptions::sourcemap`.
</ParamField>

### CLI constraints

- **Single input file only.** Wakaru rejects `--source-map` when multiple explicit inputs are passed (for example `entry.js chunk.js --unpack`). Directory scans that resolve to multiple bundle files also conflict with this flag.
- **Works with `--unpack`** when exactly one bundle file is the input. The same bundle-level map is applied during Phase 2 of each extracted module. Rename quality depends on whether identifier spans in the extracted module still align with bundle positions; single-file decompile is the reliable path.
- **`debug trace` accepts `--source-map` but does not run the source-map rename pass.** Rule tracing covers the normal `apply_rules` pipeline only, not the post-pipeline `ImportDedup` → rename → `UnImportRename` sequence.

## What --source-map enables

When `DecompileOptions::sourcemap` is set, Wakaru runs a dedicated post-rules pass that does two things:

1. **Import deduplication** — merges repeated imports of the same specifier from the same module and consolidates multiple `import` statements targeting one source into a single declaration.
2. **Identifier recovery** — looks up each binding's generated position in the map, reads the original name from the `names` array or from `sourcesContent`, and renames bindings via `BindingRenamer`.

Import dedup also runs inside the normal rule pipeline (Stage 6), but the source-map pass runs **`ImportDedup` again** immediately before rename so duplicates are collapsed to one canonical binding before names are recovered.

### Pipeline ordering

Source-map work runs **after** the full transformation pipeline and **before** the fixer and emitter. Order is fixed for correctness:

```mermaid
flowchart TD
    A[parse + resolver] --> B[apply_rules — full rule pipeline]
    B --> C{sourcemap provided?}
    C -->|yes| D[ImportDedup]
    D --> E[apply_sourcemap_renames]
    E --> F[UnImportRename]
    C -->|no| G[strip Sentry markers — standard+]
    F --> G
    G --> H[fixer]
    H --> I{emit_source_map?}
    I -->|yes| J[print with srcmap + build v3 map]
    I -->|no| K[print_js]
```

**Why rename must come last among transforms:**

- Rules match patterns by **minified names** (`require`, `__generator`, `__esModule`). Renaming first would break helper detection and module-system reconstruction.
- `ImportDedup` needs **`UnEsm`** to have converted `require()` calls into `import` declarations. That happens in Stage 2 of the main pipeline.
- Dedup must run **before** rename so five duplicate imports of `lodash` become one binding that receives a single recovered name instead of five separate renames.

During **unpack**, the same source-map trio (`ImportDedup` → `apply_sourcemap_renames` → `UnImportRename`) runs at the end of Phase 2, after cross-module rules (`namespace_decomposition`, re-export consolidation) and the `UnTemplateLiteral` through `UnReturn` range. Supplying a source map or requesting `--emit-source-map` forces Phase 2 to **re-parse** each module rather than reuse the Phase 1 AST, because rename lookup depends on the Phase 2 parser `SourceMap` and emitted mappings depend on a fresh parse.

### How identifier recovery works

For each non-global identifier (bindings where `ctxt.outer() != unresolved_mark`):

1. Convert the identifier's `span` to a generated `(line, col)` in the parsed input file.
2. Call `lookup_token` on the v3 map to find the original source file, line, and column.
3. Prefer the map's `names` entry when present; otherwise extract the identifier text from `sourcesContent` at the original position (works when `names` is empty, as in esbuild output).
4. **Vote** per `(sym, SyntaxContext)` binding — plurality wins.
5. Apply renames with scope-aware disambiguation:
   - **Local bindings** (params, block-scoped locals): all claimants receive the bare recovered name; nested scopes shadow as in original source.
   - **Module-level bindings** (imports, top-level declarations): names must be unique in the merged flat namespace; collisions get suffixes (`name_2`, `name_3`, …).

Globals and unresolved references (`Object`, `Symbol`, etc.) are never renamed. Identifiers with `DUMMY_SP` spans are skipped so synthetically inserted bindings are not voted on.

After rename, **`UnImportRename`** removes redundant import aliases (`import { foo as bar }` → `import { foo }` when safe).

## Emit a decompiled source map

`--emit-source-map` generates a v3 source map mapping **decompiled output** positions back to the **parsed input** (the minified or extracted module code Wakaru received), not to original TypeScript sources.

<RequestExample>

```bash
wakaru input.js --emit-source-map -o output.js
```

</RequestExample>

This writes two files:

- `output.js` — decompiled JavaScript
- `output.js.map` — v3 map with `sourcesContent` embedding the input source

<ParamField body="--emit-source-map" type="boolean" required={false}>
Off by default. When set, each output file gets a companion `.map` (for example `foo.js` → `foo.js.map`). With `--unpack`, one map is emitted per kept module. The map's `file` field uses the final module filename, including names recovered from provenance markers.
</ParamField>

### Output map contents

Wakaru collects byte-position mappings during `JsWriter` emission, then builds a v3 map via `SourceMapBuilder`:

- **`file`** — output filename (`DecompileOptions::filename` or recovered module name).
- **`sources`** — the parsed input source name from SWC's `SourceMap`.
- **`sourcesContent`** — full text of the input as Wakaru parsed it.
- **Mappings** — tokens linking decompiled output lines/columns to input lines/columns.

<ResponseExample>

```json title="--json stdout (single file, no -o)"
{
  "code": "const greeting = \"hello\";\n",
  "source_map": "{\"version\":3,\"file\":\"input.js\",...}",
  "warnings": [],
  "elapsed_ms": 12
}
```

</ResponseExample>

With `-o`, the map is written to disk. With `--json` and no `-o`, the map JSON is included in the `source_map` field on stdout. Human-readable stdout mode (no `-o`, no `--json`) prints code only and discards the map.

## Extract embedded sources

When a bundler embeds original sources in `sourcesContent`, recover them without running the decompiler:

<Steps>
<Step title="Run extract">

```bash
wakaru extract input.js.map -o src/
```

</Step>

<Step title="Check output">

Wakaru prints a count to stderr when attached to a terminal:

```
extracted 42 source file(s) to src/
```

Only entries with **both** a source path and non-empty `sourcesContent` are written. Entries missing content are skipped silently.

</Step>
</Steps>

### Path sanitization

Source paths from webpack and other bundlers often carry prefixes or traversal segments. `resolve_source_path` normalizes them before writing:

- Strips `webpack://`, `webpack:///`, and leading `/`
- Resolves path components under the output directory
- **Drops `..` components** — source paths never escape the output root

Parent directories are created as needed. Overwrite protection follows the global `--force` flag.

## Core API and WASM

<ParamField body="DecompileOptions::sourcemap" type="Option<Vec<u8>>">
Raw v3 source map bytes. Enables the post-rules import dedup + rename pass.
</ParamField>

<ParamField body="DecompileOptions::emit_source_map" type="bool">
When `true`, populates `DecompileOutput::source_map` (single file) or `UnpackOutput::source_maps` (per-module `(filename, json)` pairs).
</ParamField>

The WASM `decompile` binding accepts `sourcemap?: Uint8Array` and `emitSourceMap?: boolean`; results surface in `WakaruDecompileResult.source_map`. The `unpack` binding supports `emitSourceMap` and returns `source_maps` per module.

Public helpers in `wakaru-core`:

- `parse_sourcemap(data)` — parse v3 bytes
- `extract_source_entries(sm, out_dir)` — list `(path, content)` pairs without writing
- `resolve_source_path(out_dir, source)` — sanitize a single source path

## Limitations and troubleshooting

| Situation | Behavior |
|-----------|----------|
| No `--source-map` | Skip import-dedup/rename pass; `ImportDedup` in the main pipeline still runs at Stage 6 |
| Map missing `sourcesContent` and empty `names` | Fewer identifiers recover; position fallback cannot read original text |
| Multiple CLI inputs + `--source-map` | Hard error |
| `debug trace` + `--source-map` | Map bytes accepted but rename pass not traced |
| stdin input (`<stdin>`) | Rename still works if map positions match; output map uses `<stdin>` as source name |
| Non-ASCII identifiers in source | Column lookup assumes ASCII-fast extraction; UTF-16 column offsets may mismatch for non-ASCII source |

If recovered names look wrong, confirm the map targets the exact file Wakaru parses — not an intermediate bundle when modules were pre-extracted. For regressions in the rename pass, use normal decompile output comparison; source-map steps are outside rule trace range.

## Related pages

<CardGroup>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Parse, resolver marks, staged rules, optional source-map rename, fixer, and emit order.
</Card>

<Card title="CLI reference" href="/cli-reference">
Full flag list including `-m`, `--emit-source-map`, and the `extract` subcommand.
</Card>

<Card title="Core API reference" href="/core-api-reference">
`DecompileOptions` fields, `DecompileOutput::source_map`, and `UnpackOutput::source_maps`.
</Card>

<Card title="Unpack bundles" href="/unpack-bundles">
Using `--unpack` with single-file inputs and Phase 2 re-parse behavior when maps are involved.
</Card>

<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs for the main pipeline; source-map passes are not included in trace output.
</Card>

<Card title="JSON output and CI" href="/json-output-and-ci">
Machine-readable `source_map` field in `--json` decompile responses.
</Card>
</CardGroup>

---

## 11. JSON output and CI integration

> Machine-readable --json stdout schema for decompile and unpack, warning kinds and is_error flags, elapsed_ms timing, and piping patterns for automation pipelines.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/11-json-output-and-ci-integration.md
- Generated: 2026-06-28T01:11:00.052Z

### Source Files

- `crates/cli/src/json_output.rs`
- `crates/cli/src/main.rs`
- `README.md`
- `crates/core/src/driver/types.rs`

---
title: "JSON output and CI integration"
description: "Machine-readable --json stdout schema for decompile and unpack, warning kinds and is_error flags, elapsed_ms timing, and piping patterns for automation pipelines."
---

The `wakaru` CLI exposes a `--json` flag that serializes operation results to **stdout** as compact JSON (single line, no pretty-printing). Warnings, timing, and metadata live in the JSON object; decompiled or unpacked **code** is written to `-o` when provided, or embedded in JSON for single-file decompile without an output path. Human-readable summaries and styled warning lines are suppressed on stderr when `--json` is active.

<ParamField body="--json" type="boolean">
Emit machine-readable JSON to stdout instead of human-readable summaries. Warnings and errors are included in the JSON object. Compatible with both single-file decompile and `--unpack` modes. Does not affect fatal pre-run errors (missing input, overwrite protection, invalid flag combinations), which still print to stderr and exit before JSON is written.
</ParamField>

## Stdout and stderr contract

| Stream | `--json` off | `--json` on |
|--------|--------------|-------------|
| **stdout** | Decompiled code (no `-o`), or nothing (with `-o` / unpack) | Single JSON object |
| **stderr** | Warnings, scan stats, format summaries | Mostly quiet; `--formatter` failures still print to stderr |

<Note>
JSON is emitted **before** a non-zero exit when non-fatal module errors exist. Pipelines should parse stdout first, then check the process exit code.
</Note>

```mermaid
sequenceDiagram
    participant CI as CI job
    participant Wakaru as wakaru --json
    participant FS as Filesystem (-o)

    CI->>Wakaru: input file or stdin
    Wakaru->>FS: write code / modules (if -o set)
    Wakaru-->>CI: JSON on stdout
    alt warnings with is_error true
        Wakaru-->>CI: exit code 1
    else success or diagnostics only
        Wakaru-->>CI: exit code 0
    end
```

## Decompile schema

Triggered when `--unpack` is absent. Applies to a single file or stdin (`-` or piped input).

<RequestExample>
```bash title="Decompile stdin to JSON"
echo 'var a=1;' | wakaru --json
```
</RequestExample>

<ResponseExample>
```json title="JsonDecompileOutput (stdout, no -o)"
{"code":"const a = 1;\n","warnings":[],"elapsed_ms":9}
```
</ResponseExample>

<ResponseField name="code" type="string | omitted">
Decompiled JavaScript. Present only when **no** `-o` / `--output` is set. When `-o` is set, code is written to the output file and this field is omitted from JSON.
</ResponseField>

<ResponseField name="source_map" type="string | omitted">
v3 source map JSON string. Present when `--emit-source-map` produced a map. Included in JSON regardless of whether `-o` is set; when `-o` is set, the `.map` file is also written alongside the output file.
</ResponseField>

<ResponseField name="warnings" type="JsonWarning[]">
Array of warning objects. See [Warning objects](#warning-objects).
</ResponseField>

<ResponseField name="elapsed_ms" type="number">
Wall-clock milliseconds for the decompile operation (`u64`).
</ResponseField>

<CodeGroup>
```bash title="Write code to file, JSON metadata only"
wakaru input.js --json -o output.js
```

```json title="Stdout when -o is set"
{"warnings":[],"elapsed_ms":10}
```
</CodeGroup>

## Unpack schema

Requires `-o` / `--output` (a directory). Module **code** is written under that directory; JSON lists filenames and aggregate stats only.

<RequestExample>
```bash title="Unpack bundle with JSON summary"
wakaru bundle.js --unpack --json -o unpacked/
```
</RequestExample>

<ResponseExample>
```json title="JsonUnpackOutput (stdout)"
{"detected_formats":["webpack4"],"modules":[{"filename":"module-0.js"},{"filename":"entry.js"}],"warnings":[],"total":52,"failed":0,"elapsed_ms":1768}
```
</ResponseExample>

<ResponseField name="detected_formats" type="string[]">
Bundle format identifiers detected during unpack. Values come from `BundleFormat::as_str()`: `webpack5`, `webpack4`, `browserify`, `systemjs`, `esbuild`, `amd`, `scope-hoisted`.
</ResponseField>

<ResponseField name="modules" type="JsonModule[]">
One entry per extracted module. Each object has only `filename` (logical module name, not the on-disk path under `-o`).
</ResponseField>

<ResponseField name="warnings" type="JsonWarning[]">
Per-module warnings from the unpack pipeline.
</ResponseField>

<ResponseField name="total" type="number">
Count of modules written to the output directory.
</ResponseField>

<ResponseField name="failed" type="number">
Count of **distinct module filenames** that have at least one warning with `is_error: true`. Matches the human-mode `"(N failed)"` suffix.
</ResponseField>

<ResponseField name="elapsed_ms" type="number">
Wall-clock milliseconds for the full unpack operation.
</ResponseField>

## Warning objects

Every warning in CLI JSON uses the same shape:

| Field | Type | Description |
|-------|------|-------------|
| `filename` | string | Module or input filename the warning applies to |
| `kind` | string | Machine-readable warning code (see table below) |
| `is_error` | boolean | `true` when the warning is a hard failure; `false` for diagnostics |
| `message` | string | Human-readable detail |

`is_error` is derived from `UnpackWarningKind::is_error()` in wakaru-core: diagnostic kinds return `false`; all others return `true`.

### Warning kinds

| `kind` | `is_error` | Meaning |
|--------|------------|---------|
| `raw_normalization_failed` | `true` | Raw extraction normalization failed |
| `fact_collection_parse_failed` | `true` | Phase 1 fact-collection parse failed |
| `decompile_failed` | `true` | Per-module decompile failed (fallback to raw code) |
| `input_parse_recovered` | `false` | Input parsed with error recovery |
| `tdz_violation` | `false` | Temporal dead zone detected after transform (`--diagnostics`) |
| `duplicate_declaration` | `true` | Duplicate binding in same scope (`--diagnostics`) |
| `import_cycle` | `false` | Import cycle detected |
| `output_parse_recovered` | `true` | Output parse required recovery |
| `output_parse_failed` | `true` | Output could not be parsed |

<Warning>
`--formatter` failures are **not** included in JSON warnings. When `--formatter` is enabled, formatter failures print to stderr and the unformatted code is preserved. For programmatic formatter-error handling, use the [WASM API](/wasm-api-reference) (`formatter_failed` kind) or parse stderr.
</Warning>

### Exit code behavior

| Condition | Exit code |
|-----------|-----------|
| Success, or only diagnostic warnings (`is_error: false`) | `0` |
| Any warning with `is_error: true` | `1` (after JSON is printed) |
| Fatal CLI error (missing input, overwrite without `--force`, invalid flags) | `1` (no JSON) |

Diagnostic warnings (`input_parse_recovered`, `tdz_violation`, `import_cycle`) do **not** cause a non-zero exit on their own.

## CI integration patterns

<Steps>
<Step title="Gate on exit code and errors">
Run wakaru with `--json`, capture stdout, then check `$?`. For stricter gates, also assert `failed == 0` (unpack) or no warnings with `is_error: true` (decompile).
</Step>

<Step title="Extract fields with jq">
Parse the JSON object from stdout. Compact output is valid `jq` input without preprocessing.
</Step>

<Step title="Separate code from metadata">
For unpack, read module bodies from the `-o` directory. For decompile without `-o`, read `code` from JSON; with `-o`, read the output file.
</Step>
</Steps>

<CodeGroup>
```bash title="Fail CI on decompile errors"
result=$(wakaru input.js --json)
echo "$result" | jq -e '.warnings | all(.is_error == false)' > /dev/null
```

```bash title="Unpack and assert zero failures"
wakaru bundle.js --unpack --json -o out/ | jq -e '.failed == 0'
```

```bash title="List error warnings"
wakaru bundle.js --unpack --json -o out/ | jq '.warnings[] | select(.is_error)'
```

```bash title="Capture timing for benchmarks"
wakaru input.js --json | jq '.elapsed_ms'
```

```bash title="Stdin pipeline"
echo "$SOURCE" | wakaru --json | jq -r '.code'
```
</CodeGroup>

<Tip>
Redirect stderr separately in CI (`2>wakaru.log`) so `--formatter` or linker warnings do not contaminate JSON parsing on stdout.
</Tip>

### Baseline regression stats

The repository's reproduction matrices (`scripts/repro/*-matrix/matrix.mjs`) support their own `--json` flag with a **different schema** (`summary`, `rows`, etc.). `scripts/repro/collect-stats.mjs` runs all matrices with `--json`, aggregates pass rates into `scripts/repro/stats.json`, and supports `--check` to fail CI when the baseline is stale:

```bash
node scripts/repro/collect-stats.mjs          # refresh stats.json
node scripts/repro/collect-stats.mjs --check  # exit non-zero if stale
```

This is separate from `wakaru --json` but follows the same automation pattern: machine-readable stdout, exit-code gating.

## CLI JSON vs WASM API

The browser/WASM bindings return JSON-shaped objects via `serde_wasm_bindgen` with different field sets:

| Aspect | CLI `--json` | WASM `decompile` / `unpack` |
|--------|--------------|----------------------------|
| Unpack module code | On disk only (`-o`) | Inline in `modules[].code` |
| `detected_formats` | Yes (unpack) | No |
| `total` / `failed` | Yes (unpack) | No |
| `elapsed_ms` | Yes | No |
| `is_error` on warnings | Yes | No (check `kind` manually) |
| `formatter_failed` kind | No (stderr only) | Yes |

See [WASM API reference](/wasm-api-reference) for `WakaruDecompileResult` and `WakaruUnpackResult` TypeScript shapes.

## Common pitfalls

<AccordionGroup>
<Accordion title="Empty stdout on fatal errors">
Overwrite protection, missing `-o` for unpack, and directory-as-input for decompile fail before JSON is emitted. Check stderr for the error message.
</Accordion>

<Accordion title="Expecting code in unpack JSON">
Unpack JSON lists `filename` only. Module source lives under the `-o` directory.
</Accordion>

<Accordion title="Treating diagnostics as CI failures">
`tdz_violation` and `import_cycle` set `is_error: false` and do not change the exit code. Gate on `is_error` or `failed`, not warning count alone.
</Accordion>

<Accordion title="Parsing stderr as JSON">
Only stdout is JSON when `--json` is set. Human scan stats (`scanned:`, `detected:`, `total:`) are suppressed in JSON mode but formatter warnings are not.
</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="CLI reference" href="/cli-reference">
Complete flag surface including `--json`, stdin behavior, `--force`, and profiling options.
</Card>
<Card title="Unpack bundles" href="/unpack-bundles">
Operational unpack guide: `--unpack` modes, `-o` directory layout, and `--raw` extraction.
</Card>
<Card title="Quickstart" href="/quickstart">
First successful decompile and unpack runs with expected success signals.
</Card>
<Card title="Troubleshooting" href="/troubleshooting">
UnpackWarningKind codes, parse-recovery warnings, and bug-report fields.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`UnpackOutput`, `DecompileOutput`, and `has_errors()` semantics in wakaru-core.
</Card>
<Card title="WASM API reference" href="/wasm-api-reference">
In-process JSON result types for browser and Node WASM consumers.
</Card>
</CardGroup>

---

## 12. WASM and playground

> Build wakaru-wasm for the browser playground, decompile/unpack JS bindings, TypeScript result types, and Vite integration for the online demo.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/12-wasm-and-playground.md
- Generated: 2026-06-28T01:09:00.577Z

### Source Files

- `crates/wasm/src/lib.rs`
- `playground/package.json`
- `playground/scripts/build-wasm.mjs`
- `playground/vite.config.ts`
- `.github/workflows/playground.yml`

---
title: "WASM and playground"
description: "Build wakaru-wasm for the browser playground, decompile/unpack JS bindings, TypeScript result types, and Vite integration for the online demo."
---

The `wakaru-wasm` crate (`crates/wasm`) compiles `wakaru-core` to WebAssembly via `wasm-bindgen` and exposes `decompile`, `unpack`, and `ruleNames` to JavaScript. The React playground in `playground/` loads the generated `crates/wasm/pkg` bundle through a Vite alias, runs decompilation in a dedicated Web Worker, and deploys to Vercel at `/playground/`.

## Architecture

```mermaid
flowchart TB
  subgraph browser["Browser (playground)"]
    UI["React UI (App.tsx)"]
    Bridge["WasmBridge"]
    Worker["Web Worker (worker.ts)"]
    UI --> Bridge
    Bridge <-->|postMessage| Worker
  end

  subgraph wasm_pkg["crates/wasm/pkg (wasm-pack output)"]
    Init["default init()"]
    Decompile["decompile()"]
    Unpack["unpack()"]
    RuleNames["ruleNames()"]
  end

  subgraph rust["Rust workspace"]
    Core["wakaru-core"]
    Formatter["wakaru-formatter"]
    Lib["crates/wasm/src/lib.rs"]
    Lib --> Core
    Lib --> Formatter
  end

  Worker --> Init
  Worker --> Decompile
  wasm_pkg -.-> Lib
```

The playground UI calls only `decompile` today. `unpack` and `ruleNames` are exported for programmatic use and match the CLI/core semantics.

| Layer | Path | Role |
|---|---|---|
| WASM crate | `crates/wasm/` | `cdylib` bindings, serde result types, embedded TypeScript defs |
| Generated JS/WASM | `crates/wasm/pkg/` | `wasm-pack` output (gitignored) |
| Build script | `playground/scripts/build-wasm.mjs` | Invokes `wasm-pack` before Vite |
| Playground app | `playground/src/` | Monaco editors, controls, share URLs, source-map overlay |
| Worker bridge | `playground/src/wasm/` | Off-main-thread WASM init and request routing |

## Build wakaru-wasm

### Prerequisites

- Rust stable with `wasm32-unknown-unknown` target
- [`wasm-pack`](https://rustwasm.github.io/wasm-pack/)
- Node.js (playground uses Node 24 in CI)

<Steps>
<Step title="Install the WASM target">

```bash
rustup target add wasm32-unknown-unknown
```

</Step>
<Step title="Build the package">

From the repository root:

```bash
wasm-pack build crates/wasm --target web --out-dir pkg --release
```

Or from `playground/`:

```bash
npm run build:wasm
```

`build-wasm.mjs` runs the same `wasm-pack` invocation with `--target web` and writes to `crates/wasm/pkg`.

</Step>
<Step title="Verify output">

After a successful build, `crates/wasm/pkg/` contains at least:

- `wakaru_wasm.js` — ES module glue code
- `wakaru_wasm_bg.wasm` — compiled binary
- `wakaru_wasm.d.ts` — generated types (supplemented by the crate's `typescript_custom_section`)

</Step>
</Steps>

<Note>
`crates/wasm/Cargo.toml` sets `wasm-opt = false` for the release profile. Binary size is traded for faster CI builds and predictable output.
</Note>

### Crate configuration

`wakaru-wasm` is a workspace member packaged as `cdylib`. Dependencies include `wakaru-core`, `wakaru-formatter`, `wasm-bindgen`, `serde-wasm-bindgen`, `console_error_panic_hook`, and `getrandom` with the `wasm_js` feature for browser entropy.

The `#[wasm_bindgen(start)]` `init` function installs `console_error_panic_hook` so Rust panics surface in the browser console.

## JavaScript bindings

Three functions are exported from `crates/wasm/src/lib.rs`. All return structured objects serialized through `serde_wasm_bindgen`. Errors from `wakaru-core` propagate as thrown `JsValue` strings.

### `decompile`

Single-file decompilation with optional source-map input and emitted output map.

<ParamField body="source" type="string" required>
Input JavaScript source.
</ParamField>

<ParamField body="level" type='"minimal" | "standard" | "aggressive"'>
Rewrite level. Defaults to `standard` when omitted or invalid (via `RewriteLevel::from_str_or_default`).
</ParamField>

<ParamField body="sourcemap" type="Uint8Array">
Raw bytes of a v3 source map for identifier recovery and import deduplication.
</ParamField>

<ParamField body="diagnostics" type="boolean">
When `true`, runs post-transform checks (TDZ, output parse). Default: `false`.
</ParamField>

<ParamField body="formatter" type="boolean">
When `true`, runs the Oxc formatter after decompilation. Default: `false` (no formatting).
</ParamField>

<ParamField body="emitSourceMap" type="boolean">
When `true`, returns a v3 source map mapping output back to input. Default: `false`.
</ParamField>

<ResponseField name="code" type="string">
Decompiled (and optionally formatted) JavaScript.
</ResponseField>

<ResponseField name="source_map" type="string">
Emitted source map JSON. Omitted when `emitSourceMap` is `false` or no map was produced.
</ResponseField>

<ResponseField name="warnings" type="WakaruWarning[]">
Non-fatal pipeline and formatter warnings.
</ResponseField>

WASM `decompile` sets `dce_mode: DceMode::TransformOnly` (unlike the CLI default of `DceMode::Off`) and hardcodes `filename: "input.js"`.

<RequestExample>

```javascript title="Decompile in the browser"
import init, { decompile } from "wakaru-wasm";

await init();
const result = decompile(
  "var n=function(){return 1};",
  "standard",
  undefined,
  true,
  true,
  true
);
console.log(result.code);
console.log(result.warnings);
```

</RequestExample>

### `unpack`

Bundle unpacking for browser-side automation. Mirrors `wakaru_core::unpack` with per-module formatting.

<ParamField body="source" type="string" required>
Bundle JavaScript source.
</ParamField>

<ParamField body="level" type='"minimal" | "standard" | "aggressive"'>
Rewrite level.
</ParamField>

<ParamField body="heuristicSplit" type="boolean">
Enables scope-hoisted heuristic splitting when no structural bundle is detected. Default: `true`.
</ParamField>

<ParamField body="diagnostics" type="boolean">
Post-transform diagnostic checks. Default: `false`.
</ParamField>

<ParamField body="formatter" type="boolean">
Oxc formatting per extracted module. Default: `false`.
</ParamField>

<ParamField body="emitSourceMap" type="boolean">
Per-module emitted source maps. Default: `false`.
</ParamField>

<ResponseField name="modules" type="WakaruModule[]">
Extracted modules with `filename` and `code`.
</ResponseField>

<ResponseField name="source_maps" type="WakaruSourceMap[]">
Per-module maps (`filename`, `map`). Omitted when empty.
</ResponseField>

<ResponseField name="warnings" type="WakaruWarning[]">
Unpack and formatter warnings.
</ResponseField>

### `ruleNames`

Returns the ordered pipeline rule identifiers from `wakaru_core::rule_names()` as a `string[]`. Useful for debugging and UI that lists active rules.

## TypeScript result types

Types are declared in two places:

1. **Embedded section** — `#[wasm_bindgen(typescript_custom_section)]` in `lib.rs` augments `wakaru_wasm.d.ts` with canonical interfaces and camelCase parameter names on exported functions.
2. **Playground shim** — `playground/src/vite-env.d.ts` declares the `wakaru-wasm` module for the Vite alias, including `default init()` and snake_case fields on result objects (`source_map` on `WakaruDecompileResult`).

### Core interfaces

```typescript title="Embedded TypeScript definitions (lib.rs)"
export interface WakaruDecompileResult {
    code: string;
    source_map?: string;
    warnings: WakaruWarning[];
}

export interface WakaruUnpackResult {
    modules: WakaruModule[];
    source_maps?: WakaruSourceMap[];
    warnings: WakaruWarning[];
}

export type WakaruWarningKind =
    | "raw_normalization_failed"
    | "fact_collection_parse_failed"
    | "decompile_failed"
    | "tdz_violation"
    | "output_parse_failed"
    | "formatter_failed";
```

Formatter failures add warnings with `kind: "formatter_failed"` and a message naming the formatter (`oxc`) and underlying error.

<Warning>
Serde field names use snake_case (`source_map`, `source_maps`). The playground worker maps `result.source_map` to `sourceMap` in its internal bridge types. When consuming `wakaru-wasm` directly, read `source_map` from the returned object.
</Warning>

## Vite integration

`playground/vite.config.ts` wires the WASM artifact into the React app.

| Setting | Value | Purpose |
|---|---|---|
| `resolve.alias["wakaru-wasm"]` | `../crates/wasm/pkg` | Import the local wasm-pack output |
| `plugins` | `react()`, `wasm()` | `vite-plugin-wasm` enables `.wasm` imports |
| `worker.plugins` | `[wasm()]` | WASM support inside Web Workers |
| `optimizeDeps.exclude` | `["wakaru-wasm"]` | Skip pre-bundling the native WASM module |
| `server.fs.allow` | Repository root | Dev server can read `crates/wasm/pkg` |
| `build.target` | `esnext` | Modern JS for worker modules |
| `base` | `/playground/` (build) / `/` (dev) | Production assets served under `/playground/` |

Build-time defines inject version metadata from the workspace `Cargo.toml` and current git hash:

- `import.meta.env.VITE_WAKARU_VERSION`
- `import.meta.env.VITE_WAKARU_GIT_HASH`

The header displays `v{version}` and links to the commit on GitHub.

### npm scripts

```json title="playground/package.json scripts"
"build:wasm": "node scripts/build-wasm.mjs",
"dev": "vite",
"build": "npm run build:wasm && vite build",
"preview": "vite preview",
"test": "vitest run"
```

Production `npm run build` always rebuilds WASM before the Vite bundle.

## Playground runtime

### Web Worker isolation

`WasmBridge` spawns a module worker at `playground/src/wasm/worker.ts`. The worker:

1. Handles `{ type: "init" }` by calling `await init()` from `wakaru-wasm`.
2. Handles `{ type: "decompile", ... }` by calling `decompile()` and posting results back with a request `id`.

This keeps SWC/WASM work off the UI thread. Monaco editors and source-map hover overlays stay responsive during decompilation.

### Auto-run behavior

After WASM init, `App.tsx` debounces input changes and re-runs decompilation automatically. Delay scales with the last run duration (60–300 ms). The playground passes `diagnostics: true`, `formatter` from the controls toggle, and `emitSourceMap: true` to power the mapping overlay.

### Controls and features

| Control | WASM parameter | Notes |
|---|---|---|
| Level selector | `level` | `minimal`, `standard`, `aggressive` |
| Formatter toggle | `formatter` | Disabled when Mapping overlay is on |
| Mapping toggle | `emitSourceMap` (always on during run) | Visualizes input↔output line links via emitted map |
| Share button | — | Encodes gzip-compressed state in URL hash (`#state=1\|...`) |

Share limits: source max 1,000,000 characters; encoded state max 200,000 characters. Oversized input shows "Input is too large to share".

<Info>
The live demo is linked from the README at `https://wakaru.vercel.app/playground`. Bug report templates accept playground share URLs as reproductions.
</Info>

## Deployment

`.github/workflows/playground.yml` deploys on pushes to `main` that touch `crates/core/**`, `crates/wasm/**`, `playground/**`, or the workflow itself.

<Steps>
<Step title="CI build pipeline">

1. Install Rust + `wasm32-unknown-unknown`
2. Install `wasm-pack`
3. `wasm-pack build crates/wasm --target web --out-dir pkg --release`
4. `npm install` in `playground/`
5. `vercel build --prod` then `vercel deploy --prebuilt --prod`

</Step>
<Step title="Routing">

`playground/vercel.json` rewrites `/playground` and `/playground/*` to the Vite `dist` output. The main `website/` project proxies `/playground` requests to the dedicated Vercel playground project.

</Step>
</Steps>

Concurrency group `playground-deploy` cancels in-progress deploys when new commits land.

## Local development

<Steps>
<Step title="First-time setup">

```bash
rustup target add wasm32-unknown-unknown
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
cd playground && npm install
```

</Step>
<Step title="Run dev server">

```bash
cd playground
npm run build:wasm   # required before first dev session
npm run dev
```

Open the URL Vite prints (dev uses `base: "/"`).

</Step>
<Step title="Production preview">

```bash
cd playground
npm run build
npm run preview
```

</Step>
</Steps>

<Tip>
Rebuild WASM after changing `crates/core` or `crates/wasm`. The dev server does not watch Rust sources.
</Tip>

## WASM vs CLI defaults

| Option | WASM `decompile` | WASM `unpack` | CLI default |
|---|---|---|---|
| `dce_mode` | `TransformOnly` | `Off` (via `..Default::default()`) | `Off` |
| `heuristic_split` | N/A | `true` | `false` |
| `diagnostics` | `false` unless passed | `false` unless passed | CLI flag |
| `formatter` | Oxc when `true` | Oxc per module when `true` | `--formatter` flag |
| `filename` | `"input.js"` | `"input.js"` | From input path |

For full flag surfaces and JSON stdout schemas, see the CLI and core API reference pages.

## Troubleshooting

| Symptom | Likely cause | Fix |
|---|---|---|
| `Failed to resolve import "wakaru-wasm"` | Missing `crates/wasm/pkg` | Run `npm run build:wasm` |
| `WASM not initialized` in worker | `init()` not completed | Ensure `WasmBridge.waitForInit()` resolves before decompile |
| Stale playground output after core changes | WASM not rebuilt | Re-run `npm run build:wasm` |
| Dev server cannot read pkg | `server.fs.allow` misconfigured | Keep Vite config allowing repo root |
| Formatter warnings with `formatter_failed` | Oxc could not format output | Output is preserved; warning is non-fatal |
| Share URL fails | Input exceeds size limits | Reduce source or share a gist link instead |

<AccordionGroup>
<Accordion title="Why is unpack not in the playground UI?">
`unpack` is exported and fully wired in the WASM crate, but the current playground only exercises single-file `decompile`. Bundle unpacking remains a CLI workflow (`wakaru --unpack`) or a custom integration calling `unpack()` from JavaScript.
</Accordion>
<Accordion title="Why run WASM in a worker?">
Decompilation is CPU-heavy. The worker boundary prevents blocking React rendering and Monaco input while `wakaru-core` runs inside WASM.
</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="WASM API reference" href="/wasm-api-reference">
Per-export signatures, JSON shapes, and warning kind strings for `decompile`, `unpack`, and `ruleNames`.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`DecompileOptions`, `DceMode`, `RewriteLevel`, and unpack output types that the WASM layer wraps.
</Card>
<Card title="Rewrite levels and assumptions" href="/rewrite-levels-and-assumptions">
What `minimal`, `standard`, and `aggressive` change in the pipeline.
</Card>
<Card title="Use source maps" href="/use-source-maps">
Input source maps for rename recovery and emitted maps from `--emit-source-map` / `emitSourceMap`.
</Card>
<Card title="Overview" href="/overview">
Three-crate workspace layout and the path from input JavaScript to readable output.
</Card>
<Card title="Troubleshooting" href="/troubleshooting">
`UnpackWarningKind` codes, formatter failures, and bug-report fields.
</Card>
</CardGroup>

---

## 13. Develop transformation rules

> Add or modify VisitMut rules: test-first workflow, pipeline placement in RuleDescriptor order, unresolved_mark guards, BindingRenamer for renames, and definition-of-done verification checklist.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/13-develop-transformation-rules.md
- Generated: 2026-06-28T01:09:06.520Z

### Source Files

- `AGENTS.md`
- `CONTRIBUTING.md`
- `docs/architecture.md`
- `docs/testing.md`
- `crates/core/src/rules/pipeline.rs`
- `crates/core/src/rules/rename_utils.rs`

---
title: "Develop transformation rules"
description: "Add or modify VisitMut rules: test-first workflow, pipeline placement in RuleDescriptor order, unresolved_mark guards, BindingRenamer for renames, and definition-of-done verification checklist."
---

Wakaru's decompile pipeline applies roughly 60 SWC `VisitMut` transformation rules from `crates/core/src/rules/`, executed in a fixed `RuleDescriptor` order defined in `pipeline.rs`. Each rule is one Rust file, registered via `mod.rs` and a `runner!` entry in `define_rule_registry!`; rule behavior is validated with isolated `render_rule` tests before pipeline snapshot regressions are reviewed.

## Rule file layout

| Path | Role |
|---|---|
| `crates/core/src/rules/<rule>.rs` | Rule implementation (`VisitMut`) |
| `crates/core/src/rules/mod.rs` | `mod` declaration + `pub use` export |
| `crates/core/src/rules/pipeline.rs` | `runner!` function, `RuleDescriptor` entry, registry order |
| `crates/core/tests/<rule>_rule.rs` | Isolated unit tests (required for every change) |

A minimal rule struct implements `VisitMut` and visits children before applying its own transform:

```rust
use swc_core::ecma::ast::Expr;
use swc_core::ecma::visit::{VisitMut, VisitMutWith};

pub struct MyRule;

impl VisitMut for MyRule {
    fn visit_mut_expr(&mut self, expr: &mut Expr) {
        expr.visit_mut_children_with(self);
        // transformation logic
    }
}
```

Rules that match identifiers by name take `unresolved_mark: Mark` and receive it from `RuleRunContext` in the pipeline runner (see [Scope-aware matching](#scope-aware-matching-with-unresolved_mark)).

<Note>
Second pipeline passes reuse the same runner but get a suffixed registry ID — for example `UnWebpackInterop2`, `UnIife2`, `UnParameters3`. Test helpers and `debug trace` use these suffixed names.
</Note>

## Test-first workflow

Every rule change requires a focused unit test. Pipeline snapshot updates alone do not satisfy coverage — they exercise the whole pipeline, not the individual rule.

<Steps>
<Step title="Write failing tests">

Create or extend `crates/core/tests/my_rule_rule.rs` with positive cases (pattern transforms) and negative cases (unrelated code unchanged):

```rust
mod common;

use common::{assert_eq_normalized, render_rule};
use wakaru_core::rules::MyRule;

fn apply(input: &str) -> String {
    render_rule(input, |unresolved_mark| MyRule::new(unresolved_mark))
}

#[test]
fn transforms_target_pattern() {
    let input = r#"/* minified input */"#;
    let expected = r#"/* readable output */"#;
    assert_eq_normalized(&apply(input), expected);
}

#[test]
fn leaves_unrelated_code_alone() {
    let input = r#"/* code that should not change */"#;
    assert_eq_normalized(&apply(input), input);
}
```

Run only your test file while iterating:

```bash
cargo test -p wakaru-core --test my_rule_rule
```

</Step>

<Step title="Implement the rule">

Add `crates/core/src/rules/my_rule.rs` and wire it into `mod.rs` and `pipeline.rs` (next section). Iterate until focused tests pass.

</Step>

<Step title="Check pipeline regressions">

Run the required pipeline test binaries and review any snapshot drift:

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
```

</Step>
</Steps>

### Choosing a test helper

| Helper | When to use |
|---|---|
| `render_rule(source, builder)` | Default — single rule in isolation (resolver + rule + fixer) |
| `render(source)` | Rule depends on earlier normalization (helper detection, Stage 1 transforms) |
| `render_pipeline_until(source, stop_after)` | Verify AST shape at a specific pipeline point |
| `render_pipeline_between(source, start, stop)` | Test a rule given realistic pre-processed input without downstream effects |
| `assert_eq_normalized(actual, expected)` | Compare output after whitespace normalization |

For rules needing `unresolved_mark`, pass it through the builder closure:

```rust
fn apply(input: &str) -> String {
    render_rule(input, |unresolved_mark| MyRule::new(unresolved_mark))
}
```

For `.ts`/`.tsx` inputs, use `render_rule_with_filename`.

<Warning>
Do not use bare expression statements as test inputs (e.g. `65536;`) — `SimplifySequence` drops them as dead code. Wrap in a binding: `const x = 65536;`.
</Warning>

Bugfixes to existing rules need a regression test that reproduces the exact failure before the fix.

## Register a rule in the pipeline

<Steps>
<Step title="Export from mod.rs">

```rust
mod my_rule;
pub use my_rule::MyRule;
```

</Step>

<Step title="Add a runner in pipeline.rs">

Use the `runner!` macro. Pass `unresolved_mark` or other context from `RuleRunContext` when needed:

```rust
runner!(run_my_rule, |ctx| MyRule::new(ctx.unresolved_mark));
```

For rules without context:

```rust
runner!(run_my_rule, MyRule);
```

Helper-dependent rules may need a custom runner function (see `run_un_es6_class` or `run_un_template_literal` for patterns that call `ctx.local_helpers(module)`).

</Step>

<Step title="Insert a RuleDescriptor entry">

Add an entry inside `define_rule_registry!` at the position where upstream dependencies are satisfied:

```rust
("MyRule", Structural, run_my_rule, always_enabled, requires: [
    "UnBracketNotation"
]),
```

Each descriptor specifies:

<ParamField body="id" type="&'static str" required>
Registry name used by `rule_names()`, `debug trace`, and `render_pipeline_until`. Must be unique; second passes append a numeric suffix (`UnIife2`).
</ParamField>

<ParamField body="stage" type="RuleStage" required>
One of `Syntax`, `Helpers`, `Structural`, `Complex`, `Modernization`, `Cleanup`. Metadata for grouping; execution order follows registry position, not stage enum order alone.
</ParamField>

<ParamField body="runner" type="RuleRunner" required>
Function produced by `runner!` or a custom `fn run_*` that calls `module.visit_mut_with`.
</ParamField>

<ParamField body="enabled" type="RuleEnabled" required>
Gate function: `always_enabled`, `standard_or_above`, or `dead_code_elimination_enabled`.
</ParamField>

<ParamField body="requires" type="&[&'static str]" >
Optional documented dependency list. Does not auto-enforce ordering — placement in the registry array is what determines run order. Comments in the registry document why a rule sits where it does.
</ParamField>

</Step>
</Steps>

### Pipeline stages

Rules group into six stages. Placement within the ordered registry matters more than the stage label:

| Stage | Examples | Typical purpose |
|---|---|---|
| `Syntax` | `SimplifySequence`, `FlipComparisons`, `UnBracketNotation` | Normalize minified syntax |
| `Helpers` | `UnInteropRequireDefault`, `UnCurlyBraces`, `UnEsm` | Unwrap transpiler helpers, reconstruct modules |
| `Structural` | `UnTemplateLiteral`, `UnNullishCoalescing`, `UnOptionalChaining` | Restore language constructs |
| `Complex` | `UnIife`, `UnParameters`, `UnEs6Class`, `UnAsyncAwait` | Multi-pattern recovery |
| `Modernization` | `ArrowFunction`, `VarDeclToLetConst`, `UnForOf` | ESNext idioms |
| `Cleanup` | `SmartInline`, `SmartRename`, `DeadDecls`, `UnReturn` | Rename, inline, dead-code cleanup |

```text
parse → resolver(unresolved_mark) → apply_rules(RULE_DESCRIPTORS) → fixer → emit
                              ↑
                    RulePipelineOptions controls
                    start/stop range, rewrite level,
                    DCE mode, module facts
```

### Placement heuristics

When choosing registry position, ask what AST shape your rule expects and what later rules consume:

| Requirement | Place after / before |
|---|---|
| `["default"]` normalized to `.default` | After `UnBracketNotation` |
| `require()` calls still present | Before `UnEsm` |
| Rule creates new IIFEs | Before `UnIife2` (second pass) |
| Alias `var` declarations must remain | Before `SmartInline` (removes `var h = p`) |
| Export specifiers must reference real bindings | After `SmartInline` |
| `UnEsm` prerequisites (braces, `__esModule`, assignment splitting) | After `UnCurlyBraces`, `UnEsmoduleFlag`, `UnAssignmentMerging`, etc. |

Use `cargo run -p wakaru-cli -- debug trace input.js` or `render_pipeline_until` to confirm the AST shape your rule receives. See the [rule pipeline reference](/rule-pipeline-reference) for the full ordered registry and documented `requires` edges.

<Info>
`RulePipelineOptions` supports `until(stop_after)`, `between(start, stop)`, `with_rewrite_level`, `with_dce_mode`, and `with_module_facts` for tests and the unpack driver's two-phase execution.
</Info>

## Scope-aware matching with unresolved_mark

After `resolver(unresolved_mark, top_level_mark)`, every identifier carries a `SyntaxContext`. Free variables (globals like `Object`, `require`) have `unresolved_mark` as their outer mark; locally bound identifiers do not.

Rules that match identifiers **by name** must gate on `SyntaxContext` to avoid transforming unrelated inner-scope bindings:

```rust
use swc_core::common::Mark;
use swc_core::ecma::ast::Ident;

pub struct MyRule {
    unresolved_mark: Mark,
}

impl MyRule {
    pub fn new(unresolved_mark: Mark) -> Self {
        Self { unresolved_mark }
    }
}

// Inside a visitor method:
fn is_free_variable(&self, id: &Ident) -> bool {
    id.ctxt.outer() == self.unresolved_mark
}

// Guard pattern — skip bound locals:
if id.ctxt.outer() != self.unresolved_mark {
    return;
}
```

Without this guard, a rule matching webpack factory param `e` would also rewrite `e` inside `function inner(e) { ... }`.

<Warning>
Every new visitor that matches identifiers by name must take `unresolved_mark: Mark` and gate on it. Missing guards are a common source of snapshot cascades and incorrect renames.
</Warning>

## Renaming with BindingRenamer

Never rename identifiers by `sym` alone with a custom `VisitMut` — that hits inner-scope locals and parameters sharing the same name. Use `rename_utils::BindingRenamer`, which keys renames on `(Atom, SyntaxContext)` bindings.

### Core types and functions

```rust
use wakaru_core::rules::rename_utils::{
    rename_bindings_in_module, rename_bindings,
    BindingRename, BindingId,
};

// BindingId = (Atom, SyntaxContext)
let renames = vec![BindingRename {
    old: (old_sym, old_ctxt),
    new: new_name,
}];
rename_bindings_in_module(module, &renames);
// Or on a subtree:
rename_bindings(&mut stmts, &renames);
```

`BindingRenamer` handles declaration sites, import/export specifiers, object shorthand, and destructuring patterns — cases a naive ident swap misses.

### Shadow safety

Before applying a rename, check whether the new name would be captured by a nested scope:

| Function | Purpose |
|---|---|
| `rename_causes_shadowing(module, old, new_name)` | Whole-module shadow check for one binding |
| `binding_replacement_would_be_shadowed(module, old, replacement_name)` | Targeted check before raw `Expr::Ident` substitution |
| `RenameShadowIndex::for_bindings(module, bindings)` | Batch forbidden-name index for multiple renames |

`SmartRename`, `UnImportRename`, `ImportDedup`, and `UnParameters` all route through these utilities.

## Modify an existing rule

Default: add tests to the existing `crates/core/tests/<rule>_rule.rs` file rather than creating a new one. Only create a new `*_rule.rs` file when adding an entirely new rule.

For bugfixes:

1. Add a regression test reproducing the exact broken input/output pair.
2. Fix the rule implementation.
3. Run focused tests, then pipeline tests.
4. If snapshots change, inspect diffs — confirm output is semantically better, not merely different.

## Definition of done

Before opening a PR, complete this checklist:

<Steps>
<Step title="Focused rule tests">

```bash
cargo test -p wakaru-core --test my_rule_rule
```

</Step>

<Step title="Pipeline integration tests">

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
```

</Step>

<Step title="Formatting and lint">

```bash
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

Use `cargo clippy --workspace --all-targets -- -D warnings` when touching non-core crates.

</Step>

<Step title="Review snapshot diffs">

`.cargo/config.toml` sets `INSTA_UPDATE=new` — changed snapshots fail tests and write `.snap.new` files. Review each diff; accept intentional changes with `cargo insta accept`. No stale `.snap.new` files should remain (`git status --short`).

</Step>

<Step title="Optional fixture matrix">

If you have the sibling `wakaru-fixtures` repo and the change affects decompile output:

```bash
../wakaru-fixtures/run.sh --check
```

</Step>
</Steps>

Prefer `cargo nextest run -p wakaru-core` for faster iteration during development; `cargo test` remains valid for single-file focus.

## Debugging rule behavior

When a rule does not fire or produces unexpected output:

| Symptom | Likely cause | Action |
|---|---|---|
| Rule not firing | Earlier rule changed AST shape | `debug trace` to see input at your rule's position |
| Wrong variable renamed | Missing `unresolved_mark` guard | Add `ctxt.outer()` check |
| Many snapshots changed | Early rule cascading | Bisect with `render_pipeline_until` or `--from`/`--until` trace ranges |
| `render_rule` unchanged but `render` works | Depends on earlier normalization | Use `render` or pre-normalize test input |
| Test hangs | Infinite recursion in visitor | `RUST_BACKTRACE=1 cargo test -- --nocapture` |

<CodeGroup>

```bash title="Rule trace CLI"
cargo run -p wakaru-cli -- debug trace path/to/input.js
```

```bash title="Trace a rule range"
cargo run -p wakaru-cli -- debug trace path/to/input.js --from RemoveVoid --until UnEsm
```

```bash title="Show all rules including no-ops"
cargo run -p wakaru-cli -- debug trace path/to/input.js --all
```

</CodeGroup>

## Related pages

<CardGroup>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` registry, `rule_names()` identifiers, `RuleStage` groupings, and documented cross-rule dependencies.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs with `debug trace`, `--from`/`--until` ranges, and bisection workflow for single-file regressions.
</Card>
<Card title="Testing and snapshots" href="/testing-and-snapshots">
`cargo nextest` vs `cargo test`, insta snapshot workflow, test helpers, and the pre-commit verification matrix.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Parse → resolver → staged rules → fixer → emit flow, including `unresolved_mark` scope gating in context.
</Card>
<Card title="Helper detection" href="/helper-detection">
`LocalHelperContext`, body-shape matchers, and helper lifecycle for rules in the Helpers stage.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
Snapshot drift investigation, `unresolved_mark` symptom mapping, and early-rule cascade diagnosis.
</Card>
</CardGroup>

---

## 14. Trace the rule pipeline

> Use debug trace (and --profile / --profile-rules) to bisect single-file regressions with per-rule diffs, --from/--until ranges, and limitations for bundle unpack debugging.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/14-trace-the-rule-pipeline.md
- Generated: 2026-06-28T01:09:52.771Z

### Source Files

- `docs/debugging.md`
- `crates/cli/src/main.rs`
- `crates/core/src/driver.rs`
- `README.md`

---
title: "Trace the rule pipeline"
description: "Use debug trace (and --profile / --profile-rules) to bisect single-file regressions with per-rule diffs, --from/--until ranges, and limitations for bundle unpack debugging."
---

Wakaru exposes two complementary debugging surfaces for the single-file rule pipeline: `wakaru debug trace` prints per-rule before/after unified diffs via `trace_rules` / `format_trace_events`, and global `--profile` / `--profile-rules` flags emit a Chrome-compatible trace with optional per-rule `debug_span` timing. Bundle unpack uses a two-phase cross-module pipeline that `debug trace` does not model.

## Debug trace command

`debug trace` is a hidden subcommand under `wakaru debug`. It runs parse → resolver marks → `apply_rules_with_observer`, snapshots rendered output after each rule, and prints git-style unified diffs for rules that change the code.

<CodeGroup>

```bash title="Trace all changed rules"
cargo run -p wakaru-cli -- debug trace path/to/module.js
```

```bash title="Write trace to file"
cargo run -p wakaru-cli -- debug trace path/to/module.js -o trace.txt
```

```bash title="From source checkout (release binary)"
wakaru debug trace path/to/module.js
```

</CodeGroup>

### Flags

<ParamField body="input" type="path" required>
  Input JavaScript or TypeScript file. Bundle-shaped inputs are rejected.
</ParamField>

<ParamField body="-o, --output" type="path">
  Write trace output to a file. Prints to stdout when omitted.
</ParamField>

<ParamField body="-m, --source-map" type="path">
  Source map path. Accepted by the CLI but not applied during trace (see limitations below).
</ParamField>

<ParamField body="--all" type="boolean">
  Include rules that ran but did not change rendered output. Default: only changed rules.
</ParamField>

<ParamField body="--from" type="string">
  First rule to run (inclusive). Must match a `rule_names()` identifier such as `RemoveVoid` or `UnEsm`.
</ParamField>

<ParamField body="--until" type="string">
  Last rule to run (inclusive). Unknown names produce an error.
</ParamField>

<ParamField body="--level" type="RewriteLevel" default="standard">
  Rewrite aggressiveness: `minimal`, `standard`, or `aggressive`. Controls which rules are enabled.
</ParamField>

### Trace output format

`format_trace_events` renders events as:

1. One `=== initial ===` block with the post-resolver source.
2. For each changed rule: `=== RuleName ===` followed by a unified diff (`@@` hunks, `-`/`+` lines).
3. For unchanged rules (with `--all`): `=== RuleName (unchanged) ===` with no diff body.

The `before` text for each rule is implied by the previous event's `after` string, so intermediate states are not duplicated as full source blocks.

<RequestExample>

```bash
cargo run -p wakaru-cli -- debug trace fixture.js --from RemoveVoid --until UnEsm
```

</RequestExample>

<ResponseExample>

```text
=== initial ===
const x = void 0;

=== RemoveVoid ===
@@ -1 +1 @@
-const x = void 0;
+const x = undefined;

=== UnEsm ===
...
```

</ResponseExample>

### Rule names

Rule identifiers match `rule_names()` and `RuleDescriptor.id` values from the pipeline registry. Second-pass rules use suffixed names (for example `UnIife2`, `UnWebpackInterop2`, `UnParameters2`). Consult the [rule pipeline reference](/rule-pipeline-reference) for the full ordered list and stage groupings.

<Warning>
`--from` and `--until` must use exact `rule_names()` strings. Typos fail fast with `unknown trace start rule` or `unknown trace stop rule`.
</Warning>

## Bisect a single-file regression

<Steps>

<Step title="Capture a minimal reproduction">
  Reduce the failing input to a single file. For bundle issues, extract one module with `--unpack --raw` first (see bundle limitations).
</Step>

<Step title="Run a full trace">
  Run `debug trace` without range flags. Scan diffs for the first rule that introduces the bad output.
</Step>

<Step title="Narrow the range">
  Re-run with `--from` and `--until` around the suspect rule. Repeat until one rule is isolated.
</Step>

<Step title="Confirm with test helpers">
  Use `render_pipeline_until` to capture cumulative output up to the rule before the regression, and `render_pipeline_between` to run only the suspect range in a unit test.
</Step>

<Step title="Check for early cascades">
  If many rules look wrong, inspect early normalization rules (`SimplifySequence`, `FlipComparisons`, `RemoveVoid`) before blaming a late rule.
</Step>

</Steps>

<CodeGroup>

```bash title="Show unchanged rules in a range"
cargo run -p wakaru-cli -- debug trace module.js --from SimplifySequence --until UnEsm --all
```

```bash title="Isolate one second-pass rule"
cargo run -p wakaru-cli -- debug trace module.js --from UnParameters2 --until UnParameters2 --all
```

</CodeGroup>

### Test helper equivalents

The `crates/core/tests/common/mod.rs` helpers mirror trace behavior for unit tests:

| Helper | Behavior |
| --- | --- |
| `trace_pipeline(source, RuleTraceOptions)` | Returns `Vec<RuleTraceEvent>` via `trace_rules` |
| `changed_rules(source)` | Rule names where rendered output changed |
| `render_pipeline_until(source, "RuleName")` | Full pipeline through named rule, then fixer + emit |
| `render_pipeline_between(source, "Start", "Stop")` | Only rules from `Start` through `Stop` inclusive |

`RulePipelineOptions::until`, `::between`, `::with_rewrite_level`, and `::with_dce_mode` provide the programmatic range API used internally by trace.

## Chrome profiling

Global flags on the main `wakaru` command (including when invoking `debug trace`) write a Chrome trace profile:

<CodeGroup>

```bash title="Decompile with phase spans"
wakaru input.js --profile trace.json
```

```bash title="Include per-rule timing spans"
wakaru input.js --profile trace.json --profile-rules
```

```bash title="Profile during debug trace"
wakaru --profile trace.json --profile-rules debug trace module.js
```

</CodeGroup>

<ParamField body="--profile" type="path">
  Write a Chrome trace file (open with `chrome://tracing`). Creates the file at the given path.
</ParamField>

<ParamField body="--profile-rules" type="boolean">
  Requires `--profile`. Sets the tracing filter to `DEBUG` so per-rule `debug_span!(name = descriptor.id)` events from `apply_rules_impl` are recorded. Without this flag, the filter is `INFO` and only coarser spans (for example `decompile`, `parse`, `rules`, `fixer`, `emit` during normal decompile) appear.
</ParamField>

<Tip>
For `debug trace` specifically, add `--profile-rules` to get per-rule spans. The trace path does not emit the same `info_span` hierarchy as `decompile`, so a profile without `--profile-rules` may be nearly empty.
</Tip>

### Trace vs profile

| Surface | Output | Best for |
| --- | --- | --- |
| `debug trace` | Per-rule unified diffs of rendered source | Finding which rule changed output |
| `--profile` | Chrome trace with phase timing | End-to-end latency, unpack parallelism |
| `--profile --profile-rules` | Chrome trace with per-rule spans | Slow-rule identification |

## Bundle unpack limitations

<Warning>
`trace_rules` rejects bundle-shaped input. The error is: `rule tracing currently supports single-file inputs only; use normal decompile or unpack for bundles`.
</Warning>

Bundle decompile runs a two-phase pipeline: Phase 1 collects `ModuleFactsMap` after `UnEsm`, then Phase 2 applies cross-module rules (`namespace_decomposition`, cross-module helper refs) with `module_facts` populated. `debug trace` runs the single-file pipeline with `module_facts: None` and cannot reproduce Phase 2 behavior.

### Workflow for bundle regressions

<Steps>

<Step title="Compare snapshot layers">
  For webpack4, diff raw (`webpack4_unpack_raw__*.snap`) vs final (`webpack4_unpack__*.snap`) snapshots. Raw unchanged + final changed → decompile pipeline regression. Raw changed → unpacker or bundler-coupled normalization.
</Step>

<Step title="Extract a single module">
  Unpack with `--raw` to get pre-pipeline module source, then trace that file in isolation.
</Step>

<Step title="Reduce to single-file">
  Copy the smallest module body that still reproduces the bug into a standalone fixture and trace it.
</Step>

</Steps>

```bash
# Extract raw modules, then trace one
wakaru bundle.js --unpack --raw -o raw/
wakaru debug trace raw/some-module.js
```

## Differences from normal decompile

`debug trace` is not a byte-for-byte preview of `wakaru input.js` output. Keep these gaps in mind when comparing trace diffs to CLI decompile results:

| Aspect | `debug trace` | Normal `wakaru input.js` |
| --- | --- | --- |
| Dead-code elimination | `DceMode::Off` (default; no `--dce` on trace) | `DceMode::TransformOnly` (or `Full` with `--dce`) |
| Source map renames | Not applied (CLI accepts `-m` but trace ignores it) | Applied when `--source-map` is set |
| Post-rule rendering | `print_trace_module` (fixer per snapshot, no final emit pass) | Full fixer + `print_js` / source map emit |
| Cross-module facts | Always `None` | Populated during unpack Phase 2 |

Trace snapshots use `apply_fixer` before each `print_js` call, so intra-pipeline rendering is consistent, but the final decompile path adds source-map passes and different DCE behavior.

## Symptom mapping

| Symptom | Likely cause | Trace action |
| --- | --- | --- |
| Many snapshots changed at once | Early rule cascade | Trace from start; inspect `SimplifySequence`, `FlipComparisons`, `RemoveVoid` |
| Rule not firing | AST already transformed by earlier pass | Trace with `--all` through the rule; check `before` snapshot |
| Wrong identifier renames | Missing `unresolved_mark` guard | Trace the rule range; compare binding contexts in diffs |
| Bundle-only regression | Cross-module or unpack layer | Do not use `debug trace` on the bundle; trace extracted raw module |
| Trace differs from `wakaru` output | DCE or source-map gap | Re-test with matching `--level`; compare against `render_pipeline_until` |

## Core API

For programmatic bisection outside the CLI:

```rust
use wakaru_core::{
    trace_rules, format_trace_events, DecompileOptions, RuleTraceOptions,
};

let events = trace_rules(
    source,
    DecompileOptions { filename: "module.js".into(), ..Default::default() },
    RuleTraceOptions {
        start_from: Some("RemoveVoid".into()),
        stop_after: Some("UnEsm".into()),
        only_changed: true,
    },
)?;
let report = format_trace_events(&events);
```

Exported types: `RuleTraceEvent` (`rule`, `changed`, `before`, `after`), `RuleTraceOptions` (`start_from`, `stop_after`, `only_changed`). `rule_names()` returns the canonical ordered name list.

## Related pages

<CardGroup>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
  Ordered `RuleDescriptor` registry, `rule_names()` identifiers, and cross-rule dependencies.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
  Snapshot drift investigation, raw vs final layers, and symptom-to-cause mapping.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
  Test-first rule workflow and pipeline placement constraints.
</Card>
<Card title="CLI reference" href="/cli-reference">
  Full flag surface including `debug trace`, `--profile`, and `--profile-rules`.
</Card>
<Card title="Testing and snapshots" href="/testing-and-snapshots">
  `render_pipeline_until`, `trace_pipeline`, and insta snapshot workflow.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
  Why bundle unpack cannot be traced as a single-file pipeline.
</Card>
</CardGroup>

---

## 15. CLI reference

> Complete wakaru command surface: global flags, unpack modes, subcommands (extract, debug trace/normalize), stdin/stdout behavior, formatter, diagnostics, and profiling options.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/15-cli-reference.md
- Generated: 2026-06-28T01:11:19.098Z

### Source Files

- `crates/cli/src/main.rs`
- `README.md`
- `crates/cli/src/json_output.rs`
- `crates/cli/src/discovery.rs`
- `crates/cli/src/formatter.rs`

---
title: "CLI reference"
description: "Complete wakaru command surface: global flags, unpack modes, subcommands (extract, debug trace/normalize), stdin/stdout behavior, formatter, diagnostics, and profiling options."
---

The `wakaru` binary (`wakaru-cli`, distributed as `@wakaru/cli`) is a single clap-driven entry point with two top-level shapes: a default path that decompiles one file or unpacks bundles, and optional subcommands (`extract`, hidden `debug`) that do not accept positional `inputs`. Root flags such as `--json`, `--formatter`, and `--profile` apply only on the default path; `--force` is global.

## Command synopsis

```bash
wakaru [OPTIONS] [INPUTS...]           # decompile or unpack (default)
wakaru extract <MAP> -o <DIR>          # extract sourcesContent from a .map
wakaru debug trace <INPUT> [OPTIONS]   # per-rule pipeline trace (hidden)
wakaru debug normalize [INPUT] [OPTS]  # structural canonicalization (hidden)
```

Positional `inputs` and subcommands are mutually exclusive (`args_conflicts_with_subcommands`). Run `wakaru --help`, `wakaru extract --help`, or `wakaru debug trace --help` for generated flag lists.

## Default mode: decompile

Without `--unpack`, Wakaru accepts exactly one JavaScript/TypeScript file (or stdin) and runs the single-file decompile pipeline.

```bash
wakaru input.js                        # decompiled code to stdout
wakaru input.js -o output.js           # write to file
cat input.js | wakaru > output.js      # stdin when piped or `-`
```

Constraints:

- Multiple positional inputs require `--unpack`.
- A directory argument without `--unpack` errors with *cannot decompile a directory*.
- `--emit-source-map` requires `-o`/`--output`.

## Default mode: unpack

`--unpack` (short `-u`) splits bundles into module files. It **requires** `-o`/`--output` as an output directory.

| Flag form | Behavior |
|-----------|----------|
| `--unpack` or `--unpack=auto` | Structural bundle detection plus heuristic fallback for scope-hoisted output (default) |
| `--unpack=strict` | Structural detection only; no heuristic scope-hoist splitting |

```bash
wakaru bundle.js --unpack -o out/
wakaru bundle.js --unpack --raw -o out/       # raw split, skip decompiler rules
wakaru bundle.js --unpack=strict -o out/
wakaru entry.js chunk.js --unpack -o out/     # multiple explicit files
wakaru dist/ --unpack -o out/                 # recursive directory scan
```

### `--raw`

Requires `--unpack`. Calls `unpack_raw` / `unpack_files_raw` instead of the full decompile pipeline—modules are split but not rewritten for readability. `--formatter` still runs on output when enabled.

### Directory scanning

When an input path is a directory, the CLI recursively collects `.js`, `.mjs`, and `.cjs` files, skips hidden entries and `node_modules`, and keeps only files that pass core bundle/chunk detection (`is_detected_unpack_input`). Non-matching files are skipped—not copied or decompiled. If a directory yields zero detected bundles, Wakaru exits with *no bundle or chunk files detected in directory input*.

Explicit file inputs (non-directory) keep normal fallback behavior when no structural format matches.

On a TTY, unpack summaries on stderr include scan stats (`scanned`, `detected`, `skipped`), detected format names, module count, and elapsed time.

### Output path safety

Unpack resolves each module filename under the output directory, rejecting `..` traversal, absolute paths, drive prefixes, and symlink escapes. Duplicate module names receive deduplicated suffixes (e.g. `index_2.js`).

## Global flags (default path)

| Flag | Short | Default | Description |
|------|-------|---------|-------------|
| `--output` | `-o` | — | Output file (decompile) or directory (`--unpack`) |
| `--unpack` | `-u` | off | Unpack mode: `auto` (default) or `strict` |
| `--raw` | — | off | Raw unpack only; requires `--unpack` |
| `--source-map` | `-m` | — | Input `.map` for identifier recovery and import dedup. Alias: `--sourcemap`. Single input only. |
| `--level` | — | `standard` | Rewrite level: `minimal`, `standard`, `aggressive` |
| `--dce` | — | off | Full dead-code elimination (reachability sweep). Default without flag: transform-induced DCE only (`TransformOnly`). |
| `--diagnostics` | — | off | Post-transform checks; results appear as warnings |
| `--formatter` | — | off | Final oxc formatting pass |
| `--emit-source-map` | — | off | Write `output.js.map` alongside each output file |
| `--json` | — | off | Machine-readable JSON on stdout |
| `--profile` | — | — | Chrome trace file (`chrome://tracing`) |
| `--profile-rules` | — | off | Per-rule DEBUG spans in profile; requires `--profile` |
| `--force` | — | off | Overwrite existing outputs (global; all commands) |

## Stdin and stdout

| Scenario | Input | Primary output | Secondary output |
|----------|-------|----------------|------------------|
| Decompile, no `-o` | file or stdin | stdout (code) | warnings → stderr |
| Decompile, with `-o` | file or stdin | file | warnings → stderr; `.map` if `--emit-source-map` |
| Unpack | file(s), dir, or stdin | files under `-o` | summary → stderr (TTY); warnings → stderr |
| `--json` decompile, no `-o` | any | stdout (JSON with `code`) | — |
| `--json` decompile, with `-o` | any | stdout (JSON without `code`); file written | — |
| `--json` unpack | any | stdout (JSON metadata) | files still written to `-o` |

Stdin is read when the input is `-`, or when no input is given and stdin is not a terminal. Filename is recorded as `<stdin>`.

With `--json`, human-readable warning lines are suppressed; warnings are embedded in the JSON object. Terminal styling (`NO_COLOR` respected) is disabled for JSON runs.

## Formatter

`--formatter` selects the oxc formatter (`CodeFormatter::Oxc`) for a final pass after decompilation. Off by default.

If formatting fails, Wakaru prints a stderr warning and preserves the unformatted code:

```
warning: oxc formatter failed for <file>, preserving output: <message>
```

## Diagnostics

`--diagnostics` enables post-transform checks in the core driver: temporal-dead-zone (use-before-declaration) detection, recovered input/output parse errors, and duplicate lexical declarations. Findings are returned as `UnpackWarning` entries—error-severity diagnostics cause a non-zero exit after output is written.

## JSON output

`--json` prints a single JSON object to stdout.

**Decompile** (`JsonDecompileOutput`):

| Field | When present |
|-------|--------------|
| `code` | Only when `-o` is omitted (stdout mode) |
| `source_map` | When `--emit-source-map` produced a map |
| `warnings` | Always (may be empty) |
| `elapsed_ms` | Always |

**Unpack** (`JsonUnpackOutput`):

| Field | Description |
|-------|-------------|
| `detected_formats` | Bundle format strings detected |
| `modules` | `{ filename }` per module (code is written to disk, not inlined) |
| `warnings` | Per-module warnings |
| `total` | Module files written |
| `failed` | Count of modules with error-severity warnings |
| `elapsed_ms` | Wall-clock time |

Each warning object: `filename`, `kind`, `is_error`, `message`.

See [JSON output and CI integration](/json-output-and-ci) for piping patterns and warning kinds.

## Profiling

```bash
wakaru input.js --profile trace.json
wakaru input.js --profile trace.json --profile-rules
```

`--profile` writes a Chrome trace event file. `--profile-rules` lowers the tracing filter to DEBUG so individual rule spans appear. The profile guard flushes on normal exit.

Profiling initializes before command dispatch, but `--profile` is a root-level flag—use it on the default decompile/unpack path, not on subcommands.

## Subcommand: `extract`

```bash
wakaru extract bundle.js.map -o src/
```

Reads a v3 source map, writes every `sourcesContent` entry under `-o`, creating parent directories as needed. Respects `--force` for output directory overwrite protection. Prints an extraction count to stderr on a TTY. The legacy `--extract` flag is no longer supported.

## Subcommand: `debug` (hidden)

Internal commands for regression matrices and pipeline bisection. Omitted from default `--help` output.

### `debug trace`

```bash
wakaru debug trace input.js
wakaru debug trace input.js -o trace.txt --from UnEsm --until SmartInline --all
wakaru debug trace input.js -m input.js.map --level aggressive
```

| Flag | Default | Description |
|------|---------|-------------|
| `-o` / `--output` | stdout | Trace output file |
| `-m` / `--source-map` | — | Input source map |
| `--from` | pipeline start | First rule name (must match `rule_names()`) |
| `--until` | pipeline end | Last rule name |
| `--all` | off | Include unchanged rules (default: changed only) |
| `--level` | `standard` | Rewrite level |

Produces a git-style unified diff log: initial source, then per-rule deltas. **Bundle inputs are rejected**—tracing supports single-file inputs only. Use normal decompile or unpack for bundles.

### `debug normalize`

```bash
wakaru debug normalize input.js
wakaru debug normalize --rename --format < input.js
```

| Flag | Description |
|------|-------------|
| `input` | Optional file; `-` or omitted with piped stdin |
| `--rename` | Alpha-rename locals to `$0`, `$1`, … for structural comparison |
| `--format` | Run oxc formatter after canonicalization |

Output always goes to stdout. Used by reproduction matrices to compare mangled and original code structurally.

## Overwrite protection

| Target | Without `--force` | With `--force` |
|--------|-------------------|----------------|
| Output file (`-o` decompile) | Error if file exists | Overwrite |
| Output directory (`--unpack`) | Error if directory exists and is non-empty | Write into directory |
| Non-empty forced directory | — | Uses write-if-changed for identical files |

Empty or newly created output directories write files directly.

## Exit codes

| Outcome | Code |
|---------|------|
| Success | 0 |
| CLI validation, I/O, or processing error (`bail!`) | 1 |
| Warnings with `is_error: true` after successful write | 1 |
| Internal panic | Non-zero; panic hook prints a GitHub issue URL with version and OS |

## Examples

```bash
# Minimal-change audit
wakaru audit.js --level minimal -o audit.pretty.js

# Aggressive readability with full DCE
wakaru min.js --level aggressive --dce --formatter -o readable.js

# CI-friendly unpack
wakaru dist/ --unpack --json -o out/ 2>log.txt

# Source-map-driven decompile
wakaru bundle.min.js -m bundle.min.js.map --emit-source-map -o restored.js

# Bisect a regression between two rules
wakaru debug trace broken.js --from UnBracketNotation --until SmartInline -o trace.txt
```

<Callout type="info">
`--source-map` applies to a single input file. Multi-file unpack with `-m` is rejected at the CLI layer.
</Callout>

## Related pages

<CardGroup>
  <Card title="Quickstart" href="/quickstart">
    First successful decompile and unpack runs with expected stdout and stderr signals.
  </Card>
  <Card title="Unpack bundles" href="/unpack-bundles">
    Operational guide for auto vs strict modes, raw extraction, multi-file inputs, and directory scanning.
  </Card>
  <Card title="Use source maps" href="/use-source-maps">
    Identifier recovery, import dedup, emit maps, and the extract subcommand.
  </Card>
  <Card title="JSON output and CI" href="/json-output-and-ci">
    Machine-readable schemas, warning kinds, and automation patterns.
  </Card>
  <Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
    Bisect regressions with debug trace, profile export, and rule name ranges.
  </Card>
  <Card title="Rewrite levels" href="/rewrite-levels-and-assumptions">
    minimal / standard / aggressive tradeoffs and --dce semantics.
  </Card>
  <Card title="Troubleshooting" href="/troubleshooting">
    Overwrite errors, directory skip behavior, formatter failures, and warning codes.
  </Card>
</CardGroup>

---

## 16. Core API reference

> Exported wakaru-core functions (decompile, unpack, unpack_files, unpack_raw, trace_rules), DecompileOptions fields, DceMode, UnpackOutput warnings, and RewriteLevel defaults.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/16-core-api-reference.md
- Generated: 2026-06-28T01:10:21.819Z

### Source Files

- `crates/core/src/lib.rs`
- `crates/core/src/driver/types.rs`
- `crates/core/src/driver.rs`
- `crates/core/src/rules/mod.rs`

---
title: Core API reference
description: Exported wakaru-core functions (decompile, unpack, unpack_files, unpack_raw, trace_rules), DecompileOptions fields, DceMode, UnpackOutput warnings, and RewriteLevel defaults.
---

The `wakaru-core` crate (`wakaru_core` in Rust) is the programmatic entry point for Wakaru. It exposes single-file decompilation, bundle unpacking, raw extraction, and rule-pipeline tracing. All primary functions return `anyhow::Result` and share configuration through `DecompileOptions`.

Add the dependency from the workspace root:

```toml
[dependencies]
wakaru-core = { path = "crates/core" }
```

## Entry points

Wakaru exposes five driver functions for the main workflows. Each takes JavaScript source (or structured inputs) plus `DecompileOptions`, and returns typed output with optional warnings.

| Function | Signature | Purpose |
| --- | --- | --- |
| `decompile` | `(&str, DecompileOptions) -> Result<DecompileOutput>` | Parse one file, run the full rule pipeline, emit code |
| `unpack` | `(&str, DecompileOptions) -> Result<UnpackOutput>` | Detect and unpack a bundle; decompile extracted modules |
| `unpack_files` | `(Vec<UnpackInput>, DecompileOptions) -> Result<UnpackOutput>` | Multi-file unpack (entry + chunks, or directory scan results) |
| `unpack_raw` | `(&str, &DecompileOptions) -> Result<UnpackOutput>` | Extract modules without the decompile rule pipeline |
| `unpack_files_raw` | `(Vec<UnpackInput>, &DecompileOptions) -> Result<UnpackOutput>` | Multi-file raw extraction |
| `trace_rules` | `(&str, DecompileOptions, RuleTraceOptions) -> Result<Vec<RuleTraceEvent>>` | Per-rule before/after snapshots for single-file debugging |

### `decompile`

Runs the complete single-file pipeline: parse (with recovery), resolver marks, staged rule application, optional source-map rename passes, fixer, and emit.

```rust
use wakaru_core::{decompile, DecompileOptions};

let output = decompile(
    source,
    DecompileOptions {
        filename: "input.js".into(),
        ..Default::default()
    },
)?;
println!("{}", output.code);
```

When `DecompileOptions::sourcemap` is set, Wakaru runs `ImportDedup` and source-map-driven identifier rename after the main rule pipeline. When `emit_source_map` is set, `DecompileOutput::source_map` contains v3 source map JSON mapping decompiled output back to the input.

### `unpack`

Detects a structural bundle format, optionally applies scope-hoist splitting, then decompiles every extracted module in parallel (two-phase unpack with a cross-module facts barrier). Behavior when no bundle is detected:

- **`heuristic_split: false`** — falls back to `decompile`, returning a single module named `module.js`.
- **`heuristic_split: true`** — attempts scope-hoisted splitting; on success runs full unpack, otherwise falls back to `decompile`.

`UnpackOutput::detected_formats` records which `BundleFormat` variants were recognized (for example `webpack5`, `esbuild`, `scope-hoisted`).

### `unpack_files`

Accepts multiple `UnpackInput { filename, source }` values. A single input delegates to `unpack`. Multiple inputs merge detected modules, run cross-chunk numeric rewrite planning, and execute the parallel Phase 2 decompile pipeline.

Returns `Err` when `inputs` is empty or no modules could be extracted.

### `unpack_raw` and `unpack_files_raw`

Skip the decompile rule pipeline and cross-module fact collection. Output is detector-specific extraction plus bundler-coupled normalization only.

- No `source_maps` are emitted (always empty).
- Heuristic scope-hoisted fallback may run narrow runnable normalization; failures produce `raw_normalization_failed` warnings when `diagnostics` is enabled.
- When no bundle is detected, `unpack_raw` returns the input unchanged as a single `module.js` entry.

Use raw extraction when you need structural module boundaries without readability rewrites. Use `unpack` when you want decompiled, idiomatic source.

### `trace_rules`

Records per-rule rendered output for bisecting single-file regressions. Rejects bundle inputs with an explicit error — use `decompile` or `unpack` for bundles.

Pair with `format_trace_events` to render a git-style unified diff log:

```rust
use wakaru_core::{
    trace_rules, format_trace_events, DecompileOptions, RuleTraceOptions,
};

let events = trace_rules(
    source,
    DecompileOptions { filename: "input.js".into(), ..Default::default() },
    RuleTraceOptions {
        start_from: Some("un_esm".into()),
        stop_after: None,
        only_changed: true,
    },
)?;
println!("{}", format_trace_events(&events));
```

<ParamField body="start_from" type="Option<String>">
First rule name to trace. Must appear in `rule_names()`. When omitted, tracing starts at the beginning of the pipeline.
</ParamField>

<ParamField body="stop_after" type="Option<String>">
Last rule name to trace. When omitted, tracing runs through the end of the pipeline.
</ParamField>

<ParamField body="only_changed" type="bool" default="true">
When `true`, only emit events whose rendered output changed. Set `false` to include unchanged rules.
</ParamField>

## `DecompileOptions`

Shared configuration for all driver entry points.

<ParamField body="filename" type="String" default='""'>
Logical filename for parse diagnostics, source-map paths, and provenance-based filename recovery during unpack.
</ParamField>

<ParamField body="sourcemap" type="Option<Vec<u8>>" default="None">
Raw bytes of a v3 source map. Enables import deduplication and source-map-driven identifier rename. Multi-file unpack with source maps requires reparsing in Phase 2.
</ParamField>

<ParamField body="dce_mode" type="DceMode" default="Off">
Controls late dead-code elimination (`DeadImports`, `DeadDecls`) and dead helper-module elimination during unpack.
</ParamField>

<ParamField body="level" type="RewriteLevel" default="Standard">
How aggressively rules recover likely original source patterns. Gates filename recovery, dead-module elimination, nested scope splitting, and several rule-specific heuristics.
</ParamField>

<ParamField body="heuristic_split" type="bool" default="false">
When `true`, attempt heuristic splitting of top-level scope-hoisted bundles when no structural bundle is detected. At `Aggressive` level, also retry scope-hoist splitting inside modules extracted by a structural detector.
</ParamField>

<ParamField body="diagnostics" type="bool" default="false">
Run post-transform checks: input parse recovery, TDZ violations, duplicate declarations, and output parse verification. Results appear in `warnings`; no warnings are collected when `false`.
</ParamField>

<ParamField body="emit_source_map" type="bool" default="false">
Generate v3 source maps mapping decompiled output to the input. Populates `DecompileOutput::source_map` or `UnpackOutput::source_maps`.
</ParamField>

### API defaults vs CLI and WASM

The Rust API defaults are conservative. Edge integrations often override fields:

| Field | `DecompileOptions::default()` | CLI | WASM `decompile` | WASM `unpack` |
| --- | --- | --- | --- | --- |
| `dce_mode` | `Off` | `TransformOnly` (`Full` with `--dce`) | `TransformOnly` | `Off` (via `..Default::default()`) |
| `level` | `Standard` | `standard` | parsed from arg | parsed from arg |
| `heuristic_split` | `false` | `true` for `--unpack` / `--unpack=auto`; `false` for `--unpack=strict` | `false` | `true` |
| `diagnostics` | `false` | `--diagnostics` flag | arg, default `false` | arg, default `false` |

When building integrations, set fields explicitly rather than relying on `Default` if you want CLI-equivalent behavior.

## `DceMode`

Controls dead-code elimination in the rule pipeline and unpack cleanup.

| Variant | Behavior |
| --- | --- |
| `Off` | No dead-code cleanup |
| `TransformOnly` | Remove only transform-induced dead code; pre-existing dead input code is preserved |
| `Full` | Full reachability sweep — remove all unreachable code |

`DceMode::is_enabled()` returns `true` for `TransformOnly` and `Full`. During unpack, dead helper-module elimination requires both `dce_mode.is_enabled()` and `level` at least `Standard`.

## `RewriteLevel`

```rust
pub enum RewriteLevel {
    Minimal,
    Standard,  // #[default]
    Aggressive,
}
```

`RewriteLevel` is `PartialOrd`: higher levels enable more speculative recovery. Parse from strings with `RewriteLevel::from_str_or_default(level)` — accepts `"minimal"`, `"standard"`, `"aggressive"`; any other value falls back to `Standard`.

Each level maps to `RewriteAssumptions` via `RewriteAssumptions::from_level`:

| Level | `no_document_all` | `pure_getters` | `stable_builtins` |
| --- | --- | --- | --- |
| `Minimal` | `false` | `false` | `false` |
| `Standard` | `true` | `false` | `false` |
| `Aggressive` | `true` | `true` | `true` |

`RewritePolicy::from_level` bundles the level with its assumptions for rules that need both.

## Return types

### `DecompileOutput`

<ResponseField name="code" type="String">
Decompiled JavaScript source.
</ResponseField>

<ResponseField name="warnings" type="Vec<UnpackWarning>">
Non-fatal warnings from diagnostics and parse recovery.
</ResponseField>

<ResponseField name="source_map" type="Option<String>">
v3 source map JSON when `emit_source_map` is set.
</ResponseField>

`DecompileOutput::has_errors()` returns `true` when any warning has `kind.is_error()`.

### `UnpackOutput`

<ResponseField name="modules" type="Vec<(String, String)>">
Extracted modules as `(filename, code)` pairs.
</ResponseField>

<ResponseField name="warnings" type="Vec<UnpackWarning>">
Per-module warnings accumulated across extraction, fact collection, and decompile.
</ResponseField>

<ResponseField name="detected_formats" type="Vec<BundleFormat>">
Bundle formats recognized during detection (`webpack4`, `webpack5`, `browserify`, `systemjs`, `esbuild`, `amd`, `scope-hoisted`).
</ResponseField>

<ResponseField name="source_maps" type="Vec<(String, String)>">
Per-module source map JSON when `emit_source_map` is set.
</ResponseField>

`UnpackOutput::has_errors()` mirrors `DecompileOutput::has_errors()`.

### `UnpackInput`

```rust
pub struct UnpackInput {
    pub filename: String,
    pub source: String,
}
```

Used by `unpack_files` and `unpack_files_raw` for multi-file operations.

## Warnings

Warnings are non-fatal. Operations return `Ok` with warnings attached unless a hard error (parse failure, I/O, etc.) occurs.

```rust
pub struct UnpackWarning {
    pub filename: String,
    pub kind: UnpackWarningKind,
    pub message: String,
}
```

### Warning kinds

| `UnpackWarningKind` | `as_str()` | `is_error()` | Typical cause |
| --- | --- | --- | --- |
| `RawNormalizationFailed` | `raw_normalization_failed` | yes | Raw heuristic normalization failed; unparsed code preserved |
| `FactCollectionParseFailed` | `fact_collection_parse_failed` | yes | Module parse failed during Phase 1 fact collection |
| `DecompileFailed` | `decompile_failed` | yes | Per-module decompile error during unpack |
| `InputParseRecovered` | `input_parse_recovered` | no (diagnostic) | Parser recovered from input syntax errors |
| `TdzViolation` | `tdz_violation` | no (diagnostic) | Lexical use-before-declaration detected |
| `DuplicateDeclaration` | `duplicate_declaration` | yes | Duplicate lexical binding in output |
| `ImportCycle` | `import_cycle` | no (diagnostic) | Circular import dependency detected |
| `OutputParseRecovered` | `output_parse_recovered` | yes | Output failed strict parse but recovered |
| `OutputParseFailed` | `output_parse_failed` | yes | Output is not parseable JavaScript |

`UnpackWarningKind::is_diagnostic()` is `true` for `InputParseRecovered`, `TdzViolation`, and `ImportCycle`. Diagnostic warnings signal potential transform issues but do not indicate data loss. `is_error()` returns `!is_diagnostic()`.

For machine-readable output, the CLI maps warnings to JSON with `kind`, `is_error`, `filename`, and `message` fields via `JsonWarning::from_core`.

## Additional exports

Beyond the driver entry points, `wakaru_core` re-exports utilities useful for custom integrations:

- **Rules** — `apply_rules`, `rule_names`, `rule_descriptors`, `RulePipelineOptions`, `RuleStage`, `RewritePolicy`
- **Facts** — `ModuleFacts`, `ModuleFactsMap`, `collect_module_facts`, import/export fact types
- **Source maps** — `parse_sourcemap`, `extract_source_entries`, `resolve_source_path`
- **Diagnostics** — `check_tdz`, `TdzViolation`
- **Unpacker** — `BundleFormat`, `unpack_webpack4`, `scope_hoist`, `unpack_webpack4_raw`
- **I/O helpers** — `normalize`, `is_detected_unpack_input`, `deduplicate_path`, `safe_relative_module_path`

Low-level rule development typically imports individual rule types from `wakaru_core::rules`.

## Example: unpack with diagnostics

```rust
use wakaru_core::{
    unpack, DecompileOptions, DceMode, RewriteLevel,
};

let output = unpack(
    bundle_source,
    DecompileOptions {
        filename: "dist/bundle.js".into(),
        level: RewriteLevel::Standard,
        dce_mode: DceMode::TransformOnly,
        heuristic_split: true,
        diagnostics: true,
        ..Default::default()
    },
)?;

for (filename, code) in &output.modules {
    std::fs::write(format!("out/{filename}"), code)?;
}

if output.has_errors() {
    for w in &output.warnings {
        if w.kind.is_error() {
            eprintln!("[{}] {}: {}", w.kind.as_str(), w.filename, w.message);
        }
    }
}
```

## Related pages

<CardGroup>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Single-file parse → rules → fixer → emit flow and `unresolved_mark` scope gating.
</Card>
<Card title="Rewrite levels and assumptions" href="/rewrite-levels-and-assumptions">
How `RewriteLevel` and `DceMode` affect rule behavior and CLI flags.
</Card>
<Card title="Bundle formats and unpacking" href="/bundle-formats-and-unpacking">
Detection order, `BundleFormat` variants, and raw vs full unpack semantics.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Using `trace_rules` and `format_trace_events` for regression bisection.
</Card>
<Card title="JSON output and CI" href="/json-output-and-ci">
Machine-readable warning schemas with `kind` and `is_error` for automation.
</Card>
<Card title="WASM API reference" href="/wasm-api-reference">
Browser bindings that wrap the same core types with JSON result shapes.
</Card>
</CardGroup>

---

## 17. WASM API reference

> wasm_bindgen exports decompile, unpack, and ruleNames; parameter types, WakaruDecompileResult and WakaruUnpackResult JSON shapes, and warning kind strings.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/17-wasm-api-reference.md
- Generated: 2026-06-28T01:12:25.800Z

### Source Files

- `crates/wasm/src/lib.rs`
- `crates/core/src/driver/types.rs`
- `crates/formatter/src/lib.rs`

---
title: "WASM API reference"
description: "wasm_bindgen exports decompile, unpack, and ruleNames; parameter types, WakaruDecompileResult and WakaruUnpackResult JSON shapes, and warning kind strings."
---

The `wakaru-wasm` crate (`crates/wasm`) compiles to a `cdylib` and exposes three JavaScript-callable functions — `decompile`, `unpack`, and `ruleNames` — via `#[wasm_bindgen]`. Successful calls return plain objects serialized with `serde_wasm_bindgen`; fatal failures throw a string error. A `#[wasm_bindgen(start)]` initializer installs `console_error_panic_hook` so Rust panics surface in the browser console. The playground builds the artifact with `wasm-pack build --target web` into `crates/wasm/pkg`.

## Initialization

`wasm-pack` emits a default `init` function that must run before any API call. It accepts an optional `RequestInfo`, `URL`, `Response`, `BufferSource`, or `WebAssembly.Module` and returns `Promise<void>`.

```typescript
import init, { decompile, unpack, ruleNames } from "wakaru-wasm";

await init();
```

Vite resolves `wakaru-wasm` to `crates/wasm/pkg` and excludes it from dependency pre-bundling so the `.wasm` binary loads correctly.

## Exported functions

| Function | Rust symbol | Returns on success |
|----------|-------------|-------------------|
| `decompile` | `decompile` | `WakaruDecompileResult` object |
| `unpack` | `unpack` | `WakaruUnpackResult` object |
| `ruleNames` | `rule_names` (js_name `ruleNames`) | `string[]` |

`ruleNames` cannot fail under normal conditions; it returns `wakaru_core::rule_names()` — the ordered pipeline rule identifiers — or `null` if serialization fails.

Neither `unpack_files`, `unpack_raw`, nor `trace_rules` from `wakaru-core` are exposed on the WASM surface.

## `decompile`

Runs the single-file decompile pipeline on one source string.

### Parameters

| Position | JS name | Type | Default | Effect |
|----------|---------|------|---------|--------|
| 1 | `source` | `string` | — | Input JavaScript source |
| 2 | `level` | `"minimal" \| "standard" \| "aggressive"` or omitted | `"standard"` | Maps to `RewriteLevel::from_str_or_default`; unrecognized values fall back to `standard` |
| 3 | `sourcemap` | `Uint8Array` or omitted | none | Raw v3 source-map bytes; enables import dedup and source-map-driven identifier rename |
| 4 | `diagnostics` | `boolean` | `false` | Enables post-transform checks (TDZ, duplicate declarations, output parse verification) |
| 5 | `formatter` | `boolean` | `false` | When `true`, runs the Oxc formatter on output; failures become `formatter_failed` warnings |
| 6 | `emitSourceMap` | `boolean` | `false` | Emits a v3 source map mapping decompiled output back to input |

### Fixed internal options

The WASM binding sets options not exposed as parameters:

| Field | Value |
|-------|-------|
| `filename` | `"input.js"` |
| `dce_mode` | `DceMode::TransformOnly` |
| `heuristic_split` | `false` |

### Example

```typescript
const result = decompile(
  bundledSource,
  "standard",
  undefined,
  true,
  true,
  false
);
console.log(result.code);
for (const w of result.warnings) {
  console.warn(`${w.filename} [${w.kind}]: ${w.message}`);
}
```

## `unpack`

Detects bundle format, extracts modules, and runs the decompile pipeline on each module.

### Parameters

| Position | JS name | Type | Default | Effect |
|----------|---------|------|---------|--------|
| 1 | `source` | `string` | — | Bundle source |
| 2 | `level` | rewrite level string | `"standard"` | Same parsing as `decompile` |
| 3 | `heuristicSplit` | `boolean` | `true` | Enables scope-hoisted bundle splitting when no structural bundle is detected |
| 4 | `diagnostics` | `boolean` | `false` | Per-module diagnostic checks |
| 5 | `formatter` | `boolean` | `false` | Oxc formatting per extracted module |
| 6 | `emitSourceMap` | `boolean` | `false` | Per-module emitted source maps |

### Fixed internal options

| Field | Value |
|-------|-------|
| `filename` | `"input.js"` |
| `dce_mode` | `DceMode::Off` (from `DecompileOptions::default`) |
| `sourcemap` | none |

### Example

```typescript
const result = unpack(bundleSource, "aggressive", true, true, true, false);
for (const mod of result.modules) {
  console.log(mod.filename, mod.code.length);
}
```

## `ruleNames`

Returns the ordered list of transformation rule identifiers from the core pipeline registry. Use it to populate UI pickers or to align browser-side tooling with `RuleDescriptor` order documented on the rule pipeline reference page.

```typescript
const names: string[] = ruleNames();
```

## Result shapes

Serde serializes Rust struct fields using **snake_case** JSON keys. Optional fields are omitted when empty.

### `WakaruDecompileResult`

| Field | Type | Presence |
|-------|------|----------|
| `code` | `string` | always |
| `source_map` | `string` | only when `emitSourceMap` is `true`; v3 source-map JSON |
| `warnings` | `WakaruWarning[]` | always (may be empty) |

```json
{
  "code": "export function greet() {\n  return \"hello\";\n}\n",
  "warnings": []
}
```

With diagnostics and source-map emission:

```json
{
  "code": "...",
  "source_map": "{\"version\":3,\"sources\":[\"input.js\"],...}",
  "warnings": [
    {
      "filename": "input.js",
      "kind": "tdz_violation",
      "message": "..."
    }
  ]
}
```

### `WakaruUnpackResult`

| Field | Type | Presence |
|-------|------|----------|
| `modules` | `WakaruModule[]` | always |
| `source_maps` | `WakaruSourceMap[]` | only when non-empty (`emitSourceMap` produced maps) |
| `warnings` | `WakaruWarning[]` | always (may be empty) |

#### `WakaruModule`

| Field | Type |
|-------|------|
| `filename` | `string` — virtual path for the extracted module |
| `code` | `string` — decompiled (and optionally formatted) source |

#### `WakaruSourceMap`

| Field | Type |
|-------|------|
| `filename` | `string` — module filename the map belongs to |
| `map` | `string` — v3 source-map JSON |

```json
{
  "modules": [
    { "filename": "src/index.js", "code": "..." },
    { "filename": "src/utils.js", "code": "..." }
  ],
  "warnings": [
    {
      "filename": "src/legacy.js",
      "kind": "decompile_failed",
      "message": "..."
    }
  ]
}
```

Unlike CLI `--json` output, WASM results do **not** include `elapsed_ms`, `detected_formats`, `total`, or `failed` counters.

## `WakaruWarning`

Every warning carries three fields:

| Field | Type | Description |
|-------|------|-------------|
| `filename` | `string` | Module or input file the warning refers to |
| `kind` | `string` | Machine-readable warning category (see table below) |
| `message` | `string` | Human-readable detail |

WASM results omit the CLI JSON `is_error` flag. Classify warnings client-side using the kind strings below.

### Warning kind strings

| `kind` | Origin | Treat as error? |
|--------|--------|-----------------|
| `raw_normalization_failed` | core `UnpackWarningKind` | yes |
| `fact_collection_parse_failed` | core | yes |
| `decompile_failed` | core | yes |
| `duplicate_declaration` | core | yes |
| `output_parse_recovered` | core | yes |
| `output_parse_failed` | core | yes |
| `formatter_failed` | WASM formatter layer only | yes |
| `input_parse_recovered` | core | no — diagnostic |
| `tdz_violation` | core | no — diagnostic |
| `import_cycle` | core | no — diagnostic |

Core classification follows `UnpackWarningKind::is_diagnostic()`: diagnostic kinds are `input_parse_recovered`, `tdz_violation`, and `import_cycle`. All other kinds, including `formatter_failed`, are errors (`is_error()` returns true for them).

`formatter_failed` messages follow the pattern `{formatter} formatter failed, preserving output: {detail}` where `formatter` is `oxc` when Oxc formatting was requested.

### Filtering example

```typescript
const DIAGNOSTIC_KINDS = new Set([
  "input_parse_recovered",
  "tdz_violation",
  "import_cycle",
]);

function isErrorWarning(kind: string): boolean {
  return !DIAGNOSTIC_KINDS.has(kind);
}

const errors = result.warnings.filter((w) => isErrorWarning(w.kind));
```

## Error handling

| Outcome | JS behavior |
|---------|-------------|
| Success | Plain object return value |
| Fatal error (parse failure, unpack failure, serialization error) | Thrown `string` (from `JsValue::from_str`) |

The playground worker pattern catches thrown strings and forwards them as `decompile-error` messages. Always wrap calls in `try/catch` when integrating outside the demo.

```typescript
try {
  const result = decompile(source);
} catch (err) {
  // err is a string, e.g. "failed to parse input.js"
  console.error(err);
}
```

## TypeScript bindings

`#[wasm_bindgen(typescript_custom_section)]` in `crates/wasm/src/lib.rs` injects interface definitions into the generated `.d.ts`. Parameter names use **camelCase** (`emitSourceMap`, `heuristicSplit`); result field names use **snake_case** (`source_map`, `source_maps`).

The injected `WakaruWarningKind` union is a **subset** of runtime kinds — it lists `raw_normalization_failed`, `fact_collection_parse_failed`, `decompile_failed`, `tdz_violation`, `output_parse_failed`, and `formatter_failed` but omits `input_parse_recovered`, `duplicate_declaration`, `import_cycle`, and `output_parse_recovered`. Treat `kind` as `string` at runtime or extend local types to cover the full table above.

The playground's `playground/src/vite-env.d.ts` declares `kind: string` on `WakaruWarning`, which matches full runtime behavior.

## WASM vs CLI surface

| Capability | WASM | CLI |
|------------|------|-----|
| Single-file decompile | `decompile` | default mode |
| Bundle unpack | `unpack` (single string) | `--unpack` |
| Multi-file / directory unpack | not exposed | `--unpack` with multiple paths |
| Raw extraction | not exposed | `--unpack --raw` |
| Rule trace | not exposed | `wakaru debug trace` |
| Machine-readable timing/metadata | not in result | `--json` |
| DCE mode control | fixed per function | `--dce` flag |
| stdin input | not applicable | supported |

For automation pipelines that need `elapsed_ms`, `detected_formats`, or `is_error` on every warning, use the CLI `--json` schema instead of WASM bindings.

## Build

From the repository root:

```bash
wasm-pack build crates/wasm --target web --out-dir crates/wasm/pkg --release
```

Or via the playground helper:

```bash
node playground/scripts/build-wasm.mjs
```

Release builds disable `wasm-opt` in `Cargo.toml` metadata to avoid over-aggressive size optimization that can break debugging.

## Related pages

<Card href="/wasm-and-playground" title="WASM and playground" description="Vite integration, worker bridge pattern, and online demo setup." />

<Card href="/core-api-reference" title="Core API reference" description="Full Rust API including unpack_files, unpack_raw, DecompileOptions, and DceMode." />

<Card href="/json-output-and-ci" title="JSON output and CI integration" description="CLI --json schema with is_error, elapsed_ms, and detected_formats for automation." />

<Card href="/rewrite-levels-and-assumptions" title="Rewrite levels and assumptions" description="What minimal, standard, and aggressive levels change in the pipeline." />

<Card href="/use-source-maps" title="Use source maps" description="Source-map bytes for rename recovery and emit_source_map output semantics." />

<Card href="/rule-pipeline-reference" title="Rule pipeline reference" description="Ordered rule registry that ruleNames() mirrors." />

---

## 18. Rule pipeline reference

> Ordered RuleDescriptor registry, RuleStage groupings, rule_names() identifiers, RulePipelineOptions ranges, and documented cross-rule dependencies from the inventory.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/18-rule-pipeline-reference.md
- Generated: 2026-06-28T01:12:29.460Z

### Source Files

- `crates/core/src/rules/pipeline.rs`
- `docs/rule-dependency-inventory.md`
- `docs/architecture.md`
- `crates/core/src/rules/mod.rs`

---
title: Rule pipeline reference
description: Ordered RuleDescriptor registry, RuleStage groupings, rule_names() identifiers, RulePipelineOptions ranges, and documented cross-rule dependencies.
---

The decompile pipeline applies **97** transformation rules in a fixed order. Each rule is registered as a `RuleDescriptor` in `crates/core/src/rules/pipeline.rs`. Order is not cosmetic: later rules pattern-match on shapes produced by earlier ones, and some descriptors declare explicit `requires` edges that must stay satisfied.

This page is the executable registry reference. For broader prose notes (safety classifications, downstream effects, experimental validation), see the companion inventory in the repository's `docs/rule-dependency-inventory.md`.

## How rules run

`apply_rules()` walks `RULE_DESCRIPTORS` sequentially. For each descriptor:

1. **Enable gate** — `always_enabled`, `standard_or_above`, or `dead_code_elimination_enabled` decides whether the rule runs for the current `RewriteLevel` / `DceMode`.
2. **Range gate** — `RulePipelineOptions::start_from` and `stop_after` skip rules outside the requested slice (inclusive on both ends).
3. **Execution** — the descriptor's runner mutates the module AST. Some runners invalidate the cached `LocalHelperContext` when helper body shapes change.

```mermaid
flowchart LR
  parse[parse_js] --> resolver[resolver marks]
  resolver --> rules[apply_rules]
  rules --> fixer[fixer]
  fixer --> emit[print_js]
```

During bundle unpack, the through-`UnEsm` slice runs twice per module (fact collection, then output). See [Cross-module facts](/cross-module-facts) for the barrier design.

## RuleDescriptor and RuleStage

Each `RuleDescriptor` carries:

| Field | Type | Purpose |
| --- | --- | --- |
| `id` | `&'static str` | Stable identifier used by `rule_names()`, CLI `--from`/`--until`, and trace output |
| `stage` | `RuleStage` | Logical grouping for documentation and pipeline reasoning |
| `requires` | `&[&str]` | Ordering constraints — every named prerequisite must appear **earlier** in the registry |
| `run` | `RuleRunner` | Function that applies the rule visitor |
| `enabled` | `RuleEnabled` | Level/DCE gate |

`RuleStage` values:

| Stage | Role |
| --- | --- |
| `Syntax` | Minified-syntax normalization (sequences, bracket notation, `void 0`, indirect calls) |
| `Helpers` | Transpiler helper unwrapping and module-system reconstruction through `UnEsm` |
| `Structural` | Structural restoration (spreads, template literals, variable merging, `?.` / `??`) |
| `Complex` | Higher-order patterns (IIFEs, classes, async/regenerator, second-pass interop) |
| `Modernization` | ESNext upgrades (arrows, `let`/`const`, `for…of`, rest params) |
| `Cleanup` | Import/export renaming, inlining, smart rename passes, optional DCE, tail cleanup |

Retrieve descriptors programmatically:

```rust
use wakaru_core::{rule_descriptors, rule_names, RuleStage};

let names = rule_names();           // ordered IDs
let descs = rule_descriptors();     // id + stage + requires per rule
```

The WASM binding exposes the same ordered list via `ruleNames()`.

## Repeat passes

Several rules run more than once under distinct IDs. Numbered suffixes (`2`, `3`) are separate registry entries that re-invoke the same runner after intermediate rules expose new shapes:

| ID | Re-runs | Why |
| --- | --- | --- |
| `SimplifySequence2` | `SimplifySequence` | Flatten sequences after `UnCurlyBraces` adds blocks |
| `UnObjectSpread2` / `UnObjectRest2` / `UnSlicedToArray2` | respective helpers | Post-`UnEsm` shapes for spread/rest/slice helpers |
| `UnOptionalChaining2` | `UnOptionalChaining` | After `UnConditionals` exposes new ternaries |
| `UnWebpackInterop2` / `UnWebpackInterop3` | `UnWebpackInterop` | After async recovery and after `UnEsm` import conversion |
| `UnArgumentSpread2` | `UnArgumentSpread` | After `UnAsyncAwait` exposes `.apply` patterns |
| `UnObjectRest3` | `UnObjectRest` | After async recovery exposes assignment-form rest |
| `UnParameters2` / `UnParameters3` | `UnParameters` | After destructuring and after `SmartRename` |
| `UnNullishCoalescing2` | `UnNullishCoalescing` | After `UnDestructuring` |
| `UnImportRename2` / `UnExportRename2` | rename passes | After `SmartRename` frees occupied alias names |
| `UnIife2` | `UnIife` | After `SmartRename` and after `SmartInline` creates IIFEs |
| `UnJsx2` | `UnJsx` | After `SmartRename` / `ExtractInlinedFunction` |
| `SmartRename2` | `SmartRenameSecondPass` | JSX-aware second rename pass |
| `ArrowReturn2` | `ArrowReturn` | After `UnParameters3` strips arrow block bodies |
| `UnConditionals2` | `UnConditionals` | Final pass after `UnReturn` for late ternaries |

The pipeline **starts** at `SimplifySequence` and **ends** at `UnConditionals2` (not `UnReturn`).

## Enable gates

Most rules use `always_enabled`. Two gates trim the effective pipeline:

### Rewrite level (`standard_or_above`)

Skipped entirely when `RewriteLevel::Minimal`. Affected registry IDs:

`UnUseStrict`, `UnJsx`, `ArrowFunction`, `UnDestructuring`, `UnToArray`, `MergeDeclarationInit`, `SmartRename`, `UnJsx2`, `UnEsbuildCjsWrapper`

Individual rules may also gate risky subpatterns internally by level. See [Rewrite levels and assumptions](/rewrite-levels-and-assumptions).

### Dead-code elimination (`dead_code_elimination_enabled`)

`DeadDecls` and `DeadImports` run only when `DceMode` is not `Off`. `TransformOnly` uses delta DCE (removes transform-induced dead code only); `Full` sweeps all unreachable bindings and imports.

| Caller | Default `DceMode` |
| --- | --- |
| `DecompileOptions` (API) | `Off` |
| CLI decompile/unpack | `TransformOnly` (`--dce` → `Full`) |
| `RulePipelineOptions::default()` | `Full` (overridden by drivers) |

## RulePipelineOptions

```rust
pub struct RulePipelineOptions<'a> {
    pub start_from: Option<&'a str>,
    pub stop_after: Option<&'a str>,
    pub dce_mode: DceMode,
    pub rewrite_level: RewriteLevel,
    pub module_facts: Option<&'a ModuleFactsMap>,
}
```

| Constructor / method | Behavior |
| --- | --- |
| `Default::default()` | Full pipeline, `DceMode::Full`, `RewriteLevel::Standard`, no facts |
| `until(stop_after)` | Run from first rule through `stop_after` (inclusive) |
| `between(start, stop)` | Run from `start` through `stop` (inclusive); skip everything before `start` |
| `with_dce_mode` | Override DCE behavior |
| `with_rewrite_level` | Override rewrite aggressiveness |
| `with_module_facts` | Pass cross-module facts to fact-aware rules (`UnTemplateLiteral`, `UnForOf`, helper rules, etc.) |

**Range semantics:** `start_from` waits until an **enabled** descriptor with a matching `id` is reached, then runs it and all following enabled rules until `stop_after` (if set). Disabled rules are skipped before range matching — a `--from` or `--until` name that is gated off at the current level or DCE mode will never match.

**CLI / trace mapping:** `wakaru debug trace` accepts `--from` → `start_from` and `--until` → `stop_after`. Unknown rule names are rejected before tracing starts.

### Common ranges

| Range | Rules included | Typical use |
| --- | --- | --- |
| Full pipeline | `SimplifySequence` … `UnConditionals2` | Single-file `decompile()` |
| Through `UnEsm` | #1–28 | Unpack Phase 1 and Phase 2 pre-barrier |
| `UnObjectSpread2` … `UnReturn` | #29–96 | Unpack Phase 2 post-barrier tail |
| `UnCurlyBraces` … `UnEsm` | Helpers slice | Targeted module-system debugging |

Unpack Phase 2 appends manual cleanup after the `UnObjectSpread2`…`UnReturn` slice (extra `SimplifySequence`, `UnAssignmentMerging`, targeted ESM recovery, `UnOptionalChaining`, conditional passes). It does **not** run `UnConditionals2`.

## Ordered registry

Global index is stable across releases — tests assert `rule_descriptors()[i].id == rule_names()[i]`. Gate column: **std** = `standard_or_above`, **dce** = `dead_code_elimination_enabled`, blank = always enabled. **Requires** lists explicit `requires` edges only.

### Syntax (11)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 1 | `SimplifySequence` | | |
| 2 | `FlipComparisons` | | |
| 3 | `UnTypeofStrict` | | |
| 4 | `RemoveVoid` | | `SimplifySequence` |
| 5 | `UnminifyBooleans` | | |
| 6 | `UnDoubleNegation` | | |
| 7 | `UnInfinity` | | |
| 8 | `UnIndirectCall` | | |
| 9 | `UnTypeof` | | |
| 10 | `UnNumericLiteral` | | |
| 11 | `UnBracketNotation` | | |

### Helpers (20)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 12 | `UnInteropRequireDefault` | | `UnIndirectCall`, `UnBracketNotation` |
| 13 | `UnInteropRequireWildcard` | | `UnIndirectCall`, `UnBracketNotation` |
| 14 | `UnToConsumableArray` | | |
| 15 | `UnObjectSpread` | | |
| 16 | `UnObjectRest` | | `UnBracketNotation` |
| 17 | `UnSlicedToArray` | | |
| 18 | `UnClassCallCheck` | | |
| 19 | `UnPossibleConstructorReturn` | | |
| 20 | `UnTypeofPolyfill` | | |
| 21 | `UnCurlyBraces` | | |
| 22 | `SimplifySequence2` | | `UnCurlyBraces` |
| 23 | `UnEsmoduleFlag` | | |
| 24 | `UnUseStrict` | std | |
| 25 | `UnAssignmentMerging` | | `UnCurlyBraces` |
| 26 | `UnVariableMergingDeclsOnly` | | `UnAssignmentMerging` |
| 27 | `UnWebpackInterop` | | `UnBracketNotation`, `UnEsmoduleFlag` |
| 28 | `UnEsm` | | `UnCurlyBraces`, `UnEsmoduleFlag`, `UnUseStrict`, `UnAssignmentMerging`, `UnVariableMergingDeclsOnly`, `UnWebpackInterop` |
| 29 | `UnObjectSpread2` | | `UnEsm` |
| 30 | `UnObjectRest2` | | `UnObjectSpread2` |
| 31 | `UnSlicedToArray2` | | `UnObjectRest2` |

### Structural (10)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 32 | `UnTemplateLiteral` | | |
| 33 | `UnTypeConstructor` | | |
| 34 | `UnBuiltinPrototype` | | |
| 35 | `UnArgumentSpread` | | |
| 36 | `UnArrayConcatSpread` | | |
| 37 | `UnSpreadArrayLiteral` | | |
| 38 | `ObjectAssignSpread` | | |
| 39 | `UnVariableMerging` | | |
| 40 | `UnNullishCoalescing` | | |
| 41 | `UnOptionalChaining` | | |

### Complex (16)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 42 | `UnIife` | | |
| 43 | `UnConditionals` | | |
| 44 | `UnOptionalChaining2` | | `UnConditionals` |
| 45 | `UnParameters` | | `FlipComparisons`, `RemoveVoid` |
| 46 | `UnWhileLoop` | | `UnParameters` |
| 47 | `UnEnum` | | |
| 48 | `UnJsx` | std | |
| 49 | `UnEs6Class` | | |
| 50 | `UnAssertThisInitialized` | | `UnEs6Class` |
| 51 | `UnClassFields` | | |
| 52 | `UnDefineProperty` | | `UnConditionals`, `UnClassFields` |
| 53 | `UnRegenerator` | | |
| 54 | `UnAsyncAwait` | | |
| 55 | `UnObjectRest3` | | `UnAsyncAwait` |
| 56 | `UnArgumentSpread2` | | `UnAsyncAwait` |
| 57 | `UnWebpackInterop2` | | `UnObjectRest3` |

### Modernization (13)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 58 | `UnThenCatch` | | |
| 59 | `UnUndefinedInit` | | |
| 60 | `VarDeclToLetConst` | | |
| 61 | `ClassExpressionToDeclaration` | | `VarDeclToLetConst` |
| 62 | `ObjShorthand` | | |
| 63 | `ObjMethodShorthand` | | |
| 64 | `UnPrototypeClass` | | |
| 65 | `Exponent` | | |
| 66 | `ArgRest` | | |
| 67 | `UnRestArrayCopy` | | |
| 68 | `ArrowFunction` | std | |
| 69 | `ArrowReturn` | | |
| 70 | `UnForOf` | | |

### Cleanup (27)

| # | ID | Gate | Requires |
| --- | --- | --- | --- |
| 71 | `UnWebpackDefineGetters` | | |
| 72 | `UnWebpackObjectGetters` | | |
| 73 | `ImportDedup` | | |
| 74 | `UnExportRename` | | |
| 75 | `UnImportRename` | | `UnExportRename` |
| 76 | `UnWebpackInterop3` | | `UnEsm` |
| 77 | `UnDestructuring` | std | `UnImportRename`, `UnExportRename` |
| 78 | `UnNullishCoalescing2` | | `UnDestructuring` |
| 79 | `UnToArray` | std | `UnNullishCoalescing2` |
| 80 | `UnParameters2` | | `UnDestructuring` |
| 81 | `SmartInline` | | `UnDestructuring` |
| 82 | `MergeDeclarationInit` | std | `SmartInline` |
| 83 | `SmartRename` | std | `SmartInline` |
| 84 | `UnParameters3` | | `SmartRename` |
| 85 | `ArrowReturn2` | | `UnParameters3` |
| 86 | `UnExportRename2` | | `SmartRename` |
| 87 | `UnImportRename2` | | `SmartRename`, `UnExportRename2` |
| 88 | `UnIife2` | | `SmartRename` |
| 89 | `ExtractInlinedFunction` | | `UnIife2` |
| 90 | `UnJsx2` | std | `SmartRename`, `ExtractInlinedFunction` |
| 91 | `SmartRename2` | | `UnJsx2` |
| 92 | `DeadUninitializedDecls` | | `SmartRename2` |
| 93 | `UnEsbuildCjsWrapper` | std | `DeadUninitializedDecls` |
| 94 | `DeadDecls` | dce | |
| 95 | `DeadImports` | dce | `DeadDecls` |
| 96 | `UnReturn` | | |
| 97 | `UnConditionals2` | | `UnReturn` |

## Cross-rule dependencies

### Executable `requires` metadata

The `requires` field on each `RuleDescriptor` is the machine-checked subset of ordering constraints. A unit test verifies every named prerequisite appears at a lower index than its dependent rule.

Rules without `requires` may still have **suspected** dependencies documented in `docs/rule-dependency-inventory.md` (for example `UnBracketNotation` enabling dot-notation matching across many helpers). When adding or moving rules, update both the registry `requires` list and the inventory prose when the constraint is load-bearing.

### Confirmed chains

These chains are experimentally validated (see inventory Step 3):

```mermaid
flowchart TD
  UB[UnBracketNotation] --> UID[UnInteropRequireDefault]
  UIC[UnIndirectCall] --> UID
  UB --> UIW[UnInteropRequireWildcard]
  UIC --> UIW
  UAM[UnAssignmentMerging] --> UESM[UnEsm]
  UEF[UnEsmoduleFlag] --> UESM
  UCB[UnCurlyBraces] --> UESM
  UWI1[UnWebpackInterop] --> UESM
  UAA[UnAsyncAwait] --> UOR3[UnObjectRest3]
  UOR3 --> UWI2[UnWebpackInterop2]
  UWI2 --> UESM
  UESM --> UAA
```

**`UnEsm` barrier** — In unpack mode, `UnEsm` (#28) is the last rule before cross-module fact collection. Phase 2 resumes with `UnObjectSpread2` (#29) after the late pass. `UnEsm` must run after the first `UnWebpackInterop` pass; `UnWebpackInterop2` after `UnAsyncAwait` is a **hard** prerequisite (fixture regressions without it).

**Late cleanup chain** — `UnDestructuring` → `SmartInline` → `SmartRename` → (`UnImportRename2`, `UnExportRename2`, `UnIife2`) → `ExtractInlinedFunction` → `UnJsx2` → `SmartRename2` → `DeadUninitializedDecls`. `SmartInline` must run after import/export renames so bindings are stable; it must run before `SmartRename`, which expects alias vars removed.

**DCE ordering** — `DeadDecls` before `DeadImports`: removing dead helper declarations can leave import specifiers unreferenced.

**Parameter recovery** — `FlipComparisons` + `RemoveVoid` before `UnParameters`; `UnWhileLoop` after `UnParameters` (loop initializer removal exposes `for(; test;)`).

### Placing new rules

When inserting a rule:

1. Identify shape prerequisites (what AST must exist) and downstream consumers.
2. Add `requires` entries for load-bearing ordering — not every suspected edge needs to be in `requires`, but hard failures must be.
3. If the rule creates IIFEs, place it before the second `UnIife` pass (`UnIife2`).
4. If it matches free identifiers by name, gate on `unresolved_mark`.
5. If it renames bindings, use `BindingRenamer` — see [Develop rules](/develop-rules).

## rule_names() identifiers

`rule_names()` returns `&'static [&'static str]` — the same order as execution (modulo enable gates). Use these exact strings for:

- `RulePipelineOptions::until("UnEsm")`
- `cargo run -p wakaru-cli -- debug trace input.js --from RemoveVoid --until UnEsm`
- Test helpers `render_pipeline_until` / `render_pipeline_between`
- WASM / playground `ruleNames()`

Trace mode validates names up front; a typo in `--from` or `--until` fails before any rule runs.

At `RewriteLevel::Aggressive` with `DceMode::Off`, trace event order matches `rule_names()` minus `DeadDecls` and `DeadImports`.

## Related pages

<CardGroup>
  <Card title="Decompile pipeline" href="/decompile-pipeline">
    Parse → resolver → rules → fixer → emit flow and unresolved_mark gating.
  </Card>
  <Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
    Per-rule diffs, --from/--until bisection, and trace limitations during unpack.
  </Card>
  <Card title="Cross-module facts" href="/cross-module-facts">
    Two-phase unpack barrier at UnEsm and Phase 2 late-pass rules.
  </Card>
  <Card title="Develop rules" href="/develop-rules">
    Test-first workflow, pipeline placement, and definition-of-done checks.
  </Card>
  <Card title="Core API reference" href="/core-api-reference">
    apply_rules, DecompileOptions, DceMode, and RewriteLevel defaults.
  </Card>
</CardGroup>

---

## 19. Webpack bundle recipe

> End-to-end workflow using webpack4 and webpack5 testcases: build fixtures, run wakaru --unpack, compare against dist/*.pretty.js reference output, and multi-chunk inputs.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/19-webpack-bundle-recipe.md
- Generated: 2026-06-28T01:14:21.168Z

### Source Files

- `testcases/webpack4/README.md`
- `testcases/webpack4/webpack.config.js`
- `testcases/webpack4/dist/index.js`
- `testcases/webpack5/README.md`
- `testcases/webpack5/webpack.config.mjs`
- `testcases/webpack5/dist/index.js`

---
title: "Webpack bundle recipe"
description: "End-to-end workflow using webpack4 and webpack5 testcases: build fixtures, run wakaru --unpack, compare against dist/*.pretty.js reference output, and multi-chunk inputs."
---

Wakaru ships two committed webpack fixtures under `testcases/webpack4` and `testcases/webpack5`. Pipeline tests read `dist/index.js`, unpack with `BundleFormat::Webpack4` or `BundleFormat::Webpack5`, and pin every extracted module in `crates/core/tests/snapshots/`. The CLI path is `wakaru <bundle> --unpack -o <dir>`.

## Fixture layout

:::files
testcases/
├── webpack4/
│   ├── src/                  # React 16 + Redux + TypeScript sources
│   ├── webpack.config.js     # production build, babel-loader for .tsx
│   ├── package.json          # `pnpm build` → dist/index.js
│   └── dist/
│       ├── index.js          # minified webpack4 IIFE array bundle (test input)
│       └── index.pretty.js   # formatted copy of index.js for inspection
└── webpack5/
    ├── src/                  # small ESM app (6 modules + entry)
    ├── webpack.config.mjs    # development + source-map
    ├── package.json          # `pnpm build` → dist/index.js
    └── dist/
        ├── index.js          # webpack5 __webpack_modules__ bundle (test input)
        ├── index.js.map      # embedded source map
        ├── index.pretty.js   # formatted copy of index.js
        ├── modules.js        # extra artifact; not used by wakaru tests
        └── output.js         # partial bundle variant; not used by wakaru tests
:::

| Fixture | Webpack version | Bundle shape | Module IDs in output | Typical module count |
|---------|-----------------|--------------|--------------------|----------------------|
| `testcases/webpack4` | 4.4 (React starter fork) | `!function(modules){...}(["module",...])` | Numeric → `module-N.js` | ≥ 50 (+ `entry.js`) |
| `testcases/webpack5` | 5.88 | `(() => { var __webpack_modules__ = { "./src/…": … } })()` | String paths → `src/*.js` | 7 (`entry.js` + 6 `src/` files) |

<Note>
Only `dist/index.js` from each testcase is wired into `webpack4_unpack`, `webpack4_unpack_raw`, and `bundle_unpack` tests. The other files in `testcases/webpack5/dist/` are committed for manual comparison but are not CI inputs.
</Note>

## Webpack4 vs webpack5 detection markers

| Signal | Webpack4 (`testcases/webpack4/dist/index.js`) | Webpack5 (`testcases/webpack5/dist/index.js`) |
|--------|-----------------------------------------------|-----------------------------------------------|
| Bootstrap | Single IIFE + numeric `n(s=51)` entry | `/******/` banner + `__webpack_modules__` object |
| Module table | Array of factory functions | Object keyed by `"./src/…"` strings |
| Harmony helpers | `n.d`, `n.r`, `n.n` on runtime `n` | `__webpack_require__.d`, `.r`, `.n` |
| Output filenames | `module-0.js` … `module-50.js`, `entry.js` | Preserved paths: `src/a.js`, `src/b.js`, … |
| Source map | Not committed | `index.js.map` available (`devtool: 'source-map'`) |

Webpack4 is a full production React/Redux/Saga stack (Babel 6 + TypeScript 2.9). Webpack5 is a minimal ESM exercise focused on harmony exports, default interop, and duplicate binding names (`A` from `d.js` vs `e.js`).

## Build fixtures

<Steps>
<Step title="Install Node dependencies">

From the repo root:

<CodeGroup>
```bash title="Webpack4"
cd testcases/webpack4
pnpm install
```

```bash title="Webpack5"
cd testcases/webpack5
pnpm install
```
</CodeGroup>

Both packages pin `pnpm@8.15.2` and use a placeholder `pnpm-workspace.yaml` to avoid pnpm workspace detection issues.

</Step>

<Step title="Produce dist bundles">

<CodeGroup>
```bash title="Webpack4 (production)"
cd testcases/webpack4
pnpm build
# runs: cross-env NODE_OPTIONS="--openssl-legacy-provider" webpack --mode production
```

```bash title="Webpack5 (development + source map)"
cd testcases/webpack5
pnpm build
# runs: webpack (mode: development, devtool: source-map)
```
</CodeGroup>

Webpack4 writes a single minified `dist/index.js`. Webpack5 writes `dist/index.js` plus `dist/index.js.map`.

</Step>

<Step title="Verify committed artifacts">

After a local rebuild, confirm the primary inputs still exist:

```bash
ls -l testcases/webpack4/dist/index.js testcases/webpack5/dist/index.js
```

Wakaru tests read these paths relative to `crates/core/tests/` (`../../testcases/webpack4/dist/index.js`). If `pnpm build` changes bundle hashes, expect snapshot drift in the core test suite.

</Step>
</Steps>

## Unpack a single bundle

<ParamField body="--unpack" type="flag" required>
Unpack mode. Requires `-o/--output` as a directory. Default mode is `auto` (structural detection + heuristic fallback).
</ParamField>

<ParamField body="-o / --output" type="path" required>
Output directory. Wakaru refuses to overwrite a non-empty directory unless `--force` is set.
</ParamField>

<ParamField body="--raw" type="flag">
Write extractor output before the decompile rule pipeline. Useful for bisecting webpack normalization vs rule regressions.
</ParamField>

<ParamField body="-m / --source-map" type="path">
Source map for identifier recovery. **Single input only** — CLI rejects `--source-map` with multiple bundle files.
</ParamField>

<ParamField body="--force" type="flag">
Overwrite existing output files or non-empty output directories.
</ParamField>

<CodeGroup>
```bash title="Webpack4 full unpack"
cargo run -p wakaru-cli -- \
  testcases/webpack4/dist/index.js \
  --unpack -o /tmp/wp4-out
```

```bash title="Webpack5 full unpack"
cargo run -p wakaru-cli -- \
  testcases/webpack5/dist/index.js \
  --unpack -o /tmp/wp5-out
```

```bash title="Webpack5 with source map"
cargo run -p wakaru-cli -- \
  testcases/webpack5/dist/index.js \
  -m testcases/webpack5/dist/index.js.map \
  --unpack -o /tmp/wp5-mapped-out
```

```bash title="Raw extraction (pre-pipeline)"
cargo run -p wakaru-cli -- \
  testcases/webpack4/dist/index.js \
  --unpack --raw -o /tmp/wp4-raw-out
```
</CodeGroup>

### Expected output layout

:::files
/tmp/wp4-out/                    # webpack4: flat numeric modules
├── entry.js
├── module-0.js
├── module-1.js
└── …                            # 52 files total (entry + 51 modules)

/tmp/wp5-out/                    # webpack5: source-tree paths
├── entry.js
└── src/
    ├── 1.js
    ├── a.js
    ├── b.js
    ├── c.js
    ├── d.js
    └── e.js
:::

Webpack4 `entry.js` recovers the React mount (imports from `./module-N.js`). Webpack5 `entry.js` mirrors `testcases/webpack5/src/index.js` structure with `import` statements and aliased bindings (`A as A_1`, `A as A_2`).

<RequestExample>
```bash
cargo run -p wakaru-cli -- testcases/webpack5/dist/index.js --unpack -o /tmp/wp5-out
head -5 /tmp/wp5-out/entry.js
```
</RequestExample>

<ResponseExample>
```javascript
import _1_js__WEBPACK_IMPORTED_MODULE_0__ from "./src/1.js";
import { A } from "./src/a.js";
import _b_js__WEBPACK_IMPORTED_MODULE_2__ from "./src/b.js";
import { getC } from "./src/c.js";
import { A as A_1 } from "./src/d.js";
```
</ResponseExample>

## Reference outputs and comparison

Three layers serve different comparison goals:

| Layer | Path | What it represents | How to use it |
|-------|------|--------------------|---------------|
| Pretty bundle | `dist/index.pretty.js` | Human-readable **input** bundle (formatted `dist/index.js`) | Inspect webpack runtime, module boundaries, and `__webpack_require__` wiring before unpacking |
| Original sources | `testcases/webpack5/src/` | Pre-bundle authoring files | Spot-check recovered imports/exports against known `src/` for webpack5 |
| Insta snapshots | `crates/core/tests/snapshots/webpack4_unpack__*.snap` | Pinned **decompiled** module output | Automated regression gate; authoritative for CI |

<Warning>
`dist/*.pretty.js` is not wakaru output. Do not diff it against unpack results. It exists so you can read the committed bundle without a formatter. Automated verification compares against insta snapshots, not pretty files.
</Warning>

### Manual diff workflow

1. Unpack to a scratch directory (`--unpack -o /tmp/out --force`).
2. For webpack5, compare `out/src/*.js` and `out/entry.js` against `testcases/webpack5/src/`.
3. For webpack4, use `out/entry.js` and representative `out/module-N.js` files; original TypeScript lives under `testcases/webpack4/src/` but module numbering does not map 1:1 to source paths.
4. When rule changes alter output, run focused tests and review snapshot diffs (see below).

### Snapshot-backed verification

```bash
# Webpack4: structural checks + per-module snapshots
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw

# Webpack5 single-file unpack
cargo test -p wakaru-core --test bundle_unpack

# Webpack5 JSONP chunk format (inline fixtures)
cargo test -p wakaru-core --test webpack5_chunk_unpack

# Entry + separate chunk files (multi-input)
cargo test -p wakaru-core --test multi_file_unpack
```

Snapshot drift fails the test and writes `.snap.new` (configured via `INSTA_UPDATE=new` in `.cargo/config.toml`). Review diffs, then accept intentional improvements:

```bash
cargo insta review
# or
cargo insta accept
```

<Check>
Success signals from `webpack4_unpack.rs`: unpack succeeds with no error warnings, ≥ 50 modules extracted, every module non-empty, and `entry.js` present.
</Check>

## Raw vs full unpack (webpack4)

Webpack4 has two regression layers:

| Mode | API / CLI | Output stage | Test binary |
|------|-----------|--------------|-------------|
| Raw | `unpack_raw` / `--unpack --raw` | Post-extraction normalization only; ESM markers remain | `webpack4_unpack_raw` |
| Full | `unpack` / `--unpack` | Two-phase pipeline through ~60 rewrite rules | `webpack4_unpack` |

`webpack4_unpack_raw.rs` pins pre-pipeline code via `unpack_webpack4_raw`. `webpack4_unpack.rs` asserts raw output differs from decompiled output for at least one module — confirming the rule pipeline adds value beyond extraction.

Use `--raw` when a regression could be extraction vs rules. Use full unpack for end-user-readable modules.

## Multi-chunk and multi-file inputs

Single-file testcases cover monolithic bundles. Split-chunk workflows use separate entry and chunk artifacts merged by `unpack_files`.

```mermaid
sequenceDiagram
    participant CLI as wakaru CLI
    participant Detect as is_detected_unpack_input
    participant Merge as unpack_files
    participant P1 as Phase 1 (facts)
    participant P2 as Phase 2 (cross-module rules)

    CLI->>Detect: scan each input file
    Detect-->>CLI: bundle/chunk candidates only
    CLI->>Merge: UnpackInput[] (entry + chunks)
    Merge->>P1: par_iter modules through UnEsm
    P1->>P2: ModuleFactsMap barrier
    P2-->>CLI: merged module set → output dir
```

### Inline JSONP chunks

`webpack5_chunk_unpack.rs` exercises the `(self.webpackChunk… \|\| []).push([[id], { numericId: factory }])` format. Covered behaviors:

- Numeric module IDs become `module-11111.js`, `module-200.js`, etc.
- `require(N)` inside a chunk rewrites to `./module-N.js` imports
- Arrow and method factory forms unpack
- `window["webpackJsonp"]` chunk bases are accepted
- Webpack4-style `require.d(exports, "name", getter)` inside chunks normalizes

These tests use inline source strings, not the `testcases/` tree.

### Entry + external chunk files

`multi_file_unpack.rs` loads generated fixtures from `crates/core/tests/bundles/webpack-gen/dist/`:

| Test | Inputs | Key assertion |
|------|--------|---------------|
| `webpack5_dynamic_entry_and_chunk_unpack_together` | `bundle.js` + `src_greet_js.bundle.js` | `entry.js` imports `./src/greet.js`; chunk module extracted |
| `webpack5_dynamic_min_runtime_entry_and_chunk_unpack_together` | `bundle.js` + `529.bundle.js` | `entry.js` references `./module-529.js` |
| `webpack5_multi_file_rewrites_unambiguous_numeric_chunk_id` | synthetic entry + `529.bundle.js` | Numeric `529` rewritten; no raw `, 529,` left in entry |

CLI equivalent — pass multiple detected files:

```bash
cargo run -p wakaru-cli -- \
  path/to/bundle.js path/to/529.bundle.js \
  --unpack -o /tmp/multi-out
```

Or scan a directory (skips `node_modules`, hidden paths, and non-bundle `.js` files):

```bash
cargo run -p wakaru-cli -- \
  path/to/dist/ \
  --unpack -o /tmp/scanned-out
```

<Info>
`unpack_files` stabilizes filenames across inputs and rewrites **unambiguous** numeric webpack IDs to final module paths. Duplicate numeric IDs across unrelated runtimes in the same scan are left untouched to avoid merging incompatible bundles.
</Info>

## Definition-of-done checklist

Before merging webpack unpack changes, run the pipeline matrix from `docs/testing.md`:

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
cargo test -p wakaru-core --test webpack5_chunk_unpack
cargo test -p wakaru-core --test multi_file_unpack
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

Inspect every snapshot diff. Accept only when output is semantically better, not merely different. Ensure `git status --short` shows no stale `.snap.new` files.

<AccordionGroup>
<Accordion title="Overwrite protection on repeat runs">

If the output directory already contains files, wakaru exits unless `--force` is passed. Use a fresh `/tmp/…` path or add `--force` when re-running locally.

</Accordion>

<Accordion title="webpack4 build fails on OpenSSL 3">

The webpack4 `package.json` build script sets `NODE_OPTIONS="--openssl-legacy-provider"` for webpack 4 on modern Node. Keep that flag when rebuilding locally.

</Accordion>

<Accordion title="Snapshot update workflow">

1. Run the failing test binary.
2. Open the `.snap.new` diff (`cargo insta review` or your diff tool).
3. Confirm the change is an improvement (e.g., cleaner imports, correct `unresolved_mark` behavior).
4. `cargo insta accept` and re-run the test binary to confirm green.

</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Bundle formats and unpacking" href="/bundle-formats-and-unpacking">
Detection order for `BundleFormat::Webpack4` and `Webpack5`, raw vs full unpack, and directory scan semantics.
</Card>
<Card title="Unpack bundles" href="/unpack-bundles">
Operational guide for `--unpack` modes, `--raw`, multi-file inputs, and `--force`.
</Card>
<Card title="Testing and snapshots" href="/testing-and-snapshots">
Insta workflow, required pipeline test binaries, and pre-commit verification matrix.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
Raw vs final webpack4 layers, rule trace bisection, and snapshot drift investigation.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Two-phase unpack barrier that enables cross-chunk import/export recovery.
</Card>
</CardGroup>

---

## 20. Esbuild and Browserify recipe

> Unpack browserify standalone bundles and esbuild/Bun scope-hoisted output: detection markers (__export, __commonJS), strict vs auto heuristic split, and testcase verification commands.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/20-esbuild-and-browserify-recipe.md
- Generated: 2026-06-28T01:15:25.440Z

### Source Files

- `testcases/browserify/README.md`
- `testcases/browserify/dist/index.js`
- `crates/core/src/unpacker/browserify.rs`
- `crates/core/src/unpacker/esbuild.rs`
- `crates/core/src/unpacker/scope_hoist.rs`

---
title: "Esbuild and Browserify recipe"
description: "Unpack browserify standalone bundles and esbuild/Bun scope-hoisted output: detection markers (__export, __commonJS), strict vs auto heuristic split, and testcase verification commands."
---

Wakaru splits Browserify and esbuild/Bun bundles through `crates/core/src/unpacker/`: Browserify is detected at position 5 in `detect_bundle_candidate` (after webpack4/5 and before SystemJS), matching a triple-argument outer call whose first argument is a numeric-key module map; esbuild/Bun is detected at position 6 via lazy-helper factories (`__esm`, `__commonJS`) and/or `__export` namespace boundaries. Extracted module strings then pass through the two-phase unpack driver (rules through `UnEsm`, cross-module facts, late cleanup). Bundles without structural markers can still split when `--unpack=auto` enables `heuristic_split`, which falls back to `scope_hoist::split_scope_hoisted` reference-graph clustering.

## Detection order

Structural detectors run in fixed order; first match wins:

| Order | Format | `BundleFormat` |
|-------|--------|----------------|
| 1–3 | webpack5, webpack4, webpack5 chunk | `Webpack5`, `Webpack4` |
| 4 | Browserify standalone | `Browserify` |
| 5 | SystemJS | `SystemJs` |
| 6 | esbuild / Bun | `Esbuild` |
| — | Heuristic fallback (`--unpack=auto` only) | `ScopeHoisted` |

Pure ESM scope-hoisted output from Rollup, Vite, or esbuild/Bun **without** `__export` or `__commonJS` markers does not match the esbuild structural detector and is only split by the heuristic path.

## Browserify standalone bundles

### Structural signature

Browserify emits a self-executing wrapper whose outer expression is a call with exactly three arguments:

1. **Module map** — object literal keyed by non-negative integer module IDs; each value is a two-element array `[factory, deps]`.
2. **Cache object** — typically `{}` (argument 1; not inspected for detection).
3. **Entry IDs** — array of numeric entry module IDs (argument 2).

The detector scans top-level expression statements for `Callee::Expr` nested inside another `CallExpr`, then validates the module-map shape.

### Extraction and normalization

For each module entry, Wakaru:

- Extracts the factory function (named `FnExpr` or `ArrowExpr`).
- Builds a synthetic ES module from the factory body.
- Runs `resolver` marks and renames factory parameters to `require`, `module`, and `exports` when minified names differ.
- Applies the fixer and emits standalone module code.

Output filenames:

| Role | Filename |
|------|----------|
| Single entry | `entry.js` |
| Multiple entries | `entry-{id}.js` per entry ID |
| Dependency module | `module-{id}.js` |

The testcase bundle at `testcases/browserify/dist/index.js` is a minified standalone with numeric modules `1`–`3` and `(require, module, exports)` factories.

## Esbuild and Bun structural detection

Bun's bundler emits the same helper shapes as esbuild; both route through `esbuild::detect_from_module`.

### Phase 1: marker pre-check

Before cloning the AST, the detector scans top-level `var` initializers for:

- **Lazy helpers** — arrow with ≤2 params whose body is another arrow or function (`is_lazy_helper`). Covers minified `__esm` and non-minified `__require` wrappers.
- **`__export` helper shape** — two-param arrow whose body is a single `for…in` over the second parameter.

Detection aborts early when both `helper_syms` and `has_export_helper_shape` are empty.

### `__commonJS` and `__esm` factories

Factories match `var X = helper(arg)` where `helper` is a collected lazy-helper symbol:

| Form | Pattern | Filename source |
|------|---------|-----------------|
| Non-minified CJS | `__commonJS({ "src/foo.js"(exports, module) { … } })` | Sanitized path key |
| Minified | `y(() => { … })` | `{var_name}.js` |

`__commonJS` helpers are distinguished by lazy-helper shape **plus** a reference to `.exports` in the helper body (`collect_commonjs_helper_syms`).

A bundle qualifies when it has CJS factories **or** at least **five** lazy factories (`has_factories`).

### `__export` scope-hoisted boundaries

Scope-hoisted modules are delimited by adjacent pairs:

```javascript
var ns_a = {};
__export(ns_a, { greet: () => greet, … });
```

`detect_export_helper` finds the `__export` binding; `collect_scope_hoisted_boundaries` enumerates namespace + call pairs. Exported bindings are promoted to `export` declarations; cross-boundary references become synthesized `import`/`export` edges. `__toESM` and dynamic-require helpers are detected by body shape and rewritten during emission (not left as synthetic imports).

Scope-only bundles (no factories) still match when `__export` boundaries exist and either multiple namespaces are present or a single namespace is re-exported at module scope.

## Strict vs auto heuristic split

| Mode | CLI flag | `heuristic_split` | Behavior |
|------|----------|-------------------|----------|
| Auto (default) | `--unpack` / `--unpack=auto` | `true` | Structural detectors, then `scope_hoist` fallback |
| Strict | `--unpack=strict` | `false` | Structural detectors only; unmatched input decompiles as one file |

**Auto fallback flow** (`driver/unpack/mod.rs`):

1. `try_unpack_bundle` — webpack → browserify → systemjs → esbuild.
2. If `None` and `heuristic_split`: `scope_hoist::split_scope_hoisted`.
3. If split yields `>1` modules → unpack with `BundleFormat::ScopeHoisted`.
4. Otherwise → single-file `decompile` → `module.js`.

**Nested scope split** (auto + `--level aggressive`): after a structural match, `maybe_split_scope_hoisted_modules` re-runs the heuristic splitter on each extracted module. Import paths must resolve against sibling filenames or the parent module is kept intact.

Directory scanning (`is_detected_unpack_input`) uses the same gate: strict mode skips heuristic candidates; auto mode includes files that heuristic-split into multiple modules.

## Heuristic scope-hoist splitter

`scope_hoist::split_scope_hoisted` handles flat concatenated ESM without bundler markers (Rollup/Vite output, IIFE-wrapped esbuild ESM).

| Gate | Threshold |
|------|-----------|
| Minimum declarations | 10 top-level items with declared names |
| Minimum clusters | 2 after union-find clustering |
| Module cluster size | ≥2 declarations (else folded into entry) |

Clustering uses five merge signals: mutual references, adjacent dependency chains, inert helpers, adjacency + shared references, and exclusive-consumer merges (conservative fan-out guard). The entry cluster absorbs small clusters, `ModuleDecl` items, and startup code.

During emission, esbuild-specific helpers are recovered when present:

- **Dynamic require** — `typeof require`, `require.apply(this, arguments)`, and `"Dynamic require of"` message.
- **`__toESM`** — `__esModule` member check plus `"default"` `defineProperty`.

## CLI commands

### Browserify testcase

```bash
# Rebuild fixture (optional)
cd testcases/browserify && pnpm install && pnpm build

# Unpack with full decompile pipeline
cargo run -p wakaru-cli -- testcases/browserify/dist/index.js --unpack -o /tmp/browserify-out/

# Raw extraction (no rule pipeline)
cargo run -p wakaru-cli -- testcases/browserify/dist/index.js --unpack --raw -o /tmp/browserify-raw/
```

Expected: multiple modules including `entry.js`; factory bodies decompiled to `import`/`export` rather than `require()` calls.

### Esbuild/Bun fixtures

Generated bundles live under `crates/core/tests/bundles/esbuild-gen/dist/`. Regenerate with Node.js and Bun:

```bash
cd crates/core/tests/bundles/esbuild-gen && bash generate.sh
```

```bash
# Mixed CJS factories + scope-hoisted namespaces
cargo run -p wakaru-cli -- \
  crates/core/tests/bundles/esbuild-gen/dist/es-mixed/bundle.js \
  --unpack -o /tmp/esbuild-mixed/

# Scope-only ESM (structural __export detection)
cargo run -p wakaru-cli -- \
  crates/core/tests/bundles/esbuild-gen/dist/es-scope-only/bundle.js \
  --unpack -o /tmp/esbuild-scope/

# Bun minified comparison fixture
cargo run -p wakaru-cli -- \
  crates/core/tests/bundles/esbuild-gen/dist/bun-mixed-min/bundle.js \
  --unpack -o /tmp/bun-mixed/
```

Compare strict vs auto on markerless bundles:

```bash
cargo run -p wakaru-cli -- flat-bundle.js --unpack=strict -o /tmp/strict/   # single module if no markers
cargo run -p wakaru-cli -- flat-bundle.js --unpack=auto   -o /tmp/auto/     # heuristic split when ≥10 decls
```

## Test verification

Run the focused test binaries after changes:

```bash
# Browserify pipeline snapshot
cargo test -p wakaru-core --test bundle_unpack browserify_unpack_extracts_multiple_modules

# Full browserify + webpack5 bundle suite
cargo test -p wakaru-core --test bundle_unpack

# Esbuild detection, scope-hoisted extraction, heuristic helpers
cargo test -p wakaru-core --test esbuild_unpack

# Generated esbuild/Bun fixture integration
cargo test -p wakaru-core --test esbuild_fixtures

# Heuristic scope-hoist unit tests
cargo test -p wakaru-core -- unpacker::scope_hoist
```

Faster iteration with nextest:

```bash
cargo nextest run -p wakaru-core -E 'test(bundle_unpack) | test(esbuild_unpack) | test(esbuild_fixtures)'
```

Success signals:

- `browserify_unpack_extracts_multiple_modules`: `pairs.len() > 1` and `entry.js` present.
- `esbuild_detects_minified_lazy_helper` / `esbuild_detects_minified_cjs_helper`: ≥6 modules (5 factories + entry).
- `esbuild_scope_hoisted_modules_are_extracted`: ≥8 modules including `ns_a.js`, `ns_b.js`.
- `heuristic_scope_hoist_restores_esbuild_dynamic_require_imports`: `heuristic_split: true`; `require("react")` restored, no synthetic `import { r }`.

Definition-of-done pipeline checks from `AGENTS.md` still apply after rule changes:

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test bundle_unpack
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

## Format comparison

```mermaid
flowchart TD
  input[Bundle input] --> structural[try_unpack_bundle]
  structural -->|Browserify match| bfy[browserify.rs extract]
  structural -->|esbuild markers| esb[esbuild.rs extract]
  structural -->|no match| heuristic{heuristic_split?}
  heuristic -->|yes| sh[scope_hoist.rs cluster]
  heuristic -->|no| single[decompile as module.js]
  bfy --> pipeline[Two-phase unpack driver]
  esb --> pipeline
  sh --> pipeline
```

| Aspect | Browserify | esbuild structural | Heuristic scope-hoist |
|--------|------------|--------------------|-----------------------|
| Primary signal | Numeric module map + triple-arg call | Lazy helpers / `__export` pairs | Reference graph on top-level decls |
| Typical input | `browserify -o bundle.js` | `esbuild --bundle --format=esm` | Rollup/Vite flat ESM, markerless esbuild ESM |
| Entry filename | `entry.js` | `entry.js` | `entry.js` |
| Strict mode | Detected | Detected (with markers) | Skipped |

## Failure modes

| Symptom | Likely cause |
|---------|--------------|
| Single `module.js` output | Strict mode on markerless bundle; or heuristic gates not met (<10 declarations, <2 clusters) |
| Browserify not detected | Input lacks triple-arg outer call or non-integer module keys |
| esbuild not detected | Fewer than five lazy factories and no CJS factories; no `__export` boundaries |
| Empty or partial split | Parse failure in `try_unpack_bundle` (returns error, not `None`) |
| Directory scan skips files | `--unpack=strict` and file has no structural signature |

## Related pages

<Card href="/bundle-formats-and-unpacking" title="Bundle formats and unpacking" description="Full detection order, raw vs full unpack, and multi-file directory semantics." />

<Card href="/unpack-bundles" title="Unpack bundles" description="Operational guide for --unpack modes, --raw, --force, and directory scanning." />

<Card href="/webpack-bundle-recipe" title="Webpack bundle recipe" description="Parallel workflow for webpack4/webpack5 testcase verification." />

<Card href="/testing-and-snapshots" title="Testing and snapshots" description="nextest workflow, snapshot acceptance, and required pipeline test matrix." />

<Card href="/decompile-pipeline" title="Decompile pipeline" description="What happens to extracted modules in the two-phase unpack driver." />

---

## 21. Troubleshooting

> Common failure modes: overwrite protection, unpack directory skip behavior, UnpackWarningKind codes, TDZ and parse-recovery warnings, formatter failures, and bug report fields.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/21-troubleshooting.md
- Generated: 2026-06-28T01:14:07.556Z

### Source Files

- `README.md`
- `crates/core/src/driver/types.rs`
- `crates/core/src/tdz_check.rs`
- `docs/debugging.md`
- `.github/ISSUE_TEMPLATE/bug_report.yml`

---
title: Troubleshooting
description: Common failure modes — overwrite protection, unpack directory skip behavior, UnpackWarningKind codes, TDZ and parse-recovery warnings, formatter failures, and bug report fields.
---

Wakaru is designed to produce output even when individual modules fail: warnings are collected, raw code is preserved where transforms cannot complete, and the CLI exits non-zero only when at least one **error-severity** warning is present. This page maps symptoms to causes and shows what to include in a bug report.

## Quick symptom map

| Symptom | Likely cause | First step |
|---------|--------------|------------|
| `output file … already exists` | Overwrite protection | Add `--force` or choose a new `-o` path |
| `output directory … is not empty` | Non-empty unpack target | Add `--force` or use an empty directory |
| `no bundle or chunk files detected` | Directory scan found no bundles | Pass explicit bundle files, or see [Unpack bundles](/unpack-bundles) |
| `scanned: N … skipped: M` with low `detected` | Plain JS files in a `dist/` tree | Expected — only bundle/chunk shapes are unpacked |
| `error: … tdz_violation` | Transform reordered `let`/`const`/`class` | Run with `--diagnostics`; try `--level minimal` |
| `warning: oxc formatter failed` | Emitted code is not parseable by the formatter | Inspect decompiled output; file a bug with `--diagnostics` |
| `errors in N module(s)` at end | Non-diagnostic warning kinds | Inspect stderr or `--json` warnings where `is_error` is true |
| Internal panic with GitHub link | Unhandled Rust panic | Use the pre-filled issue URL from stderr |

## Overwrite protection

Wakaru refuses to clobber existing outputs unless you pass `--force`. This applies globally to decompile, unpack, `extract`, and `debug trace`.

<ParamField body="--force" type="flag">
Overwrite existing output **files** and write into **non-empty** output directories. Without it, Wakaru exits before processing.
</ParamField>

### Single-file decompile

When `-o output.js` points to a file that already exists:

```text
output file output.js already exists; pass --force to overwrite
```

<Steps>
<Step title="Choose a safe output path">

Run without `-o` to print to stdout, or pick a path that does not exist yet.

</Step>
<Step title="Or pass --force">

```bash
wakaru input.js -o output.js --force
```

</Step>
</Steps>

### Unpack to a directory

When `-o out/` exists and is not empty:

```text
output directory out/ is not empty; pass --force to write into it
```

With `--force` on a non-empty directory, Wakaru uses a **write-if-changed** fast path: files whose bytes are identical to the new output are left untouched (no redundant writes or timestamp updates). Changed files are overwritten.

```bash
wakaru bundle.js --unpack -o out/ --force
```

### Other subcommands

`wakaru extract` and `wakaru debug trace` (hidden) use the same `ensure_output_file` / `ensure_output_dir` checks. Pass `--force` when re-running against an existing target.

## Unpack directory skip behavior

Directory inputs work **only** with `--unpack`. Wakaru recursively scans for candidates, then keeps only files that match a bundle or chunk shape.

### What gets scanned

- **Extensions:** `.js`, `.mjs`, `.cjs`
- **Skipped entirely:** hidden files and directories (names starting with `.`), `node_modules` trees, non-JS extensions (e.g. `.js.map`, `.ts`)

### What gets kept vs skipped

Each scanned candidate is tested with `is_detected_unpack_input`. Files that do not match a bundle/chunk shape are **skipped** — they are not copied, decompiled, or written to the output directory.

On a TTY, stderr reports scan statistics:

```text
scanned: 4 file(s), detected: 2 bundle/chunk file(s), skipped: 2 file(s)
```

| Counter | Meaning |
|---------|---------|
| `scanned` | JS-like files read and evaluated |
| `detected` | Files accepted as bundle/chunk inputs |
| `skipped` | Scanned files that failed bundle detection |

### Directory vs explicit file inputs

| Input mode | Non-bundle `.js` behavior |
|------------|---------------------------|
| **Directory** (`dist/ --unpack`) | Skipped silently (counted in `skipped`) |
| **Explicit file** (`plain.js --unpack`) | Still processed — normal unpack fallback applies |

If a directory scan finds zero detected bundles:

```text
no bundle or chunk files detected in directory input
```

<Steps>
<Step title="Confirm bundles are present">

Point at the actual webpack chunk, runtime entry, or browserify standalone — not only application source files sitting beside bundles.

</Step>
<Step title="Try explicit file paths">

```bash
wakaru entry.js chunk.123.js --unpack -o out/
```

</Step>
<Step title="Adjust detection mode">

Use `--unpack=strict` for structural detection only, or default `--unpack` / `--unpack=auto` to allow heuristic scope-hoisted splitting. See [Bundle formats and unpacking](/bundle-formats-and-unpacking).

</Step>
</Steps>

### Unsafe module filenames

During unpack, module filenames are resolved under the output directory. Paths that escape the output root (e.g. `../outside.js` or symlink tricks) are rejected:

```text
unsafe module filename "../evil.js": path escapes output directory
```

This is intentional path-traversal protection, not a decompiler bug.

## Warning kinds (`UnpackWarningKind`)

Warnings flow through the core API, CLI stderr, `--json` output, and the WASM bindings (with one extra kind — see [Formatter failures](#formatter-failures)).

### Severity model

<ResponseField name="is_diagnostic" type="boolean">
`true` for kinds that signal potential transform issues but **do not** fail the run. The CLI still exits 0 unless an error-severity warning is present.
</ResponseField>

<ResponseField name="is_error" type="boolean">
`true` when `!is_diagnostic`. Triggers `has_errors()`, a non-zero CLI exit, and the `errors in N module(s)` summary.
</ResponseField>

### Full kind reference

| Kind string | Severity | When it fires | Output preserved? |
|-------------|----------|---------------|-------------------|
| `raw_normalization_failed` | Error* | Raw unpack normalization could not parse a module | Yes — unparsed raw code kept |
| `fact_collection_parse_failed` | Error | Phase 1 fact collection could not parse a module | Yes — empty facts, pipeline continues |
| `decompile_failed` | Error | Phase 2 rule pipeline returned an error for a module | Yes — raw extracted code kept |
| `input_parse_recovered` | Diagnostic | Input parsed with recoverable SWC errors (`--diagnostics`) | Yes |
| `tdz_violation` | Diagnostic | Reference to `let`/`const`/`class` before declaration (`--diagnostics`) | Yes |
| `duplicate_declaration` | Error | Duplicate lexical binding in same scope (`--diagnostics`) | Yes |
| `import_cycle` | Diagnostic | Local import SCC not merged (size/safety limits) (`--diagnostics`) | Yes |
| `output_parse_recovered` | Error | Emitted code parsed with recoverable errors (`--diagnostics`) | Yes |
| `output_parse_failed` | Error | Emitted code is entirely unparseable (`--diagnostics`) | Yes — broken code still written |

\*`raw_normalization_failed` is recorded only when `--diagnostics` is enabled on the raw/heuristic normalization path. `fact_collection_parse_failed` and `decompile_failed` are always recorded during full unpack.

### JSON shape

With `--json`, each warning includes machine-readable severity:

<RequestExample>
```bash
wakaru bundle.js --unpack --diagnostics --json -o out/
```
</RequestExample>

<ResponseExample>
```json
{
  "warnings": [
    {
      "filename": "module-1.js",
      "kind": "tdz_violation",
      "is_error": false,
      "message": "reference to `x` before declaration"
    }
  ],
  "failed": 0,
  "total": 12,
  "elapsed_ms": 340
}
```
</ResponseExample>

See [JSON output and CI](/json-output-and-ci) for the full schema and piping patterns.

### Which warnings require `--diagnostics`

`DecompileOptions::diagnostics` defaults to `false`. The CLI flag `--diagnostics` sets it.

| Kind | Without `--diagnostics` | With `--diagnostics` |
|------|-------------------------|----------------------|
| `fact_collection_parse_failed` | Reported | Reported |
| `decompile_failed` | Reported | Reported |
| `tdz_violation` | Silent | Reported |
| `input_parse_recovered` | Silent | Reported |
| `duplicate_declaration` | Silent | Reported |
| `import_cycle` | Silent | Reported |
| `output_parse_recovered` / `output_parse_failed` | Silent | Reported |
| `raw_normalization_failed` | Silent | Reported (raw path) |

For bug reports and CI quality gates, always rerun with `--diagnostics` (and `--json` for automation).

## TDZ violations

Temporal dead zone (TDZ) checking runs on the **transformed AST** after the rule pipeline, using traversal order rather than source spans. This catches cases where rules reorder nodes without updating byte positions.

### What is detected

- References to `let`, `const`, or `class` bindings before their declaration in the same scope
- Self-references in initializers (`const x = x`)
- Class heritage references before the base binding (`class Foo extends Bar` before `let Bar`)
- Parameter default expressions referencing later parameters
- Destructuring default ordering (later defaults may reference earlier bindings; self-reference is still TDZ)

Nested functions and arrows are **not** checked against outer declarations (deferred execution).

### Example

<Input>
```javascript
console.log(x);
let x = 1;
```
</Input>

With `--diagnostics`:

```text
warning: module.js: reference to `x` before declaration
```

### What to do

<AccordionGroup>
<Accordion title="TDZ warning after an upgrade">

1. Reproduce with `--level minimal` to see if aggressive heuristics caused the reorder.
2. Use [Trace the rule pipeline](/trace-rule-pipeline) to find which rule introduced the bad ordering.
3. File a bug with the minimal input, command, and `--diagnostics` output.

</Accordion>
<Accordion title="TDZ in unpack but not single-file">

Unpack uses the two-phase fact pipeline. Trace the **extracted module** as a single file — `debug trace` on the whole bundle is misleading. See [Debug regressions](/debug-regressions).

</Accordion>
</AccordionGroup>

## Parse-recovery warnings

Wakaru uses SWC's recoverable parse mode in several places. Recovery lets the pipeline continue, but diagnostics flag suspect input or output.

### `input_parse_recovered`

Fires when the **input** to decompile/unpack had recoverable parse errors. Message prefix:

```text
input parse recovered from parser error: …
```

The transform pipeline still runs on the recovered AST.

### `output_parse_recovered` and `output_parse_failed`

After emit, `verify_output_parses` re-parses the **output** code (only with `--diagnostics`).

| Kind | Meaning |
|------|---------|
| `output_parse_recovered` | Output has recoverable parse issues — **error severity** |
| `output_parse_failed` | Output is wholly unparseable — **error severity** |

Both kinds cause a non-zero exit when `--diagnostics` is on, even though the (possibly broken) code is still written.

### `duplicate_declaration`

Separate from parse recovery: detects duplicate `let`/`const`/`class`/import bindings in the same lexical scope. Error severity under `--diagnostics`.

## Formatter failures

`--formatter` runs a final **oxc** formatting pass after decompilation. Formatting is off by default.

### CLI behavior

When oxc cannot format the emitted code, Wakaru **preserves the unformatted output** and prints a stderr warning:

```text
warning: oxc formatter failed for module.js, preserving output: …
```

Formatter failures do **not** cause a non-zero exit in the CLI. The decompiled code is still written.

### WASM / playground behavior

The WASM bindings surface formatter failures as a distinct warning kind not present in `UnpackWarningKind`:

| Kind | Surface |
|------|---------|
| `formatter_failed` | WASM and TypeScript types only |

Message shape:

```text
oxc formatter failed, preserving output: …
```

### When formatting fails

Formatting fails when the **emitted** JavaScript is syntactically invalid for oxc — often the same underlying issue as `output_parse_failed`. If you see formatter warnings:

1. Inspect the raw emitted code (disable `--formatter` to see pre-format output).
2. Rerun with `--diagnostics` to get `output_parse_*` warnings from the core pipeline.
3. File a bug — the root cause is usually a rule emit bug, not the formatter itself.

```bash
wakaru input.js --diagnostics -o output.js
wakaru input.js --formatter --diagnostics -o output.js
```

## Other CLI validation errors

| Error | Cause |
|-------|-------|
| `--unpack requires -o/--output` | Unpack always needs an output directory |
| `cannot decompile a directory` | Single-file mode received a directory — add `--unpack` |
| `multiple input files require --unpack` | Pass several files only with `--unpack` |
| `--source-map is only supported with a single input file` | Source maps apply to one input at a time |
| `no input specified` | No file argument and stdin is a TTY |
| `output path … exists and is not a directory` | `-o` for unpack must be a directory |
| Panic + GitHub issue URL | Unhandled internal error — use the pre-filled link |

## Exit codes and stderr labels

On a TTY, warnings print as `warning:` and errors as `error:` (styled when color is enabled). With `--json`, human stderr summaries are suppressed; warnings live in the JSON object.

Unpack summary line when modules fail:

```text
total: 12 module(s) (3 failed) in 1.24s
```

Followed by:

```text
errors in 3 module(s): chunk-1.js, chunk-4.js, vendor.js
```

Single-file decompile uses the same `errors in N module(s)` pattern when `has_errors()` is true.

## Filing a bug report

Use the GitHub **Decompiler Bug Report** template. Required and recommended fields:

| Field | Required | What to provide |
|-------|----------|-----------------|
| Describe the bug | Yes | What you ran, what you expected, what happened |
| Expected behavior | Yes | Correct or desired output behavior |
| Version | No | Output of `wakaru --version` or npm package version |
| Command used | No | Full command; add `--diagnostics` for extra warnings |
| Input code | No | Minimal JS sample, or playground / gist link |
| Reproduction | No | Playground URL or gist — minimal repro strongly preferred |
| Steps to reproduce | No | Build steps, bundler version, flags |
| Actual behavior | No | Paste stderr, `--json` warnings, or bad output |

<Steps>
<Step title="Minimize the input">

Reduce to the smallest file or module that still shows the problem. For bundles, extract one failing module if possible.

</Step>
<Step title="Capture the exact command">

```bash
wakaru bundle.js --unpack --diagnostics --level standard -o out/ 2>stderr.txt
```

</Step>
<Step title="Include both bad and expected output">

Paste the failing module from `out/` and describe what readable structure you expected.

</Step>
<Step title="For panics">

Copy the full panic output. The CLI prints a pre-filled issue URL with version and OS.

</Step>
</Steps>

For contributor-side regression investigation, see [Debug regressions](/debug-regressions) and [Trace the rule pipeline](/trace-rule-pipeline).

## Related pages

<CardGroup>
<Card title="Unpack bundles" href="/unpack-bundles">
Operational guide for `--unpack` modes, directory scanning, and `--force`.
</Card>
<Card title="JSON output and CI" href="/json-output-and-ci">
Warning `kind` strings, `is_error`, and automation patterns.
</Card>
<Card title="CLI reference" href="/cli-reference">
Full flag surface including `--diagnostics`, `--formatter`, and `--force`.
</Card>
<Card title="Debug regressions" href="/debug-regressions">
Rule trace bisection, snapshot layers, and symptom-to-cause mapping.
</Card>
<Card title="Core API reference" href="/core-api-reference">
`UnpackWarningKind`, `has_errors()`, and `DecompileOptions::diagnostics`.
</Card>
</CardGroup>

---

## 22. Debug regressions

> Investigate snapshot drift, raw vs final webpack4 layers, rule trace bisection, profile export, and symptom-to-cause mapping (unresolved_mark, early-rule cascades).

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/22-debug-regressions.md
- Generated: 2026-06-28T01:14:05.618Z

### Source Files

- `docs/debugging.md`
- `docs/testing.md`
- `crates/cli/src/main.rs`
- `.cargo/config.toml`

---
title: "Debug regressions"
description: "Investigate snapshot drift, raw vs final webpack4 layers, rule trace bisection, profile export, and symptom-to-cause mapping (unresolved_mark, early-rule cascades)."
---

Wakaru regression debugging centers on insta snapshot failures in `crates/core/tests/`, the `debug trace` CLI for per-rule bisection on single files, and the webpack4 raw-vs-final snapshot split that separates unpack extraction from decompile pipeline changes. `.cargo/config.toml` sets `INSTA_UPDATE=new`, so drift fails tests and writes `.snap.new` files instead of silently updating committed snapshots.

## Snapshot drift workflow

When a test output changes, insta fails the test and writes a sibling `.snap.new` under `crates/core/tests/snapshots/`. Review every diff before accepting — a snapshot change is valid only when output is semantically better or the fixture expectation intentionally changed.

<Steps>
<Step title="Reproduce the failure">

Run the failing test binary directly:

```bash
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test my_rule_rule -- my_specific_test
```

For faster iteration on the core suite, use nextest:

```bash
cargo nextest run -p wakaru-core --test webpack4_unpack
```

</Step>
<Step title="Inspect the diff">

Insta prints a unified diff between the committed `.snap` and the new output. A `.snap.new` file appears next to the original snapshot. Do not commit `.snap.new` files — accept or reject first.

</Step>
<Step title="Accept or reject">

```bash
cargo insta review    # interactive accept/reject per snapshot
cargo insta accept    # bulk-accept all pending .snap.new files
```

For a one-off bulk accept during a run:

```bash
INSTA_UPDATE=always cargo test -p wakaru-core --test webpack4_unpack
```

<Warning>
`INSTA_UPDATE=always` bypasses the review gate. The project deliberately uses `INSTA_UPDATE=new` so regressions cannot land green without human review.
</Warning>

</Step>
<Step title="Add a focused regression test">

Snapshot updates alone do not satisfy the definition of done for rule changes. Add or update a per-rule test in `crates/core/tests/*_rule.rs` that reproduces the exact bug or behavior change.

</Step>
</Steps>

## Webpack4 snapshot layers

Webpack4 unpack tests pin output at two boundaries. Comparing them localizes whether a regression originates in extraction or in the decompile pipeline.

| Layer | Test binary | Snapshot prefix | What it captures |
|-------|-------------|-----------------|------------------|
| Raw | `webpack4_unpack_raw` | `webpack4_unpack_raw__raw_*` | Module code after webpack extraction and bundler-coupled normalization, **before** decompile rules (`SimplifySequence`, `UnEsm`, etc.) |
| Final | `webpack4_unpack` | `webpack4_unpack__*` | Fully decompiled module output after the normal rule pipeline |

```text
webpack4 bundle (testcases/webpack4/dist/index.js)
        │
        ▼
  unpack_webpack4_raw()          ──► webpack4_unpack_raw__raw_*.snap
        │
        ▼
  full unpack() + decompile      ──► webpack4_unpack__*.snap
```

<Note>
Raw snapshots may still contain webpack markers such as `require.r(exports)` and `require.d(...)` getters. These are semantic inputs for later ESM recovery — they are expected in raw output, not raw-layer failures by themselves.
</Note>

**Decision tree when snapshots drift:**

- Raw unchanged, final changed → inspect the decompile pipeline (rule ordering, individual rules, `RewriteLevel`).
- Raw and final both changed → inspect the unpacker or bundler-coupled normalization first, then trace downstream rules.
- Many final snapshots changed at once → suspect an early pipeline rule cascade (see symptom table below).

Run the layer-specific tests:

```bash
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test webpack4_unpack
```

## Rule trace bisection

`debug trace` runs the normal single-file rule pipeline and prints the initial source once, followed by a git-style unified diff for each rule that changes rendered output. Use it before manually bisecting with `apply_rules()` and `RulePipelineOptions::between(...)`.

<CodeGroup>

```bash title="Trace all changed rules"
cargo run -p wakaru-cli -- debug trace path/to/module.js
```

```bash title="Include unchanged rules"
cargo run -p wakaru-cli -- debug trace path/to/module.js --all
```

```bash title="Trace a rule range"
cargo run -p wakaru-cli -- debug trace path/to/module.js --from RemoveVoid --until UnEsm
```

```bash title="Write trace to file"
cargo run -p wakaru-cli -- debug trace path/to/module.js -o trace.txt
```

</CodeGroup>

<ParamField body="--from" type="string">
First rule to run. Must match a name from `rule_names()` (e.g. `RemoveVoid`, `UnIife`, `SmartInline`). Second passes use suffixed names like `UnIife2`, `UnParameters2`.
</ParamField>

<ParamField body="--until" type="string">
Last rule to run (inclusive). Unknown rule names produce an error.
</ParamField>

<ParamField body="--all" type="boolean">
Include rules that ran but left rendered output unchanged. These appear as `=== RuleName (unchanged) ===` headers with no diff hunk.
</ParamField>

<ParamField body="--level" type="RewriteLevel" default="standard">
Rewrite aggressiveness passed through to `DecompileOptions`. Affects which rules are enabled.
</ParamField>

<ParamField body="-m / --source-map" type="path">
Optional source map for identifier recovery during trace, same as normal decompile.
</ParamField>

<Warning>
`debug trace` is single-file only. `trace_rules()` rejects bundle inputs because unpack uses a two-phase fact-system pipeline where per-rule tracing would be misleading. For bundle regressions, trace an extracted raw module or reduce the issue to a single-file reproduction.
</Warning>

### Core API equivalent

```rust
use wakaru_core::{trace_rules, format_trace_events, DecompileOptions, RuleTraceOptions};

let events = trace_rules(
    source,
    DecompileOptions { filename: "module.js".into(), ..Default::default() },
    RuleTraceOptions {
        start_from: Some("RemoveVoid".into()),
        stop_after: Some("UnEsm".into()),
        only_changed: true,  // default
    },
)?;
let output = format_trace_events(&events);
```

`RuleTraceEvent` carries `rule`, `changed`, `before`, and `after` strings. `format_trace_events` renders changed events as unified diffs; unchanged events get header-only lines.

### Bisection workflow

<Steps>
<Step title="Find the introducing rule">

Run `debug trace` on a representative module. Scroll to the first diff that matches the regression symptom.

</Step>
<Step title="Confirm input shape">

Use `render_pipeline_until(source, "RuleName")` from `crates/core/tests/common/mod.rs` to capture cumulative output just before the suspect rule. Verify the AST shape is what you expect.

</Step>
<Step title="Isolate the rule">

Use `render_pipeline_between(source, "Start", "Stop")` to run only a narrow rule range without downstream effects.

</Step>
<Step title="Check pipeline ordering">

If the issue is ordering rather than a single rule, consult the rule dependency inventory and the ordered `RuleDescriptor` registry in `pipeline.rs`. Fragile orderings are documented with explicit `requires` chains.

</Step>
</Steps>

Test helpers available in `common/mod.rs`:

| Helper | Purpose |
|--------|---------|
| `render_pipeline_until(source, stop_after)` | Pipeline through named rule (inclusive), then emit |
| `render_pipeline_between(source, start, stop)` | Only rules from `start` through `stop` (inclusive) |
| `trace_pipeline(source, options)` | Collect `RuleTraceEvent`s programmatically |
| `changed_rules(source)` | List rule names that changed output |

## Chrome trace profiling

Global `--profile` writes a Chrome trace-format file suitable for `chrome://tracing`. Use it to measure parse, resolver, rule, and unpack phase timing rather than to inspect AST diffs.

```bash
cargo run -p wakaru-cli -- --profile trace.json input.js -o output.js
cargo run -p wakaru-cli -- --unpack --profile unpack-trace.json bundle.js -o out/
```

<ParamField body="--profile" type="path" required>
Output file for the Chrome trace profile. Creates the file via `tracing_chrome::ChromeLayerBuilder`.
</ParamField>

<ParamField body="--profile-rules" type="boolean">
Requires `--profile`. Sets subscriber level to `DEBUG` so per-rule `debug_span!("rule", name = ...)` spans appear in the trace. Without it, only `INFO`-level spans (parse, resolver, rules aggregate, emit, unpack phases) are recorded.
</ParamField>

Per-rule spans are emitted inside `apply_rules_impl` at `DEBUG` level. Unpack paths add `INFO` spans for detection (`detect_webpack4`, `detect_webpack5`, etc.) and phase boundaries (`phase1_collect_facts`, `phase2_decompile_modules`).

## Symptom-to-cause mapping

| Symptom | Likely cause | Investigation |
|---------|--------------|---------------|
| Unexpected variable names or wrong binding matches | Missing `unresolved_mark` guard, or matching by `sym` alone instead of `(sym, SyntaxContext)` | Check `id.ctxt.outer() != self.unresolved_mark` gate in the rule; use `BindingRenamer` for renames |
| Many snapshots changed at once | Early pipeline rule cascade | Trace a representative module; inspect `SimplifySequence`, `FlipComparisons`, `RemoveVoid` and other Stage 1 normalization rules |
| Rule not firing in isolated test | Input shape differs from real pipeline | Check raw snapshot or `debug trace` to see AST when the rule receives it; use `render()` instead of `render_rule()` if earlier normalization is required |
| `render_rule` passes but `render` fails | Rule depends on earlier normalization (e.g. helper body-shape matching after `SimplifySequence`) | Use `render_pipeline_until` or full `render()` in the test |
| `cargo test` hangs | Likely infinite recursion in a visitor | `RUST_BACKTRACE=1 cargo test -- --nocapture` |
| Raw snapshot changed unexpectedly | Unpacker or bundler-coupled normalization regression | Compare against `webpack4_unpack_raw`; inspect webpack extraction helpers |
| Final-only snapshot drift | Decompile rule or ordering change | Compare raw (stable) vs final; bisect with `debug trace` on extracted module code |
| Bundle regression, trace fails | Bundle input rejected by design | Extract raw module with `--unpack --raw`, then trace that single file |

### `unresolved_mark` in practice

After `resolver()` runs, free variables (globals like `Object`, `require`) carry `unresolved_mark` as their outer `SyntaxContext`. Rules that match identifiers by name must gate on this mark to avoid rewriting locals, parameters, or imports with the same symbol name. Every new visitor that matches identifiers by name should take `unresolved_mark: Mark` and check `id.ctxt.outer() == self.unresolved_mark` (or the inverse guard to skip bound identifiers).

## Bundle and cross-module regressions

Full bundle unpack runs Phase 1 fact collection after `UnEsm`, then Phase 2 decompilation with cross-module rules (`namespace_decomposition`, cross-module helper refs). Per-rule trace does not cross this barrier.

For bundle-level regressions:

1. Run the relevant pipeline test (`bundle_unpack`, `esbuild_unpack`, `webpack4_unpack`).
2. If webpack4, compare raw vs final layers first.
3. Extract the affected module with `--unpack --raw` and save it to a scratch file.
4. Run `debug trace` on that single module.
5. If the regression involves cross-module facts, inspect `facts_rule` tests and the two-phase barrier described in cross-module facts documentation.

## External validation

### Fixture repository

A sibling `wakaru-fixtures` repository (private) contains real-world bundles. After significant rule changes:

```bash
cd ../wakaru-my-worktree
../wakaru-fixtures/run.sh --check       # diff vs committed reference
../wakaru-fixtures/run.sh --update      # accept reviewed improvements
```

The script builds `wakaru-cli` with the `dev-release` profile from the checkout you launch it from, avoiding stale binaries.

### Reproduction matrices

Matrices under `scripts/repro/` test recovery across transpiler versions and minification levels. Current rates live in `scripts/repro/stats.json`.

```bash
cargo build --profile dev-release -p wakaru-cli
export WAKARU="$PWD/target/dev-release/wakaru"
node scripts/repro/array-spread-rest-matrix/matrix.mjs --details
node scripts/repro/collect-stats.mjs --check
```

Build the binary from the same worktree you are validating — a stale `main` binary can produce false pass/fail results.

## Required verification before commit

After fixing a regression, run the full relevant checklist:

```bash
cargo test -p wakaru-core --test my_rule_rule          # focused rule test
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
git status --short    # no stale .snap.new files
```

<Check>
A regression fix is complete when the focused rule test passes, pipeline snapshots are reviewed and accepted (or unchanged), formatting and clippy are clean, and no `.snap.new` files remain in the working tree.
</Check>

## Related pages

<CardGroup>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
CLI and API details for `debug trace`, `--from`/`--until` ranges, and trace output format.
</Card>
<Card title="Testing and snapshots" href="/testing-and-snapshots">
Insta workflow, nextest usage, test helpers, and the pre-commit verification matrix.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first rule development, pipeline placement, and `unresolved_mark` requirements.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Single-file parse → resolver → rules → fixer → emit flow and `unresolved_mark` scope gating.
</Card>
<Card title="Webpack bundle recipe" href="/webpack-bundle-recipe">
End-to-end webpack4/webpack5 fixture workflow and reference output comparison.
</Card>
<Card title="Troubleshooting" href="/troubleshooting">
Common failure modes, warning kinds, and bug report fields.
</Card>
</CardGroup>

---

## 23. Contributing

> Fork-and-branch workflow, required cargo fmt/clippy/test checks, conventional commits, areas where contributions are most valuable, and links to architecture and testing docs.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/23-contributing.md
- Generated: 2026-06-28T01:15:50.849Z

### Source Files

- `CONTRIBUTING.md`
- `AGENTS.md`
- `README.md`
- `.github/workflows/rust-ci.yml`
- `docs/releasing.md`

---
title: "Contributing"
description: "Fork-and-branch workflow, required cargo fmt/clippy/test checks, conventional commits, areas where contributions are most valuable, and links to architecture and testing docs."
---

Wakaru is a Rust Cargo workspace with three crates — `wakaru-core` (`crates/core/`), `wakaru-cli` (`crates/cli/`), and `wakaru-wasm` (`crates/wasm/`) — where nearly all code contributions land in `wakaru-core`: `VisitMut` transformation rules under `src/rules/`, bundle unpackers under `src/unpacker/`, pipeline orchestration in `src/driver.rs`, and per-rule plus integration tests under `tests/`. Pull requests are validated by the `rust-ci` workflow (`cargo fmt --check`, `cargo clippy -- -D warnings`, `cargo nextest run --workspace --profile ci`, and `cargo test --workspace --doc`) on Ubuntu, macOS, and Windows.

## Fork-and-branch workflow

<Steps>
<Step title="Fork and branch">

Fork [pionxzh/wakaru](https://github.com/pionxzh/wakaru) and create a feature branch from `main`.

</Step>
<Step title="Verify the baseline">

From the workspace root, confirm the tree builds and tests pass:

```bash
cargo test
```

</Step>
<Step title="Implement with tests">

Add or update behavior in the relevant crate. Rule and bugfix changes require a focused unit test in `crates/core/tests/` — pipeline snapshot updates alone do not satisfy coverage for an individual rule change.

</Step>
<Step title="Run local checks">

Run the full pre-PR checklist (see [Required checks](#required-checks)).

</Step>
<Step title="Open a pull request">

Push the branch and open a PR against `main`. Use [Conventional Commits](#commit-message-format) and reference the related issue number in the commit message or PR description.

</Step>
</Steps>

<Note>
The `rust-ci` workflow ignores changes under `playground/**`, `website/**`, `*.md`, and `docs/**`. Documentation-only edits do not trigger Rust CI; code changes always do.
</Note>

## Development setup

| Requirement | Details |
|---|---|
| Rust toolchain | Stable Rust via [rustup](https://rustup.rs/) |
| Workspace root | All `cargo` commands run from the repository root |
| Optional: `cargo-insta` | `cargo install cargo-insta` for interactive snapshot review |
| Optional: faster tests | `cargo install cargo-nextest --locked` (CI uses nextest; local runs are ~25× faster than plain `cargo test` for the core suite) |

Build and exercise the CLI during development:

```bash
cargo build
cargo run -p wakaru-cli -- input.js -o output.js
cargo run -p wakaru-cli -- --unpack bundle.js -o unpacked/
cargo run -p wakaru-cli -- debug trace path/to/module.js
```

### Optional shallow git dependencies

The formatter depends on pinned OXC crates from git. On a cold Cargo cache, fetches can be slow. With nightly Cargo:

```bash
cargo +nightly fetch -Zgit=shallow-deps
```

Stable Cargo works without this step.

## Workspace layout

| Crate | Path | Role |
|---|---|---|
| `wakaru-core` | `crates/core/` | Decompile pipeline, ~60 transformation rules, unpackers, public API |
| `wakaru-cli` | `crates/cli/` | `wakaru` binary (`clap`) |
| `wakaru-wasm` | `crates/wasm/` | WASM bindings for the browser playground |

Within `wakaru-core`:

```text
crates/core/
├── src/rules/          # one file per rule; pipeline.rs holds RuleDescriptor order
├── src/unpacker/       # bundle format detection and module extraction
├── src/driver.rs       # decompile and unpack orchestration
└── tests/              # per-rule tests, pipeline integration tests, snapshot fixtures
```

## Required checks

Run these before opening a PR. CI enforces the same gates (with nextest instead of plain `cargo test`).

```bash
cargo fmt --check
cargo clippy -- -D warnings
cargo test
```

For day-to-day core development, prefer nextest and scoped clippy:

```bash
cargo nextest run -p wakaru-core
cargo nextest run --workspace
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

When touching `wakaru-cli`, `wakaru-wasm`, or shared workspace code:

```bash
cargo clippy --workspace --all-targets -- -D warnings
```

### Snapshot workflow

`.cargo/config.toml` sets `INSTA_UPDATE=new`: a changed snapshot **fails** the test and writes a `.snap.new` file. CI sets `INSTA_UPDATE=no`.

1. Review the `.snap.new` diff.
2. Accept intentional changes: `cargo insta accept` (or `INSTA_UPDATE=always cargo test` for a one-off bulk accept).
3. Confirm `git status --short` shows no stale `.snap.new` files.

<Warning>
Snapshot drift that is merely different — not semantically better — is a regression. Inspect every snapshot change before accepting.
</Warning>

### Definition of done for rule changes

1. Focused rule tests for the touched behavior.
2. Pipeline integration tests:

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
```

3. `cargo fmt --check` and scoped or workspace `cargo clippy`.
4. Snapshot diffs reviewed; no unrelated changes in `git status --short`.

## Invariants for code changes

These constraints apply to every rule or rename contribution:

| Rule | Requirement |
|---|---|
| Unit tests | No code change ships without a corresponding unit test |
| Identifier matching | Rules matching identifiers by name must take `unresolved_mark: Mark` and gate with `id.ctxt.outer() != self.unresolved_mark` |
| Renames | Use `rename_utils::BindingRenamer` (`rename_bindings_in_module` / `rename_bindings`); never rename by `sym` alone |
| Pipeline placement | New rules register in `crates/core/src/rules/pipeline.rs` at a position that satisfies upstream dependencies (e.g. after `UnBracketNotation` for `["default"]` → `.default`; before `UnEsm` when `require()` must still exist; before the second `UnIife` pass when creating IIFEs) |
| Formatting | Run `cargo fmt --check`; limit `cargo fmt` to files you intentionally changed |

## High-value contribution areas

| Area | What helps |
|---|---|
| Real-world bundles | Samples Wakaru fails to unpack or decompile (webpack 4/5, esbuild, Bun, Browserify, SystemJS, AMD) |
| Helper detection | Missing transpiler helpers (Babel, TypeScript/tslib, SWC) or false-positive matches |
| Correctness | Semantic bugs, TDZ issues, incorrect rewrite output at any `RewriteLevel` |
| New rules | `VisitMut` rules that recover idiomatic ESNext from minified patterns |
| Unpackers | New or improved bundle-format detection in `src/unpacker/` |

The most common code contribution is adding or extending a transformation rule. That workflow is documented on the develop-rules page; pipeline ordering dependencies are listed in `docs/rule-dependency-inventory.md`.

## Bug reports

Use the GitHub **Decompiler Bug Report** issue template. Include:

| Field | Content |
|---|---|
| Input code | Minimal reproduction (playground link, gist, or inline sample) |
| Command | Full `wakaru` invocation (add `--diagnostics` when warnings help) |
| Version | Output of `wakaru --version` or npm package version |
| Expected vs actual | What output you expected and what Wakaru produced |

For quick triage from the README: input code, command run, current output, and expected output.

## Commit message format

Wakaru follows [Conventional Commits](https://www.conventionalcommits.org/):

```
feat: add UnNullishCoalescing rule
fix: handle nested ternary in UnConditionals
test: add edge case for arrow function with rest params
refactor: extract shared helper into babel_helper_utils
docs: update architecture diagram for two-phase pipeline
```

Mention the issue number in the commit message or PR description.

## Architecture and testing references

Read these before changing pipeline behavior:

| Document | Topic |
|---|---|
| `docs/architecture.md` | Pipeline flow, components, `unresolved_mark` scope gating |
| `docs/testing.md` | Test helpers (`render`, `render_rule`, `render_pipeline_until`), verification matrix |
| `docs/helper-detection.md` | Transpiler helper matching by AST body shape |
| `docs/debugging.md` | Rule trace CLI, snapshot layers, fixture workflow |

Repository agents and maintainers also reference `AGENTS.md` for the canonical definition-of-done checklist and code-review self-check prompts.

## Related pages

<CardGroup>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow for adding `VisitMut` rules, pipeline placement, and the verification checklist.
</Card>
<Card title="Testing and snapshots" href="/testing-and-snapshots">
`cargo nextest` vs `cargo test`, insta workflow, pipeline test binaries, and pre-commit matrix.
</Card>
<Card title="Helper detection" href="/helper-detection">
How Babel, TypeScript, and SWC helpers are matched across imported, inlined, hoisted, and minified forms.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
`debug trace` for per-rule diffs, `--from`/`--until` ranges, and regression bisection.
</Card>
<Card title="Debug regressions" href="/debugging-regressions">
Snapshot drift investigation, early-rule cascades, and `unresolved_mark` symptom mapping.
</Card>
</CardGroup>

---

## 24. Testing and snapshots

> cargo nextest vs cargo test, insta snapshot workflow, required pipeline test binaries, rule-level test patterns, Test262 round-trip coverage, and pre-commit verification matrix.

- Page Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/pages/24-testing-and-snapshots.md
- Generated: 2026-06-28T01:17:23.566Z

### Source Files

- `docs/testing.md`
- `.cargo/config.toml`
- `.config/nextest.toml`
- `docs/test262-roundtrip.md`
- `scripts/correctness/test262-roundtrip.mjs`
- `.github/workflows/rust-ci.yml`

---
title: Testing and snapshots
description: cargo nextest vs cargo test, insta snapshot workflow, required pipeline test binaries, rule-level test patterns, Test262 round-trip coverage, and pre-commit verification matrix.
---

Wakaru's test suite lives almost entirely under `crates/core/tests/` — roughly 100 integration-test binaries plus shared helpers in `common/mod.rs`. Tests fall into three layers: **rule-level unit tests** that pin one transformation, **pipeline snapshot tests** that pin end-to-end unpack/decompile output, and **semantic round-trip checks** against the Test262 corpus. Every rule change needs a focused regression test; snapshot updates alone are not sufficient coverage.

## Running tests

Wakaru uses two Rust test runners. Pick based on whether you need speed or surgical focus.

| Runner | When to use | Notes |
|---|---|---|
| `cargo nextest` | Day-to-day development, CI-equivalent full runs | One global parallel pool; much faster than `cargo test` across 90+ binaries |
| `cargo test` | Single binary or single test name | Sequential per-binary execution; best for tight iteration |

<CodeGroup>
```bash title="Full core suite (recommended)"
cargo nextest run -p wakaru-core
```

```bash title="Entire workspace"
cargo nextest run --workspace
```

```bash title="Single rule test file"
cargo test -p wakaru-core --test un_double_negation_rule
```

```bash title="Single named test"
cargo test -p wakaru-core --test smart_inline_rule -- inline_single_use
```
</CodeGroup>

Install nextest once:

```bash
cargo install cargo-nextest --locked
```

CI runs `cargo nextest run --workspace --profile ci` using the `ci` profile in `.config/nextest.toml`, which sets `fail-fast = false` (report every failure) and a 60-second slow-timeout with forced termination after three attempts. nextest does **not** run doctests; CI keeps a `cargo test --workspace --doc` guard even though there are no doctests today.

## Insta snapshot workflow

Pipeline and some rule tests use [insta](https://insta.rs/) to pin emitted JavaScript. Snapshots are committed as `.snap` files under `crates/core/tests/snapshots/`.

### Local vs CI behavior

| Environment | `INSTA_UPDATE` | Behavior on drift |
|---|---|---|
| Local (`.cargo/config.toml`) | `new` | Test **fails**; writes a `.snap.new` for review |
| CI (`.github/workflows/rust-ci.yml`) | `no` | Test **fails**; no silent acceptance |

The `new` setting prevents regressions from landing green just because nobody inspected the diff. Previously, `always` would rewrite snapshots on every run and mask real regressions.

<Steps>
<Step title="Run tests and observe drift">

Run the affected test binary. A changed snapshot fails the test and leaves a `.snap.new` alongside the committed `.snap`.

```bash
cargo test -p wakaru-core --test webpack4_unpack
```

</Step>

<Step title="Review the diff">

Inspect each `.snap.new` against its committed counterpart. Accept only when output is **semantically better** or the fixture expectation intentionally changed — "different" without "better" is a regression.

</Step>

<Step title="Accept intentional changes">

```bash
cargo insta review    # interactive accept/reject per snapshot
cargo insta accept    # accept all pending .snap.new files
```

For a one-off bulk accept during development:

```bash
INSTA_UPDATE=always cargo test -p wakaru-core --test webpack4_unpack
```

</Step>

<Step title="Verify cleanliness before commit">

```bash
git status --short
```

No `.snap.new` files should remain. Committed `.snap` files should reflect reviewed changes only.

</Step>
</Steps>

### Snapshot test patterns

**Per-module pipeline snapshots** — webpack4 final output pins every extracted module:

```rust
for (filename, code) in &pairs {
    let snap_name = filename.trim_end_matches(".js");
    insta::assert_snapshot!(snap_name, code);
}
```

**Raw extraction layer** — `webpack4_unpack_raw.rs` pins pre-rule code (after webpack normalization, before decompile rules). Useful for bisecting whether drift comes from unpack extraction or downstream rules.

**Named snapshots** — some tests pass an explicit name: `insta::assert_snapshot!("rollup_es_decompile", output)`.

When snapshots change unexpectedly, use rule tracing and the raw-vs-final layer split described on the debugging regressions page.

## Test organization

All test files live under `crates/core/tests/`. **Default: add tests to the existing file for the rule you are changing.** Create a new file only when adding a new rule.

:::files
crates/core/tests/
├── common/mod.rs              # Shared helpers (render, render_rule, normalize, …)
├── *_rule.rs                  # Per-rule unit tests (one file per rule)
├── noop_pipeline.rs           # Stability: inputs that pass through unchanged
├── webpack4_unpack.rs         # Final webpack4 decompile snapshots
├── webpack4_unpack_raw.rs     # webpack4 raw extraction snapshots
├── bundle_unpack.rs           # webpack5 + browserify pipeline tests
├── esbuild_unpack.rs          # esbuild/Bun scope-hoisted detection and unpack
├── systemjs_unpack.rs           # SystemJS generated fixtures
├── webpack5_chunk_unpack.rs   # webpack5 chunk splitting
├── multi_file_unpack.rs       # Entry + chunk multi-input workflows
├── facts_rule.rs              # Cross-module fact extraction
├── pipeline_helpers_rule.rs   # Transpiler helper detection + restoration
├── decompile_options_rule.rs  # DecompileOptions configuration
└── snapshots/                 # Insta snapshot files (auto-generated, committed)
:::

## Rule-level test patterns

There are two primary patterns: **full-pipeline** and **isolated rule**.

### Full-pipeline test

Use `render(input)` when the rule depends on earlier normalization (helper detection, `SimplifySequence`, bracket notation, etc.):

```rust
mod common;
use common::{assert_eq_normalized, render};

#[test]
fn my_feature_test() {
    let input = r#"void 0"#;
    let expected = r#"undefined"#;
    assert_eq_normalized(&render(input), expected);
}
```

### Isolated rule test

Use `render_rule(input, builder)` for most per-rule tests. This runs resolver + one rule + fixer:

```rust
mod common;
use common::{assert_eq_normalized, render_rule};
use wakaru_core::rules::UnDoubleNegation;

fn apply(input: &str) -> String {
    render_rule(input, |_| UnDoubleNegation)
}

#[test]
fn strips_double_bang_in_if() {
    assert_eq_normalized(&apply("if (!!x) { a(); }"), "if (x) { a(); }");
}
```

For rules that match identifiers by name, pass `unresolved_mark`:

```rust
render_rule(input, |unresolved_mark| MyRule::new(unresolved_mark))
```

### Test helpers

| Helper | Purpose |
|---|---|
| `render(source)` | Full decompile pipeline |
| `render_rule(source, builder)` | Single rule in isolation |
| `render_rule_with_filename(source, filename, builder)` | Same, with custom filename (`.ts`/`.tsx` parsing) |
| `render_pipeline_until(source, stop_after)` | Pipeline up to a named rule (inclusive) |
| `render_pipeline_between(source, start, stop)` | Pipeline from `start` through `stop` |
| `trace_pipeline(source, options)` | Collect `RuleTraceEvent`s |
| `changed_rules(source)` | List rule names that changed output |
| `normalize(input)` | Parse + re-emit to normalize whitespace |
| `assert_eq_normalized(actual, expected)` | Compare after normalizing both sides |

Rule names in `render_pipeline_until` match struct names (e.g. `SmartInline`, `UnEsm`). Second passes use suffixed names: `UnWebpackInterop2`, `UnIife2`.

### Choosing `render` vs `render_rule`

Most rule tests use `render_rule`. Switch to `render` or `render_pipeline_until` when:

- The rule depends on **helper detection** (`LocalHelperContext::collect` scanning function bodies).
- Your test input contains forms normalized in Stage 1 (`void 0`, bracket notation, indirect calls, comma expressions).

If `render_rule` produces unchanged output but `render` works, the rule depends on earlier normalization.

### Common pitfalls

- Do not use bare literal expression statements as inputs (`65536;`) — `SimplifySequence` drops them as dead code. Use `const x = 65536;` instead.
- Full-pipeline tests may transform input before your rule runs. Isolate with `render_rule` or stop early with `render_pipeline_until`.
- A snapshot update for a pipeline test does **not** replace a focused rule regression test. Both are required for rule changes.

## Required pipeline test binaries

Before committing code changes, run these five pipeline binaries at minimum. They cover noop stability, webpack4 final and raw layers, and webpack5/browserify/esbuild unpack paths.

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
```

| Binary | What it guards |
|---|---|
| `noop_pipeline` | Decompile stability on bundled fixtures; noop output snapshot |
| `webpack4_unpack` | Module extraction counts + per-module decompile snapshots |
| `webpack4_unpack_raw` | Raw extraction before decompile rules |
| `bundle_unpack` | webpack5 and browserify unpack + interop recovery |
| `esbuild_unpack` | esbuild/Bun scope-hoisted detection, strict vs heuristic split |

Additional binaries exist for SystemJS, webpack5 chunks, multi-file unpack, cross-module facts, helper pipelines, and `DecompileOptions` — run them when your change touches those areas.

## Test262 round-trip coverage

Beyond Rust unit and snapshot tests, `scripts/correctness/test262-roundtrip.mjs` checks **semantic preservation**: run a Test262 file in Node's `vm`, transform it with a selected producer pipeline, decompile with Wakaru, then run the decompiled result through the same Test262 harness. The Test262 harness remains the oracle — original, transformed, and decompiled code must all pass.

### How a case runs

```mermaid
flowchart LR
  A[Test262 source] --> B[Baseline VM run]
  B --> C[Transform via pipeline]
  C --> D[Transformed VM run]
  D --> E[Wakaru decompile]
  E --> F[Decompiled VM run]
  F --> G{All pass?}
  G -->|yes| H[passed]
  G -->|no| I[failed]
```

The runner operates in three phases: classify tests and run pre-decompile checks, batch-decompile pending cases in parallel (bounded by `cpus - 2`), then verify decompiled output against the harness.

### Key options

<ParamField body="--preset" type="string">
Named Test262 path set: `default`, `classes`, `destructuring`, `modules`, `operators`, and others. Prefer presets over running the entire Test262 tree.
</ParamField>

<ParamField body="--pipeline" type="string" default="terser-light">
Producer before Wakaru: `none`, `terser-light`, `terser-full`, `babel-env-terser`, `swc-minify`, `esbuild-minify`.
</ParamField>

<ParamField body="--level" type="string" default="minimal">
Wakaru rewrite level passed to the CLI: `minimal`, `standard`, `aggressive`.
</ParamField>

<ParamField body="--limit" type="number | all" default="25">
Maximum runnable tests. Use `all` for full slice runs.
</ParamField>

<ParamField body="--case-timeout-ms" type="number" default="5000">
Per-test timeout. Timeouts record as `rejected` with reason `case-timeout`.
</ParamField>

<ParamField body="--summary" type="path">
Write a deterministic Markdown report (no timestamps) suitable for git diff review.
</ParamField>

<ParamField body="--json" type="path">
Write full JSON report. Updated after each processed test; interrupted runs leave `complete: false`.
</ParamField>

<RequestExample>

```bash
# Quick smoke (25 cases, default paths)
node scripts/correctness/test262-roundtrip.mjs --limit 500

# Full default slice with reviewable summary
node scripts/correctness/test262-roundtrip.mjs --limit all --summary target/test262-default.md

# Classes slice through SWC minifier
node scripts/correctness/test262-roundtrip.mjs --preset classes --pipeline swc-minify --limit 100 --summary target/test262-classes-swc.md

# ESM module graphs (not bundle testing)
node scripts/correctness/test262-roundtrip.mjs --preset modules --pipeline swc-minify --limit all --case-timeout-ms 2000 --summary target/test262-modules-graph-swc.md

# Rerun failures from a previous report
node scripts/correctness/test262-roundtrip.mjs --rerun-from target/test262-default.json --rerun-status failed --json target/test262-default-rerun.json
```

</RequestExample>

### Status buckets

| Status | Meaning |
|---|---|
| `passed` | Original, transformed, and decompiled code all pass |
| `unsupported` | Local Node/vm/SWC setup cannot run this input |
| `rejected` | Transform/minifier or known non-Wakaru blocker stops the case before semantic evaluation |
| `failed` | Current Wakaru correctness candidate |

Known non-Wakaru classifications live in `scripts/correctness/test262-known-blockers.json`. Keep entries narrow; do not mask real Wakaru semantic failures.

### Baselines and stats

Tracked baseline summaries live in `docs/test262-baselines/`, grouped by producer pipeline and Test262 slice. Regenerate the full normal matrix:

```bash
node scripts/correctness/test262-baseline-matrix.mjs
```

Refresh a subset:

```bash
node scripts/correctness/test262-baseline-matrix.mjs --producer swc-minify --slice operators
```

Cached totals for quick reads: `scripts/correctness/test262-stats.json`, updated via `node scripts/correctness/test262-collect-stats.mjs`.

The roundtrip runner requires a built `wakaru` binary (`cargo build -p wakaru-cli`) or a `WAKARU` environment variable pointing to one. It does not fall back to `cargo run`.

## Pre-commit verification matrix

Run checks from the **worktree that contains your changes** — the directory with this repo's `Cargo.toml`. Do not reuse binaries from another checkout.

<Steps>
<Step title="Focused regression test">

```bash
cargo test -p wakaru-core --test <rule>_rule
```

Every code change needs a corresponding unit test. Snapshot-only updates are insufficient for rule changes.

</Step>

<Step title="Pipeline tests">

Run the five required pipeline binaries listed above.

</Step>

<Step title="Formatting and linting">

```bash
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

Use `cargo clippy --workspace --all-targets -- -D warnings` when touching non-core crates or shared workspace code.

</Step>

<Step title="Optional: release CLI binary">

Only when you need a standalone binary for reproduction matrices or CLI validation:

```bash
cargo build --profile dev-release -p wakaru-cli
export WAKARU="$PWD/target/dev-release/wakaru"
```

The `dev-release` profile inherits `release` with thinner LTO for faster local builds. Fixture runners auto-detect and build the checkout they are launched from.

</Step>

<Step title="Optional: external fixtures">

When the sibling `wakaru-fixtures` repository is checked out and your change affects decompile output, unpacking, bundler behavior, rule ordering, helper detection, or CLI behavior:

```bash
../wakaru-fixtures/run.sh --check
```

Accept deliberate output improvements with `../wakaru-fixtures/run.sh --update` and commit the `outputs/` change in the fixtures repo.

</Step>

<Step title="Final cleanliness">

```bash
git diff --check
git status --short
```

Confirm no stale `.snap.new` files or unrelated changes remain.

</Step>
</Steps>

### CI parity summary

| Check | Local command | CI job |
|---|---|---|
| Format | `cargo fmt --check` | `fmt` |
| Lint | `cargo clippy -- -D warnings` | `check` (all OS matrix) |
| Tests | `cargo nextest run --workspace --profile ci` | `check` |
| Doctests | `cargo test --workspace --doc` | `check` |
| Snapshot policy | `INSTA_UPDATE=new` (fail + `.snap.new`) | `INSTA_UPDATE=no` (fail, no write) |

For faster local iteration, `cargo nextest run -p wakaru-core` covers the bulk of the suite. Reserve the full workspace + `ci` profile run before pushing.

## Reproduction matrices

Under `scripts/repro/`, matrices test recovery across transpiler versions, modes, and minification levels. Current rates are cached in `scripts/repro/stats.json`.

```bash
node scripts/repro/collect-stats.mjs          # regenerate stats
node scripts/repro/collect-stats.mjs --check  # verify freshness
node scripts/repro/array-spread-rest-matrix/matrix.mjs --details
```

Every new matrix should spread `...mangleValidator()` from `lib/compare.mjs` into its `runMatrix()` config for structural comparison of mangled shapes.

## Related pages

<CardGroup>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, unresolved_mark guards, and the definition-of-done checklist for new rules.
</Card>

<Card title="Debug regressions" href="/debugging-regressions">
Investigate snapshot drift, raw vs final webpack4 layers, and rule trace bisection.
</Card>

<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs and --from/--until ranges for narrowing regressions.
</Card>

<Card title="Contributing" href="/contributing">
Fork-and-branch workflow, required checks, and conventional commits.
</Card>

<Card title="Webpack bundle recipe" href="/webpack-bundle-recipe">
Build fixtures and compare unpack output against reference dist files.
</Card>

<Card title="Esbuild and Browserify recipe" href="/esbuild-browserify-recipe">
Verify esbuild and browserify unpack with testcase commands.
</Card>
</CardGroup>

---