# Decompile pipeline

> Single-file flow: parse, resolver marks, staged rule application, optional source-map rename, fixer, emit; parallel execution during unpack; unresolved_mark scope gating.

- Repository: pionxzh/wakaru
- GitHub: https://github.com/pionxzh/wakaru
- Human docs: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b
- Complete Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/llms-full.txt

## Source Files

- `docs/architecture.md`
- `crates/core/src/driver.rs`
- `crates/core/src/rules/pipeline.rs`
- `crates/core/src/rules/mod.rs`
- `docs/rule-dependency-inventory.md`

---

---
title: "Decompile pipeline"
description: "Single-file flow: parse, resolver marks, staged rule application, optional source-map rename, fixer, emit; parallel execution during unpack; unresolved_mark scope gating."
---

Wakaru's decompile pipeline turns a single JavaScript module AST into readable ESNext by running a fixed-order rule registry after SWC's resolver. The public entry point is `decompile()` in `crates/core/src/driver/single_file.rs`; bundle unpacking reuses the same rule machinery per module inside a two-phase, rayon-parallel driver in `crates/core/src/driver/unpack/phases.rs`.

## Pipeline overview

```mermaid
flowchart TB
    subgraph single["Single-file decompile()"]
        IN[input string] --> PARSE[parse_js_with_recovery]
        PARSE --> RES[resolver marks]
        RES --> RULES[apply_rules — full registry]
        RULES --> SM{sourcemap bytes?}
        SM -->|yes| RENAME[ImportDedup → apply_sourcemap_renames → UnImportRename]
        SM -->|no| FIX
        RENAME --> FIX[apply_fixer]
        FIX --> EMIT[print_js / print_js_with_srcmap]
        EMIT --> OUT[DecompileOutput]
    end

    subgraph unpack["Unpack multi-module"]
        SPLIT[unpacker extract] --> P1[Phase 1: par_iter until UnEsm + facts]
        P1 --> BARRIER[ModuleFactsMap barrier]
        BARRIER --> P2[Phase 2: par_iter late pass + UnObjectSpread2..UnReturn]
        P2 --> OUT2[UnpackOutput modules]
    end
```

| Mode | Entry function | Rule scope per module |
|------|----------------|----------------------|
| Single file | `decompile(source, DecompileOptions)` | Full registry (`start_from` / `stop_after` optional via trace) |
| Unpack Phase 1 | `unpack_multi_module_with_plan` | `RulePipelineOptions::until("UnEsm")` + fact extraction |
| Unpack Phase 2 | same | `between("UnObjectSpread2", "UnReturn")` + cross-module late pass |
| Rule trace | `trace_rules` | Configurable range; single-file only |

<Note>
Rule tracing rejects bundle inputs. Use normal `decompile` or `unpack` for bundles; see [Trace the rule pipeline](/trace-rule-pipeline).
</Note>

## Single-file stages

`decompile()` runs entirely inside `GLOBALS.set` so SWC `SyntaxContext` values stay consistent for one module.

<Steps>
<Step title="Parse">
`parse_js_with_recovery` in `crates/core/src/driver/io.rs` builds an SWC `Module` from the input string. Syntax is chosen from the filename extension (`.ts`, `.tsx`, `.jsx`, or default ES+JSX). Recoverable parse errors are collected for optional diagnostics; unrecoverable parse failures abort.

<ParamField body="filename" type="string">
Used for syntax detection, diagnostic messages, and output source-map `file` field.
</ParamField>
</Step>

<Step title="Resolver marks">
SWC's `resolver(unresolved_mark, top_level_mark, false)` assigns `SyntaxContext` to every identifier. Free variables (globals like `Object`, `require`) receive `unresolved_mark` as their outer context; bound locals and parameters get module-scoped contexts.

This mark is threaded through every rule runner and is the primary scope gate for identifier matching.
</Step>

<Step title="Rule application">
`apply_rules` walks the ordered `RULE_DESCRIPTORS` registry in `crates/core/src/rules/pipeline.rs`. Each descriptor has an `id`, `RuleStage`, optional `requires` metadata, an `enabled` gate, and a `run` function.

<ParamField body="dce_mode" type="DceMode">
Controls late `DeadDecls` / `DeadImports` passes. API default is `Off`; the CLI defaults to `TransformOnly` and uses `Full` with `--dce`.
</ParamField>

<ParamField body="level" type="RewriteLevel">
`minimal`, `standard` (default), or `aggressive`. Gates risky subpatterns inside rules and whole rules such as `UnJsx`, `ArrowFunction`, and `UnDestructuring`.
</ParamField>
</Step>

<Step title="Optional source-map rename">
When `DecompileOptions.sourcemap` is set, three passes run **after** the main rule pipeline and **before** the fixer:

1. `ImportDedup` — merge repeated imports from the same specifier
2. `apply_sourcemap_renames` — recover original names via position lookup in `sourcesContent`
3. `UnImportRename` — clean import aliases after rename

Renaming runs late because rules detect patterns by minified helper names (`require`, `__generator`, `__esModule`), and `ImportDedup` needs `UnEsm` to have converted `require()` to `import` first.
</Step>

<Step title="Fixer and emit">
`apply_fixer` normalizes the AST for printing. Output is produced by `print_js` or, when `emit_source_map` is true, `print_js_with_srcmap` plus `build_output_sourcemap` (v3 JSON mapping decompiled output back to input positions).

Optional `diagnostics` adds TDZ checks, duplicate-declaration scans, and output parse verification as `UnpackWarning` entries in `DecompileOutput.warnings`.
</Step>
</Steps>

### RequestExample

```bash
cargo run -p wakaru-cli -- minified.js -o readable.js
```

### ResponseExample

`DecompileOutput` shape:

| Field | Type | Default |
|-------|------|---------|
| `code` | `String` | emitted JavaScript |
| `warnings` | `Vec<UnpackWarning>` | empty unless `diagnostics: true` |
| `source_map` | `Option<String>` | `None` unless `emit_source_map: true` |

## Rule registry and stages

Roughly 60 `VisitMut` rules are registered via `define_rule_registry!`. Order is fixed; `RuleDescriptor::requires` documents ordering constraints that the inventory expands on.

| `RuleStage` | Role | Examples |
|-------------|------|----------|
| `Syntax` | Minified syntax normalization | `SimplifySequence`, `UnBracketNotation`, `FlipComparisons` |
| `Helpers` | Transpiler helper unwrapping, module reconstruction | `UnInteropRequireDefault`, `UnEsm`, `UnWebpackInterop` |
| `Structural` | Pattern restoration | `UnTemplateLiteral`, `UnOptionalChaining`, `ObjectAssignSpread` |
| `Complex` | Higher-level recovery | `UnIife`, `UnEs6Class`, `UnRegenerator`, `UnAsyncAwait` |
| `Modernization` | ESNext upgrades | `VarDeclToLetConst`, `ArrowFunction`, `UnForOf` |
| `Cleanup` | Renaming, inlining, DCE | `SmartInline`, `SmartRename`, `DeadDecls`, `UnReturn` |

Several rules run multiple times under suffixed ids (`UnWebpackInterop2`, `UnIife2`, `UnParameters3`, etc.) because earlier passes expose shapes later passes must handle.

`RulePipelineOptions` controls partial execution:

<ParamField body="start_from" type="Option<&str>">
First rule id to run (inclusive). Used by `trace_rules` and tests.
</ParamField>

<ParamField body="stop_after" type="Option<&str>">
Last rule id to run (inclusive). Helpers: `until("UnEsm")`, `between("UnObjectSpread2", "UnReturn")`.
</ParamField>

<ParamField body="module_facts" type="Option<&ModuleFactsMap>">
Injected during unpack Phase 2 so fact-aware rules (`UnTemplateLiteral`, `UnForOf`, `UnRegenerator`, helper rules) can read cross-module import/export data.
</ParamField>

<Info>
Full rule order and dependency edges live in [Rule pipeline reference](/rule-pipeline-reference). Semantic assumptions per rewrite level are in [Rewrite levels and assumptions](/rewrite-levels-and-assumptions).
</Info>

## `unresolved_mark` scope gating

After `resolver()` runs, rules that match identifiers **by name** must distinguish free globals from bound locals. A webpack factory parameter `e` and an inner function parameter `e` share a symbol but not a `SyntaxContext`.

Every new visitor that matches by name should take `unresolved_mark: Mark` and gate matches:

```rust
if id.ctxt.outer() != self.unresolved_mark {
    return; // bound local — do not transform
}
```

Rules that use this pattern include `SimplifySequence`, `FlipComparisons`, `UnWebpackInterop`, `UnArgumentSpread`, `UnNullishCoalescing`, `UnJsx`, `SmartInline`, and `UnWebpackDefineGetters`. Renames must go through `rename_utils::BindingRenamer` (`rename_bindings_in_module`), never a custom rename by `sym` alone — that would hit unrelated inner-scope bindings.

<Warning>
Skipping the `unresolved_mark` guard causes cross-scope renames: a rule matching webpack param `e` would also rename `e` inside `function inner(e) { ... }`.
</Warning>

## Unpack: parallel two-phase execution

When a bundle is unpacked, the driver does **not** call `decompile()` directly on each raw string. Instead `unpack_multi_module_with_plan` in `phases.rs` runs:

**Phase 1** (`modules.par_iter()`):
- Parse each extracted module
- `resolver` + optional numeric webpack ID rewrites
- `apply_rules` with `until("UnEsm")`
- ESM recovery on a clone (or in-place when AST won't be reused) for fact extraction
- `collect_module_facts` → `ModuleFactsMap`
- Optionally retain the through-`UnEsm` AST for Phase 2 reuse

**Cross-module barrier** (sequential):
- Merge per-module facts into `ModuleFactsMap`
- Build filename rename map from provenance markers (standard+ only)

**Phase 2** (`phase2_inputs.into_par_iter()`):
- Resume from Phase 1 AST when no input source map and no `emit_source_map`; otherwise reparse and rerun through `UnEsm`
- Cross-module late pass: `run_reexport_consolidation`, `run_namespace_decomposition`
- `apply_rules` with `between("UnObjectSpread2", "UnReturn")` + `with_module_facts`
- Targeted cleanup (`SimplifySequence`, `UnAssignmentMerging`, factory-IIFE ESM recovery, export pruning)
- Same source-map rename trio as single-file when `sourcemap` is provided
- `apply_fixer` → emit

Both phases use rayon; on targets without threading, rayon falls back to sequential execution.

<Note>
The through-`UnEsm` range runs twice per module because `SyntaxContext` must stay continuous within the emitted pipeline. Reusing a Phase 1 AST after a fresh parse would break ctxt-sensitive rename rules.
</Note>

Phase 1 fact collection uses `RewriteLevel::Standard` for ESM recovery regardless of output level, so facts stay stable; Phase 2 applies the caller's `options.level`.

Individual module failures are best-effort: parse or decompile errors preserve raw extracted code and append `UnpackWarning` entries (`FactCollectionParseFailed`, `DecompileFailed`) rather than aborting the whole unpack.

## Dead-code elimination in the pipeline

Late cleanup rules `DeadDecls` and `DeadImports` are gated by `dce_mode.is_enabled()`:

| `DceMode` | Behavior |
|-----------|----------|
| `Off` | No dead-code passes (API `decompile` default) |
| `TransformOnly` | Remove only transform-induced dead code; snapshot pre-dead spans at pipeline start |
| `Full` | Full reachability sweep |

CLI default is `TransformOnly`; pass `--dce` for `Full`. Tests often use `DceMode::Off` to isolate structural restoration from cleanup.

## Debugging and partial runs

`trace_rules` replays the single-file pipeline with `apply_rules_with_observer`, capturing per-rule before/after snapshots as git-style diffs via `format_trace_events`. It shares resolver + rule machinery but skips source-map rename, fixer-adjacent source-map output, and rejects detected bundles.

Use `RulePipelineOptions::between(start, stop)` in tests or trace mode to bisect regressions without running the full ~60-rule chain.

## Key files

```text
crates/core/src/
  driver/
    single_file.rs      decompile() orchestration
    unpack/phases.rs    two-phase parallel module pipeline
    io.rs               parse, print, fixer bridge
    trace.rs            per-rule observer tracing
  rules/
    pipeline.rs         RULE_DESCRIPTORS registry, apply_rules
    mod.rs              RewriteLevel, rule exports
  sourcemap_rename.rs   late rename after rules
```

## Related pages

<CardGroup>
<Card title="Overview" href="/overview">
Workspace layout, unpack vs decompile, and primary entry points.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` list, stage groupings, and dependency inventory.
</Card>
<Card title="Cross-module facts" href="/cross-module-facts">
Phase 1 fact collection, `ModuleFactsMap`, and Phase 2 fact-aware rules.
</Card>
<Card title="Use source maps" href="/use-source-maps">
Input `--source-map` rename constraints and `--emit-source-map` output.
</Card>
<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs, `--from`/`--until` ranges, and bisection workflow.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Adding rules: `unresolved_mark` guards, pipeline placement, test-first workflow.
</Card>
</CardGroup>
