# Testing and snapshots

> cargo nextest vs cargo test, insta snapshot workflow, required pipeline test binaries, rule-level test patterns, Test262 round-trip coverage, and pre-commit verification matrix.

- Repository: pionxzh/wakaru
- GitHub: https://github.com/pionxzh/wakaru
- Human docs: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b
- Complete Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/llms-full.txt

## Source Files

- `docs/testing.md`
- `.cargo/config.toml`
- `.config/nextest.toml`
- `docs/test262-roundtrip.md`
- `scripts/correctness/test262-roundtrip.mjs`
- `.github/workflows/rust-ci.yml`

---

---
title: Testing and snapshots
description: cargo nextest vs cargo test, insta snapshot workflow, required pipeline test binaries, rule-level test patterns, Test262 round-trip coverage, and pre-commit verification matrix.
---

Wakaru's test suite lives almost entirely under `crates/core/tests/` — roughly 100 integration-test binaries plus shared helpers in `common/mod.rs`. Tests fall into three layers: **rule-level unit tests** that pin one transformation, **pipeline snapshot tests** that pin end-to-end unpack/decompile output, and **semantic round-trip checks** against the Test262 corpus. Every rule change needs a focused regression test; snapshot updates alone are not sufficient coverage.

## Running tests

Wakaru uses two Rust test runners. Pick based on whether you need speed or surgical focus.

| Runner | When to use | Notes |
|---|---|---|
| `cargo nextest` | Day-to-day development, CI-equivalent full runs | One global parallel pool; much faster than `cargo test` across 90+ binaries |
| `cargo test` | Single binary or single test name | Sequential per-binary execution; best for tight iteration |

<CodeGroup>
```bash title="Full core suite (recommended)"
cargo nextest run -p wakaru-core
```

```bash title="Entire workspace"
cargo nextest run --workspace
```

```bash title="Single rule test file"
cargo test -p wakaru-core --test un_double_negation_rule
```

```bash title="Single named test"
cargo test -p wakaru-core --test smart_inline_rule -- inline_single_use
```
</CodeGroup>

Install nextest once:

```bash
cargo install cargo-nextest --locked
```

CI runs `cargo nextest run --workspace --profile ci` using the `ci` profile in `.config/nextest.toml`, which sets `fail-fast = false` (report every failure) and a 60-second slow-timeout with forced termination after three attempts. nextest does **not** run doctests; CI keeps a `cargo test --workspace --doc` guard even though there are no doctests today.

## Insta snapshot workflow

Pipeline and some rule tests use [insta](https://insta.rs/) to pin emitted JavaScript. Snapshots are committed as `.snap` files under `crates/core/tests/snapshots/`.

### Local vs CI behavior

| Environment | `INSTA_UPDATE` | Behavior on drift |
|---|---|---|
| Local (`.cargo/config.toml`) | `new` | Test **fails**; writes a `.snap.new` for review |
| CI (`.github/workflows/rust-ci.yml`) | `no` | Test **fails**; no silent acceptance |

The `new` setting prevents regressions from landing green just because nobody inspected the diff. Previously, `always` would rewrite snapshots on every run and mask real regressions.

<Steps>
<Step title="Run tests and observe drift">

Run the affected test binary. A changed snapshot fails the test and leaves a `.snap.new` alongside the committed `.snap`.

```bash
cargo test -p wakaru-core --test webpack4_unpack
```

</Step>

<Step title="Review the diff">

Inspect each `.snap.new` against its committed counterpart. Accept only when output is **semantically better** or the fixture expectation intentionally changed — "different" without "better" is a regression.

</Step>

<Step title="Accept intentional changes">

```bash
cargo insta review    # interactive accept/reject per snapshot
cargo insta accept    # accept all pending .snap.new files
```

For a one-off bulk accept during development:

```bash
INSTA_UPDATE=always cargo test -p wakaru-core --test webpack4_unpack
```

</Step>

<Step title="Verify cleanliness before commit">

```bash
git status --short
```

No `.snap.new` files should remain. Committed `.snap` files should reflect reviewed changes only.

</Step>
</Steps>

### Snapshot test patterns

**Per-module pipeline snapshots** — webpack4 final output pins every extracted module:

```rust
for (filename, code) in &pairs {
    let snap_name = filename.trim_end_matches(".js");
    insta::assert_snapshot!(snap_name, code);
}
```

**Raw extraction layer** — `webpack4_unpack_raw.rs` pins pre-rule code (after webpack normalization, before decompile rules). Useful for bisecting whether drift comes from unpack extraction or downstream rules.

**Named snapshots** — some tests pass an explicit name: `insta::assert_snapshot!("rollup_es_decompile", output)`.

When snapshots change unexpectedly, use rule tracing and the raw-vs-final layer split described on the debugging regressions page.

## Test organization

All test files live under `crates/core/tests/`. **Default: add tests to the existing file for the rule you are changing.** Create a new file only when adding a new rule.

:::files
crates/core/tests/
├── common/mod.rs              # Shared helpers (render, render_rule, normalize, …)
├── *_rule.rs                  # Per-rule unit tests (one file per rule)
├── noop_pipeline.rs           # Stability: inputs that pass through unchanged
├── webpack4_unpack.rs         # Final webpack4 decompile snapshots
├── webpack4_unpack_raw.rs     # webpack4 raw extraction snapshots
├── bundle_unpack.rs           # webpack5 + browserify pipeline tests
├── esbuild_unpack.rs          # esbuild/Bun scope-hoisted detection and unpack
├── systemjs_unpack.rs           # SystemJS generated fixtures
├── webpack5_chunk_unpack.rs   # webpack5 chunk splitting
├── multi_file_unpack.rs       # Entry + chunk multi-input workflows
├── facts_rule.rs              # Cross-module fact extraction
├── pipeline_helpers_rule.rs   # Transpiler helper detection + restoration
├── decompile_options_rule.rs  # DecompileOptions configuration
└── snapshots/                 # Insta snapshot files (auto-generated, committed)
:::

## Rule-level test patterns

There are two primary patterns: **full-pipeline** and **isolated rule**.

### Full-pipeline test

Use `render(input)` when the rule depends on earlier normalization (helper detection, `SimplifySequence`, bracket notation, etc.):

```rust
mod common;
use common::{assert_eq_normalized, render};

#[test]
fn my_feature_test() {
    let input = r#"void 0"#;
    let expected = r#"undefined"#;
    assert_eq_normalized(&render(input), expected);
}
```

### Isolated rule test

Use `render_rule(input, builder)` for most per-rule tests. This runs resolver + one rule + fixer:

```rust
mod common;
use common::{assert_eq_normalized, render_rule};
use wakaru_core::rules::UnDoubleNegation;

fn apply(input: &str) -> String {
    render_rule(input, |_| UnDoubleNegation)
}

#[test]
fn strips_double_bang_in_if() {
    assert_eq_normalized(&apply("if (!!x) { a(); }"), "if (x) { a(); }");
}
```

For rules that match identifiers by name, pass `unresolved_mark`:

```rust
render_rule(input, |unresolved_mark| MyRule::new(unresolved_mark))
```

### Test helpers

| Helper | Purpose |
|---|---|
| `render(source)` | Full decompile pipeline |
| `render_rule(source, builder)` | Single rule in isolation |
| `render_rule_with_filename(source, filename, builder)` | Same, with custom filename (`.ts`/`.tsx` parsing) |
| `render_pipeline_until(source, stop_after)` | Pipeline up to a named rule (inclusive) |
| `render_pipeline_between(source, start, stop)` | Pipeline from `start` through `stop` |
| `trace_pipeline(source, options)` | Collect `RuleTraceEvent`s |
| `changed_rules(source)` | List rule names that changed output |
| `normalize(input)` | Parse + re-emit to normalize whitespace |
| `assert_eq_normalized(actual, expected)` | Compare after normalizing both sides |

Rule names in `render_pipeline_until` match struct names (e.g. `SmartInline`, `UnEsm`). Second passes use suffixed names: `UnWebpackInterop2`, `UnIife2`.

### Choosing `render` vs `render_rule`

Most rule tests use `render_rule`. Switch to `render` or `render_pipeline_until` when:

- The rule depends on **helper detection** (`LocalHelperContext::collect` scanning function bodies).
- Your test input contains forms normalized in Stage 1 (`void 0`, bracket notation, indirect calls, comma expressions).

If `render_rule` produces unchanged output but `render` works, the rule depends on earlier normalization.

### Common pitfalls

- Do not use bare literal expression statements as inputs (`65536;`) — `SimplifySequence` drops them as dead code. Use `const x = 65536;` instead.
- Full-pipeline tests may transform input before your rule runs. Isolate with `render_rule` or stop early with `render_pipeline_until`.
- A snapshot update for a pipeline test does **not** replace a focused rule regression test. Both are required for rule changes.

## Required pipeline test binaries

Before committing code changes, run these five pipeline binaries at minimum. They cover noop stability, webpack4 final and raw layers, and webpack5/browserify/esbuild unpack paths.

```bash
cargo test -p wakaru-core --test noop_pipeline
cargo test -p wakaru-core --test webpack4_unpack
cargo test -p wakaru-core --test webpack4_unpack_raw
cargo test -p wakaru-core --test bundle_unpack
cargo test -p wakaru-core --test esbuild_unpack
```

| Binary | What it guards |
|---|---|
| `noop_pipeline` | Decompile stability on bundled fixtures; noop output snapshot |
| `webpack4_unpack` | Module extraction counts + per-module decompile snapshots |
| `webpack4_unpack_raw` | Raw extraction before decompile rules |
| `bundle_unpack` | webpack5 and browserify unpack + interop recovery |
| `esbuild_unpack` | esbuild/Bun scope-hoisted detection, strict vs heuristic split |

Additional binaries exist for SystemJS, webpack5 chunks, multi-file unpack, cross-module facts, helper pipelines, and `DecompileOptions` — run them when your change touches those areas.

## Test262 round-trip coverage

Beyond Rust unit and snapshot tests, `scripts/correctness/test262-roundtrip.mjs` checks **semantic preservation**: run a Test262 file in Node's `vm`, transform it with a selected producer pipeline, decompile with Wakaru, then run the decompiled result through the same Test262 harness. The Test262 harness remains the oracle — original, transformed, and decompiled code must all pass.

### How a case runs

```mermaid
flowchart LR
  A[Test262 source] --> B[Baseline VM run]
  B --> C[Transform via pipeline]
  C --> D[Transformed VM run]
  D --> E[Wakaru decompile]
  E --> F[Decompiled VM run]
  F --> G{All pass?}
  G -->|yes| H[passed]
  G -->|no| I[failed]
```

The runner operates in three phases: classify tests and run pre-decompile checks, batch-decompile pending cases in parallel (bounded by `cpus - 2`), then verify decompiled output against the harness.

### Key options

<ParamField body="--preset" type="string">
Named Test262 path set: `default`, `classes`, `destructuring`, `modules`, `operators`, and others. Prefer presets over running the entire Test262 tree.
</ParamField>

<ParamField body="--pipeline" type="string" default="terser-light">
Producer before Wakaru: `none`, `terser-light`, `terser-full`, `babel-env-terser`, `swc-minify`, `esbuild-minify`.
</ParamField>

<ParamField body="--level" type="string" default="minimal">
Wakaru rewrite level passed to the CLI: `minimal`, `standard`, `aggressive`.
</ParamField>

<ParamField body="--limit" type="number | all" default="25">
Maximum runnable tests. Use `all` for full slice runs.
</ParamField>

<ParamField body="--case-timeout-ms" type="number" default="5000">
Per-test timeout. Timeouts record as `rejected` with reason `case-timeout`.
</ParamField>

<ParamField body="--summary" type="path">
Write a deterministic Markdown report (no timestamps) suitable for git diff review.
</ParamField>

<ParamField body="--json" type="path">
Write full JSON report. Updated after each processed test; interrupted runs leave `complete: false`.
</ParamField>

<RequestExample>

```bash
# Quick smoke (25 cases, default paths)
node scripts/correctness/test262-roundtrip.mjs --limit 500

# Full default slice with reviewable summary
node scripts/correctness/test262-roundtrip.mjs --limit all --summary target/test262-default.md

# Classes slice through SWC minifier
node scripts/correctness/test262-roundtrip.mjs --preset classes --pipeline swc-minify --limit 100 --summary target/test262-classes-swc.md

# ESM module graphs (not bundle testing)
node scripts/correctness/test262-roundtrip.mjs --preset modules --pipeline swc-minify --limit all --case-timeout-ms 2000 --summary target/test262-modules-graph-swc.md

# Rerun failures from a previous report
node scripts/correctness/test262-roundtrip.mjs --rerun-from target/test262-default.json --rerun-status failed --json target/test262-default-rerun.json
```

</RequestExample>

### Status buckets

| Status | Meaning |
|---|---|
| `passed` | Original, transformed, and decompiled code all pass |
| `unsupported` | Local Node/vm/SWC setup cannot run this input |
| `rejected` | Transform/minifier or known non-Wakaru blocker stops the case before semantic evaluation |
| `failed` | Current Wakaru correctness candidate |

Known non-Wakaru classifications live in `scripts/correctness/test262-known-blockers.json`. Keep entries narrow; do not mask real Wakaru semantic failures.

### Baselines and stats

Tracked baseline summaries live in `docs/test262-baselines/`, grouped by producer pipeline and Test262 slice. Regenerate the full normal matrix:

```bash
node scripts/correctness/test262-baseline-matrix.mjs
```

Refresh a subset:

```bash
node scripts/correctness/test262-baseline-matrix.mjs --producer swc-minify --slice operators
```

Cached totals for quick reads: `scripts/correctness/test262-stats.json`, updated via `node scripts/correctness/test262-collect-stats.mjs`.

The roundtrip runner requires a built `wakaru` binary (`cargo build -p wakaru-cli`) or a `WAKARU` environment variable pointing to one. It does not fall back to `cargo run`.

## Pre-commit verification matrix

Run checks from the **worktree that contains your changes** — the directory with this repo's `Cargo.toml`. Do not reuse binaries from another checkout.

<Steps>
<Step title="Focused regression test">

```bash
cargo test -p wakaru-core --test <rule>_rule
```

Every code change needs a corresponding unit test. Snapshot-only updates are insufficient for rule changes.

</Step>

<Step title="Pipeline tests">

Run the five required pipeline binaries listed above.

</Step>

<Step title="Formatting and linting">

```bash
cargo fmt --check
cargo clippy -p wakaru-core --all-targets -- -D warnings
```

Use `cargo clippy --workspace --all-targets -- -D warnings` when touching non-core crates or shared workspace code.

</Step>

<Step title="Optional: release CLI binary">

Only when you need a standalone binary for reproduction matrices or CLI validation:

```bash
cargo build --profile dev-release -p wakaru-cli
export WAKARU="$PWD/target/dev-release/wakaru"
```

The `dev-release` profile inherits `release` with thinner LTO for faster local builds. Fixture runners auto-detect and build the checkout they are launched from.

</Step>

<Step title="Optional: external fixtures">

When the sibling `wakaru-fixtures` repository is checked out and your change affects decompile output, unpacking, bundler behavior, rule ordering, helper detection, or CLI behavior:

```bash
../wakaru-fixtures/run.sh --check
```

Accept deliberate output improvements with `../wakaru-fixtures/run.sh --update` and commit the `outputs/` change in the fixtures repo.

</Step>

<Step title="Final cleanliness">

```bash
git diff --check
git status --short
```

Confirm no stale `.snap.new` files or unrelated changes remain.

</Step>
</Steps>

### CI parity summary

| Check | Local command | CI job |
|---|---|---|
| Format | `cargo fmt --check` | `fmt` |
| Lint | `cargo clippy -- -D warnings` | `check` (all OS matrix) |
| Tests | `cargo nextest run --workspace --profile ci` | `check` |
| Doctests | `cargo test --workspace --doc` | `check` |
| Snapshot policy | `INSTA_UPDATE=new` (fail + `.snap.new`) | `INSTA_UPDATE=no` (fail, no write) |

For faster local iteration, `cargo nextest run -p wakaru-core` covers the bulk of the suite. Reserve the full workspace + `ci` profile run before pushing.

## Reproduction matrices

Under `scripts/repro/`, matrices test recovery across transpiler versions, modes, and minification levels. Current rates are cached in `scripts/repro/stats.json`.

```bash
node scripts/repro/collect-stats.mjs          # regenerate stats
node scripts/repro/collect-stats.mjs --check  # verify freshness
node scripts/repro/array-spread-rest-matrix/matrix.mjs --details
```

Every new matrix should spread `...mangleValidator()` from `lib/compare.mjs` into its `runMatrix()` config for structural comparison of mangled shapes.

## Related pages

<CardGroup>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, unresolved_mark guards, and the definition-of-done checklist for new rules.
</Card>

<Card title="Debug regressions" href="/debugging-regressions">
Investigate snapshot drift, raw vs final webpack4 layers, and rule trace bisection.
</Card>

<Card title="Trace the rule pipeline" href="/trace-rule-pipeline">
Per-rule diffs and --from/--until ranges for narrowing regressions.
</Card>

<Card title="Contributing" href="/contributing">
Fork-and-branch workflow, required checks, and conventional commits.
</Card>

<Card title="Webpack bundle recipe" href="/webpack-bundle-recipe">
Build fixtures and compare unpack output against reference dist files.
</Card>

<Card title="Esbuild and Browserify recipe" href="/esbuild-browserify-recipe">
Verify esbuild and browserify unpack with testcase commands.
</Card>
</CardGroup>