# Pipeline caveats

> Ordering rules, eval safety in unpack, sourcemap precedence, polish tier differences, Prettier gitignore traps, wakaru caveats, vendor-leaf entry misidentification, and other documented failure modes with recovery steps.

- Repository: JimLiu/decode-codex
- GitHub: https://github.com/JimLiu/decode-codex
- Human docs: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33
- Complete Markdown: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33/llms-full.txt

## Source Files

- `.agents/skills/deobfuscate-javascript/reference/caveats.md`
- `.agents/skills/deobfuscate-javascript/reference/examples.md`
- `.agents/skills/deobfuscate-javascript/scripts/check-entry.ts`
- `.agents/skills/deobfuscate-javascript/scripts/control-flow-report.ts`
- `.agents/skills/deobfuscate-javascript/workflows/webpack-bundle.md`
- `.agents/skills/deobfuscate-javascript/workflows/huge-single-file.md`

---

---
title: "Pipeline caveats"
description: "Ordering rules, eval safety in unpack, sourcemap precedence, polish tier differences, Prettier gitignore traps, wakaru caveats, vendor-leaf entry misidentification, and other documented failure modes with recovery steps."
---

The `deobfuscate-javascript` skill runs a fixed multi-stage pipeline—`sourcemap-check`, optional Stage 1 deobfuscation, wakaru normalization, extract/apply rename, polish, and format—where each step assumes the byte shape and symbol offsets produced by the previous one. Violating ordering, skipping precedence checks, or misidentifying a whole-tree entry point produces silent no-ops, stale `renames.json` ids, or a falsely “complete” restore that covers only a vendor leaf.

## Stage 1 ordering

Each Stage 1 pass has an input shape the prior pass must have produced. Running steps out of order silently no-ops or mangles the AST.

| Step | Must run after | Why |
|------|----------------|-----|
| `unpack` | — (first on packed input) | Packed input is one giant `CallExpression`; AST passes have nothing to chew on until unpacked. Layered packers re-detect after each pass. |
| `string-array` | `unpack` | Array literals are already-readable; decoding the giant array first wastes work. Inlining first lets `decode-strings` walk only used literals. |
| `decode-strings` | `string-array` | Depends on resolved array references. |
| `simplify` | `string-array` + `decode-strings` | Constant folding needs resolved string references. Running `simplify` first splits rotation IIFEs into statements `string-array` will not match. |
| `control-flow-report` | `simplify` | Opaque predicates fed to `simplify` get folded out; what remains is the real CFG worth reporting. |
| Stage 2 (`extract`) | Stage 1 complete | `extract.ts` byte offsets are stable within a source but invalid across Stage 1 rewrites. Always re-extract after Stage 1. |

<Warning>
Stage 1 invalidates sourcemaps. If a usable `.map` exists, sourcemap recovery is strictly better than Stage 1 + Stage 2. Run `sourcemap-check.ts` before committing to deobfuscation.
</Warning>

The `deobfuscate.ts` orchestrator runs these steps in order and continues on individual step failure by design—read the `--report` JSON for errors rather than assuming a clean pass.

## Sourcemap precedence

`sourcemap-check.ts` is the first gate on every restore path. When a recoverable map exists, abandon the rename pipeline.

<Steps>
<Step title="Run sourcemap-check">

```bash
bun scripts/sourcemap-check.ts <input.js> [--out report.json]
```

</Step>
<Step title="Interpret the recommendation">

| Signal | Action |
|--------|--------|
| Adjacent `.map` or inline data-URL with parseable sources | Recover originals via `source-map-explorer` or `sourcesContent`—lossless, preserves names and comments. |
| `sourceMappingURL` points to missing file | Fetch the `.map` manually, or proceed with extract/apply knowing fidelity is lower. |
| No comment and no adjacent `.map` | Proceed with rename pipeline. |

</Step>
<Step title="Avoid downstream rewrites">

Do not run Stage 1, wakaru, or extract on code you intend to recover from a map—the rewrites shift the bytes the map indexes.

</Step>
</Steps>

`extract.ts` emits a warning when it sees a `sourceMappingURL` comment but step 0a was skipped. Treat that as a hard stop and re-run `sourcemap-check.ts`.

<Note>
Source maps are non-deterministic across builds. A `.map` from a different commit may not match runtime behavior—sanity-check recovered sources against the bundle you have.
</Note>

## Eval safety in unpack

`unpack.ts` unwraps Dean Edwards Packer, AAEncode, and URLEncode layers. Packer arg parsing and AAEncode decoding use `new Function(...)`, which executes the input (sandboxed, but still eval).

<ParamField body="--no-eval" type="boolean">
Refuses all eval paths. Exits 0 with input unchanged and `evalRefused: true` in the result. URLEncode (`decodeURIComponent`) still runs—no eval needed.
</ParamField>

Stderr warns before each eval. For untrusted input (malware samples, scraped blobs), probe with `--no-eval` first to confirm a wrapper exists without executing it:

```bash
bun scripts/unpack.ts "$WS/original.js" --no-eval
bun scripts/deobfuscate.ts "$WS/original.js" --no-eval --report "$WS/report.json"
```

Pass `--no-eval` through `deobfuscate.ts` when running the full Stage 1 orchestrator on untrusted sources.

## Wakaru caveats

`wakaru-normalize.ts` wraps `@wakaru/cli@1.5.0` as a pre-rename pass (Stage 2 Step 0b.5). It is default-on in the readable tier.

| Caveat | Detail | Recovery |
|--------|--------|----------|
| Byte offsets shift | wakaru rewrites the AST; `name@offset` ids captured before it are stale. | Extract and rename from `$WS/normalized.js`, never `original.js`. Order: `sourcemap-check → detect → (Stage 1) → wakaru → extract`. |
| Sourcemap first | Running wakaru rewrites bytes a `.map` indexes. | Skip wakaru when a usable map exists. |
| Not a deobfuscator | Recovers transpiler/minifier output, not Obfuscator.IO / Packer / control-flow flattening. | Run Stage 1 first on obfuscated input. |
| Not a semantic renamer | `smart_rename` is the same deterministic heuristic as `smart-rename.ts`. | Skill rename remains the hard bar. |
| `--unpack` forks restore root | Re-derives module boundaries/filenames matching nothing in `manifest.json` / `ledger.json` / `CHUNK_NAME_REGISTRY`. | Use `--unpack` only on a single scope-hoisted bundle. On an already-split chunk tree (e.g. `ref/webview/assets`), normalize chunk bodies only; rebuild the graph from real chunk files. |
| Fidelity at `aggressive` / `--dce` | Default is `--level standard`. `--dce` can drop side-effecting code. | Use `--level minimal` for fidelity-critical code; `aggressive` only with behavioral sanity check. |
| Availability | When binary is unreachable, wrapper passes input through unchanged (`skipped`, exit 0). Genuine parse errors exit 2 with passthrough output. | Offline/CI runs degrade gracefully; check stderr for `skipped`. |

<Warning>
Do not let wakaru's `un_esm` substitute for `resolve-npm-imports.ts` in full-restoration. The import graph is built by `build-import-graph.ts` from real chunk files.
</Warning>

## Polish tier differences

The readable tier runs `polish.ts --fast` (reading-aid subset). Deep mode drops `--fast` to run the import-resolution tail.

| Pass | Readable (`--fast`) | Deep (no `--fast`) |
|------|---------------------|---------------------|
| `strip-react-compiler` | ✓ | ✓ |
| `simplify` | ✓ | ✓ |
| `jsx-runtime` | ✓ | ✓ |
| `inline-defaults` | ✓ | ✓ |
| `normalize-exports` | ✓ | ✓ |
| `react-shim-elim` | skipped | ✓ |
| `resolve-npm-imports` | skipped | ✓ |
| `npm-cjs-shim-elim` | skipped | ✓ |
| `dead-shim-elim` | skipped | ✓ |

Import-resolution passes make output resolve against `node_modules` (compilability); they do not improve readability. `polish.ts` is idempotent—safe to re-run with different `--prefer` / `--skip` / `--stop-after`.

### Polish false positives and flags

| Transform | Risk | Mitigation |
|-----------|------|------------|
| `strip-react-compiler` | Any `let X = expr.c(N)` with numeric literal arg is treated as React Compiler cache. | `--skip strip-react-compiler` |
| `simplify` `(0, fn)(args)` → `fn(args)` | `this`-affecting rewrite; Rollup uses it to strip `this`. | `--skip simplify` or wakaru `--level minimal` |
| `jsx-runtime` | Rewrites any `.jsx` / `.jsxs` / `.jsxDEV` / `.Fragment` by name. | Audit after polish if you have user-defined `.jsx` methods |
| `inline-defaults` | Treats `??` like destructure default (`undefined` only vs `null`+`undefined`). | `--skip inline-defaults` for `null`-tolerant props |
| `--prefer` default | `polish.ts` defaults to `--prefer local`; standalone `normalize-exports.ts` defaults to `--prefer exported`. | Pass explicit `--prefer` when export alias matters |
| `--source` / `--description` | Provenance header is opt-in; re-running duplicates headers. | Run polish once per chunk with `--source ref/.../chunk.js` |
| Dead stubs | `var jsxRuntime = requireJsxRuntime();` may survive as unreferenced. | Delete manually after confirming no side effects |

## Prettier gitignore trap

Prettier 3 defaults `--ignore-path` to `[".gitignore", ".prettierignore"]`. Restore deliverables often live under gitignored trees (`restored/`, `ref/` in this repo). A plain `prettier --write restored/` silently skips every file and reports success.

**Symptom:** 400-character lines and un-parenthesized multi-line JSX that prettier insists is already clean. Copy the file outside the gitignored tree and `prettier --check` flags it.

`scripts/format.ts` pins `--ignore-path .prettierignore` so `.gitignore` is bypassed. When invoking prettier directly:

```bash
prettier --write --ignore-path .prettierignore restored/path/to/file.tsx
```

`promote-organized.ts` formats each deliverable via `format.ts` as it lands. `quality-gate.ts --check-format` uses the same bypass and soft-skips when prettier is unreachable.

## Vendor-leaf entry misidentification

`check-entry.ts` prevents pointing `build-import-graph.ts` at a transitive vendor-leaf chunk (e.g. `main-BDm-p1LA.js` imported by dozens of siblings) and declaring the app restored.

A real app entry has large local fan-out and is imported by ~nobody. A vendor leaf is the inverse.

| Signal | Threshold | Meaning |
|--------|-----------|---------|
| `isRoot` | Referenced by `index.html` `<script>` or `modulepreload` | Likely real entry |
| `inDegree` | ≥ 5 siblings import this chunk | High in-degree suggests dependency, not entry |
| `localOutDegree` | ≤ 8 local sibling imports | Low out-degree suggests leaf |
| `looksVendored` | `CHUNK_NAME_REGISTRY` hit, basename pattern, or content fingerprint | Vendor package chunk |
| `suspicious` | NOT root AND in-degree ≥ 5 AND out-degree ≤ 8 | Transitive dependency, not app entry |

Exit codes: `0` ok · `3` suspicious · `1` I/O or none discovered · `64` usage.

<Steps>
<Step title="Discover the real entry">

```bash
bun scripts/check-entry.ts --discover --root ref/webview/assets [--index ref/webview/index.html]
```

Prefers non-suspicious `<script>` roots over `modulepreload` (preload roots are dependencies, not entries).

</Step>
<Step title="Verify a manual candidate">

```bash
bun scripts/check-entry.ts ref/webview/assets/app-main-XXXX.js --root ref/webview/assets
```

High local out-degree (e.g. 305) keeps a subtree entry like `app-main-*` from flagging even when not in `index.html`.

</Step>
<Step title="Recover from misidentification">

Re-run `build-import-graph.ts` from the correct `index.html` script root or highest-fan-out non-vendored chunk. Inspect `manifest.json` closure size—vendor-leaf restores cover only a tiny subgraph.

</Step>
</Steps>

## Stage 2 rename caveats

| Issue | Cause | Recovery |
|-------|-------|------------|
| Program-scope-only rename | Pass 1 ran with `--scope-kind Program`; function bodies untouched | Re-extract from Pass-1 output with `--only-cryptic --min-refs 2` (no scope filter); apply Pass 2 |
| Apply renames 0 symbols | `renames.json` ids stale after Stage 1 or wakaru | Re-run `extract.ts`; rebuild `renames.json` from fresh `name@offset` ids |
| `_foo` / `__foo` chains | Target names collide within the same scope (shadowing) | Pick more specific names or accept prefixes |
| Stale names in strings | `scope.rename` does not update `eval`, `Function(...)`, or `window['a']` | Fix manually |
| 50 MB `symbols.json` | Unfiltered extract on huge bundle | `--only-cryptic --min-refs 3 --top 200 --max-same-scope 5 --context-size 300` |
| Properties not renamed | `obj.foo` and class methods are properties, not bindings | Expected—callers may depend on names |

<Note>
Do not re-rename an already-renamed file with the same `renames.json`—ids shift after generator reformatting. Pass 2 of the multi-pass workflow builds a new `renames.json` from Pass-1 output ids.
</Note>

## Decoder indirection and control-flow flattening

**String-array 0 replacements:** stderr reports `decoderIndirection: true` when arrays are accessed through a wrapper function. Run `simplify.ts` first (inlines small constant functions), then re-run `string-array`.

**Control-flow flattening:** `control-flow-report.ts` detects `while-switch` flatteners, split-string dispatches, and opaque predicates. It reports `rewriteHint` per item but does not mutate source. Automatic CFG reconstruction is unreliable—trace the dispatch graph manually (see Example 6 in the skill reference) and re-run `simplify.ts` on the rewritten code.

## Troubleshooting quick reference

<AccordionGroup>
<Accordion title="unpack says no packer detected but eval wrapper is visible">

The detector requires exact `(p, a, c, k, e, d|r)` parameter signature. Custom wrappers or renamed params won't match. Rename params to the standard signature, or paste the inner `function(p,a,c,k,e,d){...}('...', N, ...)` body into a JS REPL and execute manually.

</Accordion>
<Accordion title="deobfuscate orchestrator says step X errored but kept going">

By design—one failing AST pass does not abort the rest. Read `--report` JSON, fix input (often syntax error from prior mangling), or `--skip` that step.

</Accordion>
<Accordion title="Output still has jsx-runtime calls or backtick literals after rename">

Skipped polish. Run `bun scripts/polish.ts <renamed.js> --out <polished.js>` (add `--fast` for readable tier).

</Accordion>
<Accordion title="Output still has cache[N] or react.c(N) scaffolding">

Skipped `strip-react-compiler`. Re-run polish without `--skip strip-react-compiler`.

</Accordion>
<Accordion title="webcrack left file unsplit">

webcrack handles webpack-shaped bundles best. Rollup/esbuild/Vite output is flat ESM—feed original into extract/apply, or use wakaru `--unpack` on a single scope-hoisted bundle only (never on an already-split tree).

</Accordion>
<Accordion title="Local import path renamed and imports broke">

Do not rewrite local import paths—even hash suffixes are real filenames on disk. Only resolve to bare npm specifiers when certain the binding is a vendored copy of that package.

</Accordion>
<Accordion title="Two Restored-from headers">

Ran `polish.ts --source` twice. Polish always prepends—delete duplicate or run polish only once per chunk.

</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Restoration pipeline" href="/restoration-pipeline">
Stage 1, Stage 2, and Stage 3 contracts; readable vs deep depth; whole-tree vs single-file scope.
</Card>
<Card title="Handle obfuscated input" href="/obfuscated-input">
Stage 1 detect/unpack/string-array/decode-strings/simplify ordering and orchestrator flags.
</Card>
<Card title="Stage 2 scripts reference" href="/stage-2-scripts">
Polish flags, wakaru-normalize, extract filters, and rename pass controls.
</Card>
<Card title="Full tree restoration" href="/full-tree-restoration">
Entry discovery, import-graph orchestration, and quality-gate over the whole target.
</Card>
<Card title="Quality bar and anti-patterns" href="/quality-bar-and-anti-patterns">
Readable and deep-tier completion criteria, naming anti-patterns, and quality-gate failure modes.
</Card>
<Card title="External tools and dependencies" href="/external-tools-and-dependencies">
Bun, Prettier, wakaru, webcrack, source-map-explorer availability and graceful degradation.
</Card>
</CardGroup>
