# Restoration pipeline

> Stage 1 deobfuscation, Stage 2 rename and polish, and Stage 3 semantic finalize; readable vs deep depth; whole-tree vs single-file scope; and the restoration contract that defines done.

- Repository: JimLiu/decode-codex
- GitHub: https://github.com/JimLiu/decode-codex
- Human docs: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33
- Complete Markdown: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33/llms-full.txt

## Source Files

- `.agents/skills/deobfuscate-javascript/SKILL.md`
- `.agents/skills/deobfuscate-javascript/stages/stage-1-deobfuscate.md`
- `.agents/skills/deobfuscate-javascript/stages/stage-2-restore.md`
- `.agents/skills/deobfuscate-javascript/stages/stage-3-finalize.md`
- `.agents/skills/deobfuscate-javascript/scripts/deobfuscate.ts`
- `.agents/skills/deobfuscate-javascript/scripts/polish.ts`

---

---
title: "Restoration pipeline"
description: "Stage 1 deobfuscation, Stage 2 rename and polish, and Stage 3 semantic finalize; readable vs deep depth; whole-tree vs single-file scope; and the restoration contract that defines done."
---

The `deobfuscate-javascript` skill turns minified or obfuscated JavaScript into human-maintainable source through three ordered stages, two independent axes (scope and depth), and a staging → organize → promote discipline that keeps `restored/` deliverable-only. Stage 1 and Stage 2 produce the readable restore; Stage 3 is the deep-tier add-on for typed `.tsx` output and acceptance review.

## Two axes: scope and depth

Scope and depth are chosen independently. Scope follows input shape; depth follows user intent.

| Axis | Option | Trigger | Pipeline shape |
|------|--------|---------|----------------|
| **Scope** | Whole tree | `index.html` + sibling asset tree | Auto-discover entry → BFS every reachable project-local chunk via `manifest.json` / `ledger.json` |
| **Scope** | Single file | Lone snippet, no `index.html`/tree | Per-chunk `$WS/` workspace only; no import-graph orchestration |
| **Depth** | Deep (whole-tree default) | "deep", "full", "typed", "production", "完整", "深度", or restore-the-whole-tree | Stage 3 typed rewrite + acceptance review + full npm-import resolution + drain every chunk to `promoted` |
| **Depth** | Readable (quick opt-out) | "quick", "readable", "快速", lone pasted snippet | Meaningful names + reading-aid polish (`polish.ts --fast`); untyped output acceptable |

<Note>
Deep is the completion bar for whole-tree restores, not an upsell. A whole-tree restore is incomplete while any reachable project-local chunk lacks `stages.promoted` or `quality-gate.ts <target>` fails.
</Note>

## Pipeline overview

```mermaid
flowchart TB
  subgraph preflight["Preflight"]
    SM[sourcemap-check]
    CE[check-entry / discover]
  end

  subgraph stage1["Stage 1 — Deobfuscation (obfuscated only)"]
    D1[detect]
    D2[unpack]
    D3[string-array]
    D4[decode-strings]
    D5[simplify]
    D6[control-flow-report]
    DO[deobfuscate.ts orchestrator]
  end

  subgraph stage2["Stage 2 — Restore to readable"]
    WN[wakaru-normalize]
    RN[Rename: extract → smart-rename → apply]
    PO[Polish: strip-react-compiler → simplify → jsx-runtime → …]
    FM[format.ts]
  end

  subgraph stage3["Stage 3 — Finalize (deep only)"]
    SR[Semantic rewrite D0–D7]
    PO2[plan-organize → promote-organized]
    AR[Acceptance review E1–E4]
    QG[quality-gate.ts target audit]
  end

  SM --> CE
  CE -->|obfuscated| stage1
  CE -->|minified| WN
  stage1 --> WN
  WN --> RN --> PO --> FM
  FM -->|readable| PROMOTE[organize → promote → restored/]
  FM -->|deep| SR --> PO2 --> AR --> QG
  PROMOTE -->|deep continues| SR
```

### Critical ordering rules

Byte-rewriting passes invalidate `extract.ts` offsets and `renames.json` keys. The enforced order is:

1. **Stage 1 before Stage 2** on obfuscated input (skip Stage 1 on purely minified input).
2. **wakaru-normalize after Stage 1**, before any `extract` / `apply` — rename from `normalized.js`, never `original.js`.
3. **sourcemap-check first** — a usable `.map` beats any rename pipeline.

<Warning>
Running `simplify` before `string-array` splits rotation IIFEs and breaks string-array matching. Running `extract` before Stage 1 or wakaru produces stale symbol IDs.
</Warning>

## Stage 1 — Deobfuscation

Stage 1 unwinds obfuscator transforms with pure Babel passes — no LLM dependency. It handles Packer/AAEncode/URLEncode packing, Obfuscator.IO string arrays, hex/unicode/base64 escapes, dead code, and opaque predicates.

| Step | Script | Purpose |
|------|--------|---------|
| detect | `detect.ts` | Classify techniques; emit `recommendation` |
| unpack | `unpack.ts` | Unwrap layered packers (uses `new Function`; `--no-eval` to refuse) |
| string-array | `string-array.ts` | Inline `_0x` array lookups; remove rotation IIFE |
| decode-strings | `decode-strings.ts` | Resolve `String.fromCharCode`, `atob`, `\xNN`, `\uNNNN` |
| simplify | `simplify.ts` | Constant folding, dead-code removal, literal inlining |
| control-flow-report | `control-flow-report.ts` | Report-only: `while(true){switch}` flatteners |

The `deobfuscate.ts` orchestrator runs these in order, re-running `detect` after `unpack`, with per-step try/catch so one failure does not abort the rest.

<ParamField body="--skip" type="string">
Comma-separated step names to skip. Valid: `detect`, `unpack`, `string-array`, `decode-strings`, `simplify`, `control-flow-report`.
</ParamField>

<ParamField body="--stop-after" type="string">
Run through the named step, then stop.
</ParamField>

<ParamField body="--no-eval" type="boolean">
Refuse eval-based unpack (exits 0, input unchanged, `evalRefused: true`).
</ParamField>

Control-flow flattening detected by `control-flow-report` requires agent-driven manual rewrite — automatic CFG reconstruction is unreliable.

## Stage 2 — Restore to readable

Stage 2 has two phases: **rename** (where readability is won) and **polish** (undo bundler/compiler transforms).

### Phase A — Rename

<Steps>
<Step title="Check sourcemap">
Run `sourcemap-check.ts`. If a map exists, recover via `source-map-explorer` instead of renaming.
</Step>
<Step title="Archive and normalize">
Create per-chunk workspace `$WS=<target>/.deobfuscate-javascript/<basename>/`, copy `original.js`. Run `wakaru-normalize.ts` (readable tier: default-on; auto-skips when `@wakaru/cli` absent).
</Step>
<Step title="Extract and name">
`extract.ts` emits symbols sorted largest-scope-first. Run `smart-rename.ts` first for mechanical cases (~80%); hand-name the residue. Apply with `apply.ts`.
</Step>
<Step title="Verify density">
Run lexical and binding sweeps. Any single-letter reference count > 50 means Pass 2 (function bodies) was skipped. `quality-gate.ts --allow-flat` catches unfinished renames.
</Step>
</Steps>

The default one-shot combines mechanical rename + reading-aid polish:

```bash
bun scripts/polish.ts "$WS/normalized.js" \
  --rename --fast \
  --source ref/webview/assets/button-bq66r8jD.js \
  --out "$WS/draft.tsx" \
  --format
```

<Warning>
Program-scope-only rename is the top failure mode: top-level exports look fine while function bodies remain `let k = useIntl(), [A,M] = useState(false)`. Keep renaming outward until single-letter density is low.
</Warning>

### Phase B — Polish

`polish.ts` runs a chain of idempotent transforms. Two profiles exist:

| Profile | Flag | Steps included | Purpose |
|---------|------|----------------|---------|
| Reading-aid subset | `--fast` (readable default) | `strip-react-compiler`, `simplify`, `jsx-runtime`, `inline-defaults`, `normalize-exports` | How code *reads* |
| Import-resolution tail | no `--fast` (deep only) | + `react-shim-elim`, `resolve-npm-imports`, `npm-cjs-shim-elim`, `dead-shim-elim` | Imports resolve against `node_modules` |

When `--rename` is set, `polish.ts` runs `smart-rename` + `apply` before the polish chain. `--source` prepends `// Restored from <path>`; `--description` adds a second summary line.

```47:61:.agents/skills/deobfuscate-javascript/scripts/polish.ts
/**
 * The `--fast` (default readable-tier) profile skips the import-resolution /
 * shim-elimination tail. Those passes make the output resolve against
 * `node_modules` (a compilability concern); they do not improve how the code
 * *reads*. Skipping them keeps the readability passes — strip-react-compiler,
 * simplify, jsx-runtime, inline-defaults, normalize-exports — and is the right
 * default when "readable" matters more than "compiles". Drop `--fast` (deep
 * mode) to get resolvable npm imports.
 */
export const FAST_SKIP_STEPS: PolishStep[] = [
  "react-shim-elim",
  "resolve-npm-imports",
  "npm-cjs-shim-elim",
  "dead-shim-elim",
];
```

A well-named Stage 2 file with semantic filename and provenance header is a valid **readable-tier deliverable**. Types and import resolution are Stage 3 work.

## Stage 3 — Finalize (deep mode only)

Stage 3 turns mechanical checkpoints into idiomatic typed TypeScript. It is not required for readable-tier completion.

### Phase A — Semantic rewrite (D0–D7)

| Step | Action |
|------|--------|
| D0 | `quality-gate.ts` pre-filter — cryptic density, fallback names, bundler residue, flat multi-export files |
| D0.5 | Semantic kebab-case public filenames (component identifiers stay PascalCase) |
| D1 | Two-line provenance header |
| D2 | Import paths: bare npm specifiers; semantic paths between finalized siblings |
| D3–D4 | Delete dead runtime stubs; strip dangling sourcemap comments |
| D5 | TypeScript types via hand-edit or `semantic-finalize.ts --recipe icon|button` |
| D6 | `format.ts` directory pass after splits |
| D7 | `quality-gate.ts` exit 0 before acceptance review |

For whole-tree restores, drive Phase A through `plan-organize.ts` → `promote-organized.ts` rather than hand-walking hundreds of checkpoints.

### Phase B — Acceptance review (E1–E4)

After script gates pass, the host agent reads every delivered file end-to-end:

| Category | Readable tier | Deep tier |
|----------|---------------|-----------|
| E1 Naming | Hard bar (both tiers) | Hard bar |
| E2 Readability (Props, forwardRef, IconProps) | Optional | Required |
| E3 Formatting | Optional | Required |
| E4 Structure/imports | Optional | Required |

<Check>
No TODO-header fallback exists in deep mode. Every public file must pass E1–E4 before completion is declared.
</Check>

The review loop: pre-filter with `quality-gate.ts` → read against checklist → repair `NEEDS_FIX` files → re-read changed files only → repeat until clean.

## Scope: whole tree vs single file

### Whole-tree restore (default scope)

When input is an app (`index.html` + asset tree):

1. Auto-discover entry: `check-entry.ts --discover --root <assets-dir>` or omit positional to `build-import-graph.ts`.
2. Build `manifest.json` + `ledger.json` under `restored/.deobfuscate-javascript/_full/`.
3. Iterate `ledger.ts frontier` — extract → rename → polish per chunk with `O_EXCL` locks.
4. Batch checkpoint via `auto-restore-full.ts` (writes to `_full/checkpoints/`, never `restored/`).
5. Organize → promote: `plan-organize.ts` → `promote-organized.ts`.
6. Stage 3 acceptance + `quality-gate.ts <target-dir>`.

Terminal chunk kinds stop BFS but are recorded in the manifest:

| Kind | Treatment |
|------|-----------|
| `npm-leaf` | Consumer imports rewritten to bare specifier; chunk not restored |
| `external` | Same as npm-leaf |
| `oversized-local` | Only with explicit `--max-lines N` (quick mode, not deep) |
| `faced-boundary` | `make-facade.ts` scaffold; open boundary until deep-restored |

Project/feature chunks (`app-shell-*`, pages, panels, hooks) are never faced — they must be recursively restored.

### Single-file restore (fallback scope)

When input is a lone snippet or isolated chunk:

- Same per-file pipeline: Stage 1 (if obfuscated) → wakaru → rename → `polish.ts --rename --fast` → format.
- No `manifest.json` / `ledger.json` orchestration.
- Promote from `$WS/` into `restored/` after organizing the draft.

## Workspace and staging discipline

All intermediates live under the target's hidden workspace; `restored/` is deliverable-only.

:::files
restored/
├── button.tsx                          # promoted deliverable (semantic kebab name)
├── app-shell/                          # semantic-domain subfolder
├── IMPORT_MAP.json                     # one shared map for the whole restore
└── .deobfuscate-javascript/
    ├── button-bq66r8jD/                # per-chunk $WS
    │   ├── original.js
    │   ├── normalized.js
    │   ├── renames.json
    │   └── draft.tsx
    └── _full/                          # whole-tree coordination
        ├── manifest.json
        ├── ledger.json
        ├── checkpoints/                # mechanical batch output (not deliverables)
        └── files/<basename>/           # same layout as single-chunk $WS
:::

<Warning>
Batch/script output (`auto-restore-full.ts`, hash-basename `.tsx`, `--write-target-checkpoints`) is a mechanical checkpoint, never a deliverable. Promote only after the promotion bar is met.
</Warning>

Promotion bar (every tier): semantic names throughout, no fallback names (`buttonValue3`, `contextParam14`), kebab-case filenames, semantic-domain directories, prettier-formatted output. Deep tier adds complete types (`Props` interfaces on exported components).

## Restoration contract

### Readable tier — done when

1. Entry discovered (whole tree) or single file scoped.
2. `sourcemap-check` run; Stage 1 only if obfuscated.
3. wakaru-normalize → rename (`smart-rename` + hand-name residue) → `polish.ts --rename --fast` → `format`.
4. Draft organized and promoted into `restored/` with provenance header and `IMPORT_MAP.json` update.
5. Naming quality bar met: meaningful identifiers, no generated fallback names.

Types, npm-import resolution, and the E2–E4 acceptance loop are optional. An optional naming-only self-review against E1 is available.

### Deep tier — done when

Everything in the readable contract, plus:

1. `manifest.json` + `ledger.json` built for whole-tree scope.
2. Mechanical checkpoint produced (`auto-restore-full.ts` or per-chunk Stage 2).
3. Host-agent semantic rewrite: typed `.tsx`, semantic paths, resolved npm imports.
4. `quality-gate.ts` pre-filter passes (scripts prove not catastrophically broken, not semantically complete).
5. Acceptance review LOOP: every delivered file passes E1–E4.
6. Full-target audit: `quality-gate.ts <target-dir>` exits 0.

**Whole-tree completion proof** — all three must hold:

```
quality-gate.ts <target-dir>  →  exit 0
every reachable local chunk   →  stages.promoted (+ stages.finalized in deep)
ledger.ts frontier --stage promote  →  empty
```

<Info>
Do not infer completion from `IMPORT_MAP.status === "done"`, a `boundaries/` grep, or checkpoint counts alone. The target-level `quality-gate.ts` audit is the proof.
</Info>

### Delta restore

When `IMPORT_MAP.json` and `_full/manifest.json` already exist, prefer delta restore over rebuilding the whole graph: reuse manifest/ledger/map, restore only the scoped chunk, replace the mapped boundary/public file, validate changed paths only.

## Routing decision table

| Input signal | Route |
|--------------|-------|
| Usable `.map` | Recover from sourcemap; skip rename pipeline |
| Obfuscated (`_0x`, Packer, hex walls) | Stage 1 first, then Stage 2 |
| `index.html` + asset tree | Whole-tree restore at deep depth by default |
| Lone snippet / no tree | Single-file readable workflow |
| Existing restore + scoped chunk | Delta/boundary replacement |

## Default one-shot commands

<CodeGroup>

```bash title="Stage 1 orchestrator"
bun scripts/deobfuscate.ts input.js --out deobfuscated.js --report report.json
```

```bash title="Readable-tier Stage 2 one-shot"
bun scripts/polish.ts "$WS/normalized.js" \
  --rename --fast \
  --source ref/webview/assets/chunk-HASH.js \
  --out "$WS/draft.tsx" \
  --format
```

```bash title="Deep-tier polish (full chain)"
bun scripts/polish.ts "$WS/renamed.js" \
  --source ref/webview/assets/chunk-HASH.js \
  --out "$WS/polished.tsx" \
  --format
```

```bash title="Icon recipe (Stage 3)"
bun scripts/semantic-finalize.ts "$WS/polished.tsx" \
  --recipe icon \
  --source "$INPUT" \
  --out restored/icons/download-icon.tsx
```

</CodeGroup>

## Related pages

<CardGroup>
<Card title="Handle obfuscated input" href="/obfuscated-input">
Stage 1 detect → unpack → string-array ordering, eval safety, and decoder-indirection recovery.
</Card>
<Card title="Deobfuscate a single file" href="/deobfuscate-single-file">
Readable-tier workflow for isolated chunks without import-graph orchestration.
</Card>
<Card title="Full tree restoration" href="/full-tree-restoration">
Deep default workflow: entry discovery, manifest/ledger loop, organize → promote.
</Card>
<Card title="Workspace and output conventions" href="/workspace-and-output">
Per-chunk staging, promotion bar, kebab filenames, and `IMPORT_MAP.json`.
</Card>
<Card title="Quality bar and anti-patterns" href="/quality-bar-and-anti-patterns">
Naming anti-patterns, E1–E4 acceptance categories, and quality-gate failure modes.
</Card>
<Card title="Pipeline caveats" href="/pipeline-caveats">
Ordering traps, sourcemap precedence, wakaru guards, and recovery steps.
</Card>
</CardGroup>
