# Stage 1 scripts reference

> CLI signatures and behavior for detect, unpack, string-array, decode-strings, simplify, control-flow-report, and the deobfuscate.ts orchestrator with --skip and --stop-after controls.

- Repository: JimLiu/decode-codex
- GitHub: https://github.com/JimLiu/decode-codex
- Human docs: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33
- Complete Markdown: https://grok-wiki.com/public/docs/jimliu-decode-codex-1a3a0c425b33/llms-full.txt

## Source Files

- `.agents/skills/deobfuscate-javascript/stages/stage-1-deobfuscate.md`
- `.agents/skills/deobfuscate-javascript/scripts/detect.ts`
- `.agents/skills/deobfuscate-javascript/scripts/unpack.ts`
- `.agents/skills/deobfuscate-javascript/scripts/string-array.ts`
- `.agents/skills/deobfuscate-javascript/scripts/decode-strings.ts`
- `.agents/skills/deobfuscate-javascript/scripts/simplify.ts`
- `.agents/skills/deobfuscate-javascript/scripts/deobfuscate.ts`

---

---
title: "Stage 1 scripts reference"
description: "CLI signatures and behavior for detect, unpack, string-array, decode-strings, simplify, control-flow-report, and the deobfuscate.ts orchestrator with --skip and --stop-after controls."
---

Stage 1 deobfuscation lives under `.agents/skills/deobfuscate-javascript/scripts/`. Seven Bun entry points unwind packers, Obfuscator.IO string arrays, encoded literals, dead code, and opaque predicates using pure Babel transforms—no LLM dependency. Run them on obfuscated input before Stage 2; on purely minified (not obfuscated) input, skip Stage 1 entirely.

<Info>
All Stage 1 scripts accept `-` as the input path to read from stdin. Transformed code or JSON reports go to stdout unless `--out` is set. Progress summaries and warnings go to stderr. A successful run that makes no changes still exits `0`.
</Info>

## Prerequisites

Run from the skill directory after `bun install`:

```bash
cd .agents/skills/deobfuscate-javascript
```

Every script is invoked as `bun scripts/<name>.ts`.

## Pipeline order

Stage 1 steps are order-sensitive. Each pass expects the AST shape produced by the previous one.

```mermaid
sequenceDiagram
  participant CLI as deobfuscate.ts
  participant D as detect
  participant U as unpack
  participant SA as string-array
  participant DS as decode-strings
  participant S as simplify
  participant CF as control-flow-report

  CLI->>D: detect (regex, read-only)
  CLI->>U: unpack (packer → aaencode → urlencoded)
  Note over CLI,D: detect-after-unpack if unpack changed code
  CLI->>SA: inline _0x... arrays
  CLI->>DS: decode literals
  CLI->>S: fold / dead-code / inline
  CLI->>CF: CFG report (read-only)
```

| Step | Mutates source | Mechanism |
|------|----------------|-----------|
| `detect` | No | Regex heuristics |
| `unpack` | Yes | String replacement + optional `new Function` |
| `string-array` | Yes | Babel AST |
| `decode-strings` | Yes | Babel AST |
| `simplify` | Yes | Babel AST (multi-pass) |
| `control-flow-report` | No | Babel AST analysis |

<Warning>
Run Stage 1 before Stage 2. `extract.ts` byte offsets and pre-deobfuscation `renames.json` ids become stale after Stage 1 rewrites.
</Warning>

## Shared exit codes

| Code | Meaning |
|------|---------|
| `0` | Success (including no-op runs) |
| `1` | I/O failure (read/write) |
| `2` | Parse error (AST-based scripts) |
| `64` | Usage error (missing input, invalid flags, unknown step names) |

## `detect.ts`

Scans source with regex heuristics and emits a JSON report. Does not parse or mutate code.

```bash
bun scripts/detect.ts <input.js|-> [--out report.json]
```

<ParamField body="input" type="positional string" required>
Path to a `.js` file, or `-` for stdin.
</ParamField>

<ParamField body="--out / -o" type="string">
Write JSON report to this path instead of stdout.
</ParamField>

<ResponseField name="techniques" type="array">
Each entry: `{ name, confidence, evidence }`. Sorted by `confidence` descending.
</ResponseField>

<ResponseField name="recommendation" type="string">
Ordered script list to run next, or guidance to skip Stage 1 and go to Stage B when no obfuscation is found.
</ResponseField>

**Detected technique names:** `packer`, `aaencode`, `urlencoded`, `string-array`, `string-array-rotation`, `hex-encoding`, `unicode-encoding`, `from-char-code`, `base64-decoding`, `control-flow-flattening`, `opaque-predicates`, `dead-code-injection`, `obfuscator-io`, `webpack`, `single-line-uglify`.

<RequestExample>
```bash
bun scripts/detect.ts fixtures/composite.min.js
```
</RequestExample>

<ResponseExample>
```json
{
  "input": "fixtures/composite.min.js",
  "size": 842,
  "techniques": [
    { "name": "string-array", "confidence": 0.9, "evidence": "_0x... array declaration with 12 indexed access(es)" }
  ],
  "recommendation": "Run, in order: scripts/string-array.ts → scripts/decode-strings.ts → scripts/simplify.ts. Or run scripts/deobfuscate.ts to do all at once."
}
```
</ResponseExample>

Stderr prints a one-line summary, e.g. `detect: 4 technique(s) — string-array(90%), from-char-code(85%), …`.

## `unpack.ts`

Iteratively unpacks layered encodings. Per iteration, tries **Packer → AAEncode → URLEncode** until no layer matches or `--max-iterations` is reached.

```bash
bun scripts/unpack.ts <input.js|-> [--out output.js] [--max-iterations 5] [--no-eval]
```

<ParamField body="--max-iterations" type="integer" default="5">
Maximum unpack loop iterations. Must be a positive integer.
</ParamField>

<ParamField body="--no-eval" type="boolean" default="false">
Refuse Dean Edwards Packer and AAEncode unpacking. Input is left unchanged; `evalRefused` is set. URLEncode (`decodeURIComponent`) is unaffected.
</ParamField>

**Eval safety:** Packer arg-list parsing and AAEncode decoding use `new Function(...)`. The sandbox cannot read your filesystem, but it does execute input bytecode. `--no-eval` refuses these paths and logs a warning to stderr.

<RequestExample>
```bash
bun scripts/unpack.ts packed.js --out unpacked.js --no-eval
```
</RequestExample>

Stderr examples:

- `unpack: 1 step(s) — packer(412→89) → unpacked.js`
- `unpack: no packer/aaencode/urlencode detected — unchanged`
- `unpack: no changes (eval refused)`

## `string-array.ts`

Inlines Obfuscator.IO `var _0x… = ['a','b',…]` arrays: applies rotation IIFEs, replaces indexed member access with string literals, removes dead array declarations.

```bash
bun scripts/string-array.ts <input.js|-> [--out output.js]
```

<Warning>
If string access is wrapped behind a decoder function (e.g. `_0xabc('0x1')`), the script collects arrays but replaces zero references (`decoderIndirection: true`). Run `simplify.ts` first to inline small decoder functions, then re-run `string-array.ts`.
</Warning>

Stderr reports counts: arrays collected, refs replaced, arrays removed, rotations applied. On decoder indirection, stderr includes a `WARNING` line.

## `decode-strings.ts`

Literal-only decoder pass on the AST.

```bash
bun scripts/decode-strings.ts <input.js|-> [--out output.js]
```

| Transform | Constraint |
|-----------|------------|
| `String.fromCharCode(72, 101, …)` | All arguments must be integer numeric literals in `0…0xffff` |
| `atob("…")` | Single string-literal argument; invalid base64 is skipped |
| `\xNN` / `\uNNNN` escapes | Normalized in string literals via Babel regeneration |

Variables and non-literal arguments are left untouched.

Stderr: `decode-strings: 2 fromCharCode, 1 atob, escapes normalized` or `decode-strings: nothing to decode`.

## `simplify.ts`

Multi-pass Babel simplifier that loops to a fixed point (up to `--max-passes`).

```bash
bun scripts/simplify.ts <input.js|-> [--out output.js] [--max-passes 10] [--no-inline]
```

<ParamField body="--max-passes" type="integer" default="10">
Maximum simplification passes. Must be a positive integer.
</ParamField>

<ParamField body="--no-inline" type="boolean" default="false">
Skip scope-aware constant inlining (`var k = 5` → inline `5` at references). Use when preserving short names for Stage 2 rename.
</ParamField>

**Per-pass transforms:** constant folding (binary, comparison, unary), dead `if`/`?:` removal, logical short-circuit, identity ops (`x+0`→`x`), sequence-in-statement expansion, computed-to-dot (`obj["foo"]`→`obj.foo`), template-literal collapse, and (unless `--no-inline`) provably-constant variable inlining.

Stderr prints pass count and per-category totals: folded, dead-if, inlined, computed→dot, seq, identity, etc.

## `control-flow-report.ts`

Read-only analysis. **Does not mutate source.** Emits JSON describing patterns that survived `simplify.ts` and require manual rewrite.

```bash
bun scripts/control-flow-report.ts <input.js|-> [--out report.json]
```

<ResponseField name="flatteners" type="array">
`while(true){switch(…)}` patterns with `dispatchVariable`, `caseCount`, `caseLabels`, `containingFunction`, and `rewriteHint`.
</ResponseField>

<ResponseField name="splitDispatches" type="array">
`"0|1|2".split("|")` dispatch arrays with `states`, `splitOn`, and `rewriteHint`.
</ResponseField>

<ResponseField name="opaquePredicates" type="array">
Residual opaque tests (`literal-vs-literal`, `not-not-array`, `not-not-object`, `void-zero`) with `alwaysTruthy` and `rewriteHint`.
</ResponseField>

Automatic CFG reconstruction is intentionally not attempted. Trace `dispatchVariable` transitions and inline cases in execution order by hand.

## `deobfuscate.ts` orchestrator

Runs the full Stage 1 pipeline in one invocation.

```bash
bun scripts/deobfuscate.ts <input.js|-> [--out output.js] [--report report.json] \
  [--skip step1,step2] [--stop-after step] [--no-eval] [--no-inline] \
  [--max-iterations 5] [--max-passes 10]
```

### Step names

Valid values for `--skip` and `--stop-after`:

`detect`, `unpack`, `string-array`, `decode-strings`, `simplify`, `control-flow-report`

<ParamField body="--skip" type="comma-separated string">
Comma-separated step names to omit. Unknown names exit `64`.
</ParamField>

<ParamField body="--stop-after" type="string">
Run through this step (inclusive), then halt. Unknown names exit `64`.
</ParamField>

<ParamField body="--report / -r" type="string">
Write a JSON run report with `input`, `originalSize`, `finalSize`, `reduction` (percentage string), and `steps` array.
</ParamField>

<ParamField body="--no-eval" type="boolean" default="false">
Forwarded to `unpack`.
</ParamField>

<ParamField body="--no-inline" type="boolean" default="false">
Forwarded to `simplify`.
</ParamField>

<ParamField body="--max-iterations" type="integer" default="5">
Forwarded to `unpack`.
</ParamField>

<ParamField body="--max-passes" type="integer" default="10">
Forwarded to `simplify`.
</ParamField>

### Orchestrator behavior

- Steps run in fixed order: detect → unpack → string-array → decode-strings → simplify → control-flow-report.
- After `unpack` changes the source, a bonus `detect-after-unpack` step re-scans (unless `detect` is in `--skip`).
- Each step is wrapped in try/catch; one failing step records `{ step, error }` in the report and the pipeline continues.
- `detect` is regex-based and succeeds even on syntactically invalid JS; later AST steps may record parse errors.
- Re-running on already-deobfuscated output is idempotent (output unchanged).

<RequestExample>
```bash
bun scripts/deobfuscate.ts obfuscated.js \
  --out clean.js \
  --report stage1-report.json \
  --stop-after simplify
```
</RequestExample>

<ResponseExample>
```text
deobfuscate: detect: 4 technique(s)
deobfuscate: unpack: 0 iter
deobfuscate: string-array: 1 arrays collected, 8 refs replaced, 1 removed
deobfuscate: decode-strings: 2 fromCharCode
deobfuscate: simplify: 3 pass(es), 12 folded, 2 dead, 4 inlined
deobfuscate: done — 842 → 312 bytes
```
</ResponseExample>

### `--skip` and `--stop-after` examples

<CodeGroup>
```bash title="Skip unpack for pre-unpacked input"
bun scripts/deobfuscate.ts already-unpacked.js --skip unpack
```

```bash title="Diagnostic: detect only"
bun scripts/deobfuscate.ts suspect.js --stop-after detect
```

```bash title="Partial pipeline through string-array"
bun scripts/deobfuscate.ts obfuscated.js --stop-after string-array --out partial.js
```

```bash title="Skip simplify to preserve names for Stage 2"
bun scripts/deobfuscate.ts obfuscated.js --skip simplify --out literals-only.js
```
</CodeGroup>

## Ordering rules

<Steps>
<Step title="Unpack first">
Packed input is one giant `CallExpression`; AST passes have nothing to parse until `unpack.ts` expands it.
</Step>
<Step title="String-array before decode-strings">
Inlining arrays first limits `decode-strings` to literals that are actually referenced.
</Step>
<Step title="Simplify after string-array and decode-strings">
Running `simplify` first can split rotation IIFEs so `string-array`'s matcher no longer recognizes them.
</Step>
<Step title="Control-flow-report last">
`simplify` folds most opaque predicates; the report surfaces what remains for manual CFG rewrite.
</Step>
<Step title="Stage 1 before Stage 2">
Full order: **Stage 1 → wakaru-normalize → extract**. Wakaru is also byte-rewriting and must finish before any `extract`/`apply`.
</Step>
</Steps>

## Troubleshooting

| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| `string-array: WARNING — … decoder indirection` | Decoder function wraps array access | Run `simplify.ts`, then re-run `string-array.ts` |
| `unpack: no changes (eval refused)` | `--no-eval` blocked Packer/AAEncode | Remove `--no-eval` if input is trusted, or unpack manually |
| `parse error` (exit `2`) | Syntax too broken for Babel | Fix syntax upstream; `detect` still works for heuristics |
| `control-flow-report` lists flatteners | CFG obfuscation survived simplify | Follow per-item `rewriteHint`; rewrite by hand |
| Pipeline step shows `ERROR` in `--report` | That step threw; later steps still ran | Inspect `steps[].error` in the report JSON |

<AccordionGroup>
<Accordion title="When to skip Stage 1 entirely">
If `detect` returns an empty `techniques` array and the file is only minified (short identifiers, no packers or `_0x` arrays), skip Stage 1 and proceed directly to Stage 2 (`wakaru-normalize` → `extract`).
</Accordion>

<Accordion title="Library imports">
Each script exports its core function (`detectReport`, `unpack`, `transformStringArrays`, `decodeStrings`, `simplify`, `analyzeControlFlow`, `deobfuscate`) for programmatic use and test fixtures under `scripts/*.test.ts`.
</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Handle obfuscated input" href="/obfuscated-input">
Workflow guide for packed, encoded, and Obfuscator.IO input with ordering rationale.
</Card>
<Card title="Restoration pipeline" href="/restoration-pipeline">
Stage 1, Stage 2, and Stage 3 depth tiers and the restoration contract.
</Card>
<Card title="Pipeline caveats" href="/pipeline-caveats">
Eval safety, decoder indirection, sourcemap invalidation, and other failure modes.
</Card>
<Card title="External tools and dependencies" href="/external-tools-and-dependencies">
Bun, Babel packages, BSD exit codes, and subprocess tool requirements.
</Card>
<Card title="Stage 2 scripts reference" href="/stage-2-scripts">
Rename and polish scripts that follow Stage 1.
</Card>
</CardGroup>
