# Cross-module facts

> Two-phase unpack barrier: Phase 1 fact collection after UnEsm, ModuleFactsMap shape, and Phase 2 rules (namespace_decomposition, cross-module helper refs) that read other modules' import/export facts.

- Repository: pionxzh/wakaru
- GitHub: https://github.com/pionxzh/wakaru
- Human docs: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b
- Complete Markdown: https://grok-wiki.com/public/docs/pionxzh-wakaru-77a438a6cc6b/llms-full.txt

## Source Files

- `docs/fact-system.md`
- `crates/core/src/facts.rs`
- `crates/core/src/driver/unpack.rs`
- `crates/core/src/namespace_decomposition.rs`
- `crates/core/src/rules/cross_module_helper_refs.rs`

---

---
title: "Cross-module facts"
description: "Two-phase unpack barrier: Phase 1 fact collection after UnEsm, ModuleFactsMap shape, and Phase 2 rules (namespace_decomposition, cross-module helper refs) that read other modules' import/export facts."
---

Multi-module `unpack()` runs a two-phase pipeline with a cross-module barrier: Phase 1 normalizes each extracted module through `UnEsm`, extracts import/export facts, and assembles a shared `ModuleFactsMap`; Phase 2 re-runs the through-`UnEsm` range, applies barrier passes that read other modules' facts, then continues with fact-aware helper rules. Single-file `decompile()` does not use this system.

<Info>
Facts are read-only snapshots derived from the post-Stage-2 AST. Rules mutate only the current module's AST — they never write back to `ModuleFactsMap`.
</Info>

## When facts apply

| Mode | Uses `ModuleFactsMap` |
|------|----------------------|
| `unpack()` / `unpack_files()` | Yes — all modules share one map at the barrier |
| `decompile()` (single file) | No — `RulePipelineOptions::module_facts` stays `None` |
| `unpack_raw()` | No — raw extraction skips the rule pipeline |

Facts exist because transpiler helpers and namespace imports are often split into separate bundle modules. A consumer module's rewrite (for example `import h from "./helpers"; h.default(...)`) requires proof that the target module exports a known helper shape.

## Two-phase barrier

`unpack_multi_module_with_plan` in `crates/core/src/driver/unpack/phases.rs` orchestrates both phases. Each phase processes modules in parallel via Rayon.

```mermaid
flowchart TB
    subgraph phase1 ["Phase 1 — per module, parallel"]
        P1A[parse + resolver]
        P1B[rules until UnEsm]
        P1C[recover_late_esm_from_factory_iifes]
        P1D["collect_module_facts(module)"]
        P1A --> P1B --> P1C --> P1D
    end

    subgraph barrier ["Cross-module barrier"]
        MAP["ModuleFactsMap::insert(filename, facts)"]
    end

    subgraph phase2 ["Phase 2 — per module, parallel"]
        P2A[resume or re-parse + rules until UnEsm]
        P2B[run_reexport_consolidation]
        P2C[run_namespace_decomposition]
        P2D["rules UnObjectSpread2 → UnReturn<br/>with_module_facts(map)"]
        P2E[late cleanup + emit]
        P2A --> P2B --> P2C --> P2D --> P2E
    end

    phase1 --> barrier --> phase2
```

### Why the through-`UnEsm` range runs twice

SWC's `SyntaxContext` must remain continuous within the emitted module pipeline. Phase 1 may discard its AST (source-map mode always re-parses; the no-sourcemap path can reuse the Phase 1 AST when `can_reuse_phase1_ast` is true). Either way, Phase 2 needs a fresh or carefully resumed AST so ctxt-sensitive rules (`UnImportRename`, `BindingRenamer`, and others) see consistent binding identities.

<Note>
Phase 1 clones the module before `recover_late_esm_from_factory_iifes` when the AST will be reused, so fact extraction sees recovered ESM shapes while Phase 2 resumes from the pre-recovery through-`UnEsm` state and runs its own recovery at `options.level`.
</Note>

### Phase 1 failure semantics

If a module fails to parse during fact collection, Wakaru records an `UnpackWarning` with kind `FactCollectionParseFailed` and inserts empty facts for that module. The unpack continues best-effort rather than aborting the entire bundle.

## `ModuleFacts` shape

`collect_module_facts` in `crates/core/src/facts.rs` is a pure function over the post-Stage-2 AST. Call it immediately after `UnEsm` and ESM recovery, before later rules mutate import/export structure.

<ResponseField name="imports" type="Vec<ImportFact>">
Each `ImportFact` records `local`, `source`, and `kind` (`Default`, `Namespace`, or `Named(imported)`).
</ResponseField>

<ResponseField name="exports" type="Vec<ExportFact>">
Each `ExportFact` records `exported`, optional `local`, and `kind` (`Default` or `Named`). The exported name `"default"` marks default exports.
</ResponseField>

<ResponseField name="helper_exports" type="Vec<HelperExportFact>">
Transpiler helper identity proven from exported binding body shape. Kinds include `Extends`, `ObjectSpread`, `AsyncToGenerator`, `InteropRequireDefault`, and others defined by `HelperKind`.
</ResponseField>

<ResponseField name="default_object_helper_exports" type="Vec<HelperExportFact>">
Helpers exported as properties on a default-export object literal (`export default { extends: _extends, ... }`).
</ResponseField>

<ResponseField name="ts_helper_exports" type="Vec<TypeScriptHelperExportFact>">
Raw TypeScript/tslib helper identity (`__awaiter`, `__generator`, `__spreadArray`, etc.) proven from export shape or tslib registrar patterns.
</ResponseField>

<ResponseField name="passthrough_target" type="Option<Atom>">
Set when the module is a pure passthrough: body is exactly `export default require("./X.js")` with no other statements. Importers can be redirected to `./X.js`.
</ResponseField>

<ResponseField name="is_helper_module" type="bool">
True when the module exports any recognized transpiler helper, including kinds with no rewrite mapping (for example `_defineProperty`). Used by dead helper-module elimination.
</ResponseField>

Helper export facts are conservative: they record identity only when the exported local binding matches a known helper body or runtime export shape. They do not speculate from consumer-side usage patterns.

## `ModuleFactsMap` lookup

`ModuleFactsMap` stores facts keyed by **canonical module filename** (the unpacked output path, not the import specifier string alone).

| Operation | Behavior |
|-----------|----------|
| `insert(key, facts)` | Normalizes `key` by stripping leading `./` |
| `get(specifier)` | Tries canonical form, then common extension variants (`.js`, `.jsx`), then extension-stripped forms |

This handles specifier variants like `./lib/foo.js`, `lib/foo.js`, and `lib/foo` resolving to the same module.

<Warning>
Filename recovery (`build_rename_map`) runs at the barrier but is kept separate from the fact map. Fact-driven passes operate on provisional filenames; only the final emit step rewrites import sources to recovered names.
</Warning>

## Phase 2 barrier passes

These free functions take `(&mut Module, &ModuleFactsMap)` and run sequentially before the fact-aware rule range.

### `run_reexport_consolidation`

Redirects imports from passthrough modules to their actual target. When a default import from a passthrough is used only via member access, the import source is rewritten:

```
import x from "./passthrough.js"  →  import * as x from "./target.js"
```

`resolve_passthrough` follows chains of `passthrough_target` facts transitively, detecting cycles.

### `run_namespace_decomposition`

Rewrites namespace-like imports into named imports when usage is property-access only and the target module exports those names:

```
import r from "./x"; r.foo()  →  import { foo } from "./x"; foo()
```

Safety checks include inner-scope shadowing, mixed default+named imports on the same declaration, JSX intrinsic vs component distinction, and readability backoff when too many collision aliases would be needed. Reused pre-existing named specifiers propagate their real `SyntaxContext` so downstream `(sym, ctxt)` passes match rewritten usages.

## Fact-aware pipeline rules

After the barrier passes, Phase 2 runs:

```rust
RulePipelineOptions::between("UnObjectSpread2", "UnReturn")
    .with_module_facts(facts_ref)
```

`RulePipelineOptions::with_module_facts` threads the map into `RuleRunContext`. Rules that accept facts use optional `new_with_facts` constructors; single-file `decompile()` keeps the normal constructors with `module_facts: None`.

### `collect_cross_module_helper_refs`

`crates/core/src/rules/cross_module_helper_refs.rs` bridges consumer import specifiers to producer `helper_exports` / `default_object_helper_exports` / `ts_helper_exports`:

| Import shape | Lookup |
|--------------|--------|
| Default | `helper_exports` where `exported == "default"`, plus `default_object_helper_exports` as a namespace map |
| Named | `helper_exports` matched by exported name |
| Namespace | All helper exports chained into a member-access namespace map |

Returns `CrossModuleHelperRefs { direct, namespaces }` keyed by `(sym, ctxt)` binding keys.

### Rules that consume cross-module facts

| Rule | Cross-module behavior |
|------|----------------------|
| `UnObjectSpread` / `UnObjectSpread2` | Recognizes `extends` / `objectSpread` helpers imported from a separate helper module |
| `UnObjectRest` / `UnObjectRest2` | Same pattern for object-rest helpers |
| `UnSlicedToArray2` | Cross-module `slicedToArray` / tslib `__read` refs |
| `UnTemplateLiteral` | Cross-module `taggedTemplateLiteral` refs |
| `UnRegenerator` | Proves `AsyncToGenerator` default export on imported helper modules (including interop `require()` aliases) |

<Note>
Cross-module helper recognition does not remove the consumer import when helper identity alone cannot prove the helper module is side-effect-free. `DeadImports` may downgrade binding imports to side-effect imports; dead helper-module elimination (when DCE is enabled at standard+ rewrite level) can then drop unused helper modules.
</Note>

## Adding a fact-reading pass

<Steps>
<Step title="Implement a barrier function">

Add a free function in `crates/core/src/` with signature `fn run_my_pass(module: &mut Module, module_facts: &ModuleFactsMap)`. Derive all conclusions locally from facts you read — do not write back to the map.

</Step>

<Step title="Wire into Phase 2">

Call it from `run_phase2_tail` in `phases.rs` between `apply_rules(..., until("UnEsm"))` and the `UnObjectSpread2`–`UnReturn` range, alongside `run_reexport_consolidation` and `run_namespace_decomposition`.

</Step>

<Step title="Or extend an existing pipeline rule">

For rules that must stay at their current pipeline position, add an optional `new_with_facts` constructor and read `ctx.module_facts` in the pipeline runner. Thread facts only through the multi-module rule runner.

</Step>

<Step title="Add unit tests">

Follow `crates/core/tests/namespace_decomposition_rule.rs`: use `facts_for(source)` to synthesize a target module's facts, build a `ModuleFactsMap`, and assert the rewrite output.

```rust
fn facts_for(source: &str) -> ModuleFacts {
    // parse → resolver → collect_module_facts(&module)
}

let mut map = ModuleFactsMap::new();
map.insert("target.js", facts_for(r#"export function foo() {}"#));
run_namespace_decomposition(&mut module, &map);
```

</Step>
</Steps>

## Implementation constraints

### Identifier and span gotchas

| Constraint | Reason |
|------------|--------|
| Use `DUMMY_SP` for new import specifiers and rewritten usage idents | `apply_sourcemap_renames()` skips idents only when `span.is_dummy()` |
| Propagate `SyntaxContext` when reusing an existing binding | Downstream `(sym, ctxt)` passes (for example `UnImportRename`) must see rewrites as the same binding |
| Use `SyntaxContext::empty()` for freshly created import specifiers | New bindings match each other without re-running the resolver |

### Non-goals

- No shared mutable state between rules in the same phase
- No multi-round fact merging
- No speculative facts — a fact holds only if the post-Stage-2 AST proves it
- Rules derive heavier semantic conclusions (for example namespace-projection equivalence) internally; they do not emit observations back into the map

## Debugging fact-driven rewrites

Fact collection happens at a fixed pipeline point. To bisect a single-file regression, use `--trace-rules` with `--from` / `--until` ranges. Bundle unpack debugging cannot trace per-module fact assembly directly — compare Phase 1 fact output via `ModuleFactsMap` display formatting or add targeted unit tests with synthetic `facts_for` maps.

<AccordionGroup>
<Accordion title="Namespace decomposition skipped for a valid-looking import">

Check that `module_facts.get(source)` resolves the target specifier, that all accessed properties appear in `target_facts.exports` with `ExportKind::Named`, and that usage is property-access only (no bare binding reference, no computed access, no assignment to members).

</Accordion>

<Accordion title="Cross-module helper not recognized">

Verify the helper module's `helper_exports` or `ts_helper_exports` contain the expected kind after Phase 1. Helper identity must be proven from the producer's export AST, not inferred from consumer call shape alone.

</Accordion>

<Accordion title="Empty facts for a module">

Look for `FactCollectionParseFailed` in unpack warnings. The module may contain invalid standalone JS after extraction; other modules still decompile.

</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Bundle formats and unpacking" href="/bundle-formats-and-unpacking">
Detection order, raw vs full unpack, and when multi-module fact collection activates.
</Card>
<Card title="Decompile pipeline" href="/decompile-pipeline">
Single-file pipeline flow, `unresolved_mark` gating, and how unpack parallelizes per module.
</Card>
<Card title="Helper detection" href="/helper-detection">
How helper body shapes are matched before they become `helper_exports` facts.
</Card>
<Card title="Rule pipeline reference" href="/rule-pipeline-reference">
Ordered `RuleDescriptor` registry, Stage 2 boundary at `UnEsm`, and `with_module_facts` threading.
</Card>
<Card title="Develop transformation rules" href="/develop-rules">
Test-first workflow, pipeline placement, and `BindingRenamer` conventions for new rules.
</Card>
</CardGroup>
