Agent-readable wiki

RLM Code — Explain Like I'm 5 Wiki

RLM Code is a Python tool that runs AI agents in a looping read-execute-judge cycle, benchmarks them across environments, and lets you compare results — all from a terminal UI. It implements the Recursive Language Models paper idea: instead of stuffing a giant document into the AI's memory all at once, the AI reads a little piece, writes code to analyze it, and repeats until it has an answer.

Pages

Explain It Simply — What Is RLM Code?RLM Code in plain language: what problem it solves, the one core idea to keep, and what you will find when you open the repo for the first time.
The Loop — How an Agent Actually Works HereStep by step: how RLMRunner turns one user prompt into many rounds of context → action proposal → sandbox execution → observation → reward → memory update, and when it stops.
Environments & Sandboxes — Where Code Actually RunsThe three built-in environments (DSPy coding, Generic, TraceAnalysis, PureRLM), what each one does, and the sandbox runtimes (Docker, Monty, mock) that execute untrusted code safely.
Framework Adapters — Plug In Your Favourite AI StackHow rlm_code/rlm/frameworks/ lets DSPy, Google ADK, Pydantic-AI, and DeepAgents all plug into the same RLM loop through a shared base class and a framework registry, without changing the core runner.
Benchmarks, Leaderboard & Observability — Did It Work?How RLMBenchmarkCase definitions drive automated runs, how scores flow into the leaderboard, how trajectory replay lets you re-watch any session, and how observability sinks (OTel-shaped JSONL, trace analysis) record what happened.
The One Map to Keep — Core Idea, Key Files, What to Read NextA plain-English recap of the whole system: the single analogy that holds, the five files that matter most, the two constraints every newcomer hits, and where to go from here.

Complete Markdown

The complete agent-readable Markdown files are published separately from this HTML page.