# Technical Orientation

> Purpose of agent-sandbox, its core/extensions split, controller-manager entry point, and a map for navigating the rest of this developer reference.

- Repository: kubernetes-sigs/agent-sandbox
- GitHub: https://github.com/kubernetes-sigs/agent-sandbox
- Human wiki: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a
- Complete Markdown: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a/llms-full.txt

## Source Files

- `README.md`
- `AGENTS.md`
- `cmd/agent-sandbox-controller/main.go`
- `go.mod`
- `roadmap.md`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [README.md](README.md)
- [AGENTS.md](AGENTS.md)
- [cmd/agent-sandbox-controller/main.go](cmd/agent-sandbox-controller/main.go)
- [go.mod](go.mod)
- [roadmap.md](roadmap.md)
- [controllers/sandbox_controller.go](controllers/sandbox_controller.go)
- [api/v1beta1/groupversion_info.go](api/v1beta1/groupversion_info.go)
- [extensions/api/v1beta1/groupversion_info.go](extensions/api/v1beta1/groupversion_info.go)
- [Dockerfile](Dockerfile)
- [codegen.go](codegen.go)
</details>

# Technical Orientation

`agent-sandbox` is a Kubernetes SIG Apps subproject that introduces a `Sandbox` Custom Resource Definition (CRD) and an associated controller for managing **stateful, singleton, pod-backed workloads with a stable identity**. The project is targeted at AI agent runtimes, development environments, notebooks, and similar workloads that do not fit the stateless replicated model of `Deployment` or the numbered-set model of `StatefulSet`. The Go module is `sigs.k8s.io/agent-sandbox` (Go 1.26.2), built on `controller-runtime` v0.23.x and Kubernetes API libraries v0.35.x.

This page is the entry point for the developer reference. It explains the **core / extensions split**, walks through the **controller-manager entry point** in `cmd/agent-sandbox-controller`, and maps the rest of the repository so other pages can be navigated with confidence.

Sources: [README.md:16-18](), [AGENTS.md:5-12](), [go.mod:1-24]()

## Why agent-sandbox exists

Kubernetes excels at stateless replicated workloads (`Deployment`) and stable numbered sets (`StatefulSet`), but use cases such as per-user dev environments, isolated runtimes for LLM-generated code, Jupyter-style sessions, and single-pod stateful services are awkward to express through those primitives. The README enumerates the gap and the desired characteristics (strong runtime isolation such as gVisor or Kata, deep hibernation, automatic resume, efficient persistence, programmatic API), and the roadmap shows the project is iterating toward Beta/GA while expanding runtime support and SDK surface.

Sources: [README.md:134-159](), [roadmap.md:1-29]()

## Core / Extensions split

The project is intentionally partitioned into a small, stable **core** and an opt-in **extensions** module. The core ships the primitive `Sandbox`; extensions build higher-level lifecycle and pool semantics on top of it without forcing them on every operator.

| Layer | Go package | API group / version | Kinds | Controller package |
| --- | --- | --- | --- | --- |
| Core | `sigs.k8s.io/agent-sandbox/api/v1beta1` | `agents.x-k8s.io / v1beta1` | `Sandbox` | `controllers/` |
| Extensions | `sigs.k8s.io/agent-sandbox/extensions/api/v1beta1` | `extensions.agents.x-k8s.io / v1beta1` | `SandboxClaim`, `SandboxTemplate`, `SandboxWarmPool` | `extensions/controllers/` |

Both groups are versioned `v1alpha1` in older docs but are registered today as `v1beta1` in code — see the `+groupName` markers in `api/v1beta1/groupversion_info.go:17` and `extensions/api/v1beta1/groupversion_info.go:17`, and the import `extensionsv1beta1 "sigs.k8s.io/agent-sandbox/extensions/api/v1beta1"` in `cmd/agent-sandbox-controller/main.go:34`.

The extensions form a small pattern stack on top of `Sandbox`:

- **`SandboxTemplate`** — reusable spec a `Sandbox` can be stamped from.
- **`SandboxClaim`** — user-facing request that resolves to a `Sandbox`, optionally by adopting one from a warm pool.
- **`SandboxWarmPool`** — pre-creates Sandboxes so claims can hand back a ready instance immediately.

Sources: [README.md:22-83](), [AGENTS.md:9-23](), [api/v1beta1/groupversion_info.go:14-36](), [extensions/api/v1beta1/groupversion_info.go:14-36]()

### Ownership map

```text
agent-sandbox repo
├── api/v1beta1/                ← Sandbox types (CRD source of truth)
├── controllers/                ← SandboxReconciler + envtest
├── extensions/
│   ├── api/v1beta1/            ← SandboxClaim / Template / WarmPool types
│   └── controllers/            ← three reconcilers + shared queue/
├── cmd/agent-sandbox-controller/
│   └── main.go                 ← single manager binary
├── internal/                   ← lifecycle, metrics, version (not importable externally)
├── k8s/                        ← generated CRDs, RBAC, controller manifests
├── helm/                       ← parallel Helm packaging
├── clients/
│   ├── k8s/                    ← generated typed clientset/listers/informers
│   ├── go/                     ← hand-written high-level Go SDK
│   └── python/agentic-sandbox-client/  ← Python SDK (PyPI: k8s-agent-sandbox)
├── docs/                       ← development.md, testing.md, configuration.md, keps/
├── examples/, extensions/examples/
└── test/e2e/, test/benchmarks/
```

Sources: [AGENTS.md:13-34](), [codegen.go:14-28]()

## Controller-manager entry point

A single binary (`bin/manager`, container entrypoint `/agent-sandbox-controller`) hosts every reconciler. The `extensions` flag decides whether the extension CRDs and their three reconcilers are also registered on the manager — the core path always runs.

### Startup architecture

```mermaid
flowchart LR
    subgraph CLI[cmd/agent-sandbox-controller/main.go]
      Flags[flag.Parse]
      Ver[version.Print]
      ZapLog[zap.New logger]
      Sched[ctrl.SetupSignalHandler]
      OTel[asmetrics.SetupOTel\nif --enable-tracing]
      MetricsOpts[metricsserver.Options\n+pprof handlers]
      RC[ctrl.GetConfigOrDie\n+QPS/Burst]
      MGR[ctrl.NewManager\nLeaderElectionID=a3317529...]
    end

    subgraph CoreScheme[controllers.Scheme]
      coreInit[init: clientgo + sandboxv1beta1\nAddToScheme]
    end

    subgraph Reconcilers
      SR[SandboxReconciler]
      SCR[SandboxClaimReconciler]
      STR[SandboxTemplateReconciler]
      SWP[SandboxWarmPoolReconciler]
      Q[queue.NewSimpleSandboxQueue]
    end

    subgraph Probes
      Health[/healthz/]
      Ready[/readyz/]
      Metrics[/metrics + /debug/pprof]
    end

    Flags --> RC --> MGR
    ZapLog --> MGR
    Sched --> MGR
    OTel -. instrumenter .-> SR
    OTel -. instrumenter .-> SCR
    OTel -. instrumenter .-> STR
    CoreScheme --> MGR
    MGR -->|register| SR
    MGR -->|register if --extensions| SCR
    MGR -->|register if --extensions| STR
    MGR -->|register if --extensions| SWP
    SCR <-->|warm queue| Q
    MetricsOpts --> MGR
    MGR --> Health
    MGR --> Ready
    MGR --> Metrics
    MGR -->|mgr.Start ctx| Run((reconcile loops))
```

Sources: [cmd/agent-sandbox-controller/main.go:50-295](), [controllers/sandbox_controller.go:112-128]()

### Flags and defaults

The flags exposed by `main` are the contract operators tune against. Defaults below are taken directly from the `flag.*Var` calls in `cmd/agent-sandbox-controller/main.go`.

| Flag | Default | Purpose |
| --- | --- | --- |
| `--version` | `false` | Print build info from `internal/version` and exit. |
| `--cluster-domain` | `cluster.local` | Used when the Sandbox controller composes service FQDNs. |
| `--metrics-bind-address` | `:8080` | controller-runtime metrics + optional pprof. |
| `--health-probe-bind-address` | `:8081` | `/healthz` and `/readyz` (both wired to `healthz.Ping`). |
| `--leader-elect` | `true` | Single-active manager via Lease `a3317529.agent-sandbox.x-k8s.io`. |
| `--leader-election-namespace` | `""` | Empty → controller-runtime auto-detection. |
| `--extensions` | `false` | Register `SandboxClaim`, `SandboxTemplate`, `SandboxWarmPool` reconcilers. |
| `--enable-tracing` | `false` | Initialize OTLP tracing via `asmetrics.SetupOTel` (10s init timeout). |
| `--enable-pprof` / `--enable-pprof-debug` | `false` | Mount pprof handlers; debug variant exposes heap/goroutine/block/mutex/fgprof. |
| `--pprof-block-profile-rate` | `1000000` | ns sampling rate for `/debug/pprof/block` when debug is on. |
| `--pprof-mutex-profile-fraction` | `10` | 1/N sampling for `/debug/pprof/mutex` when debug is on. |
| `--kube-api-qps` | `-1.0` | Client-side QPS limit; `-1` means unlimited. |
| `--kube-api-burst` | `10` | Client-side burst. |
| `--sandbox-concurrent-workers` | `1` | Reconciles in flight for `SandboxReconciler`. |
| `--sandbox-claim-concurrent-workers` | `1` | Reconciles in flight for `SandboxClaimReconciler`. |
| `--sandbox-warm-pool-concurrent-workers` | `1` | Reconciles in flight for `SandboxWarmPoolReconciler`. |
| `--sandbox-template-concurrent-workers` | `1` | Reconciles in flight for `SandboxTemplateReconciler`. |
| `--sandbox-warm-pool-max-batch-size` | `300` | Cap on parallel sandbox create/delete per warm-pool reconcile. |

Validation in `main` enforces that worker counts and `--kube-api-burst` are positive, warns when total workers exceed 1000 or exceed `--kube-api-burst` with a positive QPS, and clamps negative pprof sampling values to 0.

Sources: [cmd/agent-sandbox-controller/main.go:70-145](), [cmd/agent-sandbox-controller/main.go:185-227]()

### Startup sequence

The order of operations in `main` matters because some side effects (pprof, scheme registration, leader election ID) need to be in place before `mgr.Start` blocks the goroutine.

1. Parse flags, optionally short-circuit on `--version`.
2. Install the `zap` logger via `ctrl.SetLogger`.
3. Validate worker / QPS settings and log a "Concurrency settings" summary.
4. Install a signal-handled context (`ctrl.SetupSignalHandler`).
5. If `--enable-tracing` is set, build the OTLP instrumenter; otherwise use `asmetrics.NewNoOp()` so reconcilers can always call `r.Tracer.StartSpan` unconditionally.
6. Reset `http.DefaultServeMux` so importing `net/http/pprof` does **not** accidentally expose pprof on the default mux.
7. Build the scheme: always register the core (`controllers.Scheme` already includes `clientgo` + `sandboxv1beta1`); add `extensionsv1beta1` only when `--extensions` is true.
8. Configure `metricsserver.Options` with optional pprof handlers and set `runtime.SetBlockProfileRate` / `SetMutexProfileFraction` for the debug variant.
9. Take the REST config, apply QPS/Burst, build the manager with leader election ID `a3317529.agent-sandbox.x-k8s.io`.
10. Register `asmetrics.RegisterSandboxCollector` against the manager client.
11. Register `SandboxReconciler` (always) and, when `--extensions` is set, `SandboxClaimReconciler` (sharing a `queue.NewSimpleSandboxQueue` with the warm-pool reconciler), `SandboxTemplateReconciler`, and `SandboxWarmPoolReconciler`.
12. Wire `/healthz` and `/readyz` to `healthz.Ping`, then `mgr.Start(ctx)`.

Sources: [cmd/agent-sandbox-controller/main.go:151-294]()

### Scheme registration in core

`controllers.Scheme` is a package-level `runtime.Scheme` populated in an `init()` so any importer of `controllers` gets the core types registered. `cmd/main.go` then layers extensions on top of the same scheme rather than constructing a parallel one.

```go
// controllers/sandbox_controller.go
var (
    Scheme = runtime.NewScheme()
)

func init() {
    utilruntime.Must(clientgoscheme.AddToScheme(Scheme))
    utilruntime.Must(sandboxv1beta1.AddToScheme(Scheme))
}
```

Sources: [controllers/sandbox_controller.go:112-120](), [cmd/agent-sandbox-controller/main.go:174-177]()

## Map of the rest of the developer reference

This section is the navigational hub for the other pages in the wiki. Each row points to the directory or file that owns the code, so deeper pages can be opened from the actual source rather than from prose.

| Topic | Where to look | Notes |
| --- | --- | --- |
| Core `Sandbox` types and kubebuilder markers | [api/v1beta1/sandbox_types.go](api/v1beta1/sandbox_types.go) | Spec, Status, validation/defaulting markers. |
| Core reconciler | [controllers/sandbox_controller.go](controllers/sandbox_controller.go) | Ownership, finalizers, pod/service/PVC management. |
| Extension types | [extensions/api/v1beta1/](extensions/api/v1beta1/) | `sandboxclaim_types.go`, `sandboxtemplate_types.go`, `sandboxwarmpool_types.go`. |
| Extension reconcilers | [extensions/controllers/](extensions/controllers/) | Claim/template/warm-pool reconcilers, shared `queue/` package, exclusivity tests. |
| Shared internals | [internal/lifecycle/](internal/lifecycle/), [internal/metrics/](internal/metrics/), [internal/version/](internal/version/) | Expiry helpers, custom Sandbox metric collector, OTel setup, build-info injected via `-ldflags`. |
| Generated manifests | [k8s/](k8s/), [helm/](helm/) | CRDs in `k8s/crds/` and `helm/crds/`; RBAC in `*.generated.yaml`. Regenerate via `make all` per `codegen.go`. |
| Generated typed clientset | [clients/k8s/](clients/k8s/) | Output of `dev/tools/client-gen-go.sh`. Do not hand-edit. |
| Hand-written Go SDK | [clients/go/](clients/go/) | High-level `SandboxClaim` lifecycle, Gateway / port-forward / direct connectivity. |
| Hand-written Python SDK | [clients/python/agentic-sandbox-client/](clients/python/agentic-sandbox-client/) | Directory `agentic-sandbox-client`, package `k8s_agent_sandbox`, PyPI `k8s-agent-sandbox`; sync/async parity required. |
| Examples | [examples/](examples/), [extensions/examples/](extensions/examples/) | Many `README.md` files double as docs-site pages via mounts in `site/hugo.yaml`. |
| Tests | [test/e2e/](test/e2e/), [test/benchmarks/](test/benchmarks/), `*_test.go` next to code | E2E expects `bin/KUBECONFIG` from `make deploy-kind`. |
| Developer docs and KEPs | [docs/development.md](docs/development.md), [docs/testing.md](docs/testing.md), [docs/configuration.md](docs/configuration.md), [docs/keps/](docs/keps/) | Source of truth for contribution workflow and design proposals. |
| Tooling and CI | [dev/tools/](dev/tools/), [dev/ci/](dev/ci/) | Lint, generate, deploy-kind, release scripts, Prow presubmits. |
| Public site | [site/](site/) | Hugo + Docsy; many pages are includes of repo files — edits to mounted files change the public docs. |

Sources: [AGENTS.md:13-62](), [codegen.go:14-28]()

## Build, packaging, and operational shape

The container image is built from a multi-stage `Dockerfile` that compiles a static binary with version metadata injected via `-ldflags` into `internal/version`, then copies it into `gcr.io/distroless/static-debian13:nonroot`. The image entrypoint is the controller binary, so all configuration flows through the flags documented above.

```dockerfile
RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} go build \
    -ldflags="-s -w -X sigs.k8s.io/agent-sandbox/internal/version.gitVersion=${GIT_VERSION} ..." \
    -o /agent-sandbox-controller ./cmd/agent-sandbox-controller
FROM gcr.io/distroless/static-debian13:nonroot
COPY --from=builder /agent-sandbox-controller /agent-sandbox-controller
ENTRYPOINT ["/agent-sandbox-controller"]
```

For day-to-day development, every common task is fronted by a `make` target (`make all`, `make build`, `make test-unit`, `make test-e2e`, `make lint-go`, `make lint-api`, `make deploy-kind`); the `deploy-kind` target additionally accepts `EXTENSIONS=true` and `CONTROLLER_ARGS="..."` to mirror the runtime split described above. Code generation (CRDs, RBAC, deepcopy, typed clients) is driven by the `//go:generate` directives in `codegen.go` and runs through `make fix-go-generate`.

Sources: [Dockerfile:1-37](), [AGENTS.md:36-49](), [codegen.go:14-28]()

## How to navigate from here

A reader who has read this page should be able to:

- Start the binary locally via `make deploy-kind` (optionally `EXTENSIONS=true`) and know which CRDs land in the cluster.
- Open `cmd/agent-sandbox-controller/main.go` and identify which flag controls which reconciler and which side-effect (tracing, pprof, leader election, QPS shaping).
- Choose between `controllers/` and `extensions/controllers/` based on whether a change touches the core `Sandbox` primitive or a higher-level claim/template/warm-pool semantic.
- Find the right SDK surface for client work — `clients/go/` for Go, `clients/python/agentic-sandbox-client/` for Python (published as `k8s-agent-sandbox`), and `clients/k8s/` only when typed Kubernetes-style clients are needed.

Subsequent pages drill into each of these areas: the `Sandbox` reconciler's ownership model, the `SandboxClaim`/`SandboxTemplate`/`SandboxWarmPool` workflow, the shared queue and exclusivity model, the metrics/tracing internals, and the CRD/RBAC generation pipeline.

Sources: [AGENTS.md:36-62](), [cmd/agent-sandbox-controller/main.go:236-277]()
