# SandboxWarmPool CRD

> Specification of pre-warmed sandbox pools: template binding, replica counts, and adoption semantics consumed by SandboxClaim.

- Repository: kubernetes-sigs/agent-sandbox
- GitHub: https://github.com/kubernetes-sigs/agent-sandbox
- Human wiki: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a
- Complete Markdown: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a/llms-full.txt

## Source Files

- `extensions/api/v1beta1/sandboxwarmpool_types.go`
- `k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml`
- `extensions/examples/sandboxwarmpool.yaml`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [extensions/api/v1beta1/sandboxwarmpool_types.go](extensions/api/v1beta1/sandboxwarmpool_types.go)
- [k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml](k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml)
- [extensions/examples/sandboxwarmpool.yaml](extensions/examples/sandboxwarmpool.yaml)
- [extensions/controllers/sandboxwarmpool_controller.go](extensions/controllers/sandboxwarmpool_controller.go)
- [extensions/api/v1beta1/sandboxclaim_types.go](extensions/api/v1beta1/sandboxclaim_types.go)
- [extensions/controllers/sandboxclaim_controller.go](extensions/controllers/sandboxclaim_controller.go)
</details>

# SandboxWarmPool CRD

`SandboxWarmPool` is a namespaced Custom Resource in API group `extensions.agents.x-k8s.io/v1beta1` that maintains a population of pre-allocated, ready-to-use `Sandbox` objects derived from a `SandboxTemplate`. It is the supply side of the warm-pool pattern: the `SandboxWarmPool` controller continually drives the live replica count toward `spec.replicas`, while the `SandboxClaim` controller consumes those pre-warmed sandboxes through an adoption protocol that flips ownership from the pool to the claim. The CRD is short, but it ties together a template binding, an `HorizontalPodAutoscaler`-friendly `scale` subresource, an `updateStrategy` for handling template drift, and a label/ownership convention that is load-bearing for adoption.

This page documents the schema, controller behavior, the labels and owner references involved in pool membership, the two update strategies, and the handoff contract that lets `SandboxClaim` adopt pool members.

## Resource Identity and Scope

The CRD is registered as a namespaced kind with short name `swp` and exposes a `scale` subresource whose `specReplicasPath`, `statusReplicasPath`, and `labelSelectorPath` point at the warm pool's own fields. This makes the resource directly compatible with `HorizontalPodAutoscaler`, and the Go type's comment explicitly anticipates that case.

| Aspect | Value |
| --- | --- |
| API group / version | `extensions.agents.x-k8s.io/v1beta1` |
| Kind / list kind | `SandboxWarmPool` / `SandboxWarmPoolList` |
| Plural / singular | `sandboxwarmpools` / `sandboxwarmpool` |
| Short name | `swp` |
| Scope | Namespaced |
| Subresources | `status`, `scale` |
| Printer columns | `Ready` (`.status.readyReplicas`), `Age` |

Sources: [k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml:1-84](k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml), [extensions/api/v1beta1/sandboxwarmpool_types.go:86-120](extensions/api/v1beta1/sandboxwarmpool_types.go)

## Spec Schema

```go
// extensions/api/v1beta1/sandboxwarmpool_types.go
type SandboxWarmPoolSpec struct {
    Replicas       int32                          `json:"replicas"`
    TemplateRef    SandboxTemplateRef             `json:"sandboxTemplateRef,omitempty"`
    UpdateStrategy *SandboxWarmPoolUpdateStrategy `json:"updateStrategy,omitempty"`
}
```

| Field | Type | Required | Notes |
| --- | --- | --- | --- |
| `spec.replicas` | `int32`, `minimum: 0` | yes | Desired number of pre-allocated sandboxes. Exposed through the `scale` subresource for HPAs. |
| `spec.sandboxTemplateRef.name` | `string` | yes | Name of the `SandboxTemplate` (same namespace) whose `podTemplate`, `service`, and `volumeClaimTemplates` are used to build pool sandboxes. Indexed by `.spec.sandboxTemplateRef.name` via `TemplateRefField`. |
| `spec.updateStrategy.type` | `Recreate` \| `OnReplenish` | no | Defaults to `OnReplenish`. Governs how stale sandboxes are reconciled when the underlying template drifts. |

The `TemplateRefField` constant is wired into a manager field indexer in `SetupWithManager` and used by `findWarmPoolsForTemplate` so that template events fan out to all pools referencing that template name. The code comment in the spec is explicit that the JSON tag `sandboxTemplateRef` and the indexer constant must stay in sync.

Sources: [extensions/api/v1beta1/sandboxwarmpool_types.go:24-69](extensions/api/v1beta1/sandboxwarmpool_types.go), [extensions/controllers/sandboxwarmpool_controller.go:534-583](extensions/controllers/sandboxwarmpool_controller.go)

### Status Schema

```go
type SandboxWarmPoolStatus struct {
    Replicas      int32  `json:"replicas,omitempty"`
    ReadyReplicas int32  `json:"readyReplicas,omitempty"`
    Selector      string `json:"selector,omitempty"`
}
```

`status.replicas` is the count of active (non-deleting, owned-or-adopted, non-stale) sandboxes the controller currently observes. `status.readyReplicas` counts those whose `Sandbox` `Ready` condition is `True`. `status.selector` is the stringified label selector (`agents.x-k8s.io/warm-pool-sandbox=<NameHash(poolName)>`) used internally and exported so the `scale` subresource can attach an HPA via `labelSelectorPath: .status.selector`.

Status writes go through Server-Side Apply with field owner `warmpool-controller` and `ForceOwnership`, and are skipped when semantically unchanged from the prior status snapshot.

Sources: [extensions/api/v1beta1/sandboxwarmpool_types.go:71-84](extensions/api/v1beta1/sandboxwarmpool_types.go), [extensions/controllers/sandboxwarmpool_controller.go:159-170](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:416-443](extensions/controllers/sandboxwarmpool_controller.go)

## Example

```yaml
# extensions/examples/sandboxwarmpool.yaml
apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxWarmPool
metadata:
  name: sandboxwarmpool-example
spec:
  updateStrategy:
    type: Recreate
  replicas: 1
  sandboxTemplateRef:
    name: secure-datascience-template
```

Note that the bundled example file uses `v1alpha1` in its `apiVersion`, while the generated CRD only serves and stores `v1beta1` (`extensions.agents.x-k8s.io_sandboxwarmpools.yaml`). Apply against a cluster using `extensions.agents.x-k8s.io/v1beta1`.

Sources: [extensions/examples/sandboxwarmpool.yaml:1-11](extensions/examples/sandboxwarmpool.yaml), [k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml:18-27](k8s/crds/extensions.agents.x-k8s.io_sandboxwarmpools.yaml)

## Controller Architecture

```mermaid
flowchart LR
    subgraph User["User / HPA"]
        SWP["SandboxWarmPool (spec.replicas, sandboxTemplateRef, updateStrategy)"]
    end

    subgraph TemplateNS["Same namespace"]
        ST["SandboxTemplate (referenced by name)"]
    end

    subgraph Ctrl["SandboxWarmPoolReconciler"]
        REC["Reconcile()"]
        POOL["reconcilePool()"]
        FILTER["filterActiveSandboxes() — adopt orphans, drop stale"]
        STALE["isSandboxStale() — pod template hash + DeepEqual"]
        BUILD["buildSandboxCR() — apply secure defaults, set ownerRef"]
        SLOW["slowStartBatch() — create/delete in parallel"]
    end

    subgraph Cluster["Pool members"]
        SB1["Sandbox (label warm-pool-sandbox=H(poolName))"]
        SB2["Sandbox ..."]
    end

    subgraph Claim["Consumer side"]
        SC["SandboxClaim"]
        SCR["SandboxClaimReconciler.adoptSandboxFromCandidates()"]
        Q["WarmSandboxQueue (templateRefHash → sandbox keys)"]
    end

    SWP --> REC --> POOL --> FILTER --> STALE
    POOL -->|need more| BUILD --> SLOW --> SB1
    POOL -->|too many| SLOW -.->|delete| SB2
    ST -. watched .-> REC
    SB1 -. ownedBy .-> SWP
    SB1 -. enqueued by hash .-> Q --> SCR --> SC
    SCR -->|completeAdoption: strip labels, reset ownerRef| SB1
```

`SandboxWarmPoolReconciler` owns the supply side. It watches `SandboxWarmPool` objects, owns `Sandbox` objects via `Owns(&sandboxv1beta1.Sandbox{})`, and also watches `SandboxTemplate` so that a template change triggers reconciliation of every warm pool that references it. Concurrency is controlled by `MaxConcurrentReconciles` passed at setup, and `MaxBatchSize` (defaulting to `sandboxCreateDeleteMaxBatchSize = 300`) caps the number of creates/deletes per reconcile pass.

Sources: [extensions/controllers/sandboxwarmpool_controller.go:48-66](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:534-583](extensions/controllers/sandboxwarmpool_controller.go)

## Reconciliation Loop

`Reconcile` fetches the pool, returns early on deletion, snapshots the status, calls `reconcilePool`, and then patches the status via SSA. The interesting work happens in `reconcilePool` and its helpers.

1. **Hash the pool name.** `poolNameHash := sandboxcontrollers.NameHash(warmPool.Name)` produces an 8-character hash used as the value of the membership label `agents.x-k8s.io/warm-pool-sandbox`.
2. **List candidate sandboxes.** Sandboxes carrying that label are listed within the pool's namespace.
3. **Resolve the template and compute its pod-template hash.** `fetchTemplateAndHash` retrieves the `SandboxTemplate` and computes `computePodTemplateHash(template)` from `Spec.PodTemplate` (JSON-marshaled, then hashed). Failures other than `NotFound` are joined into the returned error so reconciliation can still proceed for create/delete decisions when the template is missing.
4. **Filter, adopt, drop stale.** `filterActiveSandboxes` walks each candidate and decides between *ignore*, *adopt*, *delete stale*, or *keep active* (see [Adoption and Ownership](#adoption-and-ownership) and [Update Strategies](#update-strategies)).
5. **Sweep stuck sandboxes.** Any active sandbox that is not `Ready` and is older than the constant `warmPoolReadinessGracePeriod = 5 * time.Minute` is deleted. This bounds how long a wedged pre-warm can occupy a slot.
6. **Compute deltas.** `currentReplicas` is the count of healthy active sandboxes; the controller creates or deletes to converge toward `spec.replicas`. Both operations use `slowStartBatch`, which starts with one parallel call and doubles the batch on every success up to the remaining count, capped per reconcile by `MaxBatchSize`.
7. **Delete ordering.** When over-provisioned, sandboxes are sorted so that *unready first, then newest first* are deleted, preserving older Ready members that are the most valuable adoption candidates.
8. **Status update.** `status.replicas`, `status.readyReplicas`, and `status.selector` are written and applied with SSA only if changed.

Sources: [extensions/controllers/sandboxwarmpool_controller.go:67-229](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:303-325](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:585-621](extensions/controllers/sandboxwarmpool_controller.go)

## Labels, Annotations, and Ownership

`buildSandboxCR` constructs each pool member. It sets a controller `OwnerReference` from the new `Sandbox` to its `SandboxWarmPool` (so deletions cascade), copies `template.Spec.PodTemplate.ObjectMeta.Labels` and `Annotations` into the pod template, and overlays the following membership/identity labels and annotations:

| Key | Where | Value | Purpose |
| --- | --- | --- | --- |
| `agents.x-k8s.io/warm-pool-sandbox` | Sandbox + pod template labels | `NameHash(poolName)` | Pool membership; used as the list selector and in `status.selector`. Also used by `SandboxClaim` to pin adoption to a specific pool. |
| `agents.x-k8s.io/sandbox-template-ref-hash` | Sandbox + pod template labels | `SandboxTemplateRefHash(templateRefName)` | Allows `SandboxClaim` to find pool members for a given template by hash. |
| `SandboxPodTemplateHashLabel` (from `sandboxv1beta1`) | Sandbox + pod template labels | `computePodTemplateHash(template)` | Identifies the exact pod-template revision a member was built from; consumed by `isSandboxStale`. |
| `SandboxTemplateRefAnnotation` | Sandbox annotation | `warmPool.Spec.TemplateRef.Name` | Plain-text record of which template produced the sandbox. |

The pod spec is normalized with `ApplySandboxSecureDefaults(template, &sandbox.Spec.PodTemplate.Spec)` before creation, and the same normalization is reapplied when computing the expected spec in `comparePodSpecs`. This is what lets the staleness check use `equality.Semantic.DeepEqual` without false positives from controller-defaulted fields.

Sources: [extensions/controllers/sandboxwarmpool_controller.go:48-52](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:327-390](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:521-531](extensions/controllers/sandboxwarmpool_controller.go)

## Adoption and Ownership

`filterActiveSandboxes` distinguishes three states for each labeled sandbox:

```text
┌──────────────────────────────┐
│   Sandbox carries pool label │
└────────────┬─────────────────┘
             ▼
   controllerRef? ──── nil (orphan)
        │                 │
        │                 ├──> If stale → Delete
        │                 └──> Else      → adoptSandbox()  ───┐
        │                                                     │
        ├── ref.UID == warmPool.UID  (owned)                  │
        │   └──> If Recreate strategy && stale → Delete       │
        │   └──> Else → keep                                   │
        │                                                     │
        └── ref.UID != warmPool.UID  (foreign)                │
            └──> Ignore (log only)                            │
                                                              │
adoptSandbox: SetControllerReference(warmPool, sb)  <─────────┘
              + r.Update(ctx, sb)
```

Adoption is the mechanism that lets an orphaned, label-matching sandbox rejoin a pool — for example, after the previous owner pool was deleted or after the `SandboxClaim` controller's owner-reference flip raced with the pool's lister. Conversely, sandboxes whose `controllerRef.UID` does not match the pool are explicitly ignored: once a `SandboxClaim` takes ownership during adoption, that sandbox is no longer counted toward `status.replicas`.

Sources: [extensions/controllers/sandboxwarmpool_controller.go:231-301](extensions/controllers/sandboxwarmpool_controller.go)

## Update Strategies

```go
const (
    RecreateSandboxWarmPoolUpdateStrategyType    SandboxWarmPoolUpdateStrategyType = "Recreate"
    OnReplenishSandboxWarmPoolUpdateStrategyType SandboxWarmPoolUpdateStrategyType = "OnReplenish"
)
```

| Strategy | Behavior when template drifts | When it triggers replacement |
| --- | --- | --- |
| `OnReplenish` (default) | Stale members keep running until they are manually deleted or adopted out by a `SandboxClaim`. Fresh members built from the new template fill the gap on the next reconcile. | Replacement happens lazily as members are removed for other reasons. |
| `Recreate` | Stale members are eagerly deleted in the reconcile pass. The pool then refills under `slowStartBatch`. Per the type comment, only `PodTemplate` spec changes trigger recreate; pure label/annotation edits on the template do not. | Replacement happens as soon as the controller observes drift. |

`isSandboxStale` evaluates drift in three layers, short-circuiting where possible to avoid per-sandbox `DeepEqual` work:

1. If the sandbox's `agents.x-k8s.io/sandbox-template-ref-hash` label does not match `SandboxTemplateRefHash(template.Name)`, it is stale (template binding changed at the name level).
2. For owned members, if the recorded `SandboxPodTemplateHashLabel` matches `currentPodTemplateHash`, it is fresh — no further work.
3. Otherwise (mismatched hashes, or orphan), `comparePodSpecs` runs `ApplySandboxSecureDefaults` against the template and uses `equality.Semantic.DeepEqual` to compare. Results are memoized in `vettedHashes` per reconcile so identical legacy revisions are only compared once.

If `currentPodTemplateHash` could not be computed (marshal error), the staleness check is skipped to avoid a mass deletion on a transient hashing failure.

Note that the `updateStrategy.type` field has `+kubebuilder:default=OnReplenish`. The Go path additionally falls back to `OnReplenish` if `spec.updateStrategy` is unset or contains an unknown value.

Sources: [extensions/api/v1beta1/sandboxwarmpool_types.go:49-69](extensions/api/v1beta1/sandboxwarmpool_types.go), [extensions/controllers/sandboxwarmpool_controller.go:240-301](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:462-531](extensions/controllers/sandboxwarmpool_controller.go)

## Consumption by SandboxClaim

A `SandboxClaim` chooses how to consume warm pools through its `spec.warmpool` field (`WarmPoolPolicy`):

| Policy value | Meaning |
| --- | --- |
| `"default"` (default) | Adopt any pool member whose `sandbox-template-ref-hash` matches the claim's template. |
| `"none"` | Skip warm pools entirely and always create a cold sandbox. |
| `<pool name>` | Only adopt members carrying `agents.x-k8s.io/warm-pool-sandbox = NameHash(<pool name>)`. |

The `SandboxClaim` controller keeps a `WarmSandboxQueue` keyed by `SandboxTemplateRefHash(templateRefName)`. When picking a candidate, `getCandidate` enforces the `IsSpecificPool` check by comparing the candidate's pool-membership label against `NameHash(string(policy))`, returning unmatched candidates to the queue.

Adoption completes in `completeAdoption`, which is what makes the handoff irreversible from the pool's perspective:

```go
// extensions/controllers/sandboxclaim_controller.go
delete(adopted.Labels, warmPoolSandboxLabel)
delete(adopted.Labels, sandboxTemplateRefHash)
delete(adopted.Labels, v1beta1.SandboxPodTemplateHashLabel)

// Transfer ownership from SandboxWarmPool to SandboxClaim
adopted.OwnerReferences = nil
if err := controllerutil.SetControllerReference(claim, adopted, r.Scheme); err != nil {
    return fmt.Errorf("failed to set controller reference on adopted sandbox: %w", err)
}
```

After this point, the pool controller's list (filtered by the membership label) no longer sees the sandbox, so its `status.replicas` drops by one and the next reconcile schedules a fresh replacement. The handoff also explains the asymmetry of update strategies: `OnReplenish` deliberately leans on this drain-and-refill path to roll the pool forward.

One constraint propagates from the warm-pool design to claims: when `WarmPool` policy is anything other than `none`, `SandboxClaim` rejects `spec.env`, because injecting environment variables into an already-running pool sandbox is not supported.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:33-55](extensions/api/v1beta1/sandboxclaim_types.go), [extensions/api/v1beta1/sandboxclaim_types.go:124-151](extensions/api/v1beta1/sandboxclaim_types.go), [extensions/controllers/sandboxclaim_controller.go:591-645](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxclaim_controller.go:728-741](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxclaim_controller.go:1155-1178](extensions/controllers/sandboxclaim_controller.go)

## Operational Notes

- **HPA integration.** Because the `scale` subresource maps `spec.replicas` and `status.replicas` plus a label selector (`status.selector`), a standard `HorizontalPodAutoscaler` can target a `SandboxWarmPool` directly. The Go comment on `Spec.Replicas` calls this out explicitly.
- **Throughput caps.** `MaxBatchSize` (default 300) bounds creates and deletes per reconcile; the `slowStartBatch` helper doubles parallelism from 1 to up to that cap per pass, which keeps the cluster from being slammed with simultaneous Sandbox creates on startup.
- **Readiness watchdog.** Sandboxes that fail to reach `Ready` within `warmPoolReadinessGracePeriod` (5 minutes) are deleted on the next reconcile, ensuring the pool does not accumulate wedged slots.
- **Cross-resource indexing.** The `TemplateRefField = ".spec.sandboxTemplateRef.name"` constant is shared with `SandboxClaim` for the same purpose; both controllers register a field indexer so that watching `SandboxTemplate` produces correct fan-out via `findWarmPoolsForTemplate`.
- **Deletion.** When a `SandboxWarmPool` is being deleted (`DeletionTimestamp` non-zero), `Reconcile` returns immediately; cascading delete of the owned `Sandbox` objects is left to Kubernetes garbage collection driven by the controller-owner reference set in `buildSandboxCR`.

Sources: [extensions/api/v1beta1/sandboxwarmpool_types.go:24-47](extensions/api/v1beta1/sandboxwarmpool_types.go), [extensions/controllers/sandboxwarmpool_controller.go:48-52](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:80-148](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:534-583](extensions/controllers/sandboxwarmpool_controller.go)

## Summary

`SandboxWarmPool` is a small CRD with three knobs — `replicas`, `sandboxTemplateRef`, and `updateStrategy` — that drive a controller responsible for keeping a population of pre-warmed `Sandbox` objects in lockstep with a referenced `SandboxTemplate`. Membership is encoded in a hashed label on each pool member, ownership is held by the pool until a `SandboxClaim` adopts a member, and template drift is handled either lazily (`OnReplenish`) or eagerly (`Recreate`). The `scale` subresource and `status.selector` make the pool first-class for HPAs, while the slow-start batching, the `5 m` readiness watchdog, and the staleness hash cache shape the controller's runtime behavior under churn.
