# SandboxClaim CRD

> Claim resource that resolves to a Sandbox: template references, warm-pool policy, env injection, additional pod metadata, and shutdown policies (Delete, DeleteForeground, Retain).

- Repository: kubernetes-sigs/agent-sandbox
- GitHub: https://github.com/kubernetes-sigs/agent-sandbox
- Human wiki: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a
- Complete Markdown: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a/llms-full.txt

## Source Files

- `extensions/api/v1beta1/sandboxclaim_types.go`
- `k8s/crds/extensions.agents.x-k8s.io_sandboxclaims.yaml`
- `extensions/examples/sandboxclaim.yaml`
- `extensions/examples/sandbox-claim.yaml`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [extensions/api/v1beta1/sandboxclaim_types.go](extensions/api/v1beta1/sandboxclaim_types.go)
- [k8s/crds/extensions.agents.x-k8s.io_sandboxclaims.yaml](k8s/crds/extensions.agents.x-k8s.io_sandboxclaims.yaml)
- [extensions/examples/sandboxclaim.yaml](extensions/examples/sandboxclaim.yaml)
- [extensions/examples/sandbox-claim.yaml](extensions/examples/sandbox-claim.yaml)
- [extensions/controllers/sandboxclaim_controller.go](extensions/controllers/sandboxclaim_controller.go)
- [extensions/api/v1beta1/sandboxtemplate_types.go](extensions/api/v1beta1/sandboxtemplate_types.go)
- [api/v1beta1/sandbox_types.go](api/v1beta1/sandbox_types.go)
</details>

# SandboxClaim CRD

`SandboxClaim` is the namespaced "user-intent" resource in the `extensions.agents.x-k8s.io` API group. A claim names a `SandboxTemplate`, optionally narrows warm-pool selection, layers extra pod labels/annotations on top of the template, injects environment variables, and declares when and how the resulting `Sandbox` should be torn down. The `SandboxClaimReconciler` resolves the claim into exactly one owned `Sandbox` in the same namespace, either by adopting a pre-provisioned pod from a warm pool or by creating a fresh one from the template.

This page documents the schema, the spec/status surfaces, the warm-pool resolution flow, metadata and environment-variable injection rules, the lifecycle and shutdown policies (`Delete`, `DeleteForeground`, `Retain`), and the status conditions emitted by the controller.

## API identification and scoping

The custom resource is registered as `SandboxClaim` (singular/short name `sandboxclaim`, plural `sandboxclaims`) in group `extensions.agents.x-k8s.io`, namespaced, with `v1beta1` served and stored. The `status` subresource is enabled so spec/status updates are decoupled.

```yaml
group: extensions.agents.x-k8s.io
names: { kind: SandboxClaim, plural: sandboxclaims, singular: sandboxclaim, shortNames: [sandboxclaim] }
scope: Namespaced
versions:
- name: v1beta1
  served: true
  storage: true
  subresources: { status: {} }
```

Note that the example manifests under `extensions/examples/` declare `apiVersion: extensions.agents.x-k8s.io/v1alpha1`, while the CRD published in `k8s/crds/` only serves `v1beta1`. Use `v1beta1` to match the installed CRD.

Sources: [k8s/crds/extensions.agents.x-k8s.io_sandboxclaims.yaml:1-141](), [extensions/api/v1beta1/sandboxclaim_types.go:175-202](), [extensions/examples/sandboxclaim.yaml:1-20]()

## Spec surface

`SandboxClaimSpec` is intentionally small — the heavy lifting belongs in the referenced `SandboxTemplate`. The claim only carries fields that vary per consumer.

| Field | Type | Required | Default | Purpose |
| --- | --- | --- | --- | --- |
| `sandboxTemplateRef.name` | string | yes | — | Name of the `SandboxTemplate` in the same namespace. |
| `warmpool` | string | no | `default` | One of `none`, `default`, or a specific warm-pool name. Controls adoption. |
| `additionalPodMetadata` | object | no | `{}` | Extra `labels`/`annotations` merged onto the pod template. |
| `env` | `[]EnvVar` | no | `[]` | Environment variables injected into one or more containers. |
| `lifecycle.shutdownTime` | RFC3339 timestamp | no | — | Absolute expiration time for the claim. |
| `lifecycle.ttlSecondsAfterFinished` | int32 ≥ 0 | no | — | Retention window after the mirrored `Finished` condition transitions. |
| `lifecycle.shutdownPolicy` | enum | no | `Retain` | One of `Delete`, `DeleteForeground`, `Retain`. |

`sandboxTemplateRef` only carries `name`; cross-namespace references are not modeled — the template must live in the same namespace as the claim. The OpenAPI schema enforces both the `sandboxTemplateRef` requirement and the `shutdownPolicy` enum.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:101-151](), [k8s/crds/extensions.agents.x-k8s.io_sandboxclaims.yaml:29-86]()

### Template reference

```go
type SandboxTemplateRef struct {
    Name string `json:"name,omitempty"`
}
```

The reconciler resolves the template only when needed: it first tries to find or adopt an existing `Sandbox` and only fetches the template if it has to create one from scratch, or if metadata needs to be merged after adoption. A missing template is requeued (`ErrTemplateNotFound`) instead of returned as an error to avoid log spam.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:101-106](), [extensions/controllers/sandboxclaim_controller.go:62-62](), [extensions/controllers/sandboxclaim_controller.go:263-272](), [extensions/controllers/sandboxclaim_controller.go:1182-1197]()

### Warm-pool policy

`WarmPoolPolicy` is a free-form string with two sentinel values:

| Value | Meaning |
| --- | --- |
| `none` | Never adopt from a warm pool; always cold-start from the template. |
| `default` | Adopt from any matching warm pool (default). |
| any other string | Adopt only from the named pool. `IsSpecificPool()` returns true. |

Specific-pool matching is enforced in the adoption loop by comparing `Labels[warmPoolSandboxLabel]` against `NameHash(policy)`; non-matching candidates are pushed back onto the queue. Two important interactions:

- If `warmpool != none` **and** the claim sets `spec.env`, the controller refuses adoption and returns an error — env injection mutates the pod spec at create time and cannot be applied to a pre-running warm sandbox.
- If `warmpool == none`, the controller skips the warm-pool queue entirely and falls through to `createSandbox`.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:33-55](), [extensions/controllers/sandboxclaim_controller.go:74-80](), [extensions/controllers/sandboxclaim_controller.go:591-646](), [extensions/controllers/sandboxclaim_controller.go:1155-1180]()

### Additional pod metadata

`additionalPodMetadata` reuses `sandboxv1beta1.PodMetadata` (labels + annotations) and is merged onto the pod template that ends up in `Sandbox.spec.podTemplate.metadata`. Two rules are enforced server-side by the controller:

1. **No restricted-domain keys.** Keys whose domain prefix is `kubernetes.io`, `k8s.io`, or `agents.x-k8s.io` (or any subdomain) are rejected via `ErrInvalidMetadata`. Label values must additionally pass `validation.IsValidLabelValue` (max 63 chars, standard pattern).
2. **No silent overrides.** If the template already defines the same label or annotation key with a different value, `mergePodMetadata` returns a metadata-override conflict error. Identical values are allowed; new keys are appended.

The merged metadata also receives controller-injected identity labels (`agents.x-k8s.io/claim-uid` from the claim UID and `agents.x-k8s.io/sandbox-template-ref-hash`). On every reconcile, the controller diffs the recomputed `mergedMeta` against the live `Sandbox.spec.podTemplate.ObjectMeta` and pushes an update when they drift, so changes to `additionalPodMetadata` propagate even after creation.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:142-145](), [api/v1beta1/sandbox_types.go:68-82](), [extensions/controllers/sandboxclaim_controller.go:64-72](), [extensions/controllers/sandboxclaim_controller.go:324-374](), [extensions/controllers/sandboxclaim_controller.go:806-894]()

### Env injection

Each entry in `spec.env` is `{name, value, containerName?}`. Both `name` and `value` are required by the schema; `containerName` is optional.

```go
type EnvVar struct {
    Name          string `json:"name"`
    Value         string `json:"value"`
    ContainerName string `json:"containerName,omitempty"`
}
```

Resolution rules:

- The template's `envVarsInjectionPolicy` gates whether injection happens at all: `Allowed` permits new variables, `Overrides` additionally permits replacing existing values, and any other value (including the explicit `Disallowed`) causes the create to fail with "environment variable injection is not allowed by the template policy."
- Env entries are grouped by `containerName`. Entries without a `containerName` (the "default" bucket) are appended only to the **first** main container in the template (`Spec.Containers[0]`).
- Entries with a `containerName` target that exact container, scanning both `InitContainers` and `Containers`. If the named container is not present in the resolved pod template, the reconcile fails with `target container %q not found in template for environment variable %q`.
- For each var, if a same-name entry already exists on the container, the injection is treated as an override and is rejected unless the template's policy is `Overrides`; otherwise it is appended.

Combined with the warm-pool rule above, env injection is only allowed when the claim does not adopt — that is, when a fresh sandbox is created from the template by `createSandbox`.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:108-122](), [extensions/api/v1beta1/sandboxtemplate_types.go:30-55](), [extensions/controllers/sandboxclaim_controller.go:896-921](), [extensions/controllers/sandboxclaim_controller.go:972-1038](), [extensions/controllers/sandboxclaim_controller.go:1157-1162]()

## Lifecycle and shutdown policies

`Lifecycle` separates *when* the claim expires from *what to do* when it does. Three policies are defined in `ShutdownPolicy`:

```go
ShutdownPolicyDelete           = "Delete"
ShutdownPolicyDeleteForeground = "DeleteForeground"
ShutdownPolicyRetain           = "Retain"  // default
```

| Policy | API behavior on expiration | Resource cleanup | When to use |
| --- | --- | --- | --- |
| `Delete` | The `SandboxClaim` object is deleted (default propagation), which cascades to the owned `Sandbox`. | Everything is garbage-collected. | You don't need to observe shutdown completion. |
| `DeleteForeground` | The claim is deleted with `metav1.DeletePropagationForeground`. The claim stays in the API with a `deletionTimestamp` until its `Sandbox`/Pod terminate. | Same as Delete, but observable. | External systems poll for the claim's disappearance as a "fully torn down" signal. |
| `Retain` | The claim object is preserved; only the owned `Sandbox` (and its Pod, Service, etc.) are deleted. Status reflects `ClaimExpired`. | Sandbox resources are reclaimed; the claim record stays. | Historical/audit retention, or driving downstream cleanup off the persisted claim. |

The reconciler computes expiration via `lifecycle.TimeLeft(now, shutdownTime, ttlSecondsAfterFinished, finishedCondition)`. If `claim.Spec.Lifecycle` is nil, the claim never expires. When expired:

- `Delete` / `DeleteForeground` paths short-circuit the reconcile after emitting a `ClaimExpired` event and issuing `r.Delete` with the appropriate propagation option. The reconcile returns immediately because subsequent status updates against a deleted object would fail.
- `Retain` falls through to `reconcileExpired`, which fetches the owned sandbox, verifies controller-ownership, and issues a delete on the sandbox while leaving the claim in place. If the sandbox is not controlled by this claim, the call fails with `ErrSandboxNotOwned` (suppressed in the requeue path to avoid a crash loop).

The `ttlSecondsAfterFinished` countdown is anchored on the `Finished` condition mirrored from the underlying `Sandbox`. The claim does **not** propagate `shutdownTime` down to the `Sandbox`; expiration is enforced entirely by the claim controller.

```text
                      lifecycle.TimeLeft(now, shutdownTime, ttlSecondsAfterFinished, finishedCond)
                                          │
                  ┌───────────────────────┼───────────────────────┐
       claim not expired             expired + Delete*       expired + Retain
                  │                       │                       │
                  ▼                       ▼                       ▼
          reconcileActive          r.Delete(claim, ...)     reconcileExpired
       (adopt or create)          [Foreground? add opt]    (delete owned Sandbox,
                                                            keep the claim)
```

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:57-99](), [extensions/controllers/sandboxclaim_controller.go:165-282](), [extensions/controllers/sandboxclaim_controller.go:309-317](), [extensions/controllers/sandboxclaim_controller.go:388-420]()

## Resolution flow

A single `Reconcile` pass walks the claim through expiration check, fast-path adoption, optional cold creation, and status update.

```mermaid
flowchart TD
    subgraph API["extensions.agents.x-k8s.io / v1beta1"]
      Claim["SandboxClaim<br/>spec.sandboxTemplateRef<br/>spec.warmpool<br/>spec.env<br/>spec.additionalPodMetadata<br/>spec.lifecycle"]
    end

    subgraph Reconciler["SandboxClaimReconciler"]
      Reconcile["Reconcile()"]
      Expire["checkExpiration()"]
      Active["reconcileActive()"]
      Expired["reconcileExpired()"]
      GetOrCreate["getOrCreateSandbox()"]
      Adopt["adoptSandboxFromCandidates()"]
      Create["createSandbox()"]
      Merge["mergePodMetadata()"]
      InjectEnv["injectEnvs()"]
      Status["computeAndSetStatus() / updateStatus()"]
    end

    subgraph WarmPool["Warm pool side"]
      Queue["WarmSandboxQueue<br/>(templateHash → SandboxKey)"]
      Template["SandboxTemplate"]
    end

    subgraph Core["agents.x-k8s.io / Sandbox"]
      Sandbox["Sandbox (owned)"]
    end

    Claim --> Reconcile
    Reconcile --> Expire
    Expire -->|not expired| Active
    Expire -->|expired + Delete/DeleteForeground| Reconcile -.->|r.Delete claim| Claim
    Expire -->|expired + Retain| Expired --> Sandbox
    Active --> GetOrCreate
    GetOrCreate -->|status/label hit or name lookup| Sandbox
    GetOrCreate -->|warmpool != none| Adopt --> Queue
    Adopt -->|adopt success| Sandbox
    GetOrCreate -->|no candidate or warmpool=none| Create
    Create --> Template
    Create --> Merge --> Sandbox
    Create --> InjectEnv --> Sandbox
    Active --> Status
    Expired --> Status
    Status --> Claim
```

Key invariants in `reconcileActive` / `getOrCreateSandbox`:

1. The status pointer (`claim.Status.SandboxStatus.Name`) is the primary discovery hint; the `agents.x-k8s.io/sandbox-name` label on the claim is the secondary one. Both are validated with `metav1.IsControlledBy` before being trusted.
2. Name-based lookup uses `claim.Name` as the sandbox name when creating from scratch (`createSandbox` writes `Name: claim.Name`), so re-running a reconcile is idempotent.
3. Warm-pool adoption is a two-phase update: first the claim is patched with the `agents.x-k8s.io/sandbox-name` label under optimistic locking, then `completeAdoption` strips warm-pool labels (`warmPoolSandboxLabel`, `sandboxTemplateRefHash`, `agents.x-k8s.io/sandbox-pod-template-hash`), re-parents the sandbox via `SetControllerReference(claim, ...)`, and forces `Spec.PodTemplate.ObjectMeta` to the merged metadata.
4. Cross-namespace adoption is rejected (`ErrCrossNamespaceAdoption`); ineligible candidates are pushed back onto the queue rather than dropped.

Sources: [extensions/controllers/sandboxclaim_controller.go:140-282](), [extensions/controllers/sandboxclaim_controller.go:319-386](), [extensions/controllers/sandboxclaim_controller.go:591-794](), [extensions/controllers/sandboxclaim_controller.go:923-1068](), [extensions/controllers/sandboxclaim_controller.go:1070-1180]()

## Status

`SandboxClaimStatus` exposes two fields:

```go
type SandboxClaimStatus struct {
    Conditions    []metav1.Condition `json:"conditions,omitempty"`
    SandboxStatus SandboxStatus      `json:"sandbox,omitempty"`
}

type SandboxStatus struct {
    Name   string   `json:"name,omitempty"`   // resolved Sandbox name
    PodIPs []string `json:"podIPs,omitempty"` // mirrored from the Sandbox
}
```

The reconciler maintains two condition types on the claim:

| Condition | Source | Representative `Reason` values |
| --- | --- | --- |
| `Ready` | Computed by `computeReadyCondition`. Mirrors `Ready` from the owned `Sandbox` while the claim is healthy. | `TemplateNotFound`, `InvalidMetadata`, `SandboxMissing`, `SandboxNotReady`, `Expired`, `ClaimExpired`, `ReconcilerError`. |
| `Finished` | Mirrored from `Sandbox.status.conditions[Finished]` via `syncFinishedCondition`. Removed when the sandbox is gone and the claim is not yet expired. | Carried through from the core controller. |

When the underlying `Sandbox` reports `SandboxReasonExpired`, the claim's `Ready` condition surfaces the same reason so callers can distinguish "sandbox expired on its own" from "claim expired and reaped the sandbox" (the latter uses `Reason=ClaimExpired`). The controller emits Kubernetes events with reasons `ClaimExpired`, `SandboxAdopted`, and `SandboxProvisioned` corresponding to the major lifecycle transitions.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:153-173](), [extensions/controllers/sandboxclaim_controller.go:459-576](), [extensions/controllers/sandboxclaim_controller.go:185-188](), [extensions/controllers/sandboxclaim_controller.go:700-712]()

## Example manifests

Minimal claim (template ref only, defaults to `warmpool: default`, no lifecycle, `Retain` policy implicit):

```yaml
# extensions/examples/sandboxclaim.yaml
apiVersion: extensions.agents.x-k8s.io/v1beta1   # use v1beta1 (CRD-served version)
kind: SandboxClaim
metadata:
  name: my-secure-sandbox
  namespace: default
spec:
  sandboxTemplateRef:
    name: secure-datascience-template
  # warmpool: "default"        # implicit default
```

Claim with an explicit expiration and clean-up policy:

```yaml
# extensions/examples/sandbox-claim.yaml
apiVersion: extensions.agents.x-k8s.io/v1beta1
kind: SandboxClaim
metadata:
  name: my-secure-sandbox
  namespace: default
spec:
  sandboxTemplateRef:
    name: secure-datascience-template
  lifecycle:
    shutdownPolicy: Delete                  # Delete | DeleteForeground | Retain
    shutdownTime: "2025-12-31T23:59:59Z"
```

A richer claim that exercises every field:

```yaml
apiVersion: extensions.agents.x-k8s.io/v1beta1
kind: SandboxClaim
metadata:
  name: data-prep
  namespace: ml-team
spec:
  sandboxTemplateRef:
    name: jupyter-template
  warmpool: fast-pool                       # adopt only from this named pool
  additionalPodMetadata:
    labels:
      team: ml
    annotations:
      cost-center: "1234"
  env:                                       # requires template policy Allowed/Overrides
    - name: NOTEBOOK_DIR
      value: /workspace
    - name: GPU_FLAGS
      value: "--mig"
      containerName: trainer                 # target a specific container
  lifecycle:
    shutdownTime: "2026-01-01T00:00:00Z"
    ttlSecondsAfterFinished: 600
    shutdownPolicy: DeleteForeground
```

Sources: [extensions/examples/sandboxclaim.yaml:1-20](), [extensions/examples/sandbox-claim.yaml:1-20]()

## Summary

The `SandboxClaim` CRD is a thin, intent-bearing wrapper around a `SandboxTemplate`: it declares which template to use, whether warm-pool adoption is allowed, what extra metadata and env vars to apply, and when/how the resulting sandbox should be torn down. The controller enforces a strict separation between adopt (no env injection, no template overrides) and create-from-template (full merge with no key conflicts on `additionalPodMetadata`), and pins the expiration semantics to the claim with three distinct shutdown behaviors — `Delete`, `DeleteForeground`, and `Retain` — so callers can pick between fire-and-forget cleanup, observable teardown, and audit-style retention without changing the underlying `Sandbox` API.
