# Conditions, Reasons & Status Surfaces

> Catalogue of condition types (Ready, Suspended, Finished), reason strings, and the annotation/label keys (pod-name, template-ref, propagated-labels) that controllers use to coordinate state.

- Repository: kubernetes-sigs/agent-sandbox
- GitHub: https://github.com/kubernetes-sigs/agent-sandbox
- Human wiki: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a
- Complete Markdown: https://grok-wiki.com/public/wiki/kubernetes-sigs-agent-sandbox-c3f2597a654a/llms-full.txt

## Source Files

- `api/v1beta1/sandbox_types.go`
- `extensions/api/v1beta1/sandboxclaim_types.go`
- `extensions/api/v1beta1/sandboxwarmpool_types.go`
- `docs/api.md`

---

<details>
<summary>Relevant source files</summary>
The following files were used as context for generating this wiki page:
- [api/v1beta1/sandbox_types.go](api/v1beta1/sandbox_types.go)
- [extensions/api/v1beta1/sandboxclaim_types.go](extensions/api/v1beta1/sandboxclaim_types.go)
- [extensions/api/v1beta1/sandboxwarmpool_types.go](extensions/api/v1beta1/sandboxwarmpool_types.go)
- [extensions/api/v1beta1/sandboxtemplate_types.go](extensions/api/v1beta1/sandboxtemplate_types.go)
- [controllers/sandbox_controller.go](controllers/sandbox_controller.go)
- [extensions/controllers/sandboxclaim_controller.go](extensions/controllers/sandboxclaim_controller.go)
- [extensions/controllers/sandboxwarmpool_controller.go](extensions/controllers/sandboxwarmpool_controller.go)
- [internal/lifecycle/expiry.go](internal/lifecycle/expiry.go)
</details>

# Conditions, Reasons & Status Surfaces

This page catalogues the **status conditions** (`Ready`, `Suspended`, `Finished`), their machine-readable **reason strings**, and the **annotation/label keys** the `agent-sandbox` controllers use to coordinate state between `Sandbox`, `SandboxClaim`, and `SandboxWarmPool`. These surfaces are the wire format by which the core `Sandbox` controller publishes pod state, the `SandboxClaim` controller mirrors and translates that state for callers, and the `SandboxWarmPool` controller tracks which pods are still considered fresh for adoption.

Every condition follows the standard `metav1.Condition` shape (`Type`, `Status`, `Reason`, `Message`, `ObservedGeneration`, `LastTransitionTime`), so consumers can rely on `meta.FindStatusCondition` semantics. The annotation/label vocabulary is intentionally small and namespaced under `agents.x-k8s.io/` so callers can write selectors and tooling against a stable set of keys.

## Condition Types

The three condition types reported on `Sandbox.status.conditions` are declared as `ConditionType` constants in `api/v1beta1/sandbox_types.go`. The `SandboxClaim` re-uses the same condition type strings for `Ready` and `Finished` so a caller can read either resource with the same parsing logic.

| Type | Declared at | Reported on | Meaning |
|------|-------------|-------------|---------|
| `Ready` | `SandboxConditionReady` | `Sandbox.status`, `SandboxClaim.status` | The sandbox's pod (and optional Service) are observed Ready, or — when False — the reason captures why not. |
| `Suspended` | `SandboxConditionSuspended` | `Sandbox.status` only | `spec.replicas == 0`. Tracks whether the underlying pod has actually been torn down yet. |
| `Finished` | `SandboxConditionFinished` | `Sandbox.status`, mirrored onto `SandboxClaim.status` | The backing pod has reached a terminal phase (`PodSucceeded`/`PodFailed`). Only present while the pod still exists. |

Sources: [api/v1beta1/sandbox_types.go:27-55](api/v1beta1/sandbox_types.go), [controllers/sandbox_controller.go:273-417](controllers/sandbox_controller.go)

### Ready

`computeReadyCondition` is the single funnel for `Sandbox`'s `Ready` value. It starts pessimistic (`False`, `Reason=DependenciesNotReady`) and only transitions to `True` once the pod has `PodReady=True`, has at least one `PodIP`, and the optional headless `Service` is present when required.

```go
// controllers/sandbox_controller.go (excerpt)
readyCondition := metav1.Condition{
    Type:               string(sandboxv1beta1.SandboxConditionReady),
    ObservedGeneration: sandbox.Generation,
    Status:             metav1.ConditionFalse,
    Reason:             sandboxv1beta1.SandboxReasonDependenciesNotReady,
}
// ... if pod Ready AND service satisfied ...
readyCondition.Status = metav1.ConditionTrue
readyCondition.Reason = sandboxv1beta1.SandboxReasonDependenciesReady
```

Sources: [controllers/sandbox_controller.go:313-392](controllers/sandbox_controller.go)

### Suspended

`computeSuspendedCondition` is **only emitted when `spec.replicas == 0`**. The `Status` flips depending on whether the pod has actually been deleted: `True/PodTerminated` once the pod is gone, `False/PodNotTerminated` while it's still draining. When `replicas != 0` the controller does not emit a `Suspended` condition at all (existing ones remain whatever `meta.SetStatusCondition` last wrote).

Sources: [controllers/sandbox_controller.go:289-311](controllers/sandbox_controller.go), [api/v1beta1/sandbox_types.go:28-33](api/v1beta1/sandbox_types.go)

### Finished

`computeFinishedCondition` returns `nil` unless the pod exists **and** its `Status.Phase` is `PodSucceeded` or `PodFailed`. The reconcile loop strips the `Finished` condition whenever the pod is missing or non-terminal, so its presence is the authoritative "this run is over" signal:

```go
// controllers/sandbox_controller.go (excerpt)
if !hasFinished {
    meta.RemoveStatusCondition(&sandbox.Status.Conditions,
        string(sandboxv1beta1.SandboxConditionFinished))
}
```

`SandboxClaim` reflects the same condition into its own status array via `syncFinishedCondition`, and the `Lifecycle.ttlSecondsAfterFinished` countdown is anchored on `FinishedCondition.LastTransitionTime`. The `lifecycle.FinishedCondition` helper requires `Status == True`; transient or False entries are ignored for TTL purposes.

Sources: [controllers/sandbox_controller.go:256-268](controllers/sandbox_controller.go), [controllers/sandbox_controller.go:394-417](controllers/sandbox_controller.go), [extensions/controllers/sandboxclaim_controller.go:562-576](extensions/controllers/sandboxclaim_controller.go), [internal/lifecycle/expiry.go:24-82](internal/lifecycle/expiry.go)

## Reason Strings

Reasons are machine-readable codes intended for selectors, alerts, and metric labels. They are declared as untyped string constants alongside the condition types.

### Sandbox-emitted reasons

| Condition | Reason | Status | Source |
|-----------|--------|--------|--------|
| `Ready` | `DependenciesReady` | True | `SandboxReasonDependenciesReady` |
| `Ready` | `DependenciesNotReady` | False | `SandboxReasonDependenciesNotReady` |
| `Ready` | `SandboxSuspended` | False | `SandboxReasonSuspended` (set when `replicas==0`) |
| `Ready` | `SandboxExpired` | False | `SandboxReasonExpired` (set when shutdownTime passes) |
| `Ready` | `ReconcilerError` | False | Free-form, set when reconcile returns a non-nil error |
| `Suspended` | `PodTerminated` | True | `SandboxReasonSuspendedPodTerminated` |
| `Suspended` | `PodNotTerminated` | False | `SandboxReasonSuspendedPodNotTerminated` |
| `Finished` | `PodSucceeded` | True | `SandboxReasonPodSucceeded` |
| `Finished` | `PodFailed` | True | `SandboxReasonPodFailed` |

Sources: [api/v1beta1/sandbox_types.go:27-55](api/v1beta1/sandbox_types.go), [controllers/sandbox_controller.go:289-417](controllers/sandbox_controller.go), [controllers/sandbox_controller.go:1080-1127](controllers/sandbox_controller.go)

### SandboxClaim-emitted reasons

The claim controller adds its own reasons for `Ready` while still mirroring the underlying sandbox state when neither expiry nor an error applies.

| Reason | When emitted |
|--------|--------------|
| `ClaimExpired` (`extensionsv1beta1.ClaimExpiredReason`) | `Lifecycle.shutdownTime` or `ttlSecondsAfterFinished` has elapsed; also used by `Eventf` when the controller logs the deletion. |
| `TemplateNotFound` | `getTemplate` returned `ErrTemplateNotFound`. |
| `InvalidMetadata` | `validateAdditionalPodMetadata` rejected the claim's `additionalPodMetadata`. |
| `SandboxMissing` | Reconcile succeeded with no error but the owned `Sandbox` does not exist (and the claim is not expired). |
| `SandboxNotReady` | Underlying `Sandbox` exists but no `Ready` condition was found to forward. |
| `SandboxExpired` | Forwarded from the underlying Sandbox via `hasSandboxExpiredCondition`. |
| `ReconcilerError` | Generic fallback for unrecognized reconcile errors. |

`hasClaimExpiredCondition` and `hasSandboxExpiredCondition` both query `Ready` by reason — the reason is the load-bearing signal that distinguishes "claim TTL fired" from "sandbox TTL fired" even though both surface as `Ready=False`.

Sources: [extensions/api/v1beta1/sandboxclaim_types.go:25-31](extensions/api/v1beta1/sandboxclaim_types.go), [extensions/controllers/sandboxclaim_controller.go:459-546](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxclaim_controller.go:1436-1444](extensions/controllers/sandboxclaim_controller.go)

## State Coordination Across Controllers

The conditions and reasons above are wired together so that a single read of the claim's status is sufficient to decide whether to keep waiting, drop the connection, or retry. The diagram captures who writes what, and how the SandboxClaim reconciler folds the underlying Sandbox status back into its own:

```mermaid
flowchart LR
  subgraph CoreCtrl["controllers/sandbox_controller.go"]
    direction TB
    SR[SandboxReconciler]
    SR --> SCond[Sandbox.status.conditions<br/>Ready / Suspended / Finished]
  end

  subgraph ExtCtrl["extensions/controllers/sandboxclaim_controller.go"]
    direction TB
    SCR[SandboxClaimReconciler]
    SCR --> CCond[SandboxClaim.status.conditions<br/>Ready + mirrored Finished]
  end

  subgraph PoolCtrl["extensions/controllers/sandboxwarmpool_controller.go"]
    direction TB
    WPR[SandboxWarmPoolReconciler]
    WPR -.writes labels.-> SandboxObj
  end

  Pod[(corev1.Pod)] -- phase + Ready --> SR
  SandboxObj[(Sandbox CR)] -- mirrored Finished --> SCR
  SCond --> SandboxObj
  SandboxObj -- read in computeReadyCondition --> SCR
  CCond --> ClaimObj[(SandboxClaim CR)]
  SCR -- Eventf<br/>ClaimExpired --> Events[(events.k8s.io)]
```

The mirroring path lives in `computeReadyCondition` and `syncFinishedCondition`: if the underlying `Sandbox` has a `Ready` condition the claim simply forwards it (preserving `Reason`, `Message`, and `LastTransitionTime`); if the sandbox carries a terminal `Finished` it is copied onto the claim, and removed when no longer applicable. Expiry is a special case — the claim emits `Ready=False/ClaimExpired` regardless of the sandbox's own condition so callers can distinguish a deliberate caller-initiated shutdown from an internal failure.

Sources: [extensions/controllers/sandboxclaim_controller.go:459-576](extensions/controllers/sandboxclaim_controller.go), [controllers/sandbox_controller.go:240-271](controllers/sandbox_controller.go)

## Lifecycle State Machine

`Sandbox.status.conditions` evolves through a small state machine driven by pod phase and `spec.replicas`. `Finished` is orthogonal — it can appear in parallel with `Ready=False` while the pod is still in a terminal phase.

```mermaid
stateDiagram-v2
  [*] --> NotReady: pod missing / pending
  NotReady --> Ready: pod Ready=True\nReason=DependenciesReady
  Ready --> NotReady: pod loses readiness\nReason=DependenciesNotReady
  Ready --> Suspending: replicas=0\nSuspended=False/PodNotTerminated
  Suspending --> Suspended: pod deleted\nSuspended=True/PodTerminated
  Suspended --> NotReady: replicas back to 1
  Ready --> Finished: pod Succeeded/Failed\nFinished=True
  NotReady --> Finished: pod Succeeded/Failed
  Ready --> Expired: shutdownTime elapsed\nReady=False/SandboxExpired
  Finished --> Expired: ttlSecondsAfterFinished elapsed\n(claim-side)
  Expired --> [*]: ShutdownPolicy=Delete\nresource removed
```

Sources: [controllers/sandbox_controller.go:273-417](controllers/sandbox_controller.go), [controllers/sandbox_controller.go:1065-1127](controllers/sandbox_controller.go), [internal/lifecycle/expiry.go:47-82](internal/lifecycle/expiry.go)

## Annotation and Label Keys

All cross-controller coordination keys are namespaced under `agents.x-k8s.io/`. Every key is declared once as a constant so test fixtures, the warm-pool controller, and the claim controller all share the same vocabulary.

### Annotations

| Key | Declared at | Written by | Read by | Purpose |
|-----|-------------|------------|---------|---------|
| `agents.x-k8s.io/pod-name` | `SandboxPodNameAnnotation` | `SandboxClaim` completeAdoption; pod-create path | core Sandbox controller (`resolvePodName`), `getLaunchType` | Records the pod name an adopted warm-pool sandbox is bound to. Differs from `sandbox.Name` only during/after adoption. Also used as the "this was a warm start" signal for metrics. |
| `agents.x-k8s.io/sandbox-template-ref` | `SandboxTemplateRefAnnotation` | warm-pool controller; claim's `createSandbox` | metrics collector | Stores the originating `SandboxTemplate.name` for cardinality-bounded metric labels and audit. |
| `agents.x-k8s.io/propagated-labels` | `SandboxPropagatedLabelsAnnotation` | core Sandbox controller (`reconcilePod`, `updatePodMetadata`) | core Sandbox controller (next reconcile) | Sorted, comma-joined list of label keys the controller copied from `Sandbox.spec.podTemplate.metadata.labels` onto the Pod. Lets the controller detect which labels it owns so it can delete ones that are removed from spec. |
| `agents.x-k8s.io/propagated-annotations` | `SandboxPropagatedAnnotationsAnnotation` | core Sandbox controller | core Sandbox controller | Same idea as above, but for annotations. |
| `opentelemetry.io/trace-context` | `internal/metrics.TraceContextAnnotation` | webhook / claim / sandbox controllers | tracing helpers | Propagates a W3C trace context across CRD boundaries (not in the `agents.x-k8s.io/` namespace, but participates in the same propagation chain). |

Sources: [api/v1beta1/sandbox_types.go:56-66](api/v1beta1/sandbox_types.go), [controllers/sandbox_controller.go:82-90](controllers/sandbox_controller.go), [controllers/sandbox_controller.go:785-943](controllers/sandbox_controller.go), [extensions/controllers/sandboxclaim_controller.go:748-755](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxwarmpool_controller.go:335-338](extensions/controllers/sandboxwarmpool_controller.go)

### Labels

| Key | Declared at | Owner | Purpose |
|-----|-------------|-------|---------|
| `agents.x-k8s.io/sandbox-name-hash` | `sandboxLabel` (package-private) | core Sandbox controller | Selector that ties a Pod and headless Service to its owning `Sandbox`. The hash form keeps the value inside Kubernetes' 63-character label limit. |
| `agents.x-k8s.io/sandbox-pod-template-hash` | `SandboxPodTemplateHashLabel` | warm-pool controller | Tags warm-pool sandboxes with the template-content hash; the pool reconciler uses it (via `isSandboxStale`) to decide which pods are still fresh. Stripped on adoption so adopted sandboxes don't accidentally re-enter the pool. |
| `agents.x-k8s.io/sandbox-name` | `AssignedSandboxNameLabel` | `SandboxClaim` controller | Written on the `SandboxClaim` itself once a sandbox is bound; lets callers selector-match `claim → sandbox` without reading status. |
| `agents.x-k8s.io/claim-uid` | `SandboxIDLabel` | `SandboxClaim` controller | Written on `Sandbox.metadata.labels` and `Sandbox.spec.podTemplate.metadata.labels` so NetworkPolicies and external informers can resolve a Pod or Sandbox back to its owning claim by UID. |
| `agents.x-k8s.io/warm-pool-sandbox` | `warmPoolSandboxLabel` (package-private) | warm-pool controller | Hashed pool-name marker that lets the warm-pool informer enumerate its own sandboxes; stripped during claim adoption. |
| `agents.x-k8s.io/sandbox-template-ref-hash` | `sandboxTemplateRefHash` (package-private) | warm-pool + claim controllers | Hashed template-name marker on the pod template; used to bucket warm pods by template for the adoption queue. |

Sources: [controllers/sandbox_controller.go:49-53](controllers/sandbox_controller.go), [api/v1beta1/sandbox_types.go:60-61](api/v1beta1/sandbox_types.go), [extensions/api/v1beta1/sandboxclaim_types.go:25-31](extensions/api/v1beta1/sandboxclaim_types.go), [extensions/api/v1beta1/sandboxtemplate_types.go:33-37](extensions/api/v1beta1/sandboxtemplate_types.go), [extensions/controllers/sandboxwarmpool_controller.go:47-50](extensions/controllers/sandboxwarmpool_controller.go), [extensions/controllers/sandboxclaim_controller.go:578-589](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxclaim_controller.go:728-794](extensions/controllers/sandboxclaim_controller.go)

### Propagation Algorithm

The `propagated-labels` / `propagated-annotations` annotations exist because Kubernetes has no "owner record" for fields a controller stamps onto someone else's object. Without a tracking annotation, removing a key from `Sandbox.spec.podTemplate.metadata.labels` would leave the previously-written value orphaned on the Pod. The Sandbox controller solves this by writing the *sorted, comma-joined set of keys it currently manages* back onto the Pod and consulting it on the next reconcile:

```text
   spec.podTemplate.metadata.labels       Pod.metadata.labels
   ┌──────────────────────────┐           ┌──────────────────────────┐
   │ app=demo                 │ ────────► │ app=demo                 │
   │ tier=web                 │           │ tier=web                 │
   └──────────────────────────┘           │ agents.x-k8s.io/         │
                ▲                         │   sandbox-name-hash=...  │
                │                         └──────────────────────────┘
                │                                       │
        next reconcile: diff                            │ annotation
        managedKeys vs.                                 ▼
        Pod.annotations["propagated-labels"]    propagated-labels=app,tier
```

If `tier` is later removed from spec, the next reconcile sees `tier` in the tracked set but absent from spec, and deletes it from `Pod.labels`. Annotations follow the identical procedure.

Sources: [controllers/sandbox_controller.go:785-943](controllers/sandbox_controller.go)

## Worked Example: A Claim's `status` Across Phases

The table maps user-observable lifecycle phases onto the conditions and metadata the `SandboxClaim` controller actually writes. Reading top-to-bottom corresponds to the normal flow of a successful claim that later expires.

| Phase | `Ready` | `Finished` | Notable labels/annotations on `Sandbox` |
|-------|---------|------------|-----------------------------------------|
| Claim accepted, template missing | `False / TemplateNotFound` | — | (no sandbox yet) |
| Cold start, pod pending | `False / DependenciesNotReady` (forwarded) | — | `claim-uid`, `sandbox-template-ref-hash` |
| Warm-pool adoption in progress | `False / SandboxNotReady` | — | `pod-name`, `claim-uid` added; `warm-pool-sandbox`, `sandbox-pod-template-hash` removed |
| Running | `True / DependenciesReady` (forwarded) | — | `claim-uid`, `pod-name` (if warm) |
| `replicas=0` requested | `False / SandboxSuspended` | — | `Suspended` condition set on Sandbox |
| Pod exited 0 | `True / DependenciesReady` then transitions | `True / PodSucceeded` | `Finished` mirrored from Sandbox |
| `ttlSecondsAfterFinished` elapses | `False / ClaimExpired` | preserved | `ClaimExpired` event emitted |
| `shutdownTime` elapses with `ShutdownPolicy=Retain` | `False / ClaimExpired` | (cleared if sandbox deleted) | Sandbox deleted; claim object retained |

Sources: [extensions/controllers/sandboxclaim_controller.go:160-249](extensions/controllers/sandboxclaim_controller.go), [extensions/controllers/sandboxclaim_controller.go:459-576](extensions/controllers/sandboxclaim_controller.go), [controllers/sandbox_controller.go:289-417](controllers/sandbox_controller.go)

## Summary

The status surface of an agent sandbox is intentionally minimal: three condition types (`Ready`, `Suspended`, `Finished`), a closed set of reason strings declared as constants in `api/v1beta1/sandbox_types.go` and `extensions/api/v1beta1/sandboxclaim_types.go`, and a small handful of `agents.x-k8s.io/`-namespaced annotations and labels that let controllers reconstruct ownership and propagation history across reconciles. Together they form the contract that lets the `SandboxClaim` mirror the core `Sandbox`'s status verbatim while still expressing claim-only concerns like `ClaimExpired` and `ttlSecondsAfterFinished`, and let the `SandboxWarmPool` decide which pre-warmed sandboxes are still safe to hand out.
