# Capability directories

> Pipeline path grammar (.by, .filter, .order, .first, .last, .export), chaining limits, SQL pushdown, and PathType resolution in path.go.

- Repository: timescale/tigerfs
- GitHub: https://github.com/timescale/tigerfs
- Human docs: https://grok-wiki.com/public/docs/timescale-tigerfs-60719456a5c3
- Complete Markdown: https://grok-wiki.com/public/docs/timescale-tigerfs-60719456a5c3/llms-full.txt

## Source Files

- `internal/tigerfs/fs/path.go`
- `internal/tigerfs/fs/constants.go`
- `internal/tigerfs/db/pipeline.go`
- `docs/data-first.md`
- `docs/spec.md`
- `test/integration/pipeline_test.go`

---

---
title: "Capability directories"
description: "Pipeline path grammar (.by, .filter, .order, .first, .last, .export), chaining limits, SQL pushdown, and PathType resolution in path.go."
---

Capability directories are dot-prefixed path segments under a table (for example `orders/.filter/status/pending/.last/10/`) that accumulate query state in an immutable `FSContext`. `ParsePath` in `internal/tigerfs/fs/path.go` performs syntactic parsing only; `Operations.parsePath` resolves schema, applies file-first policy gates, and hands the resulting `ParsedPath` to `readDirWithParsed` / `readFileWithParsed`. When `FSContext.HasPipelineOperations()` is true, row listings and `.export/` reads call `db.QueryRowsPipeline` or `db.QueryRowsWithDataPipeline`, which compile the full chain into one parameterized SQL statement.

## Pipeline capabilities

Navigation and bulk-query capabilities are defined as constants in `internal/tigerfs/fs/constants.go`:

| Directory | Segments | Effect on `FSContext` |
|-----------|----------|------------------------|
| `.by/` | `.by/`, `.by/<col>/`, `.by/<col>/<val>/` | Equality filter; `Indexed: true` on the filter |
| `.filter/` | `.filter/`, `.filter/<col>/`, `.filter/<col>/<val>/` | Equality filter; `Indexed: false` |
| `.order/` | `.order/`, `.order/<col>/`, `.order/<col>.desc/` | Sets `OrderBy` / `OrderDesc`; blocks further filters |
| `.columns/` | `.columns/`, `.columns/col1,col2/` | Column projection; only `.export/` allowed next |
| `.first/N/` | Positive integer `N` | `LimitFirst` |
| `.last/N/` | Positive integer `N` | `LimitLast` |
| `.sample/N/` | Positive integer `N` | `LimitSample` |
| `.export/` | `.export/<fmt>`, `.export/.with-headers/<fmt>` | Terminal; sets `IsTerminal` → `PathExport` |
| `.all/` | `.all/` | No-op limit (equivalent to table root; hidden from `ls`) |

Supported export format names: `csv`, `tsv`, `json`, `yaml` (with or without a file extension such as `data.csv`).

<Note>
`.import/`, `.info/`, DDL (`.create/`, `.modify/`, `.delete/`), `.indexes/`, and synth/history paths (`.build/`, `.format/`, `.history/`, `.log/`, `.savepoint/`, `.undo/`) are separate capability families. This page focuses on the data-first **pipeline** set: `.by`, `.filter`, `.order`, `.columns`, `.first`, `.last`, `.sample`, `.export`.
</Note>

### Example paths

```bash
# List pending orders for customer 123, newest first, as JSON
cat /mnt/db/orders/.by/customer_id/123/.filter/status/pending/.order/created_at/.last/10/.export/json

# Project columns then export
cat /mnt/db/orders/.filter/status/shipped/.columns/id,total,created_at/.export/csv

# Nested pagination: last 50 rows among the first 100 by PK
ls /mnt/db/events/.first/100/.last/50/
```

## Path grammar and `PathType` resolution

`ParsePath` splits the mount path into segments and routes through `parseTablePath` → `processSegments`. Known capabilities are recognized by `isKnownCapability`; unknown dot-segments (for example `.git`) are treated as ordinary filenames via scan-ahead logic.

```mermaid
flowchart TD
  subgraph parse ["fs/path.go — ParsePath"]
    PP[ParsePath]
    PT[parseTablePath]
    PS[processSegments]
    PC[processCapability]
    PR[processRowOrColumn]
  end
  subgraph types ["ParsedPath.Type"]
    PCap[PathCapability — incomplete cap]
    PTbl[PathTable — filters/limits/order applied]
    PExp[PathExport — terminal export]
    PRow[PathRow / PathColumn]
  end
  subgraph runtime ["fs/operations.go"]
    OP[Operations.parsePath]
    RD[readDirWithParsed / readFileWithParsed]
    QP[FSContext.ToQueryParams]
    DB[db.QueryRowsPipeline]
  end
  PP --> PT --> PS
  PS -->|known cap| PC
  PS -->|no cap ahead| PR
  PC -->|listing only| PCap
  PC -->|filter/limit/order/columns| PTbl
  PC -->|.export/| PExp
  PR --> PRow
  OP --> PP
  PTbl --> RD
  PExp --> RD
  PTbl --> QP --> DB
```

### How `PathType` is chosen

| Path shape | `PathType` | Notes |
|------------|------------|-------|
| `/users` | `PathTable` | Empty `FSContext`; schema filled at runtime |
| `/users/.by/email` | `PathCapability` | `CapabilityDir=.by`, `CapabilityArg=email` |
| `/users/.by/email/foo@bar.com` | `PathTable` | Filter applied; type collapses from `PathCapability` |
| `/users/.first/10` | `PathTable` | Limit applied |
| `/users/.export/json` | `PathExport` | `IsTerminal=true`; no further caps |
| `/users/123/name` | `PathColumn` | Terminal row/column segment |
| `/users/.filter/active/.order/date/.last/10` | `PathTable` | Full pipeline in `Context` |

Incomplete capabilities (directory listing forms) keep `PathCapability` and set `CapabilityDir` / `CapabilityArg`. Once a filter value, limit, order, or column list is applied, `processBy` / `processFilter` / `processLimit` / `processOrder` / `processColumns` reset the type to `PathTable` while mutating `Context`.

`processRowOrColumn` runs only when no known capability appears ahead in the remaining segments. Hierarchical synth paths use scan-ahead: segments before `.history/` become `PrimaryKey` / `RawSubPath`, then capability processing continues.

### Reserved names

`IsCapabilityDirectory` in `constants.go` prevents names like `.filter` from being interpreted as column values during row/column parsing. The reserved set includes pipeline dirs plus `.info`, `.log`, `.savepoint`, and `.undo`.

## `FSContext` and chaining rules

`FSContext` (`internal/tigerfs/fs/context.go`) is immutable: each pipeline step returns a clone via `WithFilter`, `WithOrder`, `WithLimit`, `WithColumns`, or `WithTerminal`. `PipelineDepth` increments on each mutating step and drives `max_pipeline_depth` listing suppression.

Enforcement methods:

| Method | Rule |
|--------|------|
| `CanAddFilter()` | False after `.order/` or `.export/` |
| `CanAddOrder()` | False if already ordered (second `.order/` rejected at parse time) |
| `CanAddLimit(type)` | No `.first` after `.first`, no `.last` after `.last`, no limits after `.sample/` |
| `CanAddColumns()` | False after `.columns/` or `.export/` |
| `CanExport()` | False when `IsTerminal` |

`AvailableCapabilities()` drives which capability directories appear in `ls` after pipeline steps. After `.columns/`, only `.export/` is advertised.

### Allowed chaining (parent → children)

| Parent state | Next capabilities |
|--------------|-------------------|
| Table root | `.by`, `.filter`, `.order`, `.columns`, `.first`, `.last`, `.sample`, `.export` |
| After `.by/<col>/<val>/` or `.filter/<col>/<val>/` | Same as root (filters AND-combined) |
| After `.order/<col>/` | `.columns`, `.first`, `.last`, `.sample`, `.export` (no more filters) |
| After `.columns/col1,col2/` | `.export` only |
| After `.first/N/` or `.last/N/` | Filters, order, columns, other limit types, `.export` |
| After `.sample/N/` | Filters, order, columns, `.export` (no further limits) |
| After `.export/<fmt>` | None (terminal) |

### Disallowed combinations (parse-time errors)

| Pattern | Replacement / behavior |
|---------|------------------------|
| `.first/N/.first/M/` | Use single `.first/M/` |
| `.last/N/.last/M/` | Use single `.last/M/` |
| `.sample/N/.first|M` or `.sample/N/.last/M` | Use `.sample/M/` |
| `.order/a/.order/b/` | Rejected; only one order per path |
| `.columns/a,b/.columns/c,d/` | Rejected; merge column lists |
| `.by/` or `.filter/` after `.order/` | Rejected with hint: filters before order |
| `.export/.../.first/` | Export is terminal |

Post-limit filters are allowed: `.first/100/.filter/status/active/` applies the filter to the subquery result set.

### Nested pagination

When a second limit is added, `WithLimit` moves the current limit to `PreviousLimit` / `PreviousLimitType`. `FSContext.NeedsSubquery()` becomes true, and `db.buildNestedPipelineSQL` wraps the inner limit in a subquery before applying outer filters, order, and limit.

| Path | Meaning |
|------|---------|
| `.first/100/.last/50/` | Last 50 rows of the first 100 by PK (rows 51–100 of PK-ordered set) |
| `.last/100/.first/50/` | First 50 of the last 100 |
| `.first/1000/.sample/50/` | Random 50 drawn from the first 1000 |

Integration tests in `test/integration/pipeline_test.go` (`TestPipeline_NestedPagination`) verify `.first/10/.last/3/` and `.last/10/.first/3/` mount listings.

## SQL pushdown

At runtime, `readDirTable` checks `fsCtx.HasPipelineOperations()`. When true, it builds `params := fsCtx.ToQueryParams()`, fills `PKColumns` from metadata, applies `DirListingLimit` if no explicit limit was set, and calls `db.QueryRowsPipeline`. Export reads use `QueryRowsWithDataPipeline` with `selectPKOnly=false` so column projection and filters apply to the full row payload.

`db.QueryParams` (`internal/tigerfs/db/pipeline.go`) carries:

- `Filters` — AND-combined equalities (`column = $n`)
- `OrderBy` / `OrderDesc` — custom sort with PK tie-breakers
- `Limit` / `LimitType` — `LIMIT` plus default `ORDER BY` (PK ASC for `.first/`, PK DESC for `.last/`, `RANDOM()` for `.sample/`)
- `PreviousLimit` / `PreviousLimitType` — nested subquery when needed
- `Columns` — quoted column list instead of `SELECT *`

`FilterCondition.Indexed` (from `.by/` vs `.filter/`) does not change SQL semantics; both emit equality predicates. The flag is reserved for planning hints.

Simple (non-nested) pipeline SQL shape:

```sql
SELECT <cols or pk> FROM "schema"."table"
WHERE col1 = $1 AND col2 = $2
ORDER BY <order_col> ASC NULLS LAST, <pk cols>
LIMIT $n
```

Nested `.first/100/.last/50/` compiles to an inner `SELECT * ... ORDER BY pk ASC LIMIT 100` wrapped by an outer query with `ORDER BY pk DESC LIMIT 50`.

<Warning>
`.sample/` uses `ORDER BY RANDOM()` and can be expensive on large tables. Prefer indexed `.by/` paths and explicit `.first/` / `.last/` windows for predictable cost.
</Warning>

## `.by/` vs `.filter/`

Both add equality filters to `FSContext`. The difference is navigation and listing behavior, not the final filter SQL.

| Aspect | `.by/` | `.filter/` |
|--------|--------|------------|
| Column listing (`ls`) | Indexed columns only (`GetSingleColumnIndexes`) | All table columns |
| Value listing (`ls .by/col/`) | `DirListingLimit` (default 1000) | `DirFilterLimit` (default 100000); may surface `.table-too-large` |
| Filter flag | `Indexed: true` | `Indexed: false` |
| Typical use | Index-backed lookups | Ad-hoc filters on non-indexed columns |

Direct access without listing still works when value listing is blocked:

```bash
cat /mnt/db/large_events/.filter/type/click/.first/100/.export/json
```

On `tigerfs.<workspace>_log` tables, `.by/filename/` and `.filter/filename/` are rejected at parse time (`blockLogFilenameQuery`): log `filename` values contain `/`, which cannot appear in a single path segment. Use `.log/.by/file_id/<uuid>/` instead.

## Directory listings vs reads

| Operation | Pipeline behavior |
|-----------|-------------------|
| `ls` on `PathTable` with pipeline context | `QueryRowsPipeline` for row PKs; capability subdirs from `AvailableCapabilities()` |
| `ls` on `PathCapability` | Column/value/order listing helpers (`.by/`, `.filter/`, `.columns/`, `.order/`) |
| `cat` on `PathExport` | `QueryRowsWithDataPipeline` + format encoder |
| `cat` on row files under a pipeline path | Row read respects accumulated filters (via parsed context) |

Pagination directories (`.first/`, `.last/`, `.sample/`) appear in listings but show empty contents until you navigate to a numeric child (for example `.first/50/`). This avoids recursive tools (`find`, `rm -rf`, agents) walking infinite placeholder trees.

### `max_pipeline_depth`

<ParamField body="max_pipeline_depth" type="int" default="10">
Maximum chained pipeline operations before capability directories are hidden from `ls`. `0` means unlimited depth for listings. Parsing and explicit paths still work when depth is exceeded; only directory advertisements are suppressed. Default comes from `config.Config` / `~/.config/tigerfs/config.yaml`.
</ParamField>

When `fsCtx.PipelineDepth >= max_pipeline_depth`, `readDirTable` lists rows and `.info/` only—no further `.by/`, `.filter/`, etc. `.log/` and `.savepoint/` under file-first workspaces set `HideCapabilities` for the same recursion protection; pipeline segments remain reachable by full path.

## File-first workspace boundary

`Operations.rejectDataFirstCapOnSynthWorkspace` blocks `.by/`, `.filter/`, `.order/`, `.columns/`, `.first/`, `.last/`, `.sample/`, `.export/`, and related DDL on synth workspace mount paths (`/<view>/...`). Data-first pipeline access to the backing table is available through `/.tables/<view>/...` (schema `tigerfs`). Allowed workspace controls include `.info/`, `.history/`, `.log/`, `.savepoint/`, `.undo/`, and `.format/`.

## Configuration and errors

| Setting | Default | Role |
|---------|---------|------|
| `max_pipeline_depth` | `10` | Hide capability dirs in deep chains |
| `dir_listing_limit` | `1000` | Default row cap for pipeline `ls` when no `.first/` / `.last/` / `.sample/` |
| `dir_filter_limit` | `100000` | Threshold for `.filter/<col>/` value listing |

Invalid chains return `ErrInvalidPath` from `ParsePath` with a `Hint` field (surfaced via structured logging and POSIX `EINVAL` on FUSE/NFS). Examples: `"cannot add .filter/ after .order/"`, `"cannot add .first after .first"`.

## Implementation map

| Layer | Responsibility |
|-------|----------------|
| `fs/path.go` | `ParsePath`, `processCapability`, `PathType`, capability-specific parsers |
| `fs/context.go` | `FSContext`, chaining rules, `ToQueryParams()` |
| `fs/constants.go` | Directory name constants, `IsCapabilityDirectory` |
| `fs/operations.go` | `parsePath`, `readDirTable`, capability listing, synth/file-first gates |
| `db/pipeline.go` | SQL generation, `QueryRowsPipeline`, nested subqueries |

Production code should call `Operations.parsePath`, not bare `ParsePath`, so schema resolution and workspace policy apply before any database access.

## Related pages

<CardGroup>
  <Card title="Filesystem as API" href="/filesystem-as-api">
    Mount hierarchy, dot-directory control surface, and how operations map paths to SQL.
  </Card>
  <Card title="Data-first exploration" href="/data-first-exploration">
    Row/column reads, index navigation, pagination, PATCH writes, and export/import workflows.
  </Card>
  <Card title="Consistency and caching" href="/consistency-and-caching">
    No content cache on reads; stat/path cache keys must include full pipeline paths.
  </Card>
  <Card title="Configuration reference" href="/configuration-reference">
    `max_pipeline_depth`, `dir_listing_limit`, `dir_filter_limit`, and mount flags.
  </Card>
</CardGroup>
