# Filesystem as API

> Mount hierarchy, schema/table/row/column path model, dot-directory control surface, and how fs.Operations maps paths to SQL.

- Repository: timescale/tigerfs
- GitHub: https://github.com/timescale/tigerfs
- Human docs: https://grok-wiki.com/public/docs/timescale-tigerfs-60719456a5c3
- Complete Markdown: https://grok-wiki.com/public/docs/timescale-tigerfs-60719456a5c3/llms-full.txt

## Source Files

- `docs/spec.md`
- `internal/tigerfs/fs/path.go`
- `internal/tigerfs/fs/operations.go`
- `internal/tigerfs/fs/context.go`
- `README.md`

---

---
title: "Filesystem as API"
description: "Mount hierarchy, schema/table/row/column path model, dot-directory control surface, and how fs.Operations maps paths to SQL."
---

TigerFS exposes PostgreSQL through a mount point where `fs.Operations` parses every path into a `ParsedPath` and `FSContext`, then issues SQL via `internal/tigerfs/db` — FUSE (Linux) and in-process NFS (macOS) are thin adapters that delegate `ReadDir`, `Stat`, and `ReadFile` to that core without reimplementing query logic.

## Request flow

```mermaid
flowchart LR
  subgraph clients [Clients]
    Unix[Unix tools ls cat grep]
  end
  subgraph adapters [Platform adapters]
    FUSE[fuse.OpsFS + FSAdapter]
    NFS[nfs.OpsFilesystem]
  end
  subgraph core [Shared core internal/tigerfs/fs]
    Ops[Operations]
    Parse[parsePath → ParsePath]
    Ctx[FSContext]
    DB[db.Client pipeline + row queries]
  end
  PG[(PostgreSQL)]
  Unix --> FUSE
  Unix --> NFS
  FUSE --> Ops
  NFS --> Ops
  Ops --> Parse
  Parse --> Ctx
  Ops --> DB
  DB --> PG
```

<Note>
`ReadFile` always queries PostgreSQL for row content. `Stat` and directory listings may use short-TTL metadata caches; see the consistency page for invalidation rules.
</Note>

## Mount hierarchy

After `tigerfs mount <connection> <mountpoint>`, the mount root lists tables and views from the connection’s current schema (typically `public`), plus global control directories.

| Path | Role |
|------|------|
| `/` | Default-schema tables/views plus `.build`, `.create`, `.info`, `.schemas`, `.tables`, `.views` |
| `/<table>/` | One PostgreSQL table (or synth workspace view) as a directory |
| `/.schemas/<schema>/` | Explicit schema; non-default schemas also appear as `/<schema>/<table>/` |
| `/.tables/<name>/` | Backing table in the `tigerfs` schema (data-first escape hatch for synth apps) |

**Schema flattening:** Tables in the default schema appear at the mount root (`/users/`). Other schemas use a prefix (`/analytics/reports/`) or `/.schemas/analytics/reports/`. Root-level `/.build/` creates synthesized apps; `/.build/` under `/.schemas/<schema>/` is rejected to avoid `tigerfs` schema collisions.

**Root listing** (`readDirRoot`) adds control dirs first, then cached table names, then views (synth views are writable `0755`; plain views `0555`).

## Path model: schema → table → row → column

Paths are absolute, start with `/`, and are parsed syntactically by `ParsePath` then enriched by `(*Operations).parsePath` (schema resolution and file-first policy gates).

### PathType dispatch

`ParsedPath.Type` drives all operations. Common values:

| PathType | Example path | Meaning |
|----------|--------------|---------|
| `PathRoot` | `/` | Mount root |
| `PathTable` | `/orders` or `/orders/.filter/status/shipped/` | Table dir; pipeline state in `FSContext` |
| `PathRow` | `/orders/123` or `/orders/123.json` | Row directory or format file |
| `PathColumn` | `/orders/123/email` | Single column |
| `PathCapability` | `/orders/.by/customer_id/` | Mid-pipeline capability listing |
| `PathExport` | `/orders/.export/json` | Terminal bulk export |
| `PathInfo` | `/orders/.info/count` | Table metadata file |
| `PathBuild` | `/.build/blog` | Create synth workspace (write format name) |
| `PathHistory` | `/blog/.history/post.md/` | Version snapshots (file-first) |
| `PathDDL` | `/.create/mytable/sql` | DDL staging |

Hierarchical synth paths (e.g. `/blog/tutorials/intro.md/.history/`) use **scan-ahead**: non-dot segments before a known capability become `PrimaryKey` / `RawSubPath`, then capability parsing continues.

### Dual row representation

Every row is reachable two ways:

1. **Row directory** — `/<table>/<pk>/` appears in `ls`; contains per-column files and hidden format files (`.json`, `.csv`, `.tsv`, `.yaml`).
2. **Row file** — `/<table>/<pk>.json` (etc.) is readable but **not** listed, keeping `ls` output small.

Column files use type extensions (`.txt`, `.json`, `.bin`) per table metadata; full-row writes use PATCH semantics for `.json`/`.yaml`/`.tsv`/`.csv`.

### Format detection

Extensions `.json`, `.csv`, `.tsv`, `.yaml` set `ParsedPath.Format` for serialization in `readRowFile` via `internal/tigerfs/format`.

## Dot-directory control surface

Dot-prefixed names are reserved **capability directories** (hidden from plain `ls`, visible with `ls -a`). User dotfiles (`.gitignore`, `.env`) are allowed in file-first workspaces when not in the reserved set.

Capabilities are centralized in `internal/tigerfs/fs/constants.go`:

| Group | Directories | Purpose |
|-------|-------------|---------|
| Navigation | `.by`, `.filter`, `.order`, `.first`, `.last`, `.sample`, `.all` | Filters, sort, pagination (pipeline) |
| Projection / I/O | `.columns`, `.export`, `.import` | Column select, bulk export/import |
| Metadata | `.info` | `count`, `ddl`, `schema`, `columns`, `indexes` |
| Schema / DDL | `.schemas`, `.create`, `.modify`, `.delete`, `.indexes`, `.views` | DDL staging and schema browsing |
| Synth / workspace | `.build`, `.format`, `.history`, `.tables` | Create/configure workspaces, history, backing tables |
| Undo / audit | `.log`, `.savepoint`, `.undo` | Operation log, bookmarks, preview/apply undo |
| Mount | `/.info/user` | Mount-level identity for log entries |

<Warning>
Creating files or directories whose names match reserved capability names returns `EACCES`. Hard links or renames into those names must enforce the same rule.
</Warning>

### File-first vs data-first surfaces

On **synth workspace** paths (markdown/plaintext views), `rejectDataFirstCapOnSynthWorkspace` blocks data-first pipeline capabilities (`.by`, `.filter`, `.export`, table DDL, etc.) at the workspace root. Allowed workspace controls include `.info`, `.history`, `.log`, `.savepoint`, `.undo`, `.format`.

Data-first access to the backing table uses **`/.tables/<view>/...`**, which parses with `Schema=tigerfs` and bypasses the gate.

## FSContext: path state → SQL parameters

`FSContext` is **immutable**: each capability segment returns a cloned context with updated fields:

- `Filters` — AND-combined from `.by/<col>/<val>/` (indexed) and `.filter/<col>/<val>/`
- `OrderBy` / `OrderDesc` — from `.order/<col>/` (no further filters after order)
- `Limit` / `LimitType` — `.first`, `.last`, `.sample`; nested limits set `PreviousLimit*` and `NeedsSubquery()`
- `Columns` — `.columns/col1,col2/` then only `.export/` allowed
- `IsTerminal` — after `.export/`
- `PipelineDepth` — incremented per stage; compared to `max_pipeline_depth` (default **10**, `0` = unlimited)

`AvailableCapabilities()` drives which capability subdirs appear in `readDirTable`. When depth is exceeded or `HideCapabilities` is set (`.log`/`.savepoint` in workspaces), listings hide further pipeline dirs to avoid recursive scanner blowups (`find`, `rm -rf`, agents).

`ToQueryParams()` copies context into `db.QueryParams` for `QueryRowsPipeline` / `QueryRowsWithDataPipeline`.

### Pipeline → SQL

`internal/tigerfs/db/pipeline.go` builds one SQL statement from `QueryParams`:

- Simple: `WHERE` + `ORDER BY` + `LIMIT`
- Nested pagination: subquery wrapper (e.g. `.first/100/.last/50/`)
- Identifiers quoted via `QuoteIdent` / `QuoteTable`

Example path and effective query shape:

```bash
cat /mnt/db/orders/.by/customer_id/123/.order/created_at/.last/10/.export/json
```

```sql
SELECT * FROM "public"."orders"
WHERE "customer_id" = $1
ORDER BY "created_at" DESC
LIMIT 10
```

(Exact SQL depends on PK columns, projection, and nested limits.)

## How Operations maps operations to SQL

| Operation | Entry | SQL path |
|-----------|-------|----------|
| List table / pipeline dir | `ReadDir` → `readDirTable` | `ListRows` or `QueryRowsPipeline(ToQueryParams())` |
| Read export | `ReadFile` → `readExportFile` | `QueryRowsWithDataPipeline` or `GetAllRows` |
| Read row | `readRowFile` | `GetRow` + format encode; synth → synthesized markdown/plaintext |
| Read column | `readColumnFile` | Column value query |
| Read `.info/*` | `readInfoFile` | Metadata queries (`count`, DDL, etc.) |
| Stat | `statWithParsed` | Metadata cache + size heuristics; export sizes from pipeline |

Default row listing cap: `dir_listing_limit` (default **1000**). Pipeline listings without an explicit limit inherit the same cap. Large tables may refuse bare `ls` on the table root; use `.first/`, `.sample/`, or index paths (see spec large-table section).

`parsePath` also:

1. Fills empty `Context.Schema` from `GetCurrentSchema()` once per mount (`sync.Once`).
2. Applies synth workspace policy before any listing/read.

## Configuration knobs

<ParamField body="dir_listing_limit" type="int">
Max rows returned when listing a table without an explicit pipeline limit (default 1000).
</ParamField>

<ParamField body="max_pipeline_depth" type="int">
Hide further capability directories after this many pipeline stages (default 10; 0 = unlimited).
</ParamField>

<ParamField body="dir_filter_limit" type="int">
Threshold for `.filter/` distinct-value listing behavior on large tables (default 100000).
</ParamField>

CLI: `tigerfs mount --max-ls-rows`, `--max-pipeline-depth`, etc. bind into the same `config.Config` struct passed to `NewOperations`.

## Mental model (compact)

```text
/mount/
├── .build/ .create/ .schemas/ .tables/ .views/ .info/   # mount control
├── <table>/                    # public (or default) table
│   ├── .by/ .filter/ .order/ … .export/   # pipeline (data-first)
│   ├── .info/ .indexes/ .modify/ .delete/ # metadata + DDL
│   └── <pk>/                   # row dir
│       ├── col.txt  age  .json …
│   └── <pk>.json               # row file (not in ls)
└── <schema>/<table>/           # non-default schema
```

Synth workspace `notes/` looks like files (`hello.md`) but rows live in `tigerfs` backing tables; `/.tables/notes/` exposes the relational shape.

## Verification

<Steps>
<Step title="Mount and inspect root">
```bash
tigerfs mount postgres://localhost/mydb /mnt/db
ls /mnt/db
ls -a /mnt/db/notes/    # expect .history, .log, etc. on history-enabled workspaces
```
</Step>
<Step title="Trace a pipeline path">
```bash
ls /mnt/db/orders/.by/customer_id/
ls /mnt/db/orders/.by/customer_id/123/.last/10/
cat /mnt/db/orders/.by/customer_id/123/.last/10/.export/json | head
```
</Step>
<Step title="Confirm backing-table route">
```bash
ls /mnt/db/.tables/notes/    # data-first columns/rows for synth app "notes"
```
</Step>
</Steps>

Run with `--debug` to see structured zap logs when paths return `EINVAL`/`ENOENT` (hints often include the `/.tables/...` alternative for blocked synth paths).

## Related pages

<CardGroup>
<Card title="Overview" href="/overview">
What TigerFS exposes at mount time, runtime assumptions, and where to start reading.
</Card>
<Card title="Capability directories" href="/capability-directories">
Pipeline grammar, chaining rules, and PathType resolution in depth.
</Card>
<Card title="File-first and data-first" href="/file-first-and-data-first">
Mode detection, `.build`/`.format`, and `.tables/` backing schema.
</Card>
<Card title="Data formats reference" href="/data-formats-reference">
TSV, CSV, JSON, YAML encoding and PATCH write semantics.
</Card>
<Card title="Platform backends" href="/platform-backends">
FUSE vs NFS adapters and delegation to `fs.Operations`.
</Card>
<Card title="Consistency and caching" href="/consistency-and-caching">
Fresh reads vs metadata cache TTLs and write invalidation.
</Card>
</CardGroup>
