Skip to content

Storage Layout

The on-disk layout is deliberately simple. Anything that knows how to read a directory and a markdown file can read Cortex.

~/.cortex/
├── config.yaml
├── cortex.db
├── index.md
├── inbox/
│ ├── pending/
│ ├── processed/
│ └── failed/
├── projects/
├── areas/
├── resources/
├── archive/
├── meetings/
├── daily/
└── templates/

The single configuration file. Contains:

  • The owner identifier, used for provenance.
  • Retrieval settings (activation enable/disable, decay, weight).
  • Normalize defaults (auto-commit, dedup behaviour).
  • Hook-related toggles.

Configuration is loaded lazily by each script; there is no central daemon that reads it once at boot.

The SQLite sidecar. Indexes the filesystem; does not own data.

A top-level human-readable index. Generated by the rebuild script. Lists active projects, open items, recently active scopes. The cortex MCP server returns a structured version of this for status calls.

Three subdirectories:

  • pending/ — events awaiting normalization. The capture scripts only ever touch this directory.
  • processed/ — events that successfully normalized. Kept for audit and replay.
  • failed/ — events that errored during normalize. Kept for inspection.

Events here are full markdown files with frontmatter, not opaque blobs.

The four PARA directories — projects/, areas/, resources/, archive/ — hold normalized records. The split is the standard PARA distinction: active work, ongoing responsibilities, reference material, inactive.

A record can move between PARA directories over its lifetime — for example, from projects/ to archive/ when a project closes. The move is just a git mv; the record’s ID stays stable.

  • meetings/ — meeting records, plus a per-meetings _index.md.
  • daily/ — daily journal entries, one file per day.
  • templates/ — record templates the normalizer uses for common patterns (action items, decisions, etc.).

These are not separate from PARA — they are siblings. They exist because the access patterns are different enough to warrant their own top-level directory.

~/.cortex/inbox/pending/evt-<timestamp>-<session_short>-<seq>.md
  • <timestamp> is YYYYMMDDTHHMMSS in the local timezone.
  • <session_short> is the first 8 characters of the source session ID, or manual for manually triggered captures.
  • <seq> is a 3-digit sequence number for collision avoidance within the same second.
~/.cortex/<para_bucket>/rec-<scope>-<slug>-<seq>.md
  • <para_bucket> is one of projects/areas/resources/archive/meetings/daily/.
  • <scope> is a short identifier. For projects, it is the project hint. For meetings, it is meeting. For action items, action-item. For decisions, decision.
  • <slug> is a slugified summary of the record content, capped at 20 characters.
  • <seq> is a 3-digit sequence number, scoped to the bucket.

Every record starts with YAML frontmatter, at minimum:

---
record_id: rec-<scope>-<slug>-<seq>
type: <project|feedback|reference|meeting|action-item|decision|context>
status: <open|closed|superseded>
created: <ISO timestamp>
updated: <ISO timestamp>
provenance:
event_ids: [evt-...]
source_session_id: <id>
related_to: [rec-..., rec-...]
---

Body is plain markdown. Sections, lists, code, links — whatever fits the record.

Both capture and normalize use the same pattern: write to a temp file in the target directory, then os.replace() to the final name. This is atomic on local POSIX filesystems and avoids partial files on crash.

The lockfile for normalize sits at ~/.cortex/.normalize.lock and is held for the duration of a single normalize pass.