Architecture Overview
Cortex is built from a small number of components, each with a narrow responsibility. The boundaries between them are deliberate — they let different parts run on different schedules, fail independently, and be replaced without rewiring the rest.
Components
Section titled “Components”Capture scripts
Section titled “Capture scripts”Small CLI utilities that write events to the inbox. The reference
implementation has scripts for session-stop captures, pre-compact emergency
saves, manual captures, and meeting ingest. Each writes the same shape of
file: YAML frontmatter plus a body, into ~/.cortex/inbox/pending/.
A capture script does not touch the database, does not call the LLM, and does not enforce any schema beyond the frontmatter. Its only job is to write a file atomically and emit the new event ID.
Normalizer
Section titled “Normalizer”The single-writer process that turns events into records. Runs on demand or on a schedule. Holds an exclusive lockfile while running so concurrent normalize calls do not race.
For each event the normalizer:
- Parses the frontmatter.
- Walks the body and decides whether the event yields one record or many.
- Generates record IDs and writes record files into PARA directories.
- Indexes the new records in the SQLite sidecar.
- Moves the event from
pending/toprocessed/(orfailed/).
A successful normalize is idempotent at the event level: re-processing a
moved event is a no-op because it is no longer in pending/.
Filesystem store
Section titled “Filesystem store”A directory tree at ~/.cortex/:
~/.cortex/├── config.yaml├── cortex.db SQLite sidecar├── index.md human-readable top-level index├── inbox/│ ├── pending/ events awaiting normalization│ ├── processed/ normalized events (kept for audit)│ └── failed/ events that errored out├── projects/ PARA: active project work├── areas/ PARA: ongoing responsibilities├── resources/ PARA: reference material├── archive/ PARA: closed/completed├── meetings/ meeting records and per-meeting indexes├── daily/ daily journal entries└── templates/ record templatesThe PARA directories contain rec-*.md files. Sub-paths are allowed but
not required — flat is fine.
SQLite sidecar
Section titled “SQLite sidecar”A single database file (cortex.db) that indexes the filesystem. Tables
include:
- A full-text index over record bodies for
recallqueries. - A records table with stable IDs, scopes, slugs, and frontmatter fields.
- An access events table for activation tracking.
- A view that aggregates access counts and recency by record.
- Tables for meeting-specific dedup, related-to relationships, and provenance back to source events.
The sidecar is rebuildable from the filesystem alone. Losing it means losing indexes, not data.
MCP server
Section titled “MCP server”A long-running process that exposes the canonical operations to AI clients
over MCP: capture, recall, reflect, status, supersede. The server
is the only thing that AI clients talk to directly — they never touch
the filesystem or the SQLite sidecar themselves.
Lifecycle hooks
Section titled “Lifecycle hooks”Small scripts wired into the AI client’s hook system:
SessionStart— calls recall and injects relevant records.PreCompact— emergency-captures the current session.Stop— captures the session for later normalization.
Hooks are thin: they shell out to capture scripts or the MCP server rather than implementing logic themselves.
Distillation loop
Section titled “Distillation loop”A periodic process that walks high-volume scopes and produces rolled-up summary records. Originals are not deleted; summaries are added as new records and can be recalled directly.
How the components compose
Section titled “How the components compose” AI client ──MCP──▶ MCP server ──▶ recall query ──▶ SQLite + filesystem │ └──hooks──▶ capture scripts ──▶ inbox/pending │ ▼ normalizer ──▶ records + SQLite │ ▼ distillation (periodic)The arrow shapes are doing real work here:
- MCP is the synchronous request/response surface for AI clients.
- Hooks fan out to small scripts that do not need to wait for anything.
- The inbox is the asynchronous boundary between capture and normalize.
Capture is fast because it has nothing to do but write a file. Recall is fast because it reads an indexed sidecar. Normalize is allowed to take its time because nothing waits on it.