Skip to content

Recall

recall is how AI clients ask Cortex for relevant memory. It is the read-side counterpart to capture and the operation most session lifecycle hooks ultimately call into.

Recall takes structured arguments:

ArgumentDescription
scopeA scope identifier, e.g. meetings, meeting:<id>, project name, area name
queryAn optional free-text query, used for FTS ranking
relationAn optional record ID; recall returns records related to it
limitMax number of records to return (default sensible)
sinceOptional time bound, e.g. 7d, 1m

It returns an ordered list of record summaries: ID, type, scope, summary line, last access timestamp.

Three signals combine into the rank:

  • FTS relevance. SQLite FTS5 score against the query, when a query is provided.
  • Recency. Records updated more recently rank higher.
  • Activation. Records accessed more recently and more frequently rank higher. Activation is computed from the access events table; the formula uses a configurable decay over time.

The three signals are blended with configurable weights. Activation has a moderate weight by default — high enough to surface frequently-used records, low enough that brand-new records can compete.

When no query is provided, FTS drops out and the rank reduces to a recency-plus-activation blend.

The scope argument is the primary filter. Cortex understands several scope shapes:

  • Project name — records routed to projects/ whose scope field matches.
  • Area name — records routed to areas/.
  • meetings — all meeting records.
  • meeting:<id> — records derived from a specific meeting, including action items and decisions.
  • recent — records updated in the last N days (configurable default).
  • open — records with status: open, regardless of scope.

The set of recognised scopes is small but extensible. New scopes are added by registering a query template against the records table.

A successful recall has one side effect: it inserts access events for the returned records into the access events table. This feeds activation for subsequent recalls.

The side effect is idempotent in the loose sense: running the same recall twice does not double-count, because each access is timestamped and the activation calculation handles repeated accesses naturally.

Recall is the most frequent Cortex operation and the most latency-sensitive. The MCP server keeps it cheap by:

  • Holding the SQLite connection open across requests (within a server lifetime).
  • Using prepared statements for the hot ranking query.
  • Pre-computing the activation view as a SQL VIEW so the join is cached.
  • Returning record summaries from the records table rather than re-reading the underlying markdown files. The body is only fetched when the client explicitly asks for it.

Typical recall latency on a modest store is in the low milliseconds.

  • It is not full-text search of the entire filesystem. It searches the FTS index, which is populated by the normalizer. Records that have not been normalized are not visible.
  • It is not a vector store. There are no embeddings. Retrieval is FTS plus metadata, not semantic similarity.
  • It is not a graph traversal. The relation argument is a single hop. To follow chains, the client makes multiple calls.

The simplicity is the point. FTS plus activation plus a small number of scopes covers most retrieval patterns without an embedding pipeline to maintain.