User Messages Are the Signal — Agent Session Recall Without LLM Extraction

2026-03-02·https://github.com/alexknowshtml/kuato

repoaimemoryrecallagent-loopsclaude-codepostgresqlsearchtypescript

directly comparable to joelclaw observe pipeline — raw message preservation vs LLM extraction, worth benchmarking retrieval quality against Typesense semantic search

Alex Hillman built Kuato to solve the amnesia problem that hits anyone running Claude Code seriously: the model forgets everything between sessions. His answer is to mine the JSONL session transcripts Claude Code writes to ~/.claude/projects locally, extracting the parts that actually matter. The key insight: user messages are the signal. You don’t need the full 50k-token transcript. The user’s requests, decisions, and corrections — combined with files_touched and tools_used — reconstruct what happened cleanly and cheaply.

The name comes from the mutant in Total Recall who helps Quaid remember who he is. The quote from the README earns its place: “You are what you do. A man is defined by his actions, not his memory.” In session terms: what you asked for is more durable than what the model generated. “Let’s build an email filtering system” → “yes, use that approach” → “actually make it async” → “commit this” — that’s the whole session in four user messages.

Two modes ship: file-based (zero dependencies, Bun only, substring search on JSONL, works in minutes) and PostgreSQL (tsvector full-text with weighted ranking — user messages weighted B, tools weighted C, working directory weighted D, GIN index for sub-100ms lookup). The tiered approach is smart. Start with zero setup, upgrade to Postgres when session volume makes scanning slow. File paths get tokenized by splitting on /, -, _, . so septa-holiday-bus/App.jsx becomes searchable tokens — search “septa” and find sessions that touched that directory.

This sits directly adjacent to the joelclaw recall and observe pipeline but makes a different bet. The joelclaw observe pipeline uses LLM extraction to distill facts from sessions into semantic memory — structured compression, clean assertions, searchable via Typesense. Kuato preserves raw user messages as-is: cheaper (no LLM call at ingest), preserves original phrasing, no extraction loss. The tradeoff is real: LLM extraction compresses but loses nuance; raw messages preserve intent but are noisier at retrieval time. Worth running both and comparing what you actually find when you search.

Key Ideas

User messages are the semantic core of a session — requests, decisions, corrections — not model responses. The model’s output is derivable from the user’s input; the reverse isn’t true
Two-tier architecture: file-based (zero-dep, instant) → PostgreSQL (tsvector, GIN index, weighted ranking) — start simple, graduate when volume demands it
tsvector weighting maps to signal strength: user messages (B) > tools used (C) > working directory (D) — higher weight for higher signal
File path tokenization: paths split on /, -, _, . make directory names like septa-holiday-bus searchable without special handling
Named after the Total Recall mutant — “you are what you do, not what you remember” — the session identity is in the actions
Tradeoff vs LLM extraction: raw messages = zero cost, preserves intent, noisier; LLM extraction = compressed, structured, lossy — explicit design choice, not an oversight
No instrumentation required — works directly on ~/.claude/projects JSONL files Claude Code already writes
Token tracking baked into the PostgreSQL version, including per-model breakdowns — useful for understanding session cost distribution
The skill template in shared/claude-skill.md shows how to teach Claude to query its own history — recursive session awareness

Key Ideas

Links