Agent Sessions Die Twice: Context Bloat Then Compaction Amnesia

repoaimcpagent-loopscontext-windowsqliteclaudeinfrastructuresession-continuity

Long joelclaw coding loops via `joelclaw loop` are exactly the workload this targets — file reads, tool calls, and compaction state loss mid-story are real failure modes in multi-hour runs

context-mode by mksglu frames it exactly right: there are two ways a long agent session falls apart. The first is raw data accumulation — a Playwright snapshot costs 56 KB, twenty GitHub issues cost 59 KB, one access log costs 45 KB. After 30 minutes of real work, 40% of the context window is gone. The second failure is worse: the model compacts to recover space and forgets which files it was editing, what tasks are in progress, and what you last asked for. That’s not a data problem. That’s working memory loss.

The tool attacks both. A set of MCP sandbox tools (ctx_batch_execute, ctx_execute, ctx_fetch_and_index, etc.) keep raw output out of the context window entirely — 315 KB becomes 5.4 KB, 98% reduction. But the more interesting half is the session continuity layer. Every file edit, git operation, task, and user decision gets tracked in SQLite with FTS5 full-text indexing. When compaction hits, context-mode doesn’t dump the full history back in — it runs BM25 search against the index and injects only what’s relevant. The agent picks up exactly where it left off.

The enforcement mechanism matters. The tool installs PreToolUse hooks that intercept Bash, Read, WebFetch, Grep, and Task calls before they execute — not as suggestions but as actual interception. Without hooks, routing instructions get ~60% compliance and one unrouted Playwright snapshot wipes out everything saved. With hooks you’re at ~98%. It also auto-writes a CLAUDE.md in the project root with routing instructions so the model knows to prefer sandbox tools from session start. The install for Claude Code is a two-line plugin command — it handles the MCP server, hooks, and CLAUDE.md automatically.

The --continue flag is a nice design decision: if you don’t pass it, previous session data is deleted immediately. Fresh session, clean slate. No accidental state bleed between unrelated tasks. That kind of intentional scoping is rare in tools that are primarily trying to save you from yourself.

Key Ideas

  • Two distinct failure modes in long sessions: context bloat from raw tool output, and working-memory loss when the model compacts. Most tools only address the first.
  • PreToolUse hooks as enforcement: without them, routing instructions are advisory (~60% compliance). Hooks make them mandatory at the process level.
  • SQLite FTS5 + BM25 for session recovery — not a full context dump on restart, only relevant events retrieved by search. Scales better as sessions get longer.
  • 98% context reduction (315 KB → 5.4 KB) via sandbox tool routing. The 6 MCP tools cover batch execution, single execution, file execution, indexing, search, and fetch+index.
  • --continue flag for intentional continuity — no flag means clean slate. Prevents context bleed between unrelated sessions.
  • Supports Claude Code, Gemini CLI, and VS Code Copilot — not Claude-specific, the hook architecture is generalizable.
  • Directly relevant to joelclaw agent loops — multi-story coding runs accumulate exactly this kind of tool output noise across many steps.