ADR-0192accepted

Recall Rewrite Reliability Contract

2026-03-02T00:00:00.000Z

Status: proposed
Date: 2026-03-02
Deciders: Joel, Panda
Relates to: ADR-0096, ADR-0140, ADR-0190, ADR-0191

Context

joelclaw recall query rewrite is intended to improve retrieval quality, but recent traces show frequent fallback (inference_rewrite_empty) with repeated latency cost and little semantic gain.

Rewrite must be treated as an optimization layer, not a mandatory step.

Decision

1) Make rewrite strictly optional at runtime

Recall retrieval must always succeed without rewrite.

Rewrite path may run only when budget policy and circuit state allow.
If rewrite is skipped/fails, retrieval proceeds immediately with normalized raw query.

2) Add rewrite skip heuristics

Skip rewrite before any LLM call when query is already simple/high-signal:

short exact-keyword queries,
quoted literal queries,
low-entropy command-like queries,
direct IDs/slugs/paths.

Emit reason codes (e.g. skip.short_query, skip.literal_query).

3) Bound rewrite failure loops

Apply ADR-0191 circuit behavior to component=recall-cli, action=recall.rewrite:

repeated inference_rewrite_empty opens circuit,
open circuit bypasses rewrite and logs skip,
half-open probes test recovery after cooldown.

4) Cache successful rewrite outputs

For identical normalized input query + context fingerprint:

cache successful rewrite result for a short TTL,
reuse cached rewrite to avoid repeated LLM calls,
invalidate cache on context fingerprint change.

5) Promote rewrite outcome observability

Every recall execution must record:

rewriteStrategy: disabled|skipped|haiku|openai|fallback
rewriteReason: success|skip.*|failure.*|circuit_open
rewriteDurationMs
rewriteUsed: boolean (rewritten query materially differs)

6) Set reliability SLO

Target contract for steady state:

fallback rate ≤ 20% over rolling window,
rewrite empty-output rate ≤ 5%,
median rewrite latency ≤ 1.2s,
rewrite path disabled automatically when outside SLO.

Consequences

Good

recall becomes robust even when rewrite is unstable,
eliminates repeated no-op rewrite spend,
preserves semantic gains where rewrite actually works.

Tradeoffs

added cache/state complexity in CLI recall adapter,
rewrite quality variability may reduce semantic expansion in short term.

Required Skills (Preflight)

recall — retrieval semantics, trust pass, budget profiles
langfuse — rewrite strategy/error visibility and trend validation
system-architecture — interaction with system-bus inference and telemetry
joelclaw — CLI behavior validation and operational checks

Implementation Plan (vector clock)

V1: add skip-heuristic classifier in packages/cli/src/capabilities/adapters/typesense-recall.ts.
V2: add rewrite circuit checks and state transitions (ADR-0191 contract) for recall.rewrite.
V3: add rewrite-result cache keyed by normalized query + context fingerprint.
V4: enrich Langfuse/OTEL payloads with strategy/reason/used flags.
V5: add regression tests for skip, fallback, circuit-open, and cache reuse paths.

detectRewriteSkipReason() already skips rewrite for: short queries (≤24 chars, ≤3 tokens), quoted literals, direct identifiers (paths/slugs), and command-like queries (show|find|list|get|open). Emits rewriteReason: skip.*.

V2 (circuit breaker) — shipped 2026-03-04

File-persisted circuit breaker at ~/.joelclaw/state/recall-rewrite-circuit.json. State survives across CLI invocations (each joelclaw run is a fresh process).

Threshold: 3 consecutive failures → open circuit
Cooldown: 5 minutes → half-open probe
States: closed → open (after N failures) → half-open (after cooldown) → closed (on success)
OTEL: rewriteReason: circuit_open

V3 (rewrite cache) — shipped 2026-03-04

File-persisted rewrite cache at ~/.joelclaw/state/recall-rewrite-cache.json. Keyed on normalized query text.

TTL: 3 minutes
Max entries: 50 (LRU eviction)
Performance: 6.2s cold → 0.4s cache hit (15x speedup)
OTEL: rewriteReason: cache_hit

Timeout fix — shipped 2026-03-04

Default REWRITE_TIMEOUT_MS bumped from 2s to 6s. Pi cold-start on M4 Pro is ~3-4s, so the 2s timeout guaranteed failure on every first invocation — this was the root cause of the 41% empty recall rate seen in ADR-0190 scorecard. Configurable via JOELCLAW_RECALL_REWRITE_TIMEOUT env var.

V4 (OTEL enrichment) — pre-existing

Every recall already emits: rewriteStrategy, rewriteReason, rewriteDurationMs, rewriteModel, rewriteProvider, rewriteUsage, budgetApplied, budgetReason.

Verification Checklist

recall succeeds with equivalent output quality when rewrite is disabled/circuit-open
repeated rewrite-empty failures open rewrite circuit and stop repeated rewrite calls
successful rewrites are cached and reused for identical inputs
strategy/reason telemetry is present on every recall run
fallback and empty-output rates trend down after rollout (monitor via joelclaw memory scorecard)
V5: regression tests for skip, fallback, circuit-open, and cache reuse paths