ADR-0092superseded

Unified pi-infer Abstraction with Model Fallback

2026-02-22T00:00:00.000Z

Context

The joelclaw codebase shells out to pi (via Bun.spawnSync) in 13+ files across 15+ call sites for LLM inference — recall rewrite, o11y triage, email cleanup, meeting analysis, memory observe/promote/review, task triage, media processing, diagnostics, and log summarization.

Each call site independently handles:

Model selection (hardcoded strings: "haiku", "sonnet", TRIAGE_MODEL, etc.)
Timeout configuration (varying from 3s to 20s)
stdout/stderr parsing
Error handling and fallback behavior
Environment setup (TERM: "dumb", etc.)

The recall rewrite fix (ADR-0077, commit 6f2fab7) introduced an escalating fallback pattern — Haiku 10s → Haiku 15s → codex-spark 20s — that proved effective. This pattern should be the default for all pi inference, not copy-pasted per call site.

Decision

Create a shared piInfer() utility that encapsulates the pi subprocess pattern with model fallback.

API

// packages/shared/src/pi-infer.ts
 
interface PiInferOptions {
  prompt: string
  systemPrompt?: string
  models?: PiModelAttempt[]       // override default fallback chain
  mode?: "text" | "json"          // --mode flag
  extraArgs?: string[]            // additional pi flags
  spawn?: SpawnHook               // test injection
}
 
interface PiModelAttempt {
  model: string                   // e.g. "anthropic/claude-haiku"
  timeout: number                 // ms
}
 
interface PiInferResult {
  text: string
  model: string                   // which model succeeded
  strategy: string                // e.g. "haiku" | "openai" | "fallback"
  attempts: number                // how many tries before success
  error?: string                  // last error if fell through to raw
}
 
const DEFAULT_MODELS: PiModelAttempt[] = [
  { model: "anthropic/claude-haiku", timeout: 10_000 },
  { model: "anthropic/claude-haiku", timeout: 15_000 },
  { model: "openai/gpt-5.3-codex-spark", timeout: 20_000 },
]
 
function piInfer(options: PiInferOptions): PiInferResult

Behavior

Iterate through model attempts in order
Each attempt: Bun.spawnSync(["pi", "--no-tools", "--no-session", "--no-extensions", "--print", "--mode", mode, "--model", model, ...extraArgs, prompt])
On success (exit 0 + non-empty stdout): return result with model info
On failure: capture error, try next attempt
After all attempts exhausted: return with strategy: "fallback" and last error
spawn hook preserved for unit testing

Call sites to migrate

File	Current model	Notes
`cli/commands/recall.ts`	haiku → codex-spark	Already has escalation — extract
`cli/commands/logs.ts`	hardcoded	2 call sites
`cli/commands/diagnose.ts`	hardcoded	1 call site
`system-bus/observability/triage.ts`	TRIAGE_MODEL	1 call site
`system-bus/functions/check-email.ts`	hardcoded	1 call site
`system-bus/functions/email-cleanup.ts`	hardcoded	2 call sites
`system-bus/functions/observe.ts`	hardcoded	1 call site
`system-bus/functions/reflect.ts`	hardcoded	1 call site
`system-bus/functions/promote.ts`	hardcoded	1 call site
`system-bus/functions/task-triage.ts`	hardcoded	1 call site
`system-bus/functions/meeting-analyze.ts`	hardcoded	1 call site
`system-bus/functions/media-process.ts`	hardcoded	1 call site
`system-bus/functions/memory/batch-review.ts`	hardcoded	1 call site

Where to put it

packages/shared/src/pi-infer.ts — new shared package, or add to existing shared utils if one exists. Both packages/cli and packages/system-bus depend on it.

Consequences

Single place to update model preferences, timeout strategy, and env setup
Automatic resilience — every pi call gets fallback for free
Testable — spawn hook means unit tests don’t need real pi
Observable — can add OTEL emit inside piInfer for all inference calls
Risk: migration touches 13 files; do incrementally, verify each with existing tests

Status

Proposed. Pending implementation priority decision.