ADR-0180shipped

Configurable Sub-Agent Roster

2026-02-28T00:00:00.000Z

Context

joelclaw delegates work to sub-agents (codex for coding, Inngest infer() for LLM calls in functions). But:

Codex only supports OpenAI models — design tasks need Opus/Sonnet (taste, not speed)
No roster — agent selection is ad-hoc, hardcoded per call site
Role.md exists but isn’t actionable — we have role definitions but no mechanism to spawn a sub-agent with a specific role, model, and tool set

Pi’s native subagent tool (from nicobailon/pi-subagents) already provides:

Agent definitions (YAML frontmatter + markdown body) with discovery by scope (builtin/user/project)
Model + thinking level per agent, tool/extension sandboxing
Chain execution with {previous}, {task}, {chain_dir} template vars
Parallel dispatch, skill injection
6 builtin agents: context-builder, planner, researcher, reviewer, scout, worker

Key insight: Pi-subagents handles agent definitions and spawning. joelclaw needs to make infer() resolve pi agent definitions and wrap execution in Inngest steps for durability. Best of both worlds, minimal new code.

Decision

Adopt pi-subagents as the definition and discovery layer, with Inngest as the execution backbone for durable agent dispatch from system-bus functions.

Gaps identified during review (addressed by this revision)

The original ADR did not define how chain/parallel semantics map to Inngest step return values ({previous} was underspecified).
Agent definition parsing/validation rules were implicit; invalid frontmatter, unknown models, and unsafe paths were not explicitly rejected.
infer() compatibility risk was unaddressed: agent currently maps to inference-router profiles (classifier, triage, reflector) in packages/system-bus/src/lib/inference.ts.
Extension sandboxing policy was not translated from pi-subagents to joelclaw’s production worker threat model.

Agent Definition Format

Markdown files in ~/.joelclaw/agents/{name}.md (user scope) and .joelclaw/agents/{name}.md (project scope):

---
name: designer
description: Frontend design with taste — UI components, layouts, visual polish
model: claude-opus-4-6
thinking: high
tools: read, bash, edit, write
skill: frontend-design, ui-animation, emilkowal-animations
---
 
You are a design-focused agent. You create distinctive, production-grade
frontend interfaces. Use the StatusPulseDot and StatusLed components from
@repo/ui/status-badge for activity indicators...

Roster Configuration

.joelclaw/config.toml:

[agents]
# Default agent for unspecified tasks
default = "coder"
 
# Route by task type
[agents.routing]
design = "designer"      # $frontend-design tagged tasks
code = "coder"            # Default coding tasks  
research = "researcher"   # Web research, repo autopsy
review = "reviewer"       # Code review, PR review

Execution Backbone: Inngest

All sub-agent execution runs through Inngest — not raw subprocess spawning. This gives durability, retries, observability, and session streaming for free.

// Single agent dispatch
const result = await step.run("designer", async () => {
  return infer(task, { agent: "designer" });
});
 
// Chain execution — each step memoized
const recon = await step.run("scout-recon", () => infer(task, { agent: "scout" }));
const plan = await step.run("planner", () => infer(recon, { agent: "planner" }));
const impl = await step.run("worker", () => infer(plan, { agent: "worker" }));
 
// Parallel execution
const [a, b] = await Promise.all([
  step.run("task-a", () => infer(taskA, { agent: "designer" })),
  step.run("task-b", () => infer(taskB, { agent: "coder" })),
]);

Why Inngest over raw subprocess:

Step memoization — crash mid-chain, resume from last completed step
Retries — model 503, timeout → automatic retry with backoff
Observability — every step in Inngest dashboard + OTEL
Concurrency control — throttle parallel agents, prevent resource contention
Timeouts — timeouts.finish kills runaway agent calls
Cancellation — cancel running chains by event
Session streaming — results stream back into gateway/interactive sessions via Redis event bridge

infer() extension:

// infer() resolves agent definition → pi flags
await infer("redesign this component", { agent: "designer" });
// Resolves to: pi -p --no-session --models claude-opus-4-6:high --tools read,bash,edit,write
// With agent's system prompt injected via --append-system-prompt

Inngest execution model parity (mapped from pi-subagents internals)

pi-subagents currently implements:

~/.repo-autopsy/nicobailon/pi-subagents/execution.ts: runSync() derives pi flags from agent definitions (--models, tool splitting, extension policy, skill injection, MCP_DIRECT_TOOLS).
~/.repo-autopsy/nicobailon/pi-subagents/chain-execution.ts + settings.ts: sequential/parallel chains with {task}, {previous}, {chain_dir} substitution and per-step behavior overrides.
~/.repo-autopsy/nicobailon/pi-subagents/async-execution.ts + subagent-runner.ts: detached background runner with status.json, events.jsonl, and filesystem polling.

In joelclaw, we replace subprocess orchestration with Inngest primitives:

Single agent: one step.run("agent:{name}") wrapping infer(...).
Chain: one Inngest function with deterministic step IDs (chain:{index}:{agent}) for memoized replay/resume.
Parallel step: parallel step.run calls with explicit concurrency and optional failFast behavior.
Async/background: no detached worker needed; step.sendEvent + Inngest run state replaces FS watcher polling.
Progress streaming: reuse the existing gateway.progress()/gateway.notify() pattern inside step.run blocks (same replay-safe pattern used in packages/system-bus/src/inngest/functions/story-pipeline.ts).

`{previous}` template semantics in Inngest chains

{previous} must map to structured step outputs, not only plain text. Define chain step returns as:

type AgentStepResult = {
  agent: string;
  text: string;
  model?: string;
  provider?: string;
  usage?: LlmUsage;
  artifacts?: Record<string, string>;
  exitCode: number;
};

Template context per step:

{task}: original top-level request
{previous}: previous step text (or aggregated parallel text)
{previous_json}: JSON-serialized previous AgentStepResult (or array for parallel)
{chain_dir}: durable artifact directory path

For parallel steps, aggregate outputs with stable headers (pi-subagents style: === Parallel Task N (agent) ===) so downstream prompts remain parseable and deterministic.

Integration Points

Gateway interactive — $frontend-design tag in user message dispatches agent/task.run Inngest event → result streams back via Redis
Inngest functions — infer() gains agent option that resolves from roster
CLI — joelclaw agent list, joelclaw agent run <name> <prompt> (fires Inngest event, streams result)
Codex delegation — unchanged for OpenAI tasks, designer agent for Anthropic tasks
Session feedback — Inngest step results emit agent/task.complete events, gateway picks up via Redis subscription and streams into active session

Existing joelclaw plumbing to reuse:

packages/system-bus/src/inngest/middleware/gateway.ts already carries originSession and exposes gateway.progress()/notify()/alert() helpers.
packages/system-bus/src/inngest/functions/agent-loop/utils.ts#pushGatewayEvent() already fans events to gateway + originSession targets.
packages/gateway/src/channels/redis.ts already prefers originSession routing and packages/gateway/src/daemon.ts already routes responses by active source.

Event contracts to standardize:

agent/task.run: { taskId, agent, task, originSession?, cwd?, timeoutMs?, metadata? }
agent/task.progress: { taskId, step, message, originSession? } (optional for long chains)
agent/task.complete: { taskId, agent, status, output, usage?, artifacts?, originSession? }
agent/task.failed: { taskId, agent, error, retryable, attempt, originSession? }

Discovery Priority

Project: .joelclaw/agents/ (highest)
User: ~/.joelclaw/agents/ (medium)
Builtin: joelclaw/agents/ in repo (lowest, git-tracked)

Patterns Adopted from pi-subagents

Agent definition format — YAML frontmatter + markdown body. Fields: name, description, model, thinking (off/minimal/low/medium/high/xhigh), tools, skill, extensions, output, defaultReads, defaultProgress, interactive.

Extension sandboxing — extensions: field controls which pi extensions load in the sub-agent:

Absent → all extensions load (default)
Empty → --no-extensions
List → --no-extensions --extension a --extension b

Three execution modes — Single (one agent, one task), Chain (sequential with {previous} template var + shared {chain_dir}), Parallel (concurrent with max concurrency).

Spawn mechanism — pi -p --mode json --no-session with --models, --tools, --extension, --append-system-prompt flags derived from agent definition. Captures stdout as JSONL, tracks usage/tokens/duration.

Async mode — Background execution via worker process. FSWatcher on results directory detects completion. Widget polls progress.

Skill injection — skill: field resolves skill files, injects content into system prompt before spawn.

Key Differences from pi-subagents

Chain execution via Inngest steps — durable, retryable, observable vs raw subprocess chains
Inngest-native — long-running agent tasks dispatched as Inngest steps with memoization
Role.md integration — agent definitions can reference roles/*.md for shared context
No TUI clarify step — joelclaw agents are headless; confirmation happens via gateway/CLI
Discovery adds repo scope — builtin agents live in joelclaw/agents/ (git-tracked, lowest priority)

Agent Definition Validation (required)

Use strict runtime validation at load-time before any dispatch:

Frontmatter parsing: use a real YAML parser (not regex-only key/value parsing) so arrays/booleans are typed reliably.
- Do not rely on permissive tool schemas alone (Type.Any usage in pi-subagents/schemas.ts); enforce strict server-side validation in joelclaw.
Identity: name + description required; name must match file basename (designer.md → name: designer).
Model: must resolve in @joelclaw/inference-router catalog (support bare IDs and provider/model IDs).
Thinking level: enum off|minimal|low|medium|high|xhigh.
Tools: each entry must be either allowed builtin tool or approved extension path.
Skills: each skill in skill/skills must resolve from canonical skill loading paths; missing skills are validation errors (not warnings) in strict mode.
Path safety: output, defaultReads, and chain file paths must reject traversal (..) outside configured workspace unless explicitly absolute-allowlisted.
Extensions policy: absent vs empty vs explicit list semantics must be preserved:
- absent: inherit platform default policy
- empty list: disable extensions (--no-extensions)
- explicit list: allowlist only
Role composition: optional role: roles/<name>.md must resolve to existing repo role file before run.

`infer()` compatibility contract (critical)

packages/system-bus/src/lib/inference.ts currently maps agent to inference-router profiles via resolveProfile() (packages/inference-router/src/profiles.ts) and existing production callers rely on this (reflect.ts, task-triage.ts, email-cleanup.ts).

Adoption rules:

Resolution order: roster agent definition → legacy inference profile → explicit options.
Preserve legacy behavior for classifier, reflector, triage until migrated.
Add an explicit profile option long-term; keep agent backward compatible during migration window.
runPiAttempt() currently hardcodes --no-extensions and --model; roster mode must support full flag derivation (--models, tools, extensions, appended prompt) while keeping locked-down defaults for non-roster calls.

Extension sandboxing in joelclaw

pi-subagents allows extension paths from agent definitions. In joelclaw worker context this is a security boundary, so defaults must remain deny-by-default:

Default execution remains no extensions for system-bus unless explicitly allowlisted.
Add agents.extension_allowlist in .joelclaw/config.toml; reject non-allowlisted extension paths at validation time.
Record effective extension set in OTEL metadata for every run.

Consequences

Positive

Design tasks route to Opus automatically
Agent selection is explicit and configurable
New agent types added without code changes
Model/tool/skill combos are named and reusable

Negative

Another config surface to maintain
Agent definitions can drift from actual capabilities
pi-subagents is a third-party dependency (or we steal patterns)

Risks

Over-engineering if we only need 2-3 agents
Chain execution complexity if adopted later
Agent/profile naming collision (agent currently means inference profile in infer())
Path/extension injection risk from unvalidated agent markdown
Parallel {previous} aggregation ambiguity can create non-deterministic downstream prompts
Gateway replay duplicates if progress/notify emits happen outside step.run

Resolved Questions

Chain artifact directories → Durable per-session workspace with retention policy. Must survive crashes for Inngest replay.
Chain topology → Full DAG support from day one. Not just linear + parallel groups.
JSON output contracts → Per-agent configurable. Some agents (coder, reviewer) require strict output schemas; others (designer, researcher) are freeform text. Add outputSchema field to agent definition — when present, output is validated; when absent, plain text passthrough.
Agent definition storage → Filesystem as source of truth (git-tracked), mirrored to Typesense for search + version pinning. Enables hot reload without git pull and searchable agent catalog.

Phase Plan

Phase 0+1: Loader + `infer()` Integration ✅ SHIPPED (2026-02-28)

packages/system-bus/src/lib/agent-roster.ts: loads pi-subagent-format .md files from project (.pi/agents/) and user (~/.pi/agent/agents/) scopes with module-level cache
infer() resolution order: roster → profile → throw (backward compatible with classifier/triage/reflector)
Roster agents derive full pi flags: --models MODEL:THINKING, --tools, --append-system-prompt, conditional --no-extensions
OTEL metadata includes agentSource (roster/profile/direct), agentName, agentDefinitionPath
3 project-scoped agents committed: agents/{designer,coder,ops}.md (symlinked to .pi/agents/)
6/6 unit tests: project load, user load, project-overrides-user, cache hit, missing agent, malformed frontmatter
Commit: a709622
Deferred: strict schema validation (model catalog check, skill resolution, path safety), role composition, extension allowlist. These become Phase 2 prerequisites.

Phase 2: Inngest Functions + CLI + Gateway Routing ✅ SHIPPED (2026-02-28)

agent/task.run, agent/task.complete, agent/task.progress event types added to Inngest client
agent-task-run Inngest function: validate → execute via infer() → emit complete/failed
Concurrency: 3 per agent type, 2 retries, 5m timeout
Gateway progress notification before execution, OTEL on start/complete/fail
originSession carried through all events (gateway middleware passthrough)
joelclaw agent list — discover agents from all scopes
joelclaw agent show <name> — display full definition + system prompt
joelclaw agent run <name> <task> — fire agent/task.run event, return taskId
HATEOAS JSON responses with next_actions throughout
Commits: f922842 (CLI), 5348b55 (Inngest function + events)
Deferred: Gateway $frontend-design tag routing (gateway pi session can already dispatch via joelclaw agent run or inngest_send)

Phase 3: Chain Execution ✅ SHIPPED (2026-02-28)

agent/chain.run, agent/chain.complete event types added to Inngest client
agent-chain-run Inngest function: sequential steps with {task}/{previous} template substitution
Parallel groups via Promise.allSettled with === Parallel Task N (agent) === aggregation headers
failFast option (default false — continue on step failure, collect partial results)
Concurrency: 2 chains, 1 retry, 15m timeout
OTEL per step + chain completion/failure; gateway progress per step (replay-safe)
CLI: joelclaw agent chain scout,planner+reviewer,coder --task "..." (+ = parallel, , = sequential)
5 unit tests: template substitution, parallel aggregation, sequential passing, error handling
Commit: ab1b885
Deferred: output artifact validation (warning-first, fail-on-strict mode), DAG topology beyond linear+parallel

Runtime proof + recovery timeline (2026-02-28)

Attempt 1 — blocked by local runtime reachability

joelclaw agent list and joelclaw agent show coder succeeded.
joelclaw agent run ... failed while local Inngest API was unreachable (localhost:8288).
No reliable event→run trace could be captured in that attempt.

Attempt 2 — ingress restored, roster drift surfaced

Event send path recovered.
agent/task.run reached Agent Task Run, but failed with Unknown agent roster entry: coder.
This proved ingress was healthy while worker runtime resolution was stale.

Remediation applied

Patched roster resolution to search ancestor directories for builtin agents/ when worker CWD is nested:
- commit a3e013a
- file: packages/system-bus/src/lib/agent-roster.ts
- tests: packages/system-bus/src/lib/__tests__/agent-roster.test.ts
Published system-bus-worker image with this fix and rolled k8s deployment.
Restarted host worker process (the active executor for Agent Task Run in this environment).
Recovered local control plane after transient outage (Colima/Talos restart + taint cleanup + pod recycle).

Final runtime proof — PASS

bun run packages/cli/src/cli.ts agent run coder "reply with OK" --timeout 20
- event ID: 01KJK9JJ1C5P54ZH4F200XYWBD
bun run packages/cli/src/cli.ts event 01KJK9JJ1C5P54ZH4F200XYWBD
- run ID: 01KJK9JJEX3A6NW55WQSZXKWNY
- function: Agent Task Run
- status: COMPLETED
- output includes {"status":"completed", ... "text":"OK"}

Conclusion: ADR-0180 runtime contract is now validated end-to-end (list/show/run/chain/watch paths + truthful event navigation + durable execution).

Validation smoke test — ts=1772321467 ✅ DEEP PROOF

Second full end-to-end proof with production binary against live k8s worker, capturing full OTEL metadata:

Roster

joelclaw agent list → ok: true, total: 3 (coder/designer/ops, all source: builtin)
joelclaw agent show coder → filePath, systemPrompt, model, tools, skills all present

Dispatch

joelclaw agent run coder "ADR-0180 smoke test ts=1772321467 — echo the string 'SMOKE_OK' and exit"
Event 01KJK9MT0X00WXREWX3KZW6F2X accepted · taskId at-1772321662985-gv2oj9

Run

Run 01KJK9MT3H0N5AGJH8F1PYJ6Z2 · COMPLETED · 3,759 ms
Output: { status: "completed", text: "SMOKE_OK", model: "anthropic/claude-sonnet-4-6", provider: "anthropic" }

Step trace (7 steps, all COMPLETED) emit-started-otel → validate → agent-task-progress-execute → execute (2,404 ms) → agent-task-complete → emit-completed-otel → Finalization

OTEL (5 events)

agent.task.started — taskId, agent, originSession, cwd, timeoutMs
model_router.request — agentSource: "roster", agentName: "coder", agentDefinitionPath, resolvedModel
model_router.route — policy version, resolved model
model_router.result — 2,140 ms, fallbackUsed, usage
agent.task.completed — model, provider, durationMs

agentSource: "roster" in OTEL confirms builtin scope resolution is healthy end-to-end. Historical OTEL also shows the pre-deploy failure arc: 5 agent.task.failed events with "Unknown agent roster entry: coder" (22:50–23:22 UTC), followed by clean completions post-deploy — observable failure→fix→recovery captured in Typesense.

Phase 4: Live streaming + async UX ✅ SHIPPED (2026-02-28)

joelclaw agent watch <taskId|chainId> — NDJSON streaming watcher
Redis pub/sub subscription to joelclaw:notify:gateway for real-time progress events
Inngest API polling fallback when Redis is degraded or task completed before watch started
Auto-detects task (at-*) vs chain (ac-*) IDs, adjusts timeout (300s vs 900s)
Graceful degradation documented in-code: Redis down → polling only, pre-completed → immediate result
--timeout option, SIGINT/SIGTERM cleanup, HATEOAS next_actions in terminal events
Commit: 9ab8c6d

References

nicobailon/pi-subagents — pi extension for subagent delegation
- execution.ts, chain-execution.ts, async-execution.ts, agents.ts, skills.ts, types.ts, schemas.ts, agents/*.md
packages/system-bus/src/lib/inference.ts — current infer implementation and agent profile resolution path
packages/inference-router/src/profiles.ts — legacy classifier/triage/reflector profiles
packages/system-bus/src/inngest/functions/story-pipeline.ts — replay-safe gateway signaling + contract-first stage execution
packages/system-bus/src/inngest/middleware/gateway.ts — originSession routing helpers
packages/system-bus/src/inngest/functions/agent-loop/utils.ts (pushGatewayEvent)
packages/gateway/src/channels/redis.ts + packages/gateway/src/daemon.ts — source-aware response routing and Redis event bridge
ADR-0170: Agent Role System
ADR-0163: Adaptive Prompt Architecture