Utah and joelclaw: Convergent Architecture

Mar 3, 2026

aiarchitectureinngestutahconvergent-evolutionarticle

Dan Farrelly open-sourced Utah (Universally Triggered Agent Harness), and it’s basically a clean-room implementation of the same architecture joelclaw landed on independently. I tore it apart to understand the design decisions and found the convergence is almost eerie.

What Utah Is

Utah is a personal AI agent that connects to Telegram and Slack, runs an LLM think/act/observe loop, and uses Inngest for every durable operation. It builds on pi-ai (Mario Zechner’s unified LLM interface) and pi-coding-agent’s battle-tested tools (read, edit, write, bash, grep, find, ls).

The entire codebase is 51 files. No monorepo. No framework layer. Just TypeScript, Inngest, and pi.

The Architecture Map

Here’s where it gets interesting. Almost every Utah component maps 1:1 to a joelclaw subsystem:

Concept	Utah	joelclaw
Agent loop	`agent-loop.ts`, while loop where each iteration is `step.run`	Gateway daemon pi session + system-bus Inngest functions
Channels	`channels/`, Telegram and Slack with `ChannelHandler` interface	`packages/gateway/src/channels/types.ts`, same interface pattern
Message handling	`agent.message.received` → singleton per chat	`gateway/message.received` → priority queue per channel
Reply dispatch	`agent.reply.ready` → channel-agnostic send	`gateway/reply.ready` → channel formatter → delivery
Acknowledgment	Separate function, typing indicator, best effort	Separate step, same pattern with typing + reactions
Memory	`MEMORY.md` + daily logs + 30-min heartbeat distillation	Observation pipeline + vector store + extension enforcement
Session persistence	JSONL files in `workspace/sessions/`	pi’s append-only JSONL with tree structure
Compaction	LLM summarization at 80% of 150K token budget	pi’s built-in compaction + branch summaries
Context pruning	Two-tier: soft trim (head+tail) + hard clear at 50K	Similar, pi handles this in prompt composition
System prompt	`SOUL.md` + `USER.md` + memory injection	`SOUL.md` + `IDENTITY.md` + `ROLE.md` + `USER.md` + skill injection
Tools	pi-coding-agent tools + `remember` + `web_fetch` + `delegate_task`	52 skills + pi tools + MCP + custom tools via extension API
Sub-agents	`delegate_task` → `step.invoke()` → isolated loop, no recursion	Codex delegation → separate process, sandboxed
Failure handling	Global handler on `inngest/function.failed` → user notification	Same pattern, function.failed → gateway notification
Heartbeat	Cron every 30min, check if distillation needed and skip if not	Cron every 15min, health checks that only act if needed
Worker	`connect()`, WebSocket to Inngest Cloud with no server needed	k8s worker serves HTTP to self-hosted Inngest

Where They Diverge

The convergence tells you what’s necessary for a personal agent. The divergence tells you what’s optional, or at least what reflects different constraints.

Scale of ambition. Utah is 51 files, joelclaw is 110+ Inngest functions across a monorepo. Utah is a reference implementation showing the pattern. joelclaw is a lived-in system that’s been accumulating capability fast over a couple weeks: CLI, ADR tracking, video pipeline, content publishing, email processing, contact enrichment, and calendar integration.

Memory architecture. Utah’s memory is file-based: MEMORY.md (curated by LLM distillation) + daily logs (append-only). Simple, effective, and great for a reference implementation. joelclaw’s is multi-layered: observation pipeline with write gates, vector embeddings in PGlite, semantic recall, and a mandatory participation contract enforced by pi extension. The complexity is earned because the system already needs structured recall.

Identity model. Utah has SOUL.md (4 lines) and USER.md. joelclaw splits identity across four files: SOUL, IDENTITY, ROLE, and USER. Different agent surfaces (gateway daemon, codex workers, interactive sessions) need different role definitions with the same core identity. The organism metaphor matters when you have multiple appendages.

Hosting. Utah uses connect(), a WebSocket connection to Inngest Cloud. No server, no public endpoint, no ngrok. Brilliant for getting started. joelclaw self-hosts Inngest on a k8s cluster because I’m a control freak who wants to own every byte. Both are valid. Utah’s approach is objectively easier.

Framework relationship. Utah uses pi-ai (the LLM library) and pi-coding-agent (the tool implementations), but not pi-the-framework. joelclaw uses pi as a full framework: sessions, extensions, skills, themes. This is the exact difference from The Harness Is a Framework; Utah is genuinely harness-shaped while joelclaw embraced the framework and built on top of it.

The Patterns That Converged

When two systems built independently arrive at the same design, pay attention. These patterns are likely necessary, not incidental:

Event-driven message flow. Both normalize inbound messages into typed events and dispatch via Inngest. Neither does synchronous request/response. The agent is async by nature, and fighting that creates fragile systems.

Channel abstraction. Both define a ChannelHandler interface with sendReply and acknowledge. Both keep the agent loop channel-agnostic. The shape of this interface is almost identical.

Singleton concurrency. One conversation at a time per chat. Utah uses Inngest’s singleton config. joelclaw uses Redis-backed priority queues. Same goal: prevent race conditions when the human sends faster than the agent thinks.

Memory as system prompt injection. Both load memory into the system prompt, not as tool results. Memory is context, not conversation.

Separate acknowledgment. Both fire a separate function to show “typing” immediately, independent of the agent loop. The user needs to know they were heard before the agent thinks.

Failure notification. Both catch inngest/function.failed and notify the user. Silent failures in agent systems are deadly. The user thinks the agent is ignoring them.

Context management. Both prune old tool results and compact long conversations. Both estimate tokens with chars/4. Context management isn’t a nice-to-have. Without it, agents degrade over hours.

What This Means

Dan built Utah as a reference implementation for the ideas in his article about agents needing infrastructure over frameworks. The fact that it maps so closely to joelclaw, which was built organically over a couple weeks of daily use, is the strongest possible validation of those ideas.

The patterns aren’t opinions. They’re convergent evolution. Two systems solving the same problem arrived at the same shape because that’s the shape the problem demands.

If you’re building a personal agent, start with Utah. It’s clean, minimal, and gets the architecture right. When you outgrow it, when you need 52 skills instead of 3 tools, when you need a CLI, and when you need vector memory instead of file-based memory, you’ll know exactly which layers to add because Utah already taught you where the seams are.

Utah on GitHub