ADR-0038implemented

Embed pi as a library in a joelclaw gateway daemon

February 17, 2026

Context and Problem Statement

The current central gateway session (ADR-0036) runs pi inside tmux, managed by launchd. A gateway extension injects events via sendUserMessage(). This works for Redis-based notifications but has fundamental limitations:

No mobile access — Joel currently SSH’s into the Mac Mini from Termius on his phone to interact with pi. This works but fights the medium (tiny keyboard, no streaming, terminal rendering issues).
No multi-channel routing — Replies stay inside the pi TUI. There’s no way to route a response to Telegram, Slack, or a native app. The agent can receive events but can’t talk back through the channel that asked.
No streaming — sendUserMessage() is fire-and-forget. The extension can’t stream LLM deltas to external clients.
TMux PTY hack — Pi is a TUI app that needs a terminal. The tmux wrapper adds complexity and an extra process layer. OpenClaw solved this by embedding pi as a library — no terminal needed.
Extension limitations — The gateway extension can inject prompts and drain events, but it can’t control the session lifecycle, model selection, compaction, or routing.

What We Want

Talk to the agent from anywhere:

Telegram — Send a message from your phone, get a response
Native iOS/macOS app — Purpose-built UI (future, on roadmap)
WebSocket — Attach from any terminal (like openclaw tui)
Redis bridge — Inngest events still flow in (existing infrastructure)
All inputs serialize through one session — Same conversation, same memory, same context

How OpenClaw Does It

OpenClaw’s gateway daemon (src/macos/gateway-daemon.ts) is a standalone Node.js process that:

Embeds pi via createAgentSession() from @mariozechner/pi-coding-agent
Serializes all inputs through a CommandQueue with lanes — TUI, heartbeat, Telegram, Discord, etc. all go through one queue into one pi session
Routes replies back through channel-specific outbound adapters (Telegram HTML chunks, Discord markdown, WhatsApp formatting, etc.)
Streams deltas to connected WebSocket clients (TUI, mobile app)
Runs as a launchd daemon — KeepAlive: true, no terminal needed
Manages channels via a plugin system — each channel implements ChannelPlugin (config, gateway lifecycle, outbound adapter, status probes)

The TUI (openclaw tui) is a WebSocket client that connects to the running daemon — it doesn’t run pi directly.

Decision

Build a joelclaw gateway daemon that embeds pi as a library, replacing the current tmux + extension approach. Start with Telegram as the first external channel.

Architecture

launchd (com.joel.gateway)
  → joelclaw-gateway daemon (Node.js)
    ├── createAgentSession() — owns the LLM conversation
    ├── CommandQueue — serializes all inputs
    ├── HeartbeatRunner — periodic checklist (setInterval)
    ├── Channels:
    │   ├── Redis — Inngest event bridge (existing)
    │   ├── Telegram — grammY bot (first external channel)
    │   ├── WebSocket — TUI attach + future native app
    │   └── (future: Discord, Slack, iMessage, web)
    ├── OutboundRouter — route replies to source channel
    └── Watchdog — heartbeat staleness detection (ADR-0037)

Session Ownership

The daemon owns the pi session via createAgentSession(). This gives us:

Full control over model, thinking level, compaction
session.prompt() for synchronous prompt/response
session.subscribe() for streaming deltas to channels
session.sendUserMessage() with followUp for async injection
Same extensions, skills, tools as interactive pi (auto-discovered from ~/.pi/agent/)
Persistent session file (conversation survives restart)

Command Queue

All inputs serialize through one queue (adapted from OpenClaw’s CommandLane):

type QueueEntry = {
  source: ChannelId;     // "telegram:12345", "redis", "ws:abc", "heartbeat"
  prompt: string;
  replyTo?: string;      // Channel-specific reply target
  metadata?: Record<string, unknown>;
};

The queue drains sequentially — one prompt at a time. While the LLM is responding, new messages queue up (OpenClaw calls this the “main lane”).

Outbound Routing

When the LLM responds, the reply routes back to the channel that sent the prompt:

Telegram → Format as Telegram HTML, send via grammY
Redis → Push to joelclaw:events:{sessionId} (satellite notification)
WebSocket → Stream deltas as JSON frames
Heartbeat → Filter HEARTBEAT_OK (suppress), deliver non-OK to notification channel

Telegram Channel (First Implementation)

Phone (Telegram) → Bot API → grammY handler → CommandQueue → pi session
                                                                   ↓
Phone (Telegram) ← Bot API ← Telegram outbound ← OutboundRouter ←─┘

grammY bot with long polling (no webhook needed — runs on the tailnet)
Allowlist: Joel’s Telegram user ID only
Message types: text, photos (as image attachments), voice (future: whisper transcription)
Reply formatting: Markdown → Telegram HTML with chunk splitting (4000 char limit)
Typing indicator while LLM is working

WebSocket Channel (TUI Attach)

# Attach to the running daemon from any terminal
joelclaw tui
 
# Or from Termius on the phone
ssh joel@mac-mini "joelclaw tui"

Protocol: JSON frames over WebSocket (simplified from OpenClaw’s protocol):

{type: "prompt", text: "..."} — send a message
{type: "delta", text: "..."} — streaming response chunk
{type: "done", fullText: "..."} — response complete
{type: "status", ...} — model, usage, session info

Build Plan

Phase 1: Daemon + Redis (replace current extension) ✅

Create packages/gateway/ in monorepo
daemon.ts — entry point, createAgentSession(), launchd lifecycle
command-queue.ts — sequential input serialization
channels/redis.ts — port existing Redis bridge from extension
heartbeat.ts — setInterval runner, reads HEARTBEAT.md, watchdog (30min threshold), tripwire file
Update com.joel.gateway plist to run daemon directly (no tmux)
Verify: Redis events flow through pi session, responses logged

Phase 2: Telegram ✅

channels/telegram.ts — grammY bot, user allowlist, text/photo/voice handlers
Outbound: markdown → Telegram HTML conversion, 4000 char chunking, typing indicator
Response routing via session.subscribe() delta collection → source channel dispatch
Bot token in agent-secrets (leased at startup via gateway-start.sh)
Created @JoelClawPandaBot via @BotFather
Verified: full round-trip — phone → Telegram → pi session → Telegram → phone

Phase 3: WebSocket + TUI

channels/websocket.ts — WS server on localhost (Tailscale accessible)
joelclaw tui CLI command — connects to daemon WS, renders in terminal
Stream deltas to connected clients
Auth: Tailscale identity or simple token

Phase 4: Native App Foundation

WebSocket protocol stabilized
Session info endpoint (model, usage, messages)
Consider React Native or Swift UI for iOS
Consider whether to port OpenClaw’s mobile node protocol

Considered Options

Option 1: Telegram bot on current extension (rejected as long-term)

Quick win (~1 hour) but doesn’t solve the fundamental limitations. The extension can’t control session lifecycle, can’t stream, can’t properly route replies. Would need to be rewritten anyway.

Option 2: OpenClaw deployment (rejected — ADR-0003)

OpenClaw has everything we want, but it’s a different system with different opinions about configuration, channel management, and multi-agent orchestration. We’ve already diverged significantly (Inngest over job queues, Qdrant over SQLite, k8s over localhost). Embedding pi directly gives us the session management without the rest.

Option 3: Embedded pi daemon (chosen)

Best of both worlds: OpenClaw’s proven architecture pattern (embedded pi, command queue, channel plugins) with joelclaw’s infrastructure (Inngest, Redis, k8s, Tailscale). We own the daemon code, control the channel implementations, and can evolve at our own pace.

Consequences

Positive

Talk to the agent from Telegram (phone), WebSocket (any terminal), and future native app
Streaming responses to all channels
No tmux PTY hack — pure headless Node.js daemon
Same session, skills, extensions, tools as interactive pi
Foundation for native iOS/macOS app
Outbound delivery: agent can proactively message Joel on Telegram (not just respond)

Negative

More code to maintain (daemon + channels vs. extension)
pi command no longer used for central session (it’s embedded in the daemon)
Need to build TUI attach for terminal access (or use Termius → joelclaw tui)
Telegram bot token is a new secret to manage
Channel-specific formatting (Telegram HTML, Discord markdown) is ongoing work

Non-goals (for now)

Multi-agent: one daemon = one pi session. Subagents are future work.
Voice: Telegram voice messages → Whisper transcription is Phase 2+.
Group chats: Bot responds only in DMs with Joel.
End-to-end encryption: Tailscale provides transport security.

Implementation

Affected Paths

Path	Change
`packages/gateway/`	New package — daemon, channels, outbound, heartbeat
`~/Library/LaunchAgents/com.joel.gateway.plist`	Updated: runs daemon directly, no tmux
`~/.joelclaw/scripts/gateway-start.sh`	Simplified: just exec the daemon
`~/.pi/agent/extensions/gateway/`	Deprecated: functionality moves into daemon
`packages/cli/src/commands/gateway.ts`	Add `tui` subcommand for WebSocket attach

Dependencies

Package	Purpose
`@mariozechner/pi-coding-agent`	Pi SDK — createAgentSession, tools, extensions
`@mariozechner/pi-ai`	Model selection (getModel)
`grammy`	Telegram Bot API
`ioredis`	Redis pub/sub bridge
`ws`	WebSocket server