ADR-0038implemented

Embed pi as a library in a joelclaw gateway daemon

Context and Problem Statement

The current central gateway session (ADR-0036) runs pi inside tmux, managed by launchd. A gateway extension injects events via sendUserMessage(). This works for Redis-based notifications but has fundamental limitations:

  1. No mobile access — Joel currently SSH’s into the Mac Mini from Termius on his phone to interact with pi. This works but fights the medium (tiny keyboard, no streaming, terminal rendering issues).

  2. No multi-channel routing — Replies stay inside the pi TUI. There’s no way to route a response to Telegram, Slack, or a native app. The agent can receive events but can’t talk back through the channel that asked.

  3. No streamingsendUserMessage() is fire-and-forget. The extension can’t stream LLM deltas to external clients.

  4. TMux PTY hack — Pi is a TUI app that needs a terminal. The tmux wrapper adds complexity and an extra process layer. OpenClaw solved this by embedding pi as a library — no terminal needed.

  5. Extension limitations — The gateway extension can inject prompts and drain events, but it can’t control the session lifecycle, model selection, compaction, or routing.

What We Want

Talk to the agent from anywhere:

  • Telegram — Send a message from your phone, get a response
  • Native iOS/macOS app — Purpose-built UI (future, on roadmap)
  • WebSocket — Attach from any terminal (like openclaw tui)
  • Redis bridge — Inngest events still flow in (existing infrastructure)
  • All inputs serialize through one session — Same conversation, same memory, same context

How OpenClaw Does It

OpenClaw’s gateway daemon (src/macos/gateway-daemon.ts) is a standalone Node.js process that:

  1. Embeds pi via createAgentSession() from @mariozechner/pi-coding-agent
  2. Serializes all inputs through a CommandQueue with lanes — TUI, heartbeat, Telegram, Discord, etc. all go through one queue into one pi session
  3. Routes replies back through channel-specific outbound adapters (Telegram HTML chunks, Discord markdown, WhatsApp formatting, etc.)
  4. Streams deltas to connected WebSocket clients (TUI, mobile app)
  5. Runs as a launchd daemonKeepAlive: true, no terminal needed
  6. Manages channels via a plugin system — each channel implements ChannelPlugin (config, gateway lifecycle, outbound adapter, status probes)

The TUI (openclaw tui) is a WebSocket client that connects to the running daemon — it doesn’t run pi directly.

Decision

Build a joelclaw gateway daemon that embeds pi as a library, replacing the current tmux + extension approach. Start with Telegram as the first external channel.

Architecture

launchd (com.joel.gateway)
  → joelclaw-gateway daemon (Node.js)
    ├── createAgentSession() — owns the LLM conversation
    ├── CommandQueue — serializes all inputs
    ├── HeartbeatRunner — periodic checklist (setInterval)
    ├── Channels:
    │   ├── Redis — Inngest event bridge (existing)
    │   ├── Telegram — grammY bot (first external channel)
    │   ├── WebSocket — TUI attach + future native app
    │   └── (future: Discord, Slack, iMessage, web)
    ├── OutboundRouter — route replies to source channel
    └── Watchdog — heartbeat staleness detection (ADR-0037)

Session Ownership

The daemon owns the pi session via createAgentSession(). This gives us:

  • Full control over model, thinking level, compaction
  • session.prompt() for synchronous prompt/response
  • session.subscribe() for streaming deltas to channels
  • session.sendUserMessage() with followUp for async injection
  • Same extensions, skills, tools as interactive pi (auto-discovered from ~/.pi/agent/)
  • Persistent session file (conversation survives restart)

Command Queue

All inputs serialize through one queue (adapted from OpenClaw’s CommandLane):

type QueueEntry = {
  source: ChannelId;     // "telegram:12345", "redis", "ws:abc", "heartbeat"
  prompt: string;
  replyTo?: string;      // Channel-specific reply target
  metadata?: Record<string, unknown>;
};

The queue drains sequentially — one prompt at a time. While the LLM is responding, new messages queue up (OpenClaw calls this the “main lane”).

Outbound Routing

When the LLM responds, the reply routes back to the channel that sent the prompt:

  • Telegram → Format as Telegram HTML, send via grammY
  • Redis → Push to joelclaw:events:{sessionId} (satellite notification)
  • WebSocket → Stream deltas as JSON frames
  • Heartbeat → Filter HEARTBEAT_OK (suppress), deliver non-OK to notification channel

Telegram Channel (First Implementation)

Phone (Telegram) → Bot API → grammY handler → CommandQueue → pi session

Phone (Telegram) ← Bot API ← Telegram outbound ← OutboundRouter ←─┘
  • grammY bot with long polling (no webhook needed — runs on the tailnet)
  • Allowlist: Joel’s Telegram user ID only
  • Message types: text, photos (as image attachments), voice (future: whisper transcription)
  • Reply formatting: Markdown → Telegram HTML with chunk splitting (4000 char limit)
  • Typing indicator while LLM is working

WebSocket Channel (TUI Attach)

# Attach to the running daemon from any terminal
joelclaw tui
 
# Or from Termius on the phone
ssh joel@mac-mini "joelclaw tui"

Protocol: JSON frames over WebSocket (simplified from OpenClaw’s protocol):

  • {type: "prompt", text: "..."} — send a message
  • {type: "delta", text: "..."} — streaming response chunk
  • {type: "done", fullText: "..."} — response complete
  • {type: "status", ...} — model, usage, session info

Build Plan

Phase 1: Daemon + Redis (replace current extension) ✅

  • Create packages/gateway/ in monorepo
  • daemon.ts — entry point, createAgentSession(), launchd lifecycle
  • command-queue.ts — sequential input serialization
  • channels/redis.ts — port existing Redis bridge from extension
  • heartbeat.tssetInterval runner, reads HEARTBEAT.md, watchdog (30min threshold), tripwire file
  • Update com.joel.gateway plist to run daemon directly (no tmux)
  • Verify: Redis events flow through pi session, responses logged

Phase 2: Telegram ✅

  • channels/telegram.ts — grammY bot, user allowlist, text/photo/voice handlers
  • Outbound: markdown → Telegram HTML conversion, 4000 char chunking, typing indicator
  • Response routing via session.subscribe() delta collection → source channel dispatch
  • Bot token in agent-secrets (leased at startup via gateway-start.sh)
  • Created @JoelClawPandaBot via @BotFather
  • Verified: full round-trip — phone → Telegram → pi session → Telegram → phone

Phase 3: WebSocket + TUI

  • channels/websocket.ts — WS server on localhost (Tailscale accessible)
  • joelclaw tui CLI command — connects to daemon WS, renders in terminal
  • Stream deltas to connected clients
  • Auth: Tailscale identity or simple token

Phase 4: Native App Foundation

  • WebSocket protocol stabilized
  • Session info endpoint (model, usage, messages)
  • Consider React Native or Swift UI for iOS
  • Consider whether to port OpenClaw’s mobile node protocol

Considered Options

Option 1: Telegram bot on current extension (rejected as long-term)

Quick win (~1 hour) but doesn’t solve the fundamental limitations. The extension can’t control session lifecycle, can’t stream, can’t properly route replies. Would need to be rewritten anyway.

Option 2: OpenClaw deployment (rejected — ADR-0003)

OpenClaw has everything we want, but it’s a different system with different opinions about configuration, channel management, and multi-agent orchestration. We’ve already diverged significantly (Inngest over job queues, Qdrant over SQLite, k8s over localhost). Embedding pi directly gives us the session management without the rest.

Option 3: Embedded pi daemon (chosen)

Best of both worlds: OpenClaw’s proven architecture pattern (embedded pi, command queue, channel plugins) with joelclaw’s infrastructure (Inngest, Redis, k8s, Tailscale). We own the daemon code, control the channel implementations, and can evolve at our own pace.

Consequences

Positive

  • Talk to the agent from Telegram (phone), WebSocket (any terminal), and future native app
  • Streaming responses to all channels
  • No tmux PTY hack — pure headless Node.js daemon
  • Same session, skills, extensions, tools as interactive pi
  • Foundation for native iOS/macOS app
  • Outbound delivery: agent can proactively message Joel on Telegram (not just respond)

Negative

  • More code to maintain (daemon + channels vs. extension)
  • pi command no longer used for central session (it’s embedded in the daemon)
  • Need to build TUI attach for terminal access (or use Termius → joelclaw tui)
  • Telegram bot token is a new secret to manage
  • Channel-specific formatting (Telegram HTML, Discord markdown) is ongoing work

Non-goals (for now)

  • Multi-agent: one daemon = one pi session. Subagents are future work.
  • Voice: Telegram voice messages → Whisper transcription is Phase 2+.
  • Group chats: Bot responds only in DMs with Joel.
  • End-to-end encryption: Tailscale provides transport security.

Implementation

Affected Paths

PathChange
packages/gateway/New package — daemon, channels, outbound, heartbeat
~/Library/LaunchAgents/com.joel.gateway.plistUpdated: runs daemon directly, no tmux
~/.joelclaw/scripts/gateway-start.shSimplified: just exec the daemon
~/.pi/agent/extensions/gateway/Deprecated: functionality moves into daemon
packages/cli/src/commands/gateway.tsAdd tui subcommand for WebSocket attach

Dependencies

PackagePurpose
@mariozechner/pi-coding-agentPi SDK — createAgentSession, tools, extensions
@mariozechner/pi-aiModel selection (getModel)
grammyTelegram Bot API
ioredisRedis pub/sub bridge
wsWebSocket server

Verification

  • createAgentSession() works headless (no TUI, no terminal)
  • Extensions and skills auto-discovered from ~/.pi/agent/
  • AGENTS.md loaded as system prompt context
  • Heartbeat fires every 15 min, HEARTBEAT_OK filtered
  • Redis events from Inngest flow through to session
  • Telegram message → LLM response → Telegram reply (round-trip)
  • WebSocket streaming deltas to connected client
  • launchd restart on crash (KeepAlive)
  • Session file persists across daemon restarts
  • Satellite pi sessions still get targeted notifications