ADR-0227proposed

Clawnode — Mesh Client Daemon

Context

joelclaw’s services (Redis, Inngest, Typesense, gateway) run on Panda (M4 Pro Mac Mini). Other dev machines on the Tailscale mesh (e.g. triangle-2 for gremlin work) have no native access to these services. Working on another machine means SSH’ing into Panda or losing access to project memory, event submission, gateway messaging, and background jobs.

OpenClaw solves a similar problem with a hub-and-spoke model: a central Gateway with WebSocket-connected “nodes” that advertise capabilities, maintain presence, and use Bonjour/mDNS + Tailscale for discovery.

Decision

Build clawnode — a native daemon that runs on any Tailscale-connected machine and provides local access to joelclaw’s service mesh.

Architecture

any-machine
└── clawnode (single Rust binary, always-on via launchd/systemd)
    ├── Embedded PDS (Axum + SQLite)
    │   ├── Own DID (did:plc:<machine-name>)
    │   ├── XRPC endpoints (:2583)
    │   ├── dev.joelclaw.node.* records (presence, capabilities, services)
    │   └── Firehose WebSocket (subscribeRepos)
    ├── Mesh subscriber (watches other nodes' firehoses for state changes)
    ├── Service proxy
    │   ├── Redis → panda:6379 (gateway events, mail, pub/sub)
    │   ├── Typesense → panda via Tailscale Funnel (memory read/write)
    │   └── Inngest → panda:8288 (send events, start jobs)
    ├── Local Unix socket API (for agents/CLIs on this machine)
    └── CLI subcommands (clawnode status, send, recall, etc.)

Daemon Responsibilities

  1. Service proxy — maintain persistent connections to Panda’s services; expose them via local Unix socket so agents don’t need network config
  2. Presence — announce this machine’s identity and capabilities to the mesh
  3. Memory sync — read/write observations, recall queries, project memory
  4. Event submission — send Inngest events, start background jobs
  5. Gateway messaging — send/receive messages through the gateway’s Redis bridge
  6. Reconnect & health — handle network interruptions, report connectivity status
  7. Service discovery — resolve Panda’s services (static config initially, PDS-backed later)

Single Binary: clawnode

The daemon and CLI are one binary. Running clawnode starts the daemon; subcommands talk to it:

clawnode                   # start daemon (foreground, launchd manages lifecycle)
clawnode status            # daemon health + service connectivity
clawnode send <event>      # fire Inngest event
clawnode recall <query>    # semantic memory search
clawnode memory write "..."# write observation
clawnode gateway send "..."# message the gateway
clawnode jobs start|status # background work
clawnode mail send|inbox   # agent coordination

No vault commands, no deploy, no k8s, no worker management. Those are operator concerns that stay on Panda.

Language Choice

Rust daemon with embedded PDS + optional Swift menu bar UI. The daemon is long-running with async I/O (tokio), reconnect logic, and low memory footprint. Rust gives a single static binary with no runtime dependency.

Each clawnode instance IS a Personal Data Server — not a client to one. The PDS is embedded directly in the daemon binary using:

  • atrium crates (MIT, ★410) — code-generated AT Proto types from lexicons, XRPC primitives
  • Axum — HTTP server for XRPC endpoints
  • libSQL (Turso’s SQLite fork) — local record storage with native vector search (F32_BLOB columns, DiskANN indexing, vector_top_k() queries). Single .db file, no external DB. Enables local semantic memory/recall on every node.
  • tokio — async runtime for firehose WebSocket, reconnect, health

Embeddings are configurable per-node:

  • Remote (default): call Panda’s embedding service (Typesense or dedicated endpoint). No local model needed.
  • Local (opt-in): run all-MiniLM-L12-v2 or similar small model locally. 33MB, pure CPU inference.
  • Hybrid: embed locally for writes, query remote for higher-quality results.

This means every node has local semantic memory — clawnode recall works even when Panda is offline.

Reference implementations studied: DrChat/bluepds (Axum+Atrium+SQLite), blacksky-algorithms/rsky (full Rust AT Proto stack), tranquil.farm/tranquil-pds.

Required XRPC surface (minimal PDS):

  • com.atproto.repo.createRecord / getRecord / listRecords / deleteRecord
  • com.atproto.repo.describeRepo
  • com.atproto.sync.subscribeRepos (firehose WebSocket)
  • com.atproto.server.createSession / getSession
  • com.atproto.identity.resolveHandle

The Swift menu bar app (optional, later) talks to the daemon’s Unix socket and displays the claw SVG icon with service health.

Evolution Path

Phase 1 (now): Each clawnode IS a PDS from day one. Own DID, own SQLite repo, own firehose. Static config for Panda’s service endpoints (~/.clawnode/services.json). Nodes write dev.joelclaw.node.* records to their local PDS. Panda’s clawnode (or existing k8s PDS) subscribes to other nodes’ firehoses to build mesh state.

Phase 2 (next): Service discovery via PDS records. Instead of static config, nodes read dev.joelclaw.node.service records from peer PDS instances to discover available services. The ServiceRegistry trait abstracts discovery — static config and PDS-backed are both adapters behind the same interface. Nodes subscribe to each other’s firehoses for real-time mesh updates.

Phase 3 (later): Full mesh autonomy. Nodes can operate independently if Panda goes down — they have their own PDS with local data. Cross-node record replication for shared state (memory observations, agent messages). Optional relay service on Panda aggregates all node firehoses into a single stream. Account/repo portability means a machine’s identity survives hardware replacement — export CAR file, import on new machine, update DID doc.

Prerequisites

  • Typesense — already exposed via Tailscale Funnel at https://panda.tail7af24.ts.net/typesense (used by Vercel production). Clawnode can use the same endpoint or hit Panda’s Tailscale IP directly. No k8s change needed.
  • Service port inventory — Redis :6379, Inngest :8288/:8289, Typesense via Funnel, docs-api :3838 are already reachable over Tailscale
  • Unix socket protocol — define the JSON-RPC or similar contract between daemon and CLI
  • PDS session auth — clawnode needs an AT Proto session to the joelclaw PDS for record read/write

Consequences

  • Any Tailscale-connected machine becomes a first-class joelclaw peer
  • Agents running anywhere in the mesh get native access to memory, events, gateway
  • Project memory (gremlin work on triangle-2) stays globally synchronized
  • No need to ship the full joelclaw CLI (operator tool) to every machine
  • New binary to build and maintain (Rust daemon)
  • PDS integration creates a natural path toward decentralized agent mesh
  • OpenClaw’s node pattern validates this approach; we can learn from their presence/discovery model

Alternatives Considered

  1. Make joelclaw CLI network-aware — add host config so existing CLI talks to remote services. Rejected: the CLI does too much (deploy, k8s, vault). Shipping it everywhere means shipping the kitchen.
  2. SSH tunnel only — just SSH into Panda when needed. Rejected: doesn’t give agents native access; breaks the “machine as peer” model.
  3. TypeScript/Bun daemon — reuse SDK directly. Considered but: daemon needs solid async reconnect, low memory, long-running stability. Rust fits better. SDK types inform the protocol design.
  4. PDS as separate Docker container, clawnode as client — run the reference Bluesky PDS image alongside the daemon. Rejected: adds Docker dependency, splits one concern into two processes, and means the daemon isn’t self-contained. Embedding the PDS makes clawnode a single binary that IS the node — install it, run it, you’re on the mesh.
  5. Fork bluepds — closest existing Rust PDS. Considered but: shaped for Bluesky social compat, more surface area than we need. Building on atrium crates directly gives us exactly the PDS surface we need without social features we don’t.

References

  • OpenClaw node architecture: ~/Code/openclaw/openclaw/docs/concepts/presence.md, docs/platforms/mac/remote.md
  • OpenClaw gateway WebSocket node pattern: nodes connect with role: node, advertise permissions, maintain presence
  • Tailscale MagicDNS for service resolution
  • AT Proto PDS for future service discovery (ADR pending on PDS lexicon design)