CLI Design for AI Agents
Every agent harness can run a shell command and read stdout. Pi, Claude Code, Codex — doesn’t matter. That’s the universal interface. If your tool returns something an agent can parse, the agent can use it. If it returns a pretty table with ANSI colors, the agent is flying blind.
npx skills add joelhooks/joelclaw --skill cli-designMy system runs 35 Inngest functions, an always-on gateway, video transcription, email triage, meeting analysis. The agent operates all of it through one CLI: joelclaw. Not a REST API. Not an SDK. A CLI that returns JSON.
Design CLIs for agents first, and humans get a perfectly usable tool for free — pipe through jq. Design for humans first, and agents get nothing.
Principle 1: JSON always
No plain text. No tables. No color codes. No --json flag to opt into structured output. JSON is the default and only format.
joelclaw status{
"ok": true,
"command": "joelclaw status",
"result": {
"server": { "ok": true, "url": "http://localhost:8288" },
"worker": { "ok": true, "functions": 35 }
},
"next_actions": [
{ "command": "joelclaw functions", "description": "View registered functions" },
{ "command": "joelclaw runs --count 5", "description": "Recent runs" }
]
}Every command. Every time. The agent never has to guess what format it’s getting.
Principle 2: HATEOAS — tell the agent what to do next
This is the one that changes everything.
Every response includes next_actions — command templates the agent can run next. Not literal examples to copy-paste — templates with typed placeholders the agent fills in.
Standard POSIX/docopt syntax: <required> for positional args, [--flag <value>] for optional flags. When a params object is present, the command is a template. When it’s absent, the command is literal. The agent doesn’t need to know your CLI’s flag syntax — the template tells it everything.
{
"ok": false,
"command": "joelclaw send pipeline/video.download",
"error": {
"message": "Inngest server not responding",
"code": "SERVER_UNREACHABLE"
},
"fix": "Start the Inngest server pod: kubectl rollout restart statefulset/inngest -n joelclaw",
"next_actions": [
{ "command": "joelclaw status", "description": "Re-check after fix" },
{
"command": "kubectl get pods [--namespace <ns>]",
"description": "Check pod status",
"params": { "ns": { "default": "joelclaw" } }
}
]
}The next_actions are contextual — they change based on what just happened. A failed command suggests different templates than a successful one. An error includes a fix field in plain language. The agent has everything it needs to self-recover.
The params object carries metadata the agent uses to fill templates intelligently:
value— pre-filled from the current response context (e.g., a run ID just returned)default— what happens if the agent omits this flag entirelyenum— valid choices (the agent picks from a closed set instead of guessing)description— what this parameter means
{
"ok": true,
"command": "joelclaw send video/download",
"result": { "event_id": "01KHF98SKZ7RE6HC2BH8PW2HB2", "status": "accepted" },
"next_actions": [
{
"command": "joelclaw run <run-id>",
"description": "Inspect the triggered run",
"params": {
"run-id": { "value": "01KHF98SKZ7RE6HC2BH8PW2HB2", "description": "Run ID (ULID)" }
}
},
{
"command": "joelclaw runs [--status <status>] [--count <count>]",
"description": "List recent runs",
"params": {
"status": { "enum": ["COMPLETED", "FAILED", "RUNNING", "QUEUED", "CANCELLED"] },
"count": { "default": 10 }
}
}
]
}The agent sees params.run-id.value → it knows the exact ID to use. It sees params.status.enum → it picks from the list instead of hallucinating a filter name. It sees params.count.default → it can omit the flag or adjust it.
This is Roy Fielding’s HATEOAS constraint from REST, applied to CLIs. But where REST gives you links, this gives you forms — hypermedia controls with typed inputs. The application state is navigable and parameterizable from the response itself. No out-of-band knowledge required.
Principle 3: Self-documenting command tree
The root command (no arguments) returns the full command tree:
joelclaw{
"ok": true,
"command": "joelclaw",
"result": {
"description": "joelclaw — Personal AI system CLI",
"commands": [
{ "name": "send", "usage": "joelclaw send <event> [-d <json>] [--follow]" },
{ "name": "status", "usage": "joelclaw status" },
{ "name": "watch", "usage": "joelclaw watch [<loop-id>]" },
{ "name": "gateway stream", "usage": "joelclaw gateway stream" }
]
},
"next_actions": [...]
}One call and the agent knows everything available. No --help parsing. No man pages. No guessing.
Principle 4: Protect context
Agents have finite context windows. A CLI that dumps 10,000 log lines into stdout just consumed half the agent’s working memory.
Rules:
- Truncate by default — show last 30 lines, not all of them
- When truncated, point to the full output — include a file path
- Auto-limit lists — cap at a reasonable default, offer
--countto adjust
{
"result": {
"showing": 30,
"total": 4582,
"truncated": true,
"full_output": "/tmp/joelclaw-logs-abc123.log",
"lines": ["...last 30 lines..."]
},
"next_actions": [
{
"command": "joelclaw logs [--lines <count>]",
"description": "Show more",
"params": { "count": { "default": 30, "description": "Number of lines" } }
}
]
}The temporal gap
Those four principles cover spatial queries — what’s the state right now? But my system is temporal. Events fire. Pipelines run. Loops iterate through stories. The gateway routes messages. All of that happens over time.
With request-response only, the agent is stuck polling:
joelclaw send video/download -d '{"url":"..."}' → event sent
joelclaw runs --count 3 → still running
joelclaw runs --count 3 → still running
joelclaw runs --count 3 → still running
joelclaw run 01KHF98SKZ7RE6HC2BH8PW2HB2 → completedFive tool calls to follow one pipeline. Each one burns context. Each one has up to 15 seconds of latency if you’re polling on an interval.
My watch command tried to solve this with a polling loop inside the CLI — but it had to break the “JSON always” principle to do it, outputting formatted text because the envelope format had no streaming semantics.
Principle 5: NDJSON for the temporal dimension
NDJSON (Newline-Delimited JSON) — one JSON object per line. The same pattern docker events --format '{{json .}}' and kubectl get pods -w -o json use. Pipe-native. Grep-able. jq-friendly.
The protocol: each line has a type discriminator. The last line is always the standard HATEOAS envelope. Tools that don’t understand streaming just read the last line.
joelclaw send video/download --follow -d '{"url":"..."}'{"type":"start","command":"joelclaw send video/download --follow","ts":"..."}
{"type":"step","name":"download","status":"started","ts":"..."}
{"type":"progress","name":"download","percent":45,"ts":"..."}
{"type":"step","name":"download","status":"completed","duration_ms":3200,"ts":"..."}
{"type":"step","name":"transcribe","status":"started","ts":"..."}
{"type":"step","name":"transcribe","status":"completed","duration_ms":45000,"ts":"..."}
{"type":"result","ok":true,"command":"...","result":{...},"next_actions":[...]}One command. The agent sees every step as it happens. No polling. No wasted calls. And because the stream terminates with the standard envelope, the agent knows exactly what to do next.
The event types:
| Type | Meaning | Terminal? |
|---|---|---|
start | Stream begun | No |
step | Pipeline step lifecycle | No |
progress | Progress update | No |
log | Diagnostic message | No |
event | An event was emitted (fan-out visibility) | No |
result | HATEOAS success envelope | Yes |
error | HATEOAS error envelope | Yes |
What this unlocks
send --follow — send an event and watch the pipeline run. The agent can react mid-stream. If a step fails, it can cancel, retry, or escalate without waiting for the whole thing to finish.
watch as real-time push — subscribe to Redis pub/sub for loop state changes instead of polling every 15 seconds. Story completions arrive the instant they happen.
gateway stream — tap into the gateway event bridge from any terminal. See every event flowing through the system.
logs --follow — structured tail -f. Each line is typed JSON with a level field. The agent can filter for errors without regex.
Composable pipes:
# Only step completions
joelclaw watch | jq --unbuffered 'select(.type == "step" and .status == "completed")'
# Only errors
joelclaw send pipeline/run --follow | jq --unbuffered 'select(.type == "error" or .status == "failed")'The response envelope
For reference — the exact shape every command uses.
Success
{
ok: true,
command: string, // the command that was run
result: object, // command-specific payload
next_actions: Array<{
command: string, // template (POSIX syntax) or literal command
description: string, // what it does
params?: Record<string, { // presence = command is a template
value?: string | number, // pre-filled from context
default?: string | number,// value if omitted
enum?: string[], // valid choices
description?: string // what this param means
}>
}>
}Error
{
ok: false,
command: string,
error: {
message: string, // what went wrong
code: string // machine-readable error code
},
fix: string, // plain-language suggested fix
next_actions: Array<{
command: string,
description: string,
params?: Record<string, { ... }> // same schema as success
}>
}Stream event
type StreamEvent =
| { type: "start"; command: string; ts: string }
| { type: "step"; name: string; status: "started" | "completed" | "failed"; ... }
| { type: "progress"; name: string; percent?: number; message?: string; ts: string }
| { type: "log"; level: "info" | "warn" | "error"; message: string; ts: string }
| { type: "event"; name: string; data: unknown; ts: string }
| { type: "result"; ok: true; command: string; result: unknown; next_actions: NextAction[] }
| { type: "error"; ok: false; command: string; error: {...}; fix: string; next_actions: NextAction[] }Implementation notes
The joelclaw CLI uses Effect CLI (@effect/cli) with Bun. The streaming infrastructure subscribes to the same Redis pub/sub channels that the gateway extension uses — pushGatewayEvent() middleware in every Inngest function is the emission point, and the CLI is just another subscriber.
Inngest function step completes
→ pushGatewayEvent() writes to Redis pub/sub
→ gateway extension receives it (session injection)
→ CLI --follow receives it (NDJSON on stdout)No new infrastructure. The event bridge was already there. Streaming just gave the CLI a way to tap into it.
The anti-patterns
| Don’t | Do |
|---|---|
| Plain text output | JSON envelope |
--json flag | JSON is the only format |
| Dump unbounded output | Truncate + file pointer |
Static --help text | Self-documenting root command |
Error: something went wrong | { ok: false, error: {...}, fix: "..." } |
| Hardcoded literal next_actions | Templates with params (<placeholder>, [--flag <value>]) |
| Poll for temporal data | Stream NDJSON |
| ANSI colors | JSON fields |
Try it
The cli-design skill contains the full pattern reference — envelope shape, streaming protocol, naming conventions, implementation checklist. Install it and your agent has the complete playbook:
npx skills add joelhooks/joelclaw --skill cli-design --yes --globalThe ADR chain is ADR-0009 (CLI identity) through ADR-0058 (streaming protocol).