ADR-0219proposed

Restate Agent Runtime with Flue-Inspired Proxy Policies

2026-03-06T00:00:00.000Z

Context

Our current agent dispatch model has gaps:

Codex workers get binary sandbox modes — workspace-write or danger-full-access. No granular API scoping. A worker doing code review has the same access as one deploying to production.
No typed result extraction — codex tasks return unstructured stdout. We parse it ad-hoc. No schema validation, no retry on malformed output.
Fragile session lifecycle — codex dispatch is fire-and-forget. If a task crashes mid-execution, we don’t recover. No durable journaling of progress.
Secrets reach workers directly — API keys are either in env vars or leased per-task. No proxy layer to keep credentials on the host.

Astro’s Flue framework solves these problems for CI workflows with three patterns worth adopting: proxy policies, typed result extraction, and structured event telemetry.

Decision

Build a Restate-based agent runtime service that combines Flue’s patterns with our existing infrastructure:

1. Proxy Policy System

Credential-injecting reverse proxies that run on the host. Workers route API calls through proxies; secrets never leave the host process.

// Policy definition (Flue-compatible)
const codeReviewPolicy = {
  github: github({ policy: 'allow-read' }),      // GraphQL queries + git clone, no mutations
  anthropic: anthropic(),                          // allow-all for model calls
};
 
const deployPolicy = {
  github: github({ policy: 'allow-all' }),         // Full write access
  vercel: vercel({ policy: 'allow-all' }),
  anthropic: anthropic(),
};

Policies compose: base level (allow-read/allow-all/deny-all) + explicit allow/deny rules with method + path glob + optional body validators + rate limits per rule.

Implementation: Host-side HTTP proxy processes (one per service per task session). HMAC tokens per session. Proxy configs stored in Redis with TTL for auto-cleanup.

2. Typed Result Extraction

Adopt Flue’s ---RESULT_START---/---RESULT_END--- delimiter pattern with Valibot schema validation:

const result = await agentTask("review-pr", {
  prompt: "Review PR #42 for security issues",
  result: v.object({
    severity: v.picklist(["critical", "high", "medium", "low", "none"]),
    issues: v.array(v.object({
      file: v.string(),
      line: v.number(),
      description: v.string(),
    })),
    approved: v.boolean(),
  }),
  proxies: codeReviewPolicy,
});

If the agent forgets the result delimiters, send a follow-up prompt in the same session (Flue’s retry pattern). Validate with Valibot before returning.

3. Restate Workflow Orchestration

Replace polling with Restate’s durable execution:

const agentTaskWorkflow = restate.workflow({
  name: "agent/task",
  handlers: {
    run: async (ctx, args: AgentTaskInput) => {
      // Step 1: Start proxy servers (durable)
      const proxies = await ctx.run("start-proxies", () =>
        startProxiesForPolicy(args.proxies, ctx.key)
      );
 
      // Step 2: Create pi session (durable)
      const sessionId = await ctx.run("create-session", () =>
        piClient.createSession({ title: args.label })
      );
 
      // Step 3: Execute prompt (durable, retryable)
      const parts = await ctx.run("execute", () =>
        piClient.promptAndWait(sessionId, args.prompt)
      );
 
      // Step 4: Extract typed result (durable)
      const result = await ctx.run("extract", () =>
        extractResult(parts, args.schema)
      );
 
      // Step 5: Cleanup (durable)
      await ctx.run("cleanup", () => stopProxies(proxies));
 
      return result;
    },
  },
});

Restate provides: automatic retries with journal, compensation on failure, ctx.sleep() for human approval gates, and observable execution state.

4. FlueEvent Telemetry

Normalize agent execution events into typed telemetry:

type AgentEvent =
  | { type: "tool.pending"; tool: string; input: string }
  | { type: "tool.complete"; tool: string; duration: number }
  | { type: "tool.error"; tool: string; error: string }
  | { type: "step.finish"; tokens: { input: number; output: number }; cost: number }
  | { type: "status"; status: "busy" | "idle" | "retry" };

Pipe into Typesense OTEL collection for unified observability.

Consequences

Positive

Granular security — workers get exactly the API access their task requires
Secrets isolation — credentials never leave the host; workers see proxy URLs only
Durable execution — Restate journals every step; crash recovery is automatic
Typed outputs — schema-validated results from every agent task
Observable — structured events for every tool call, with token counts and cost
Composable — proxy policies are reusable across task types

Negative

Proxy overhead — each task session needs proxy processes (mitigated by connection pooling)
Complexity — more moving parts than fire-and-forget codex dispatch
Restate dependency — adds Restate to the critical path for agent execution

Risks

Proxy latency could impact LLM streaming (measure before committing)
Policy definitions need careful design — too restrictive breaks tasks, too permissive defeats purpose
Pi session API may not expose enough for programmatic prompt/poll (need to verify)

Implementation Phases

Phase 1: Proxy Policy Library

Port Flue’s proxy types and policy evaluation to @joelclaw/agent-proxy
Implement Anthropic + GitHub presets
Host-side proxy server with HMAC auth
Unit tests for policy evaluation

Phase 2: Result Extraction

Port Flue’s ---RESULT_START---/---RESULT_END--- pattern
Valibot schema integration
Follow-up prompt retry on missing delimiters

Phase 3: Restate Workflow

agent/task workflow with durable proxy lifecycle
Pi session management via SDK
Integration with existing dagOrchestrator

Phase 4: Telemetry

FlueEvent-compatible event types
SSE transform for pi session events
OTEL collection integration

References

withastro/flue — source of proxy policy and result extraction patterns
Flue proxy policy evaluation: packages/client/src/proxies/policy.ts
Flue result extraction: packages/client/src/result.ts
Flue event transform: packages/client/src/events.ts