ADR-0193accepted

Task Triage Output Contract

  • Status: proposed
  • Date: 2026-03-02
  • Deciders: Joel, Panda
  • Relates to: ADR-0045, ADR-0062, ADR-0140, ADR-0190, ADR-0191

Context

tasks.triage.classify currently executes expensive classification calls that can end with null output / zero useful completion while still being treated as completed workflow steps.

This violates the intent of heartbeat-driven triage: produce actionable prioritization, not expensive no-op runs.

Decision

1) Enforce strict classification output schema

A classification run is valid only if it produces parseable JSON with:

  • triage: array of per-task classifications,
  • each item containing id, category, reason, optional suggestedAction,
  • optional insights array.

null, empty, malformed, or schema-invalid output is a failure.

2) Treat invalid/null output as failed inference, not success

For triage classify step:

  • do not mark classification as complete,
  • do not set cooldown keys,
  • do not emit success notification,
  • record failure reason and trigger retry/fallback path.

3) Add bounded fallback chain

If primary classification output is invalid:

  1. retry with stricter parsing prompt,
  2. if still invalid, fallback via inference-router policy,
  3. if all attempts fail, emit degraded event and return explicit status: degraded.

No silent no-op success path.

4) Tie triage to no-op circuit control

Apply ADR-0191 circuit behavior to (component=task-triage, action=tasks.triage.classify).

When circuit is open:

  • skip expensive classify call,
  • emit concise operator signal,
  • perform deterministic minimal triage summary from metadata-only heuristics.

5) Separate “no actionable tasks” from “classification failed”

Valid empty actionability is allowed and must return:

  • status: noop,
  • explicit reason (no actionable items),
  • schema-valid classification payload.

Failure returns:

  • status: degraded,
  • explicit failure reason (null_output, schema_invalid, etc).

6) Add quality telemetry contract

For each run record:

  • classificationValid (bool),
  • triageItemsCount,
  • actionableCount,
  • outputFailureReason (if any),
  • fallbackUsed (bool),
  • circuitState.

Consequences

Good

  • triage only reports success when it produced valid structure,
  • no more cooldown masking of failed classifications,
  • operators can distinguish true noop from degraded failure.

Tradeoffs

  • stricter contract may initially increase visible triage failures,
  • additional retry/fallback logic in workflow code.

Required Skills (Preflight)

  • system-bus — Inngest function structure and workflow behavior
  • inngest-durable-functions — retries, idempotency, failure semantics
  • inngest-steps — step boundaries and status transitions
  • task-management — expected triage output semantics
  • langfuse — output validity and no-op trend instrumentation

Implementation Plan (vector clock)

  1. V1: define schema validator for triage output in task-triage.ts.
  2. V2: route invalid output into retry/fallback path, never success path.
  3. V3: ensure cooldown keys are set only for valid classifications.
  4. V4: add no-op circuit integration for tasks.triage.classify.
  5. V5: add tests covering valid-noop vs degraded-failure split and cooldown correctness.

Verification Checklist

  • null/malformed output no longer returns success
  • cooldown key is never set when classification output is invalid
  • valid no-actionable runs return status: noop with schema-valid payload
  • repeated invalid outputs trigger circuit-open behavior (via ADR-0191 circuit)
  • triage telemetry distinguishes noop from degraded clearly

Implementation Progress

V1 (schema validator) — pre-existing

parseTriageResult() enforces strict JSON schema: requires triage[] with id, category, reason per item, validates category enum, checks for duplicate IDs, verifies all expected task IDs are present.

V2 (retry/fallback chain) — pre-existing

Primary classification attempt → repair attempt with stricter prompt on failure. Returns classificationValid: false + status: degraded if both fail.

V3 (cooldown guards) — pre-existing + bug fix 2026-03-04

Cooldown key (TRIAGE_NOTIFIED_KEY) only set when valid classification produces a notification push. Bug fixed: Task hash was persisted to Redis BEFORE classification — failed classifications cached the hash, preventing retry on next heartbeat. Fix: hash now persisted ONLY after successful classification.

V4 (circuit integration) — shipped 2026-03-04

Wired checkCircuit/recordSuccess/recordFailure from inference-circuit.ts. Circuit check before LLM classification; open circuit returns degraded without burning tokens. circuitState in all OTEL metadata.

V5 (tests) — pre-existing + expanded 2026-03-04

parseTriageResult tests for null output, missing IDs, valid payload. Circuit integration tested via inference-circuit.test.ts.