ADR-0210accepted

Channel Intelligence Pipeline

Status

Accepted

Context and Problem Statement

Joel’s inbound channels — email, Slack, Discord, iMessage, Telegram — are all signal sources with the same problem: important messages get buried, replies slip, relationships suffer. The system currently treats each channel in isolation with no shared context.

The previous email triage posture was “be aggressive about archiving” — optimizing for inbox zero at the cost of signal loss. This caused real damage: Alex Hillman (key AI Hero collaborator) was auto-archived for weeks. Slack messages are classified but the classification doesn’t surface actionably.

Every channel is a signal detector. A Slack thread from Alex about Kit access and an email from Alex about the same thing are the same relationship signal. The intelligence layer must treat them as one unified stream.

False positives (surfacing noise) are cheap. False negatives (burying signal) lose relationships.

Decision

Multi-Signal Classification Principle

> “Filtering emails on sender alone, you’re gonna have a bad time. I go the opposite way: my assistant uses a collection of signals, but is never allowed to filter on email address alone. E.g. newsletters require a minimum of 2 other signals.” > — Alex Hillman (@alexhillman), March 5, 2026

This principle is now a hard rule in the pipeline:

No classification decision may be based on a single signal. Every triage action (archive, surface, escalate) must be justified by at least 2 independent signals. Examples:

ActionMinimum signals required
Archive as noisesender reputation LOW + unsubscribe link present + no prior conversation history
Surface as signalknown person OR VIP list + awaiting reply OR project-related
Newsletter classificationunsubscribe link + bulk send headers + no personalization + sender not in contacts
Auto-archiveNEVER on sender alone. Minimum 3 signals for autonomous archiving.

This rule exists because the previous system auto-archived Alex Hillman — a key AI Hero collaborator — for weeks based on sender-domain heuristics alone. Single-signal filtering is a bug class, not a feature.

Build a multi-channel intelligence pipeline that evaluates every inbound — email, Slack, Discord, iMessage — against the full scope of joelclaw’s knowledge. The system should make Joel sparkle — surfacing the right messages at the right time with the right context, so replies are timely, informed, and relationship-building.

A Slack thread is the same unit as a Front conversation. Both are threaded conversations with a person about a topic. The intelligence layer treats them identically.

Architecture

Every channel implements a common interface. The pipeline is the same; the rubric varies per channel.

Channel Interface

interface ChannelSource {
  id: string;                    // "slack", "front", "discord", "imessage"
  rubric: ClassificationRubric;  // per-channel scoring rules
  noiseRatio: "low" | "medium" | "high";  // expected noise level
  threadable: boolean;           // supports conversation threads
}
 
interface ClassificationRubric {
  knownPerson: boolean;     // is sender a known human?
  vip: boolean;             // is sender on VIP list?
  projectRelated: boolean;  // relates to active project?
  awaitingReply: boolean;   // is Joel expected to respond?
  // score: sum of weighted signals → HIGH / MEDIUM / LOW
}

Channel Rubric Defaults

ChannelKnown personNoise ratioNotes
SlackAlways (workspace members)LowThread context available, VIP list applies
Email (Front)Sender reputation lookupHigh80%+ noise, need aggressive filtering
DiscordRole/server contextHighMix of bots, community, signal
iMessageAlwaysVery lowAlways a person, always signal

Message Flow

Inbound message (any channel)

  ├─ Haiku + channel rubric → score

  ├─ HIGH score → immediate enrichment + research
  │   ├─ Memory recall (Typesense semantic search)
  │   ├─ Active projects (Vault/Projects/ scan)
  │   ├─ Contact dossier (Vault/Resources/contacts/)
  │   ├─ Recent meetings (Granola transcript search)
  │   ├─ Cross-channel context (same person on other channels?)
  │   ├─ Thread history (Front conversation / Slack thread)
  │   ├─ Extract action items → Todoist inbox (verb-first, concrete next actions)
  │   ├─ Write observation to memory pipeline
  │   ├─ Write to Vault if durable
  │   └─ Surface via gateway (Telegram) with context + links

  ├─ MEDIUM score → immediate, lighter enrichment
  │   ├─ Same enrichment path, may skip expensive lookups
  │   ├─ Extract action items → Todoist inbox (verb-first, concrete next actions)
  │   ├─ Write observation to memory pipeline
  │   └─ Surface via gateway

  └─ LOW score → batch bucket
      ├─ Accumulates until batch threshold or time trigger
      ├─ Sonnet reviews the batch with full context
      └─ Outputs: promote to signal / archive / observe to memory

All three paths feed the memory observation pipeline. HIGH and MEDIUM immediately. LOW gets there through the batch → Sonnet review path.

The pipeline doesn’t just surface signal — it extracts what Joel needs to do and writes Todoist inbox tasks automatically. Every action item task includes source context in the description (Front/Slack link, sender, thread subject, and channel).

Tiered Inference

Three models, three jobs:

TierModelJobWhen
ClassifyHaikuScore against rubricEvery message, <200ms
Batch reviewSonnetReview LOW bucket, decide promote/archiveBatch trigger (count or time)
Deep enrichmentOpusFull RAG analysis with relationship contextHIGH score only

Haiku sorts. Sonnet sifts the batch. Opus thinks deeply about what matters.

HIGH score → Opus with full RAG context. This is where you get: “Alex emailed about email access — you discussed this in your meeting Tuesday, he’s a key AI Hero collaborator, you promised to set this up. It’s been 3 days.”

100 Haiku classifications ≈ 1 Sonnet batch review ≈ cost of 1 Opus enrichment. The pipeline ensures Opus only touches the 5-10% that truly warrants deep reasoning.

Accountability Layer

4x/day gardening (every 6h)
  ├─ Email threads awaiting Joel's reply (with Front links)
  ├─ Slack threads awaiting Joel's reply (with Slack links)
  ├─ Sorted by age (oldest first = most urgent)
  ├─ Cross-channel callouts ("Alex emailed AND slacked about this")
  ├─ Todoist task hygiene (dedupe, stale, priority drift)
  └─ Cross-channel task consolidation
 
Aging escalation
  ├─ >24h unanswered → include in digest
  ├─ >48h unanswered → higher urgency formatting
  └─ >7d unanswered → explicit warning
 
Weekly relationship health report
  ├─ Response time trends per VIP
  ├─ Relationship drift alerts ("haven't heard from X in 30d")
  └─ Cross-channel activity summary

Gardening (6-hour cron)

Every 6 hours, an Inngest gardening cron reconciles actionable tasks across channels:

  • Dedupe: Same thread spawning multiple tasks → consolidate to one
  • Stale check: Tasks from signals that got resolved elsewhere (thread archived, reply sent) → mark complete or remove
  • Thread consolidation: 8 messages in a thread → one summary task replacing fragmented originals
  • Priority drift: Something sat at p3 for 3 days with no action → evaluate if still relevant, escalate or kill
  • Memory capture: Patterns worth remembering from the signal stream → write observations
  • Cross-channel merge: Same person/topic appearing in email AND Slack → merge into single actionable item

Context Sources (available today)

SourceAccessWhat it provides
Memory observationsjoelclaw recall (Typesense)Past interactions, decisions, patterns with sender/topic
Active projectsVault/Projects/Whether message relates to active work
Contact dossiersVault/Resources/contacts/Relationship history, VIP status
Granola meetingsGranola MCPRecent meeting transcripts — “you discussed this Tuesday”
Cross-channelSlack search + Front APISame person active on multiple channels
Thread historyFront conversation / Slack threadFull conversation context
GitHubGitHub APIPRs, issues, repos related to sender/topic
OTEL eventsTypesenseSystem activity related to sender

Sender Reputation (future)

Track per-sender signal history in Typesense:

  • sender_idsignal_count, noise_count, last_signal_date, channels_active
  • Build confidence over time across channels
  • VIP override: some senders are always signal regardless of content or channel
  • Cross-channel identity: same person on Slack + email = one reputation

Phases

Phase 0 — Foundation (shipped 2026-03-04)

  • Flip triage posture to signal-first (commit 6e3bb5b)
  • “interesting” triage category
  • Front deep links in escalations
  • 2x/day nag digest function (email-nag)
  • Remove auto-archive of real people

Phase 1 — Channel Interface + Haiku Router

  • Define ChannelSource and ClassificationRubric interfaces
  • Implement Haiku classification with per-channel rubric
  • Route HIGH → immediate enrichment, MEDIUM → immediate lighter, LOW → batch
  • Action item extraction from classified messages → Todoist inbox
  • Source context in task descriptions (Front link, Slack link, sender, thread subject)
  • Dedup against existing Todoist tasks before creating
  • Batch accumulator with time/count trigger for Sonnet review
  • Replace Sonnet-on-everything in check-email with Haiku pre-route

Phase 1.25 — Gardening Cron

  • 6-hourly Inngest cron function
  • Todoist task deduplication (same thread/sender/topic)
  • Stale task detection (source thread resolved)
  • Thread consolidation (replace fragmented tasks with summaries)
  • Priority drift evaluation
  • Memory observation capture from gardening insights

Phase 1.5 — Slack Thread Intelligence

  • Treat Slack threads as first-class conversation units (same as Front conversations)
  • Thread-level classification: who started it, is Joel expected to respond, how old
  • Include aging Slack threads in the nag digest alongside email
  • Cross-reference: same person emailing AND slacking = higher urgency

Phase 2 — Memory Pipeline Integration (all channels)

  • Feed signal-classified messages into memory observation pipeline
  • Memory recall injection per sender/topic for enrichment
  • Active project matching (scan Vault/Projects/)
  • Contact dossier lookup/creation
  • Granola meeting cross-reference (“you met Tuesday”)
  • Cross-channel context (“Alex messaged in Slack 3h before this email”)

Phase 3 — Relationship Intelligence

  • Sender reputation tracking (Typesense collection)
  • Response time tracking (“you usually reply to Alex in 2h”)
  • Relationship health scoring
  • Weekly relationship report to Telegram
  • Auto-create contact dossiers for repeat senders

Phase 4 — Proactive Intelligence

  • “You haven’t heard from X in 30 days” — relationship drift alerts
  • “This thread has been open 5 days” — escalating urgency
  • Suggested reply drafts with full context
  • Calendar-aware timing (“Alex emailed, you have a meeting Thursday”)

Consequences

Positive

  • Joel replies to important emails promptly — relationships improve
  • Full system context makes replies informed and relevant
  • No more lost signals — false negatives eliminated by design
  • Memory system compounds value — past interactions inform current triage
  • Sender reputation reduces classification cost over time

Negative

  • More Telegram notifications (by design — signal-first means more surfaces)
  • Haiku + Sonnet cost per email (mitigated by Haiku pre-router skipping noise)
  • RAG enrichment adds latency to the pipeline (mitigated by parallel fetches)
  • Sender reputation needs bootstrapping period

Risks

  • Over-notification fatigue — mitigated by digest batching and urgency tiers
  • Context hallucination — mitigated by structured RAG with source attribution
  • Granola/Slack rate limits — mitigated by timeouts and graceful degradation (already handled in VIP function)

Channel Taxonomy

Operator channels (Telegram, Discord DM): Joel ↔ system. Already wired as gateway sessions. Memory observations flow via session compaction. These are NOT part of this ADR — they’re the operator interface.

Multi-user channels (Slack workspace, Discord servers, email): Joel’s professional network. The system watches these for signal and feeds observations into the memory pipeline. This ADR covers these.

The same memory pipeline applies to both — observe → write-gate → store → retrieve — but the observation source differs:

  • Operator channels: session transcript compaction (existing)
  • Multi-user channels: signal-classified messages + thread context (this ADR)

Every signal-classified message from a multi-user channel becomes a memory observation. Over time, joelclaw builds a durable model of relationships, commitments, and communication patterns that compounds across all channels and enriches every future interaction.

  • ADR-0021: Agent Memory System — the pipeline these observations feed into
  • ADR-0131: Slack Intelligence — provides workspace search and channel taxonomy
  • ADR-0199: Close the Loop — reflection patterns apply to channel learning