ADR-0224accepted

Site-Aware Content Routing and Sync Contract

Context and Problem Statement

ADR-0223 made venue selection explicit at the operator layer.

That is necessary, but it is not sufficient.

The cantrip migration exposed the deeper bug: joelclaw’s content-sync pipeline still treated a public discovery note in the Vault as implicitly publishable on joelclaw.com.

Observed failure mode:

  1. a Wizardshit-bound cantrip cluster was removed from joelclaw Convex,
  2. the old /cool/* pages disappeared briefly,
  3. content-sync ran again,
  4. the same notes were re-upserted into joelclaw Convex,
  5. the deleted pages came back.

Root cause: the source content had no site-aware routing contract. The system knew the notes were public discoveries, but it did not know which public site was allowed to host them.

That makes the current pipeline too blunt:

  • Vault note exists
  • note is not private
  • therefore publish to joelclaw Convex

That logic is wrong in a multi-site system.

The fix is not “delete harder” and it is not “put Redis in charge.”

Redis may help as a fast derived routing cache, but canonical routing truth has to live with the content metadata and the durable content schema. Otherwise every replaying sync job can resurrect the wrong site projection.

Decision

1. Content routing metadata is canonical content state

Public content that can be projected to sites must carry explicit routing metadata in its canonical metadata/schema.

At minimum, the model must be able to answer:

  • which site is canonical for this content
  • which sites, if any, are allowed to project it
  • whether the content is public, private, archived, or migration-only
  • what path policy applies on each site

For discovery notes, this metadata starts in Vault frontmatter and must flow into Convex contentResources.

2. content-sync becomes site-aware

content-sync must stop treating “public discovery” as equivalent to “publish on joelclaw.com”.

Before upserting a resource into a site’s runtime content plane, sync must check routing metadata and decide whether that site is an allowed projection target.

If the site is not allowed:

  • do not upsert it
  • remove an existing projection for that site if one exists
  • revalidate the affected paths/tags for that site

This applies even when the content still exists in the Vault and is still public somewhere else.

3. Venue ownership and projection routing are separate fields

The model needs both concepts:

  • canonical ownership — where this content belongs
  • projection permission — where this content may appear

That means a resource may have one canonical site and zero or more temporary/explicit projection targets, but the default is not broad mirroring.

Steady-state duplicates across sites are migration debt unless explicitly permitted.

4. Site routing metadata must be durable and inspectable

The routing contract must live in places that survive replay:

  • Vault frontmatter / authoring metadata
  • Convex contentResources fields or metadata
  • explicit sync logic

It must not depend on:

  • inferred intent from title/content alone
  • current repo location
  • the last tool that touched the content
  • an ephemeral cache entry with no durable backing

5. Redis is allowed only as a derived routing index

If routing lookup needs to be fast, Redis may hold a derived site-routing map.

But Redis is not the source of truth.

If Redis is used, it must be regenerated from canonical content metadata and safe to discard at any time.

Canonical truth remains:

  • Vault metadata on the write side
  • Convex content schema on the durable runtime side

6. Discovery notes need explicit routing fields

For discovery notes specifically, the minimum routing contract is:

  • canonicalSite
  • publishTargets
  • routePolicy

Suggested semantics:

  • canonicalSite: joelclaw | wizardshit | shared
  • publishTargets: explicit site list
  • routePolicy: public | private | archived | migration-only

The exact field names can still be refined during implementation, but the contract must represent these concepts explicitly.

7. Deletion is part of sync correctness, not manual cleanup folklore

When content metadata changes so a site is no longer an allowed target, sync must remove the stale site projection automatically.

This prevents the current class of bug where an operator deletes the Convex record, but the next sync run recreates it from still-public source notes.

Implementation Plan

Required skills before implementation

  • adr-skill — to keep follow-on decision records and supersession honest
  • joelclaw-web — to understand joelclaw.com runtime content readers, tags, and revalidation behavior
  • system-bus — to update the Inngest sync path without violating worker/runtime conventions
  • o11y-logging — to ensure site-aware sync removals/upserts cannot fail silently
  • content-publish — to keep publish/migration semantics aligned with operator-facing content workflows

Affected surfaces

  • ~/Vault/Resources/discoveries/*.md frontmatter contract
  • packages/system-bus/src/inngest/functions/content-sync.ts
  • packages/system-bus/src/lib/convex-content-sync.ts
  • Convex contentResources schema/metadata for site-routing fields
  • joelclaw web runtime readers for discoveries if they need to filter by site target
  • any CLI/admin surface used to verify sync and content state

Required follow-on slices

  1. Schema and frontmatter contract

    • add site-routing metadata to discovery frontmatter
    • extend Convex contentResources metadata/schema to preserve it durably
  2. Site-aware sync execution

    • teach content-sync which site it is syncing for
    • upsert only allowed targets
    • remove stale projections when routing metadata excludes the current site
  3. Observability

    • emit OTEL/log data for site-aware upsert/skip/remove decisions
    • make removals auditable by resourceId, site, and reason
  4. Verification surface

    • add a verification path that reports routing mismatches between Vault metadata and Convex projections
    • prove the cantrip class of bug cannot recur silently
  5. Optional derived cache

    • if runtime pressure justifies it, add Redis as a regenerated routing index
    • do not introduce Redis before the durable metadata contract exists

Verification

  • A discovery note marked for wizardshit only does not republish onto joelclaw.com
  • Changing routing metadata causes stale joelclaw projections to be removed on the next sync run
  • Sync logs/telemetry show why a resource was upserted, skipped, or removed for a given site
  • Deleting a Convex projection without changing source metadata is no longer the normal cleanup path; sync correctness comes from metadata
  • Redis, if introduced later, can be dropped and rebuilt without changing routing truth

Consequences

Positive

  • site ownership stops being inferred from content type alone
  • replaying sync jobs stop resurrecting content onto the wrong site
  • migration cleanup becomes durable instead of whack-a-mole
  • routing policy becomes visible in both authoring and runtime state

Tradeoffs

  • content schema and authoring metadata become a bit heavier
  • sync logic gets stricter and more explicit
  • old discovery notes may need backfill/default routing metadata before the system is fully consistent

Neutral

  • this ADR does not decide every future site name forever
  • this ADR does not require Redis
  • this ADR does not replace ADR-0223; it operationalizes it in the sync layer

Alternatives Considered

Alternative 1: Keep deleting wrong-site Convex records manually

Description: let operators remove stale projections case by case when they appear.

Why rejected: the next sync replay just recreates them if source metadata still says they are public. Manual deletion is not a contract.

Alternative 2: Put routing truth in Redis

Description: maintain site-routing decisions in Redis and let sync/runtime consult that cache.

Why rejected: Redis is ephemeral and replay-prone. It is fine as a derived index, but not as the canonical routing source for durable content.

Alternative 3: Infer site routing from content type or directory structure

Description: assume discoveries belong to joelclaw, tutorials belong elsewhere, articles belong wherever the current tool defaults.

Why rejected: this is exactly how the cantrip bug happened. Content type is not venue ownership.

References

  • ~/Vault/docs/decisions/0168-convex-canonical-content-lifecycle.md
  • ~/Vault/docs/decisions/0223-multi-venue-publishing-lifecycle.md
  • ~/Code/joelclaw/packages/system-bus/src/inngest/functions/content-sync.ts
  • ~/Code/joelclaw/packages/system-bus/src/lib/convex-content-sync.ts