The Agent Writing Loop

Mar 1, 2026

joelclawwritingaispecarchitecturetutorial

A writing system where AI drafts content under explicit constraints, an editor leaves inline feedback on published articles, a durable pipeline applies edits and verifies them, and patterns from feedback get encoded into the constraint files — making every future article better.

constraint files → AI writes → publish (no deploy) → editor feedback
     ↑                                                      ↓
     └──── human encodes patterns ← durable edit pipeline ←─┘

Components

0. Gather everything you’ve ever written

Gather everything you’ve ever written before writing any constraint files. Articles, blog posts, newsletters, emails, journal entries, notes, tweets, talks, Slack messages, Google Docs, Notion pages, that graph-based note-taking app you loved in 2021 but stopped using — anything with your actual voice in it. Published or not. Polished or not. The raw material matters more than the finish.

The more the better. 20 pieces is a minimum, 50+ is where patterns get reliable. Dump it all into a single directory. Strip navigation, headers, footers — just the prose.

This corpus is the foundation of everything. You’ll analyze it to build a “write like me” skill file — the constraint that keeps your first drafts from being generic AI slop. Without a real corpus derived from your real words, the voice file is fiction and the system produces garbage.

1. Constraint stack

Three files, each adding specificity. Version-controlled in your repo.

Voice file — how the AI speaks. Not “be helpful” — specific patterns:

Sentence length distribution (mostly 1-3 sentence paragraphs)
Profanity policy (when it’s texture vs. when it’s noise)
Banned phrases (“In this article I will”, “Let me be real”, “Key takeaways”)
Opening pattern (hooks, not thesis statements)
Ending pattern (abrupt, no forced wrap-up)

Guardrail file — what the AI cannot do:

Cannot fabricate the author’s opinions, experiences, or philosophy
Cannot generate worldview statements attributed to a real person
Cannot invent temporal claims without checking source data
Cannot invent anecdotes
Must flag and log uncertainty as [TODO: author's take on X], with structured telemetry for every unresolved claim

Style reference — patterns derived from the author’s actual published work. The more specific, the better. Include anti-patterns (what the author never does) alongside positive examples.

These files are loaded as context when any agent writes content. They are the training data — not weights, not RLHF, just text files.

2. Content store (CMS-optional)

Articles live in a database, not the filesystem. Requirements:

Upsert by stable resourceId (e.g. article:the-writing-loop)
Store content as MDX string in a fields JSON column
Support soft-delete (set deletedAt, don’t destroy)
Real-time subscriptions for the feedback UI

Any database works. Convex, Supabase, PlanetScale, Turso. The key property is that content updates don’t require a deploy.

3. Paragraph-addressable rendering

Every paragraph in rendered content gets a stable identifier:

// In your MDX component map
p: ({ children, ...props }) => {
  const id = hash(textContent(children))
  return <p data-paragraph-id={id} {...props}>{children}</p>
}

The hash must be deterministic — same content, same ID. This is what the feedback UI targets.

4. Inline feedback UI

Two elements:

Comment editor — a portal that mounts directly after the target paragraph. Click a paragraph, editor appears below it. Submit stores a feedback record linked to the article’s resourceId.

Status indicator — a small badge (pulse dot, toast, whatever) that shows the feedback pipeline state: queued → processing → applied/failed. Real-time via database subscriptions. Auto-hides after resolution.

5. Feedback storage

A table with these fields:

feedbackItems:
  resourceId  string   # links to the article
  content     string   # the feedback text
  status      enum     # pending → processing → applied | failed
  createdAt   number
  resolvedAt  number?

Index on resourceId and status. The create mutation writes the record and fires an event to trigger the pipeline.

6. Durable edit pipeline

A multi-step function that survives crashes. Each step is memoized — retries pick up where they left off, not from the beginning.

Steps:

Fetch — pull article content and all pending feedback from the database
Rewrite — send content + feedback to an LLM with instructions: apply the feedback, preserve voice, return full rewritten content
Verify — a separate LLM call reviews the original, the rewrite, and each feedback item. Returns a verdict per item: applied (with evidence) or missed. This catches the LLM quietly ignoring feedback.
Retry missed — if any items were missed, rewrite again with only those items. Max 2 retries.
Upsert — write the verified content back to the database
Revalidate — bust the CDN/ISR cache so the page updates without a deploy
Mark status — update each feedback item to applied or failed based on verification verdicts
Notify — alert the author that edits landed (Telegram, email, Slack, whatever)

Concurrency: one pipeline execution per article at a time. Queue additional feedback until the current run completes.

The verification step is non-optional. Without it, the LLM will silently ignore ~15-20% of feedback items. A second LLM call checking the diff is cheaper than a human re-reading the whole article.

7. Cache revalidation endpoint

An API route that accepts tags and paths, validates a shared secret, and calls your framework’s revalidation API:

POST /api/revalidate
{ secret, tags: ["post:slug"], paths: ["/slug"] }

Feedback → edit → live in under a minute, no deploy.

8. Skill update loop (human-in-the-loop)

When feedback reveals a pattern — not a one-off typo, but a recurring failure mode — a human updates the constraint files:

Pattern observed: AI generates philosophical pontification in author's voice
  ↓
Update voice file: "Pontification must come from the author"
Update guardrail file: "Never generate philosophical positions attributed to a real person"

This is deliberately manual. The human decides which patterns are worth encoding. The system handles everything else.

Every update to the constraint files applies to future articles written by every agent in the system.

Properties

The system has these properties when built correctly:

No-deploy publishing. Content changes are live in < 60 seconds via cache revalidation.
Durable edits. Pipeline steps are memoized. A crash mid-rewrite doesn’t corrupt the article.
Verified rewrites. A second LLM pass catches silently ignored feedback.
Compound improvement. Each feedback cycle can update the constraint files, improving all future output.
Author sovereignty. The AI cannot fabricate the author’s voice, opinions, or experiences. The constraint files are explicit about this.
Observable uncertainty. Unresolved claims are tagged and logged with structured telemetry.

What this doesn’t cover

Authentication for the feedback UI (you probably want it)
Multi-author support (extend resourceId scheme)
Feedback moderation (add a review queue before the pipeline)
Analytics on feedback patterns (useful, but build it after the core loop works)
The specific LLM or inference provider (any instruction-following model works)

Implementation notes

Use whatever tools match your stack. The architecture is:

Concern	Options
Content store	Convex, Supabase, PlanetScale, Turso, Postgres
Durable pipeline	Inngest, Vercel Workflow, Temporal, AWS Step Functions, Trigger.dev
Rendering	Next.js + MDX, Astro, Remix, any SSR framework with ISR
Real-time feedback status	Convex subscriptions, Supabase realtime, websockets
Cache revalidation	Next.js `revalidateTag`, Cloudflare purge API, Fastly surrogate keys
LLM	Any instruction-following model via any provider

The constraint files are plain markdown. The feedback table is five columns. The pipeline is eight steps.