Three-layer drift-prevention playbook

When the operator catches the orchestrator drifting on a discipline — applying large-pr-ok as a feature-work shortcut, shipping behavior without updating its doc, re-asking the same question after the agent forgot — the natural impulse is “save a memory entry so it doesn’t happen again.” That’s necessary but not sufficient. Today (2026-05-18) two distinct drift classes each got a three-layer fix; the structure is identical, so the playbook is now extractable.

The pattern

Layer	Where it fires	What it catches	Cost to ship
1. Memory HARD RULE	Loaded at session start. Top of `MEMORY.md` as an index entry pointing to a detail file with the mechanical check tree.	Drift at session start. Necessary because the agent reads memory once per session.	~15 min — one memory file + one index line.
2. Rig-side enforcement	Fires at the trigger point of the drift (PR creation, commit, dispatch, etc.). Lives in the deterministic gate code — `pr-size-check.js`, the webhook handler, a watcher’s pre-check. Surfaces as a structured warning in the review body or as a hard block.	Drift in the middle of a long session, when the memory rule has been read but forgotten. Necessary because memory entries decay in salience the longer the session runs.	~30-60 min — detection helper + integration into existing gate + tests. Same shape as today’s `detectSuggestedSplit` and `detectDocMismatches`.
3. Durable artifact	Lives in `rig-docs/research/` as a dated decision/research doc with empirical evidence (the failure pattern, the A/B comparison if available, the decision tree). Linked from `BRAIN.md`’s `## Conventions` section.	Drift across agents and across time. Necessary because Layer 1 + Layer 2 are orchestrator-only — Dev-E, Review-E, iBuild-E, Planner need to discover the rule too.	~30 min — single research doc. BRAIN.md update is a tiny follow-up after any pending budget bump.

Optional Layer 4 (audit cron): when even Layer 2 might miss a case (false-negatives in detection heuristics, retroactive drift on already-merged PRs), a periodic audit job surfaces the drift week-over-week. Higher cost; defer until a third instance of the same drift class proves it’s needed.

Today’s two instances

Drift class A: `large-pr-ok` empirically bypasses Review-E scrutiny

Symptom: orchestrator applied large-pr-ok to 3 feature PRs (rc#1119 / rc#1125 / rc#1135) over the same session. Operator caught it on the 3rd and asked: “what is the correct rig procedure to follow in this scenario?”

Three-layer fix:

L1 memory: feedback_pr_split_for_feature_work.md (orchestrator-local) — 5-step decision tree run before applying the label.
L2 rig enforcement: rar#492 — detectSuggestedSplit extends pr-size-check.js to suggest the specific PR A / PR B shape when a PR touches both Core/ and Api/. Justification-line nudge for large-pr-ok.
L3 durable artifact: research/2026-05-18-pr-size-and-large-pr-ok-semantics — the A/B evidence (same 554-LOC code shipped labelled vs split caught 3 real bugs in the split path).

Self-validation in the same session: after shipping the 3 layers, the next disciplined-split PR (rc#1136) immediately caught a logic gap (LivePrStates missing changes_requested) that would have shipped silently under large-pr-ok.

Drift class B: behavior shipped without same-PR doc update

Symptom: orchestrator shipped rar#492 (size-gate behavior change) without updating docs/pr-size-check.md. The doc stayed at updated: 2026-04-24, quoted the OLD message verbatim. Operator caught it in a “are docs and skills up to date?” audit later in the session — found a parallel gap where the pr-workflow skill was silent on the rule just shipped.

Three-layer fix:

L1 memory: feedback_docs_in_same_pr.md (orchestrator-local) — 5-point check tree before pushing any behavior PR (same-name doc match, AGENTS.md accuracy, skill match, CLAUDE.md/BRAIN.md for new conventions, PR-body attestation).
L2 rig enforcement: rar#497 — detectDocMismatches extends pr-size-check.js to flag src/<X>.* modified without docs/<X>.md in the diff, when the doc exists in the repo.
L3 durable artifact: this doc — pulls the meta-pattern out so it’s reusable.

Self-validation: rar#497 dog-foods the rule by shipping src/pr-size-check.js + src/pr-size-check.test.js + docs/pr-size-check.md in the same PR.

When to apply the playbook

Not every operator nudge needs three layers. Apply when:

The drift has happened at least twice in the same session or week (one-off slips don’t justify infrastructure).
The drift is structurally observable — there’s a deterministic signal in the diff, the file list, or the event stream that a Layer-2 detector can hook into.
The drift has measurable cost — shipped bugs, stranded PRs, audit churn. Not just stylistic preferences.
The rule has a written form — if it doesn’t fit into a check tree or detection helper, it’s not yet ready to enforce; clarify first.

A single one-off is a memory entry. Recurring is a HARD RULE. Recurring AND structurally observable is a three-layer fix.

Anti-pattern: stop at Layer 1

Memory entries alone don’t survive long sessions. The memory file loads once at session start; by the time the agent is six hours in and pattern-matching on shortcuts that worked the first time, the rule has decayed in attention. Two of today’s drift classes (A and B) had EXISTING general best-practice notes in memory before the session — they didn’t prevent the drift. Layer 2 (the gate at the trigger point) is what actually catches the failure under cognitive load.

Anti-pattern: skip Layer 1

The reverse: shipping Layer 2 without Layer 1 means future-session agents discover the gate by getting flagged by it, instead of knowing the rule in advance. Layer 1 is the agent’s a-priori knowledge; Layer 2 is the fallback.

Sequencing

Ship in order: L1 → L2 → L3. L1 is the cheapest and locks the agent’s a-priori knowledge for the rest of the session. L2 takes longer (test infrastructure) but lands the enforcement. L3 is the durable artifact that survives across agents.

For multi-PR rollouts of Layer 2 (when the enforcement is itself a >500-LOC change), apply the PR-split discipline — eat your own dog food.

When NOT to apply the playbook

One-off slips — first-time mistake, save a memory entry, move on.
Subjective preferences — “I prefer x over y.” Memory entry, no enforcement layer.
Configuration changes with no behavior surface — version bumps, secret rotations, etc. Treat as ordinary chore work.
Operator-only rules — rules that govern human actions, not orchestrator actions. Document but don’t try to enforce in rig code.

Tracking metric

A simple measure of whether the playbook is working: count the number of times the same drift class is caught in a single session AFTER all three layers ship. The goal is 0 — if Layer 2 catches it, the agent fixes before the operator notices. Today’s PR-split discipline already showed this: after rar#492 + memory rule + research doc, the very next PR (rc#1136) had Layer-2-caught issues fixed before merge without operator intervention.

Three-layer drift-prevention playbook

Three-layer drift-prevention playbook

The pattern

Today’s two instances

Drift class A: large-pr-ok empirically bypasses Review-E scrutiny

Drift class B: behavior shipped without same-PR doc update

When to apply the playbook

Anti-pattern: stop at Layer 1

Anti-pattern: skip Layer 1

Sequencing

When NOT to apply the playbook

Tracking metric

See also

Drift class A: `large-pr-ok` empirically bypasses Review-E scrutiny