Skip to content

Stage A — Compiled AGENTS.md with Schema Validation

Stage A — Compiled AGENTS.md with Schema Validation

Section titled “Stage A — Compiled AGENTS.md with Schema Validation”

One PR to dashecorp/rig-gitops. ~2 hours of agent work. Replaces hand-written AGENTS.md with a CI-validated, compiled-from-facts, size-budgeted version. Highest-leverage step from the full docs strategy — based on Vercel’s published eval showing agent success rate 53% → 100% when AGENTS.md carries an embedded compressed index under 8 KB. Defer the larger wiki migration (Stage B, ~11 more hours) until we have 5 real assignments’ worth of data showing Stage A moved the needle.

The rig must run on any coding agent — Claude Code is today’s default but the design accommodates GPT-5 CLI, Gemini CLI, Aider, Cursor, and successors. AGENTS.md is the multi-vendor standard (stewarded by the Agentic AI Foundation; joint Google/OpenAI/Factory/Sourcegraph/Cursor). CLAUDE.md in this proposal is strictly optional — only added when Claude Code-specific behavior matters. Equivalent vendor-specific files (.cursorrules, GEMINI.md, CODEX.md) follow the same overlay pattern when their agent is running. The facts/ layer, compiled AGENTS.md, schema validation, and CI enforcement are identical regardless of which agent reads.

flowchart LR
    Y[facts/*.yaml] --> C[compile-agents-md.sh]
    S[facts/schema.json] --> C
    C --> A[AGENTS.md]
    A --> G[CI: docs-check.yml]
    G -->|--check| C
    G -.->|size > 8KB| X[fail]
    G -.->|schema invalid| X
    G -.->|ok| P[merge]
View Mermaid source
flowchart LR
    Y[facts/*.yaml] --> C[compile-agents-md.sh]
    S[facts/schema.json] --> C
    C --> A[AGENTS.md]
    A --> G[CI: docs-check.yml]
    G -->|--check| C
    G -.->|size > 8KB| X[fail]
    G -.->|schema invalid| X
    G -.->|ok| P[merge]
  • facts/stack.yaml — canonical tech stack (runtime, package manager, linter, test framework)
  • facts/conventions.yaml — commit format, branch naming, MCP servers in use
  • facts/pitfalls.yaml — numbered anti-patterns agents hit in this repo
  • facts/schema.json — JSON Schema that all three facts/*.yaml files validate against
  • scripts/compile-agents-md.sh — regenerates AGENTS.md from facts/; supports --check mode for CI
  • CLAUDE.md at repo root — ≤60 lines, @-imports AGENTS.md, adds Claude-specific overrides. Optional, Claude Code only — skipped entirely when pod runs a non-Claude agent.
  • AGENTS.md at repo root — hand-written → compiled. Size budget 8 KB enforced by CI.
  • .github/workflows/docs-check.yml — adds compile-agents-md.sh --check, adds size budget checks, removes queries: from frontmatter validation, adds audience: requirement
  • docs/documentation-standard.md — frontmatter spec changes (drop queries, add audience/supersedes/source_refs), new “Compiled AGENTS.md” section, size budgets
  • raw/ / wiki/ directory migration
  • Propagation to other repos
  • LLM-as-judge lint cron
  • File-back rule in character prompts
  • Memory MCP changes
  • Not full strategy: Vercel’s measured gain (53→100%) traces to compiled 8 KB embedded index. Lint crons, file-back, raw/ populations are compounding bets. Do the measured win first.
  • Not Phase 0 (dangerous-command guard): that’s days; Stage A is 2h. Every assignment between now and Phase 0 benefits.
  • Not just fix frontmatter: frontmatter alone doesn’t move agent success rate per Vercel.
  • Not adopt llms.txt too: no production rig uses it. AGENTS.md won.
  1. ./scripts/compile-agents-md.sh produces valid AGENTS.md ≤ 8 KB.
  2. ./scripts/compile-agents-md.sh --check on fresh checkout exits 0.
  3. Editing facts/stack.yaml without re-running compile causes CI to fail with diff.
  4. Editing AGENTS.md directly causes CI to fail.
  5. Invalid enum in facts/stack.yaml fails schema validation.
  6. CLAUDE.md present at root, ≤ 60 lines, imports AGENTS.md via @.
  7. Every existing doc has valid audience: field post-migration.
  8. docs-check.yml passes on fresh PR.
  • Median turns per cli_completed on issue→PR assignments
  • Median cost per cli_completed
  • agent_stuck events per 100 assignments
  • First-attempt Review-E approval rate

Recompute same metrics.

If 2 of 4 improve:

  • Median turns drops ≥ 15%
  • Median cost drops ≥ 15%
  • agent_stuck rate drops ≥ 20%
  • First-attempt approval improves ≥ 15 percentage points

→ proceed with Stage B. Otherwise investigate why Stage A didn’t help, or pivot to Phase 0.

  • Compile script bugs: tests in same PR covering schema validation, size budget, drift detection.
  • Frontmatter migration edge cases: idempotent migrate-frontmatter.sh script; dry-run first.
  • 8 KB budget too tight: current hand-written AGENTS.md is 3-5 KB; 8 KB gives ~60% headroom.
  • CLAUDE.md import graph fails on Claude Code: test on live Dev-E pod in staging first.

Rollback: revert the PR. No cluster changes, no data loss.

  • T+0: File tracking issue with baseline metrics
  • T+0 to T+2h: Dev-E implements
  • T+2h: PR opened, Review-E reviews
  • T+3h: Merges
  • T+3h to T+5d: 5 assignments process
  • T+5d: Recompute metrics, decide Stage B
  1. facts/ YAML or TOML? → YAML (repo convention + yq present).
  2. Template engine for compile? → No, heredoc is readable.
  3. Propagate to other repos? → No, rig-gitops alone for first measurement.
  4. Exclude first 24h from baseline? → Yes, image propagation steady-state.
  • Draft — this state; awaiting human approval via PR merge
  • Approved — PR merged to main with status: approved; triggers create-impl-issues.sh
  • Implementing — GitHub issues created, Dev-E working
  • Done — all child issues merged, metrics recomputed, Stage B decision made