Skip to content

Trust Model — Tiered Autonomy by Blast Radius

Trust Model — Tiered Autonomy by Blast Radius

Section titled “Trust Model — Tiered Autonomy by Blast Radius”
  • 🟢 event-store-marten — rig-conductor event store (Marten/Postgres). 29 event types defined (latest: GUARD_BLOCKED). Projections live. Public catalog at /events.md.
  • 🟢 post-api-events — POST /api/events endpoint. Production-active on rig-conductor.
  • 🟢 assignment-dispatch — Assignment dispatch (GET /api/assignments/next). Priority + FIFO only, no capacity check.
  • per-consumer-cursor — Per-consumer cursor projection. Phase 3. Replaces earlier per-pod capacity framing.
  • bounded-loop-sentinel — Bounded-loop sentinel (ReviewLoopExceeded). Phase 4. Caps Dev-E/Review-E ping-pong.
  • trust-t3-two-attestor — T3 two-attestor approval gate. Dispatch filter live; Kyverno admission pending. Structural limit: 1-person rig makes real enforcement difficult.
  • autonomy-tier-promotion — Autonomy tier promotion projection (T0→T1→T2). Pattern: 20 successful runs, zero rollbacks. Not yet implemented.

!!! abstract “TL;DR” Autonomy is earned per task class, not granted by default. Four tiers (T0–T3) based on blast radius. Enforcement is defense-in-depth: dispatch filter + Review-E gate + Kyverno admission. Promotion is measured (20 successful runs with zero rollbacks); demotion is immediate on any attributable rollback. T3 actions never auto-promote — humans co-sign irreversibility.

Core claim: an agent’s autonomy is a function of (a) the change’s blast radius, (b) the agent’s measured track record on that change class, and (c) the reversibility of the outcome. Not of the agent’s identity, the task’s urgency, or the human’s convenience.

TierNameBlast radiusReversibilityAutonomyEnforcing gates
T0Non-blastDocs, tests, scaffolding, YAML lintingTrivial (git revert)Full agent, no humanCI + Review-E
T1ContainedSingle-service feature, test-covered refactor, bounded UI changeFast (flag kill ~30s, rollback ~5m)Full agent under canaryCI + Review-E + Flagger SLO gate + error-budget check
T2Cross-cuttingMulti-repo work, event schema change, new public API, architect-level interfaceSlow (rollforward or multi-step rollback)Agent plans, human approves interface, agent implementsCI + Review-E + human co-sign on interface + Kyverno two-attestor
T3IrreversibleDestructive DB migration, auth/authz code, payment logic, secret rotation, cluster-scope RBACNone (data loss, security regression)Human drives, agent assistsHuman approval mandatory + Kyverno reject without human-OIDC attestation

The tier is the dispatch ceiling. A task can be attempted at or below its tier. Running a T3 task through the T0 path is a policy violation blocked at admission.

sequenceDiagram
    participant U as User / GitHub Issue
    participant S as Spec-E
    participant CE as rig-conductor
    participant POL as Policy Engine
    participant D as Dispatcher

    U->>S: Issue created
    S->>S: Extract files, surfaces, effects
    S->>POL: Classify blast radius
    POL-->>S: tier T0 / T1 / T2 / T3
    S->>U: Post clarifying questions if needed
    U-->>S: Answers
    S->>CE: Commit TaskSpec<br/>(tier, acceptance criteria,<br/>scope, test strategy)
    CE->>D: Dispatch decision
    alt tier == T0
        D->>D: Any agent, any time
    else tier == T1
        D->>D: Agent + canary pipeline
    else tier == T2
        D->>D: Require human interface approval<br/>before dispatch
    else tier == T3
        D->>D: Require human co-sign + explicit<br/>"I drive" confirmation
    end
View Mermaid source
sequenceDiagram
    participant U as User / GitHub Issue
    participant S as Spec-E
    participant CE as rig-conductor
    participant POL as Policy Engine
    participant D as Dispatcher

    U->>S: Issue created
    S->>S: Extract files, surfaces, effects
    S->>POL: Classify blast radius
    POL-->>S: tier T0 / T1 / T2 / T3
    S->>U: Post clarifying questions if needed
    U-->>S: Answers
    S->>CE: Commit TaskSpec<br/>(tier, acceptance criteria,<br/>scope, test strategy)
    CE->>D: Dispatch decision
    alt tier == T0
        D->>D: Any agent, any time
    else tier == T1
        D->>D: Agent + canary pipeline
    else tier == T2
        D->>D: Require human interface approval<br/>before dispatch
    else tier == T3
        D->>D: Require human co-sign + explicit<br/>"I drive" confirmation
    end

Classification rules are encoded as a policy file (policy/blast-radius.yaml) in rig-gitops, evaluated by a deterministic classifier backed by Spec-E when ambiguous. Concrete rules:

  • No changes to code paths that execute in production
  • Or: changes to test files, docs, YAML linting rules, GitHub Actions formatting
  • No changes to dependencies
  • No changes to agent prompts or the rig’s own code
  • Branch protection does not require human review

T1 (Contained) — at least one of, and nothing from T2/T3:

Section titled “T1 (Contained) — at least one of, and nothing from T2/T3:”
  • Changes to a single service’s code
  • Dependency additions from an allowlisted registry with Socket.dev score >= threshold
  • Refactors with existing test coverage >= 80% of changed lines
  • UI changes not touching auth, payments, or user data
  • The service has a defined SLO, a Flagger Canary, and a kill-switch feature flag
  • Changes spanning 2+ repositories
  • Changes to rig-conductor event type definitions or the subscription registry
  • New public HTTP API surface or CLI command
  • Changes to the rig’s own agent prompts or character files
  • Changes to a shared library consumed by 2+ services
  • Changes requiring new feature flags to be defined (not just flipped)
  • Destructive DB DDL (DROP, TRUNCATE, non-backward-compatible ALTER)
  • Changes to authentication, authorization, or session handling
  • Changes to payment processing, billing, or money-handling paths
  • Changes to secret management or credential rotation logic
  • Cluster-scope Kubernetes RBAC changes
  • Changes to Kyverno policies themselves
  • Changes to the attestation chain (Sigstore config, SLSA workflows)
  • Production data migrations affecting >1M rows

Boundary cases route to Spec-E, which errs on the side of the higher tier.

An agent’s autonomy tier for a task class is stored in rig-conductoras a projection:

record AgentAutonomy(
string AgentId,
string TaskClass, // e.g., "docs-update", "ui-change", "service-refactor"
int CeilingTier, // 0..3, maximum tier this agent can attempt for this task class
int SuccessfulRuns, // rolling 90-day count
int Failures, // includes rollback, human-rework, budget-overrun
DateTimeOffset LastReset
);

Default ceiling is T0 for every (agent, task-class) pair. Promotion rules:

  • T0 → T1: 20 consecutive successful T0 runs of that class, zero human-rework, zero rollbacks. Ceiling raises to T1 for that class.
  • T1 → T2: 20 consecutive successful T1 runs, zero canary aborts, zero SLO-budget depletions attributable to the change. Ceiling raises to T2 for that class.
  • T2 → T3: No automatic promotion. T3 tasks require human co-sign on every instance regardless of track record. The principle “humans at semantic boundaries” trumps accumulated trust.

Demotion rules:

  • Any rollback attributable to the agent’s work on that class: ceiling drops one tier immediately, cooldown 30 days before promotion eligibility resets.
  • Model version change (e.g., Sonnet 4.6 → 4.7, or cross-vendor swap via LiteLLM fallback_models — see provider-portability.md; or any behavior-drift signal >30% on the canary suite): all ceilings reset to T0, promotion track record held in quarantine for human review.
  • New class of task: ceiling starts at T0 for that class regardless of the agent’s ceiling on other classes.

!!! danger “T3 never auto-promotes” T3 work (destructive DB, auth, payments, secret rotation, cluster RBAC, Kyverno policy changes) always requires human co-sign on every instance. Accumulated track record on lower tiers does not raise this ceiling. Irreversibility is a structural reason, not a trust metric.

Promotion/demotion events are stored in the event log. Audit query: “show me every autonomy change for Dev-E in the last 90 days” is a replay.

Tier enforcement happens in three layers, defense-in-depth:

rig-conductor’s assignment endpoint (GET /api/assignments/next?agentId=X) filters by the agent’s ceiling tier. An agent with T1 ceiling on ui-change cannot be assigned a T2-classified UI change. The filter is cheap: one JOIN against AgentAutonomy.

Review-E’s character prompt includes tier-specific review criteria. For a T2 change, Review-E explicitly looks for “interface approval attestation” in the PR metadata and blocks the review if absent. For T3, Review-E refuses to approve and routes to human.

The cluster-level final gate. Kyverno ImageValidatingPolicy requires, for any manifest targeting namespaces with a blast-radius: t3 label:

apiVersion: policies.kyverno.io/v1
kind: ImageValidatingPolicy
metadata: { name: t3-human-cosign }
spec:
validationActions: [Deny]
matchConstraints:
namespaceSelector:
matchLabels: { blast-radius: t3 }
attestors:
- name: agent-identity
cosign:
keyless:
identities:
- subject: "https://github.com/dashecorp/.+/.github/workflows/release\\.ya?ml@.+"
issuer: "https://token.actions.githubusercontent.com"
- name: human-approval
cosign:
keyless:
identities:
- subject: "repo:dashecorp/prod-approvals:environment:t3-approve"
issuer: "https://token.actions.githubusercontent.com"
validations:
- expression: "images.containers.map(i, verifyAttestationSignatures(i, attestations.slsa, [attestors.'agent-identity', attestors.'human-approval'])).all(e, e > 0)"

Translation: to land in a T3 namespace, the image must carry two valid Sigstore signatures — one from the agent’s build workflow, one from a human-triggered approval workflow. No image, no signature, no admission. No human can forget the rule because the rule is enforced by the cluster, not the reviewer.

Every dispatched task carries a TaskSpec that encodes everything needed to determine tier, dispatch correctly, and evaluate outcome:

id: task-202604160042
repo: dashecorp/rig-conductor
issue: 76
tier: T1
blast_radius:
reason: "Changes single-service feature with existing test coverage"
surfaces: ["src/Api/EventsEndpoint.cs", "tests/Api/EventsTests.cs"]
evaluated_by: spec-e
evaluated_at: "2026-04-16T16:30:00Z"
acceptance_criteria:
- "POST /api/events accepts new event type X"
- "MartenProjections.cs updates IssueStatus for type X"
- "Integration test covers happy path and bad input"
test_strategy:
required_coverage_delta: 0
property_tests: true
non_goals:
- "Do not change existing event type definitions"
- "Do not modify Discord routing"
expected_effort_tokens: 80000 # budget guardrail
assigned_agent: dev-e-dotnet
ceiling_tier: T1 # the assigned agent's current ceiling for this task class

Spec-E authors this. rig-conductorvalidates on submission (schema check + tier matches ceiling). Dispatch is gated on both.

When an agent encounters work it cannot complete within its tier:

graph LR
    A[Agent working T1 task] -->|discovers need<br/>for T2 change| E[Emit EscalationRequired]
    E --> C[rig-conductor]
    C -->|route to human| H[#admin Discord<br/>+ @mention tier-owner]
    H -->|approve scope expansion| AR[Record approval attestation]
    AR -->|re-dispatch at T2| A2[Agent with T2 ceiling<br/>or new interface spec]
    H -->|reject| R[Close task, open new tracking]
View Mermaid source
graph LR
    A[Agent working T1 task] -->|discovers need<br/>for T2 change| E[Emit EscalationRequired]
    E --> C[rig-conductor]
    C -->|route to human| H[#admin Discord<br/>+ @mention tier-owner]
    H -->|approve scope expansion| AR[Record approval attestation]
    AR -->|re-dispatch at T2| A2[Agent with T2 ceiling<br/>or new interface spec]
    H -->|reject| R[Close task, open new tracking]

An agent cannot silently exceed its tier. Any discovery that a task is actually larger than its classification forces an escalation event, stored with the discovery context, routed per severity.

T2 “interface approval” and T3 “I drive” mechanics:

  • T2 interface approval: human reviews the proposed interface (event schema, API contract, CLI flags) in a dedicated interface-review issue. Approval records a GitHub Deployments API entry with the human’s identity. rig-conductor reads the Deployments API as the attestation source. No approval event → Dispatcher refuses T2 work.
  • T3 human-drives: human is the primary author (or explicit named sponsor) of the PR. Agent assists via commits on a sub-branch that the human merges into the main feature branch. Kyverno rejects any T3 image not carrying the human’s Sigstore co-sign. The human’s GitHub Actions approval workflow (.github/workflows/t3-approve.yaml) is the signing surface — it runs on workflow_dispatch with environment t3-approve and requires a protection rule “required reviewers = human”.

This is the same technical pattern Google and GitHub use for production deploys; we just wire it to our blast-radius labels.

Tiers are heuristics, not ground truth. Known failure modes and mitigations:

FailureExampleMitigation
Spec-E underclassifiesT2 work marked T1 because Spec-E didn’t realize it touched 2 reposDaily reconciliation check: Spec-E re-evaluates all open tasks; disagreements with the original classification surface as events.
Agent discovers higher-tier work mid-taskT1 task turns out to need a schema changeStuckGuard detects the agent stalling on a T2-shaped problem; forced escalation.
Human mis-approves T3 under time pressureLate-night hotfix signed off without reviewKyverno logs every T3 admission; daily post-hoc audit surfaces rushed approvals.
Novel task class with no track recordAgent has T2 ceiling on one service but is asked to work on a new oneDefault back to T0 for the new class; human observer required for first 20 runs.
Classification rules themselves changeNew product area added, new T3 criteria neededChanges to policy/blast-radius.yaml are themselves T2 tasks — agent can propose, human approves.

Some systems give all agents uniform capabilities and rely on auditing. This is the “admin user” anti-pattern:

  • Audit is reactive. By the time someone notices an agent deployed to prod without canary, it’s too late.
  • Uniform permissions mean a compromised agent (via prompt injection) inherits the full permission set.
  • Flat models create pressure to lower the permission floor for “simple” tasks, eroding the ceiling for risky ones.

Tiered autonomy is the structural answer. Blast-radius-scoped permissions mean even a fully-compromised T0 agent cannot deploy a T3 change.

Gastown’s GUPP (Gas Town Universal Propulsion Principle) takes the opposite stance: agents push direct to main, humans stay out of the loop, the rig’s job is to keep agents running. This works for specific domains (internal tool development where the cost of a bad commit is low) but breaks down for:

  • Customer-facing services where outages have business cost
  • Security-critical code where bugs have compliance cost
  • Multi-tenant systems where blast radius crosses tenants
  • Regulated environments where human attestation is required

Our rig is designed for hybrid human-agent teams shipping software that humans depend on. The trust model is the price of that target.

The trust model is itself tiered-policy-managed (meta-T2). Changes to this document or the policy/blast-radius.yaml rules go through human review. Changes to Kyverno enforcement policies go through meta-T3 (themselves a T3 change to the enforcement of T3).

This recursion stops at the human layer: at some point, the rig must trust humans, and humans must trust each other via the usual social and legal mechanisms.