User story: safety foundation — block the unrecoverable before higher-trust tiers
User story
Section titled “User story”As the rig operator
I want deterministic runtime guards between agent reasoning and tool execution — dangerous-command blocklist, per-task git worktrees, default-deny egress, short-lived GitHub tokens
So that a compromised or looping agent cannot do the unrecoverable thing (filesystem destruction, secret exfiltration, force-push to main, long-lived token replay) without a human in the loop, and the rig earns the right to advance autonomy tiers (T1 → T2 → T3).
Progress (as of 2026-04-22 evening)
Section titled “Progress (as of 2026-04-22 evening)”| AC | Status | Ship |
|---|---|---|
| 1 · Dangerous-command guard | ✅ Shipped | rig-agent-runtime #97 + #98 |
2 · GuardBlocked events + dashboard panel | ✅ Shipped | rig-conductor #90, rig-agent-runtime #99, rig-conductor #99 |
| 3 · Git worktrees per task | ✅ Shipped | rig-agent-runtime #101 |
| 4 · GitHub App 1h tokens | ✅ Shipped | rig-agent-runtime #103 + rig-gitops #119 + #121 |
| 5 · Default-deny egress (Phase 1) | 🟡 Pod-scoped DNS live on review-e (burn-in) | rig-agent-runtime #115, rig-gitops #161 + #162 |
Score: 4 shipped + 1 partial across 2026-04-20 → 2026-04-22. AC 5 Phase 1 ipBlock attempt was shipped and reverted 2026-04-22 morning (api.anthropic.com is Cloudflare anycast, not the published Anthropic CIDR). Afternoon: three spikes landed the redesign direction — LiteLLM ruled out (error wrapping + OAuth incompat, bundled with Priority 3); Envoy SNI gateway verified end-to-end and shipped to the cluster (rig-gitops #153). Two integration attempts reverted same day — HTTPS_PROXY (SNI-inspector doesn’t speak CONNECT, #154 → #155) and cluster-wide CoreDNS rewrite (caught Flux’s own github.com fetch, #156 → #158). Evening: pod-scoped DNS path shipped. Chart 1.1.0 adds dnsPolicy / dnsConfig pass-through (rig-agent-runtime #115). Dedicated CoreDNS in the egress-gw namespace rewrites each allowlisted public host to the in-cluster Envoy egress gateway — target resolved via real kube-dns, so no Envoy ClusterIP is baked into config (rig-gitops #161 + fix #162 for the forward-plugin zone-parse trap). review-e wired with dnsPolicy: None + a dnsConfig.nameservers pointing at the dedicated CoreDNS (kube-dns kept as secondary). Live verification from a manually-scaled review-e pod: Discord gateway WSS + Anthropic + GitHub App token mint + MCP servers + Valkey stream all reached through the new path; rig-conductor sees a fresh heartbeat. 24h burn-in before dev-e rollout and the default-deny NetworkPolicy that terminates Phase 1. Cluster-reality correction (rig-docs #95 — k3s, not GKE) stands.
Context
Section titled “Context”See whats-next whitepaper §Priority 1 and the source safety.md whitepaper (pillars 1–2).
Today the rig has zero runtime guards. Trust is prompt-level (“don’t do bad things”) plus branch protection after the fact. The implementation-status matrix lists 8 safety capabilities with 0 deployed. This is the highest-leverage first investment because every higher-trust tier depends on it — you cannot promote an agent to T2 (merge-with-approval) if T1 has no floor.
Acceptance criteria
Section titled “Acceptance criteria”- ✅ Dangerous-command guard — PreToolUse hook reads tool-call JSON on stdin, matches
tool_input.commandagainst a blocklist, exits 2 (block + reason) on match. Minimum blocklist:sudo,rm -rf /(notrm -rf ./),git push --force(without--force-with-lease),git reset --hard,git clean -f,drop table,drop database,truncate table,kubectl delete namespace, package installers,chmod 777,chmod -R 000,curl … | sh. No override flag. Escape hatch is the human running the command outside the agent loop. Shipped in rig-agent-runtime#97 (script + tests + CI) and #98 (activated by default via baked-in~/.claude/settings.json). 43 test cases pass. - ✅
GuardBlockedevent emission — every block emits a non-blocking event to rig-conductor; counts visible viaGET /api/guard-blocked(optionalagentIdfilter) and on the rig-conductor dashboard Safety panel (header stat + per-agent table with top reason, last command, last-blocked time). Shipped in rig-conductor#90 (event + projection + endpoint, 46/46 tests), rig-agent-runtime#99 (hook payload shape fix), and rig-conductor#99 (dashboard Safety panel). - ✅ Git worktrees per agent task — each dispatched task runs in its own worktree under
/workspace/tasks/<task-id>/<repo>/, backed by a shared bare clone at/workspace/_bare/<owner>/<repo>.git. One task’s workspace cannot reach another’s. Cursor 2026 pattern. Shipped in rig-agent-runtime#101 (task-workspacehelper + 17 tests + CI, wired into the agent task prompt). - ✅ GitHub App installation tokens (1h TTL) — replaces the classic PAT in agent pods. Tokens minted per dispatch, expire in 60 minutes, never persisted to disk. Shipped across:
- rig-agent-runtime#103 — removes the PAT fallback when App-mint fails (fail loud, not silent).
- rig-gitops#119 — removes the
GITHUB_PERSONAL_ACCESS_TOKENenv var from dev-e + review-e pods entirely. - rig-gitops#121 — implementation-status matrix updated.
The only remaining trace is the
github-tokenkey still present inside the SealedSecrets; pruning requires a re-seal and is deferred to the next rotation. Nothing in the running pods references it.
- 🟡 Default-deny egress NetworkPolicy (Phase 1) — pod-scoped DNS path live on review-e; 24h burn-in then dev-e + NetworkPolicy. Five-attempt rollout story:
- Cluster reality correction (stands): the rig runs on k3s v1.34.6 on a GCE VM (not GKE as BRAIN.md had drifted to claim). BRAIN + research corrected in rig-docs #95; “GKE Dataplane V2 / FQDNNetworkPolicy” plan was always inapplicable.
- Attempt 1 — ipBlock allowlist (rig-gitops #137 → reverted #143 + #144, 2026-04-22 AM): plain k8s
NetworkPolicyondev-e+review-eallowing kube-dns, rig-conductor API (8080), valkey (6379), Anthropic160.79.104.0/21, GitHub/metaCIDRs. Weekly refresh workflow (#139). Reverted becauseapi.anthropic.comresolves to Cloudflare anycast (162.159.x.x), not Anthropic’s published origin CIDR. Side-gap: postgres 5432 was also blocked (rig-conductor rule only permitted 8080 + 6379). The split{ns}-github-egresspolicy was also removed — k8s NetworkPolicy Egress rules are not additive; any matching policy creates default-deny for unmatched traffic. - Afternoon spikes (2026-04-22 PM):
- LiteLLM spike #1 + #2 rule out a quick LiteLLM drop-in: error responses wrapped (breaks Claude Code retry), and the rig’s OAuth subscription tokens aren’t compatible with LiteLLM’s
x-api-keyforward path. LiteLLM route bundled with Priority 3 instead. - Envoy SNI egress gateway spike verified hostname allowlisting via SNI works end-to-end on the k3s cluster.
- LiteLLM spike #1 + #2 rule out a quick LiteLLM drop-in: error responses wrapped (breaks Claude Code retry), and the rig’s OAuth subscription tokens aren’t compatible with LiteLLM’s
- Attempt 2 — Envoy gateway standalone (rig-gitops #153, still live): pod healthy, idle. Gateway works in isolation; no agents routed through it yet.
- Attempt 3 — HTTPS_PROXY env var on review-e (rig-gitops #154 → reverted #155): the SNI-inspector listener doesn’t speak HTTP CONNECT; all HTTPS_PROXY requests failed at the CONNECT step.
- Attempt 4 — Cluster-wide CoreDNS rewrite (rig-gitops #156 → reverted #158): rewriting
github.comcaught Flux’s own source-controller in the loop; emergency delete of the rewrite ConfigMap restored Flux. - Attempt 5 — Pod-scoped DNS (rig-agent-runtime #115 chart 1.1.0 + rig-gitops #161 + fix #162, 2026-04-22 evening — LIVE for review-e):
dashecorp/rig-agent-runtimechart1.1.0adds.Values.dnsPolicy+.Values.dnsConfigpass-through on every pod spec (single-mode StatefulSet, split-mode gateway/worker Deployments, CronJob). Defaults empty; existing releases render byte-identical.- Dedicated CoreDNS in the
egress-gwnamespace (2 replicas) rewrites each allowlisted public host to the in-cluster Envoy egress gateway — target resolved via real kube-dns, so no Envoy ClusterIP is baked into config (Service recreation stays survivable). forwardplugin parse trap: multi-zone formforward cluster.local in-addr.arpa ip6.arpa 10.43.0.10is rejected silently; must split one zone per directive. Fix in #162.- review-e wired with
dnsPolicy: None+dnsConfig.nameservers: [10.43.200.53, 10.43.0.10]. kube-dns kept as secondary for availability fallback. - Live verification (manually scaled review-e pod with KEDA paused):
/etc/resolv.confcorrect; Discord gateway WSS connected; Anthropic GitHub-App token minted; 3 MCP servers connected; Valkey stream consumer attached to rig-conductor. Fresh heartbeat visible in/api/agents. Pod scaled back to 0; KEDA unpaused. - Third trap: the egress-dns Corefile
rewritelist and the Envoy SNIfilter_chain_match.server_nameslist are two places holding the same allowlist. Drift → pod succeeds DNS, Envoy resets the TLS. A CI check is the next safety follow-up.
- Pending (gated on 24h review-e burn-in, ~2026-04-23 evening):
- Apply the same
dnsPolicy/dnsConfigto dev-e (node + dotnet + python) — values-only PR, no chart work. - Default-deny egress NetworkPolicy — pod-selector based (not ipBlock), allowlist kube-dns (10.43.0.10:53), egress-dns (10.43.200.53:53), Envoy (10.43.79.56:443 + :8443), and rig-conductor pods on 8080, 6379, AND 5432 (Postgres — the gap from the first spike).
- Apply the same
- Parallel prompt fix (shipped, stands):
stream-consumer.js:226rewritten in rig-agent-runtime#110 — no more deadsudo apt-getadvice.
What it unblocks
Section titled “What it unblocks”- T1 → T2 tier promotion. Per trust-model.md, T2 is “agent merges with approval; no prod deploy creds.” That policy is meaningless if the agent can
rm -rfits way around approval. AC 1–3 are what make T2 real. - Priority 2 observability can be wired to the
GuardBlockedevent stream as an early signal. - Priority 3 cost ceiling — the egress policy (AC 5) is the chokepoint through which the LiteLLM proxy is made mandatory (if the only allowed LLM egress is the proxy, no agent can bypass it).
Out of scope
Section titled “Out of scope”- Kyverno admission policies (Phase 4 per implementation-status)
- Sigstore + cosign + SLSA L3 attestation (Phase 4)
- CaMeL trust separation (Phase 6; only prompt-injection defense with a formal guarantee)
- Schema-validated tool use via Pydantic/Instructor (continuous, not phase-gated)
Priority
Section titled “Priority”High. Prerequisite for Priorities 2–4. No higher-trust autonomy tier is honest without it.
Estimated effort
Section titled “Estimated effort”- AC 1 (dangerous-command guard): ~1 week. Pattern specified in
safety.md; reference implementation Gastown’stap_guard_dangerous. - AC 2 (
GuardBlockedevents): ~1 day. New event type + projection + dashboard panel. - AC 3 (worktrees per task): ~1 week. Well-trodden Cursor 2026 pattern.
- AC 4 (GitHub App 1h tokens): ~3 days. Replaces classic PAT; installation-token mint loop in the agent startup.
- AC 5 (default-deny egress): Phase 1 pod-scoped DNS live on review-e as of 2026-04-22 evening after four reverted approaches; 24h burn-in → dev-e rollout → default-deny NetworkPolicy terminates Phase 1. LiteLLM-based cost/model centralisation bundled with Priority 3.
Total: ~5 weeks of focused work, parallelisable across 2 engineers.
Adjacent ships (context)
Section titled “Adjacent ships (context)”Work that landed alongside the AC deliverables but isn’t a formal AC:
- rig-agent-runtime#110 — rewrote the
## Runtime installsblock instream-consumer.jsto match guard reality. The old prompt advisedsudo+apt-get install, both blocked by the AC 1 guard, so agents following their own guidance hitGuardBlockedand got stuck. Prep for AC 5 as well (primes agents for a future egress policy that denies arbitrary hosts). Surfaced by the agent runtime-install audit research. - dashecorp/infra#112 — declarative per-repo provisioning of
RIG_BOT_PATvia Terraform (needs_rig_bot_pat = trueingithub/dashecorp/variables.tf:repos). Not part of this user story, but the 2026-04-20 CI resuscitation that unblocked rig-conductor’spublish-image→ PR-based update-gitops flow depended on a manually-created PAT secret; #112 makes that pattern reproducible so the next dashecorp repo that needs PR-on-main doesn’t rediscover the trap. Seedashecorp/infra/BOOTSTRAP.md.