Agent runtime installs — what's baked in, what's needed, and why Phase 1 egress CIDR-only breaks agents
TL;DR. Audit prep work for AC 5 Phase 1 of the safety-foundation user story. Finds two issues: (1) the current agent task prompt in
stream-consumer.js:226tells agents to install tools at runtime viasudo+apt-get install, both already blocked by the pretool-guard shipped in AC 1. Dead advice, worth fixing. (2) Phase 1 of the egress policy options research recommended a CIDR-only NetworkPolicy with npm blocked at the network layer. That breaks every agent task that runs on a repo with apackage.jsonbecausenpm installneeds registry access. Revised plan: skip Phase 1 CIDR-only and go direct to GKE FQDNNetworkPolicy with an expanded allowlist (npm + pypi + nuget registries in addition to GitHub + Anthropic). Still Preview, still acceptable at 2 agents.
What’s baked into agent images today
Section titled “What’s baked into agent images today”| Layer | Packages installed at build time |
|---|---|
Dockerfile.base (common) | ca-certificates, curl, jq, git, openssh-client, GitHub CLI gh, Node 22 (via parent image node:22-slim), @anthropic-ai/claude-code, @openai/codex, @dashecorp/rig-memory-mcp |
Dockerfile.node (dev-e, review-e stacks) | adds typescript, tsx, jest, vitest, eslint, prettier (global) |
Dockerfile.python (if used) | adds python3, python3-pip, python3-venv, pytest, black, ruff |
Dockerfile.dotnet (rig-conductor agent) | adds .NET 10 SDK |
Total: ~20 common dev tools baked in. Developers working on the rig itself rarely need more.
What the task prompt tells agents (today)
Section titled “What the task prompt tells agents (today)”From src/stream-consumer.js:226:
## Runtime installsIf you need a tool that is not installed, install it yourself(npm install -g, pip install, apt-get install). You have sudo accessfor apt-get. Prefer global installs so they persist for the session.This is doubly broken today, before any egress policy even lands:
| Command in prompt | pretool-guard behavior | Result |
|---|---|---|
sudo apt-get install … | blocked (sudo + apt install regex) | agent stuck |
apt-get install … | blocked (apt install regex) | agent stuck |
brew install … | blocked (brew install regex) | agent stuck |
npm install -g … | allowed | works today, would break under egress policy |
pip install … | allowed | works today, would break under egress policy |
So: two of the three installer families the prompt recommends are already stopped by AC 1’s guard. The guard’s GuardBlocked dashboard panel (shipped yesterday) will surface these attempts.
What agent tasks actually need at runtime
Section titled “What agent tasks actually need at runtime”Agents work on customer repos, not just rig-agent-runtime itself. A typical task flow:
task-workspace create <owner/repo> issue-<N>→ worktree with the customer repocd <WORKDIR>and read the project- Implement changes — which frequently requires installing the project’s declared deps:
- Node repo:
npm install(ornpm ci) resolvingpackage.json - Python repo:
pip install -r requirements.txt - .NET repo:
dotnet restore
- Node repo:
- Run tests:
npm test/pytest/dotnet test - Open PR
Step 3 is the one that egress policy breaks. The project-level npm install is not optional — it’s how dependencies come down to run the project’s tests. Baking customer deps into the image is not feasible: every customer repo is different.
Why Phase 1 (CIDR-only allowlist) doesn’t work
Section titled “Why Phase 1 (CIDR-only allowlist) doesn’t work”The egress policy options research recommended:
Phase 1 — default-deny
NetworkPolicywith CIDR allowlist (Anthropic160.79.104.0/21, GitHub/meta, kube-dns,rig-memory-mcp). Block npm at the network layer; bake deps into agent images.
The bake-deps advice was reasonable for our own agent runtime (dep list is small, stable). But it collapses as soon as agents operate on customer repos. And the Cloudflare problem remains:
registry.npmjs.org→ Cloudflare (≈1500 prefixes that rotate)nuget.org→ Azure CDN (frequently changing)pypi.org→ Fastly (less-frequent rotation but still not stable-enough for anipBlock)
Putting these in an ipBlock NetworkPolicy produces a monthly stream of npm install failures as prefixes rotate.
Revised phase plan
Section titled “Revised phase plan”Skip Phase 1 as originally drafted. Go direct to Phase 2 — GKE-native FQDNNetworkPolicy (Preview in 2026, acceptable for 2 agents):
- Default-deny egress
NetworkPolicyon the agent namespace. - Allow kube-dns (TCP/UDP 53).
- Allow
rig-memory-mcpservice viapodSelector. - FQDNNetworkPolicy allowing:
api.github.com,github.com,codeload.github.com,objects.githubusercontent.comapi.anthropic.com(or route this via the LiteLLM proxy when Priority 3 ships)registry.npmjs.org,registry.yarnpkg.com(Yarn registry, alias to npm)pypi.org,files.pythonhosted.orgapi.nuget.org,*.nuget.org
- Everything else denied.
FQDN Preview limits per GKE docs: 50 IPs per FQDN resolution, 100 IP/hostname quota, one-label-deep wildcards. *.nuget.org matches api.nuget.org but not foo.bar.nuget.org — acceptable for the registries above.
Also fix: the prompt contradicts the guard
Section titled “Also fix: the prompt contradicts the guard”stream-consumer.js:226 should be rewritten to match what the pretool-guard actually allows. Proposed:
## Runtime installsMost common tools (git, gh, jq, curl, claude, codex, typescript, jest,vitest, eslint, prettier, pytest, black, ruff, dotnet) are pre-installed.For project dependencies, use `npm install`, `pip install`, or`dotnet restore` inside your worktree. Do NOT use `sudo`, `apt-get`, or`brew` — those are blocked by the PreToolUse guard. If you need a toolthat is genuinely missing from the image, open a separate PR againstrig-agent-runtime/Dockerfile.* to add it; do not try to install it atruntime.Two benefits: aligns with guard reality today, and primes agents correctly for a future FQDNNetworkPolicy where arbitrary-host network calls are denied.
Small implementation plan
Section titled “Small implementation plan”Two PRs, in order:
- rig-agent-runtime: rewrite the
## Runtime installsblock insrc/stream-consumer.js. One-line test: prompt no longer mentionssudo/apt-get/brew install. ~1 hour. - rig-gitops: add the FQDNNetworkPolicy (and the default-deny base
NetworkPolicy) underapps/dev-e/andapps/review-e/. Needskubectl apply --dry-run=serveragainst the cluster before merge. ~½ day plus cluster validation.
Neither PR needs Cilium CRDs; both rely only on GKE Dataplane V2 features available to Invotek today.
Supersession
Section titled “Supersession”This research supersedes the Phase 1 slice of research/2026-04-21-egress-policy-options. The seven-option comparison and the Phase 3 egress-gateway recommendation in that doc still stand; the “Phase 1 CIDR-only first” advice is superseded. The combined plan is now: skip to what that doc called Phase 2, and keep Phase 3 as the later scale-up milestone.
Sources
Section titled “Sources”rig-agent-runtime/Dockerfile.base— base image layersrig-agent-runtime/Dockerfile.{node,python,dotnet}— per-language layersrig-agent-runtime/src/stream-consumer.js:226— runtime-install prompt textrig-agent-runtime/hooks/pretool-guard.sh— blocklist (shipped in rig-agent-runtime#97)- research/2026-04-21-egress-policy-options — superseded Phase-1 recommendation