Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code

Purpose

Understand how the rig’s design choices differ from the leading agent platforms. Inform decisions about where to build custom infrastructure vs. adopt upstream patterns.

Comparison matrix

Dimension	Devin (Cognition)	SWE-agent	Aider	Claude Code	The Rig
Hosting	Managed SaaS	Self-hosted	Local CLI	Local CLI	Self-hosted k3s
Dispatch	Web UI / API	CLI / CI	CLI	CLI	Event-sourced (rig-conductor)
Persistence	Session-scoped	None	None	CLAUDE.md + project files	Postgres + pgvector (memory MCP)
Review	Human only	Human only	Human only	Human only	Review-E (agent-to-agent)
Cost attribution	Per-session billing	None built-in	None built-in	None built-in	Per-issue, per-agent, daily totals
Multi-agent	Single agent	Single agent	Single agent	Single agent	6 specialized agents
GitOps integration	No	No	No	No	Flux-managed deployments
Branch protection	Respects but doesn’t enforce	No	No	No	Conductor merge gate
Observability	Cognition dashboard	None	None	Limited	OTel (partial), cost dashboard
Open source	Closed	MIT	Apache 2.0	Closed	Mixed (components open)

Devin (Cognition)

What it does well: Devin is the most capable general-purpose coding agent available as a product. It can navigate browser UIs, run tests, read documentation, and complete multi-session tasks. The Cognition team’s focus on long-horizon reliability is visible.

Where the rig diverges:

Cost model: Devin charges per-session with no per-issue attribution. The rig tracks cost by issue, agent, and repo — essential for budgeting at 30+ issues/day.
Operator control: Devin is a managed service. The rig is operator-hosted, which means full control of secrets, network policy, and model selection at the cost of operational burden.
Multi-agent: Devin is a single agent. The rig separates concerns: Dev-E writes, Review-E reviews, iBuild-E builds for Apple platforms. This separation prevents the acceptance-gating problem (an agent cannot approve its own work).
Memory: Devin has session memory but not a persistent cross-session vector store. The rig’s memory MCP enables learning that survives pod restarts and persists across weeks.

Key insight: Devin is optimized for “do this task well once.” The rig is optimized for “do 30 tasks per day reliably, across 7 repos, with audit trails.”

SWE-agent

What it does well: SWE-agent is the research baseline for agent-on-GitHub-issues performance. It introduced the AgentComputer Interface (ACI) — a structured set of tools (search, view, edit, run) that outperforms raw shell access for code navigation.

Where the rig diverges:

ACI vs. Claude Code: SWE-agent built its own tool layer. The rig uses Claude Code CLI, which provides similar primitives (Read, Grep, Edit, Bash) with Anthropic’s prompt engineering baked in.
Persistence: SWE-agent is stateless between runs. The rig’s memory MCP means lessons from issue N inform issue N+1000.
Dispatch: SWE-agent runs from CLI. The rig’s conductor handles assignment exclusivity, cost tracking, and lifecycle events without manual intervention.

Key insight: SWE-agent’s ACI insight is correct and the rig inherits it via Claude Code. The rig extends the model with production-grade dispatch, memory, and multi-agent separation.

Aider

What it does well: Aider is the most mature local coding assistant. It supports 100+ LLM models, has excellent git integration (auto-commit with conventional messages), and handles multi-file edits cleanly. The --architect mode uses a planning model to design changes before a coding model implements them — an early form of multi-agent.

Where the rig diverges:

Human in the loop: Aider requires a human to direct each session. The rig is autonomous from issue dispatch through PR merge.
Multi-model: Aider’s --architect mode is a hint at the rig’s design: separate roles for design and implementation. The rig extends this with Review-E as a third role (independent verification).
Deployment: Aider is a developer tool, not a deployed service. It doesn’t handle assignment, cost attribution, or failure recovery.

Key insight: Aider proves the value of specialization (architect vs. coder). The rig extends the pattern: designer (Architect-E, planned) → coder (Dev-E) → reviewer (Review-E).

Claude Code

What it does well: Claude Code is the rig’s actual execution engine. It provides the tool-use layer (Read, Grep, Edit, Bash, WebFetch, Agent), handles multi-turn sessions, and manages the agentic loop. It’s not a platform — it’s a powerful primitive.

Where the rig diverges (by adding):

Claude Code alone has no dispatch. The rig wraps it with rig-conductor for assignment exclusivity and lifecycle tracking.
Claude Code alone has no persistent memory. The rig adds the memory MCP for cross-session learning.
Claude Code alone has no multi-agent coordination. The rig adds Review-E as an independent gate.
Claude Code alone has no cost attribution. The rig adds TokenUsageProjection to correlate every API call to an issue.

Key insight: Claude Code is the engine. The rig is the vehicle. Neither is sufficient alone.

Where the rig is genuinely ahead

Agent-to-agent review — No commercial platform ships independent review as a separate agent. The rig’s Review-E catches logic errors, missing docs, and pattern violations before human review.
Per-issue cost attribution — The rig knows which issues cost $0.40 and which cost $12. Useful for identifying runaway epics and calibrating complexity estimates.
Cross-session persistent memory — Memory persists across pod restarts, agent upgrades, and weeks of operation. The rig doesn’t re-learn the same gotchas.
Operator-defined governance — CODEOWNERS + branch protection gives operators surgical control over which paths require human approval. No platform offers this level of configurability.

Where the rig lags

Single-provider — Anthropic-only today. Provider portability is a planned whitepaper topic but not implemented.
No browser/UI automation — Devin can navigate web UIs; the rig cannot. This limits it to code-and-config tasks.
Operational complexity — Running k3s, Flux, Postgres, and a custom conductor requires real infrastructure expertise. Devin’s SaaS model is dramatically simpler to start.