Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code
Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code
Section titled “Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code”Purpose
Section titled “Purpose”Understand how the rig’s design choices differ from the leading agent platforms. Inform decisions about where to build custom infrastructure vs. adopt upstream patterns.
Comparison matrix
Section titled “Comparison matrix”| Dimension | Devin (Cognition) | SWE-agent | Aider | Claude Code | The Rig |
|---|---|---|---|---|---|
| Hosting | Managed SaaS | Self-hosted | Local CLI | Local CLI | Self-hosted k3s |
| Dispatch | Web UI / API | CLI / CI | CLI | CLI | Event-sourced (rig-conductor) |
| Persistence | Session-scoped | None | None | CLAUDE.md + project files | Postgres + pgvector (memory MCP) |
| Review | Human only | Human only | Human only | Human only | Review-E (agent-to-agent) |
| Cost attribution | Per-session billing | None built-in | None built-in | None built-in | Per-issue, per-agent, daily totals |
| Multi-agent | Single agent | Single agent | Single agent | Single agent | 6 specialized agents |
| GitOps integration | No | No | No | No | Flux-managed deployments |
| Branch protection | Respects but doesn’t enforce | No | No | No | Conductor merge gate |
| Observability | Cognition dashboard | None | None | Limited | OTel (partial), cost dashboard |
| Open source | Closed | MIT | Apache 2.0 | Closed | Mixed (components open) |
Devin (Cognition)
Section titled “Devin (Cognition)”What it does well: Devin is the most capable general-purpose coding agent available as a product. It can navigate browser UIs, run tests, read documentation, and complete multi-session tasks. The Cognition team’s focus on long-horizon reliability is visible.
Where the rig diverges:
- Cost model: Devin charges per-session with no per-issue attribution. The rig tracks cost by issue, agent, and repo — essential for budgeting at 30+ issues/day.
- Operator control: Devin is a managed service. The rig is operator-hosted, which means full control of secrets, network policy, and model selection at the cost of operational burden.
- Multi-agent: Devin is a single agent. The rig separates concerns: Dev-E writes, Review-E reviews, iBuild-E builds for Apple platforms. This separation prevents the acceptance-gating problem (an agent cannot approve its own work).
- Memory: Devin has session memory but not a persistent cross-session vector store. The rig’s memory MCP enables learning that survives pod restarts and persists across weeks.
Key insight: Devin is optimized for “do this task well once.” The rig is optimized for “do 30 tasks per day reliably, across 7 repos, with audit trails.”
SWE-agent
Section titled “SWE-agent”What it does well: SWE-agent is the research baseline for agent-on-GitHub-issues performance. It introduced the AgentComputer Interface (ACI) — a structured set of tools (search, view, edit, run) that outperforms raw shell access for code navigation.
Where the rig diverges:
- ACI vs. Claude Code: SWE-agent built its own tool layer. The rig uses Claude Code CLI, which provides similar primitives (Read, Grep, Edit, Bash) with Anthropic’s prompt engineering baked in.
- Persistence: SWE-agent is stateless between runs. The rig’s memory MCP means lessons from issue N inform issue N+1000.
- Dispatch: SWE-agent runs from CLI. The rig’s conductor handles assignment exclusivity, cost tracking, and lifecycle events without manual intervention.
Key insight: SWE-agent’s ACI insight is correct and the rig inherits it via Claude Code. The rig extends the model with production-grade dispatch, memory, and multi-agent separation.
What it does well: Aider is the most mature local coding assistant. It supports 100+ LLM models, has excellent git integration (auto-commit with conventional messages), and handles multi-file edits cleanly. The --architect mode uses a planning model to design changes before a coding model implements them — an early form of multi-agent.
Where the rig diverges:
- Human in the loop: Aider requires a human to direct each session. The rig is autonomous from issue dispatch through PR merge.
- Multi-model: Aider’s
--architectmode is a hint at the rig’s design: separate roles for design and implementation. The rig extends this with Review-E as a third role (independent verification). - Deployment: Aider is a developer tool, not a deployed service. It doesn’t handle assignment, cost attribution, or failure recovery.
Key insight: Aider proves the value of specialization (architect vs. coder). The rig extends the pattern: designer (Architect-E, planned) → coder (Dev-E) → reviewer (Review-E).
Claude Code
Section titled “Claude Code”What it does well: Claude Code is the rig’s actual execution engine. It provides the tool-use layer (Read, Grep, Edit, Bash, WebFetch, Agent), handles multi-turn sessions, and manages the agentic loop. It’s not a platform — it’s a powerful primitive.
Where the rig diverges (by adding):
- Claude Code alone has no dispatch. The rig wraps it with rig-conductor for assignment exclusivity and lifecycle tracking.
- Claude Code alone has no persistent memory. The rig adds the memory MCP for cross-session learning.
- Claude Code alone has no multi-agent coordination. The rig adds Review-E as an independent gate.
- Claude Code alone has no cost attribution. The rig adds TokenUsageProjection to correlate every API call to an issue.
Key insight: Claude Code is the engine. The rig is the vehicle. Neither is sufficient alone.
Where the rig is genuinely ahead
Section titled “Where the rig is genuinely ahead”- Agent-to-agent review — No commercial platform ships independent review as a separate agent. The rig’s Review-E catches logic errors, missing docs, and pattern violations before human review.
- Per-issue cost attribution — The rig knows which issues cost $0.40 and which cost $12. Useful for identifying runaway epics and calibrating complexity estimates.
- Cross-session persistent memory — Memory persists across pod restarts, agent upgrades, and weeks of operation. The rig doesn’t re-learn the same gotchas.
- Operator-defined governance — CODEOWNERS + branch protection gives operators surgical control over which paths require human approval. No platform offers this level of configurability.
Where the rig lags
Section titled “Where the rig lags”- Single-provider — Anthropic-only today. Provider portability is a planned whitepaper topic but not implemented.
- No browser/UI automation — Devin can navigate web UIs; the rig cannot. This limits it to code-and-config tasks.
- Operational complexity — Running k3s, Flux, Postgres, and a custom conductor requires real infrastructure expertise. Devin’s SaaS model is dramatically simpler to start.