Skip to content

Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code

Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code

Section titled “Agent platform comparison: the rig vs. Devin, SWE-agent, Aider, Claude Code”

Understand how the rig’s design choices differ from the leading agent platforms. Inform decisions about where to build custom infrastructure vs. adopt upstream patterns.


DimensionDevin (Cognition)SWE-agentAiderClaude CodeThe Rig
HostingManaged SaaSSelf-hostedLocal CLILocal CLISelf-hosted k3s
DispatchWeb UI / APICLI / CICLICLIEvent-sourced (rig-conductor)
PersistenceSession-scopedNoneNoneCLAUDE.md + project filesPostgres + pgvector (memory MCP)
ReviewHuman onlyHuman onlyHuman onlyHuman onlyReview-E (agent-to-agent)
Cost attributionPer-session billingNone built-inNone built-inNone built-inPer-issue, per-agent, daily totals
Multi-agentSingle agentSingle agentSingle agentSingle agent6 specialized agents
GitOps integrationNoNoNoNoFlux-managed deployments
Branch protectionRespects but doesn’t enforceNoNoNoConductor merge gate
ObservabilityCognition dashboardNoneNoneLimitedOTel (partial), cost dashboard
Open sourceClosedMITApache 2.0ClosedMixed (components open)

What it does well: Devin is the most capable general-purpose coding agent available as a product. It can navigate browser UIs, run tests, read documentation, and complete multi-session tasks. The Cognition team’s focus on long-horizon reliability is visible.

Where the rig diverges:

  • Cost model: Devin charges per-session with no per-issue attribution. The rig tracks cost by issue, agent, and repo — essential for budgeting at 30+ issues/day.
  • Operator control: Devin is a managed service. The rig is operator-hosted, which means full control of secrets, network policy, and model selection at the cost of operational burden.
  • Multi-agent: Devin is a single agent. The rig separates concerns: Dev-E writes, Review-E reviews, iBuild-E builds for Apple platforms. This separation prevents the acceptance-gating problem (an agent cannot approve its own work).
  • Memory: Devin has session memory but not a persistent cross-session vector store. The rig’s memory MCP enables learning that survives pod restarts and persists across weeks.

Key insight: Devin is optimized for “do this task well once.” The rig is optimized for “do 30 tasks per day reliably, across 7 repos, with audit trails.”


What it does well: SWE-agent is the research baseline for agent-on-GitHub-issues performance. It introduced the AgentComputer Interface (ACI) — a structured set of tools (search, view, edit, run) that outperforms raw shell access for code navigation.

Where the rig diverges:

  • ACI vs. Claude Code: SWE-agent built its own tool layer. The rig uses Claude Code CLI, which provides similar primitives (Read, Grep, Edit, Bash) with Anthropic’s prompt engineering baked in.
  • Persistence: SWE-agent is stateless between runs. The rig’s memory MCP means lessons from issue N inform issue N+1000.
  • Dispatch: SWE-agent runs from CLI. The rig’s conductor handles assignment exclusivity, cost tracking, and lifecycle events without manual intervention.

Key insight: SWE-agent’s ACI insight is correct and the rig inherits it via Claude Code. The rig extends the model with production-grade dispatch, memory, and multi-agent separation.


What it does well: Aider is the most mature local coding assistant. It supports 100+ LLM models, has excellent git integration (auto-commit with conventional messages), and handles multi-file edits cleanly. The --architect mode uses a planning model to design changes before a coding model implements them — an early form of multi-agent.

Where the rig diverges:

  • Human in the loop: Aider requires a human to direct each session. The rig is autonomous from issue dispatch through PR merge.
  • Multi-model: Aider’s --architect mode is a hint at the rig’s design: separate roles for design and implementation. The rig extends this with Review-E as a third role (independent verification).
  • Deployment: Aider is a developer tool, not a deployed service. It doesn’t handle assignment, cost attribution, or failure recovery.

Key insight: Aider proves the value of specialization (architect vs. coder). The rig extends the pattern: designer (Architect-E, planned) → coder (Dev-E) → reviewer (Review-E).


What it does well: Claude Code is the rig’s actual execution engine. It provides the tool-use layer (Read, Grep, Edit, Bash, WebFetch, Agent), handles multi-turn sessions, and manages the agentic loop. It’s not a platform — it’s a powerful primitive.

Where the rig diverges (by adding):

  • Claude Code alone has no dispatch. The rig wraps it with rig-conductor for assignment exclusivity and lifecycle tracking.
  • Claude Code alone has no persistent memory. The rig adds the memory MCP for cross-session learning.
  • Claude Code alone has no multi-agent coordination. The rig adds Review-E as an independent gate.
  • Claude Code alone has no cost attribution. The rig adds TokenUsageProjection to correlate every API call to an issue.

Key insight: Claude Code is the engine. The rig is the vehicle. Neither is sufficient alone.


  1. Agent-to-agent review — No commercial platform ships independent review as a separate agent. The rig’s Review-E catches logic errors, missing docs, and pattern violations before human review.
  2. Per-issue cost attribution — The rig knows which issues cost $0.40 and which cost $12. Useful for identifying runaway epics and calibrating complexity estimates.
  3. Cross-session persistent memory — Memory persists across pod restarts, agent upgrades, and weeks of operation. The rig doesn’t re-learn the same gotchas.
  4. Operator-defined governance — CODEOWNERS + branch protection gives operators surgical control over which paths require human approval. No platform offers this level of configurability.
  1. Single-provider — Anthropic-only today. Provider portability is a planned whitepaper topic but not implemented.
  2. No browser/UI automation — Devin can navigate web UIs; the rig cannot. This limits it to code-and-config tasks.
  3. Operational complexity — Running k3s, Flux, Postgres, and a custom conductor requires real infrastructure expertise. Devin’s SaaS model is dramatically simpler to start.