Event-sourced agent state: Marten + Postgres in rig-conductor
Event-sourced agent state: Marten + Postgres in rig-conductor
Section titled “Event-sourced agent state: Marten + Postgres in rig-conductor”Why event sourcing for an agent system?
Section titled “Why event sourcing for an agent system?”An agent system has three properties that make event sourcing a natural fit:
-
The “what happened” question matters as much as the “what is” question. When an issue stalls, you want to know: was it assigned? Did CI fail? Was Review-E called? A snapshot-only store answers “what is the state now” but not “what happened between creation and now.”
-
Multiple consumers need the same facts. The cost projection, the stuck-detection service, the Discord notifier, and the merge gate all need to react to events. An event stream is a natural fan-out.
-
Replay is essential for debugging. When the rig behaves unexpectedly, you can replay the event stream up to any point in time and inspect the state. No log-diving required.
Marten as the implementation
Section titled “Marten as the implementation”Marten is a .NET library that turns Postgres into a document + event store. Rig-conductor uses its event store features:
// Appending an event (atomic, versioned)await session.Events.AppendAsync( streamId: issueStreamId, // stream per issue expectedVersion: current, // optimistic concurrency new IssueAssignedEvent { IssueNumber = issue.Number, AgentId = agentId, AssignedAt = DateTimeOffset.UtcNow });await session.SaveChangesAsync(); // commits atomicallyIf two callers race to claim the same issue, one succeeds (version N → N+1) and the other gets a ConcurrencyException and must retry from the latest state. This is the exclusivity guarantee.
Stream-per-issue design
Section titled “Stream-per-issue design”Each GitHub issue gets its own Marten event stream, keyed by {owner}/{repo}#{number}. The stream accumulates all lifecycle events:
Stream: dashecorp/rig-agent-runtime#88 [0] ISSUE_APPROVED 2026-04-21T09:00:00Z [1] ISSUE_ASSIGNED 2026-04-21T09:00:05Z agent=dev-e-node [2] HEARTBEAT 2026-04-21T09:10:00Z [3] HEARTBEAT 2026-04-21T09:20:00Z [4] AGENT_STUCK 2026-04-21T09:47:00Z elapsed=47m [5] ISSUE_UNASSIGNED 2026-04-21T09:47:01Z (re-queued by StaleHeartbeatService) [6] ISSUE_ASSIGNED 2026-04-21T09:50:00Z agent=dev-e-node (new pod) [7] BRANCH_CREATED 2026-04-21T09:51:30Z [8] PR_CREATED 2026-04-21T10:15:00Z [9] CI_PASSED 2026-04-21T10:22:00Z [10] REVIEW_ASSIGNED 2026-04-21T10:23:00Z [11] REVIEW_PASSED 2026-04-21T10:31:00Z [12] MERGED 2026-04-21T10:32:00Z [13] ISSUE_DONE 2026-04-21T10:32:05ZThis stream is the complete history. You can answer any question: How long did it take? How many times did it stall? What was the cost?
Projections for read models
Section titled “Projections for read models”Marten’s projection system builds read models from the event stream. Rig-conductor has several:
| Projection | Read model | Purpose |
|---|---|---|
IssueProjection | IssueState document | Current state per issue (assigned/stuck/done) |
TokenUsageProjection | TokenUsage document | Token counts per issue per agent |
CostProjection | DailyCost document | Aggregated spend by date, agent, repo |
AgentStatusProjection | AgentStatus document | Last heartbeat per agent instance |
Projections run synchronously on event append (inline) for low-latency reads. The cost projections run asynchronously (Marten’s async daemon) because they aggregate across many streams.
What you get for free
Section titled “What you get for free”- Full audit trail — every state change is permanent and timestamped
- Time-travel debugging — replay stream to any version to inspect intermediate state
- Parallel read models — add a new projection without changing the event append path
- Atomic exclusivity — optimistic concurrency on append prevents double-assignment
- Postgres reliability — one backing store, standard ops tooling (pg_dump, replicas)
What you pay
Section titled “What you pay”- Schema migration complexity — changing an event’s shape requires handling old versions in deserializers (upcasting). Marten supports this but it requires discipline.
- Append-only cost — you cannot “edit” history. Corrections require a compensating event.
- Query patterns — “give me all issues assigned to dev-e-node in the last hour” requires a projection or a cross-stream query. Pure event sourcing doesn’t do ad-hoc queries cheaply.
For the rig’s use case — 60–100 issues/day, full audit trail required, 6 agents producing structured lifecycle events — Marten’s tradeoffs land clearly in favor. A CRUD store (update status column in-place) would be simpler to query but wouldn’t answer “what happened and when” without a separate audit log.