The Log is the Agent: Event-Sourced Reactive Graphs for Auditable, Forkable Agentic Systems

Yohei Nakajima

arxiv: 2605.21997 · v1 · pith:UAWB6CU6new · submitted 2026-05-21 · 💻 cs.AI · cs.MA

The Log is the Agent: Event-Sourced Reactive Graphs for Auditable, Forkable Agentic Systems

Yohei Nakajima This is my paper

Pith reviewed 2026-05-22 06:22 UTC · model grok-4.3

classification 💻 cs.AI cs.MA

keywords event sourcingreactive graphsagentic systemsdeterministic replayforkable agentsauditabilitycausal lineageAI agent frameworks

0 comments

The pith

Centering an append-only event log as the source of truth lets agent systems replay runs, fork cheaply at any point, and trace full lineage from goal to action.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that most agent frameworks build around language model loops and treat logging as an afterthought, but inverting the order makes the event log primary. The working graph becomes a deterministic projection of that log, and ordinary behaviors react to graph changes by emitting new events. Coordination occurs only through the shared graph with no direct commands between components. If this holds, agent executions gain deterministic replay from the log, low-cost forking that skips re-execution of common history, and complete causal chains from high-level goals down to specific model calls.

Core claim

By making the append-only event log the source of truth, deriving the working graph deterministically from it, and letting behaviors react to graph changes without direct inter-component instructions, the system supplies deterministic replay of any run, cheap forking that branches at any event without re-running the shared prefix, and end-to-end lineage from a high-level goal to each individual model call and artifact produced.

What carries the argument

The event-sourced reactive graph, in which the append-only log is the sole source of truth and the working graph is its deterministic projection, with behaviors reacting to graph updates and emitting events.

If this is right

Any complete run can be replayed deterministically from its log alone without access to the original environment.
A new run can be forked from any past event in the log without re-executing the preceding shared history.
Every artifact and decision carries an explicit causal chain back to the original high-level goal through the event sequence.
The full execution history remains auditable and reconstructible even after forking or later modifications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same log could serve as a persistent record for regulatory review or post-hoc analysis of agent decisions in production settings.
Multiple teams or agents could collaborate by forking from a common log prefix and later merging selected branches back into a main history.
Self-improvement loops might analyze successful log paths and automatically generate new behaviors or prompts that are tested on forked copies.
Debugging becomes a matter of inspecting the log rather than reconstructing state from scattered memory stores.

Load-bearing premise

All coordination among components occurs solely through updates to a shared graph, with no direct instructions between them, and a determinism contract exists that keeps replay sound even when language models or other non-deterministic elements are involved.

What would settle it

Implement the system with an LLM call inside a behavior, record the log, then replay the identical log and check whether the output artifacts and graph state match the original run exactly; any mismatch would show the determinism contract fails.

Figures

Figures reproduced from arXiv: 2605.21997 by Yohei Nakajima.

read the original abstract

Most agent frameworks are built around the language model: a conversation loop comes first, then tools, then rules, and finally a logging layer bolted on for observability, with state persisted as retrievable "memory." We describe ActiveGraph, a runtime that inverts this arrangement. The append-only event log is the source of truth; the working graph is a deterministic projection of that log; and behaviors--ordinary functions, classes, LLM-backed routines, or logic attached to typed edges--react to changes in the graph and emit new events. No component instructs another; coordination happens entirely through the shared graph. This single design decision yields three properties that retrieval-and-summarization memory systems do not provide: deterministic replay of any run from its log, cheap forking that branches a run at any event without re-executing the shared prefix, and end-to-end lineage from a high-level goal down to the individual model call that produced each artifact. We present the architecture, a determinism contract that makes replay sound, and a worked diligence example whose full causal structure is reconstructable from the log alone. We discuss--without claiming to demonstrate--why this substrate is unusually well suited to self-improving agents, and how it extends the BabyAGI lineage and prior graph-memory research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper centers agent systems on an append-only event log with reactive graph projections to enable replay, forking, and lineage, but the determinism contract for LLM calls needs explicit enforcement details to hold up.

read the letter

The core move here is treating the event log as the agent itself rather than bolting logging onto an LLM loop. The graph becomes a deterministic projection, and behaviors react to changes by emitting events, with all coordination routed through the shared structure instead of direct calls between components. This setup is positioned to deliver replay from the log, cheap forking at any event, and full lineage back to individual model calls.

Referee Report

1 major / 2 minor

Summary. The paper proposes ActiveGraph, an event-sourced reactive graph runtime for agentic systems that inverts conventional designs by treating an append-only event log as the source of truth. The working graph is defined as a deterministic projection of this log, with behaviors (including LLM-backed routines) reacting to graph changes and emitting events without direct inter-component commands. This architecture is claimed to deliver three properties absent from retrieval-and-summarization memory systems: deterministic replay of any run from its log, cheap forking that branches at any event without re-executing the shared prefix, and end-to-end lineage from high-level goals to individual model calls. The manuscript presents the architecture, a determinism contract for replay soundness, and a worked diligence example whose causal structure is reconstructable from the log.

Significance. If the determinism contract is shown to enforce exact capture of LLM prompts and responses (preventing replay divergence), the design would provide stronger auditability, reproducibility, and forkability than existing agent frameworks. The inversion of log-as-source-of-truth plus reactive behaviors offers a clean substrate for self-improving agents and extends prior graph-memory and BabyAGI work. The worked example supplies a concrete demonstration of lineage reconstruction, which is a positive step toward falsifiable validation.

major comments (1)

[Determinism contract description] The determinism contract is invoked to guarantee sound replay, but the manuscript supplies no explicit invariant, equation, or substitution rule showing how LLM stochasticity is handled. For replay to remain deterministic, every model call must emit its exact prompt and response as an immutable event so that re-execution substitutes the logged output rather than re-invoking the model; without this specification, temperature > 0 or API non-determinism would break the projection and therefore the replay, forking, and lineage claims.

minor comments (2)

[Abstract] The abstract states that the three properties 'follow from this single design decision' but does not quantify the overhead of maintaining the reactive graph projection or the storage cost of the append-only log; adding a brief complexity discussion would strengthen the practical claims.
[Discussion] The discussion of suitability for self-improving agents is explicitly labeled as non-demonstrative; moving the relevant paragraphs to a dedicated 'Future Directions' subsection would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. The major comment identifies a genuine gap in the explicit formulation of the determinism contract, which we address by expanding the manuscript.

read point-by-point responses

Referee: The determinism contract is invoked to guarantee sound replay, but the manuscript supplies no explicit invariant, equation, or substitution rule showing how LLM stochasticity is handled. For replay to remain deterministic, every model call must emit its exact prompt and response as an immutable event so that re-execution substitutes the logged output rather than re-invoking the model; without this specification, temperature > 0 or API non-determinism would break the projection and therefore the replay, forking, and lineage claims.

Authors: We agree with the referee that the determinism contract must be stated with greater formality to rigorously support the replay, forking, and lineage claims. The original manuscript describes the contract in Section 3.2 in prose, requiring that LLM-backed behaviors emit events containing the exact prompt and response. However, it does not supply an explicit invariant or substitution rule. In the revised version we have added the following statement: The determinism invariant requires that every LLM behavior B, when invoked on prompt p, appends an immutable event of the form LLMResponse(prompt=p, response=r) where r is the precise model output; the projection operator then substitutes the logged r on replay without re-invoking the model. This substitution rule ensures that temperature > 0 or API nondeterminism cannot alter the projected graph state. We have also included a short argument that the resulting projection remains a deterministic function of the log alone. These additions directly address the concern without altering the core architecture. revision: yes

Circularity Check

0 steps flagged

No circularity: properties follow directly from event-log architecture

full rationale

The paper derives deterministic replay, cheap forking, and end-to-end lineage from the single architectural choice that the append-only event log is the source of truth, the graph is its deterministic projection, and behaviors react to graph changes without direct inter-component commands. No equations, fitted parameters, or self-referential definitions appear in the provided text; the determinism contract is presented as an explicit part of the architecture to ensure replay soundness rather than being assumed or reduced to the target properties. There are no load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation. The derivation is self-contained within the described design and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on standard concepts of event sourcing and reactive systems applied to agents, without new fitted parameters or ungrounded physical entities; the main additions are the proposed runtime and the determinism contract.

axioms (2)

domain assumption Coordination happens entirely through the shared graph; no component instructs another.
Invoked directly in the description of how behaviors react to graph changes and emit events.
domain assumption A determinism contract exists that makes replay sound.
Stated as necessary for the replay property to hold reliably.

invented entities (1)

ActiveGraph runtime no independent evidence
purpose: Implements the event-sourced reactive graph for agentic systems with the listed properties.
New system introduced by the paper to realize the inverted architecture.

pith-pipeline@v0.9.0 · 5757 in / 1705 out tokens · 94616 ms · 2026-05-22T06:22:44.949476+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The append-only event log is the source of truth; the working graph is a deterministic projection of that log; and behaviors react to changes in the graph and emit new events.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A determinism contract and replay mechanism that makes any run byte-reproducible from its log, including a content-addressed cache that records model and tool responses

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages · 3 internal anchors

[1]

Nakajima

Y. Nakajima. Task-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications (BabyAGI). 2023.https://github.com/yoheinakajima/babyagi

work page 2023
[2]

MemGPT: Towards LLMs as Operating Systems

C. Packer, S. Wooders, K. Lin, V. Fang, S. G. Patil, I. Stoica, and J. E. Gonzalez. MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

P. Rasmussen, P. Paliychuk, T. Beauvais, J. Ryan, and D. Chalef. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. arXiv:2501.13956, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[4]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

P. Chhikara, D. Khant, S. Aryan, T. Singh, and D. Yadav. Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory. arXiv:2504.19413, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

Latimer, N

C. Latimer, N. Boschi, A. Neeser, C. Bartholomew, G. Srivastava, X. Wang, and N. Ramakrishnan. Hindsight is 20/20: Building Agent Memory that Retains, Recalls, and Reflects. arXiv:2512.12818, 2025

work page arXiv 2025
[6]

H. P. Nii. The Blackboard Model of Problem Solving and the Evolution of Blackboard Architectures. AI Magazine, 7(2):38–53, 1986. 11

work page 1986

[1] [1]

Nakajima

Y. Nakajima. Task-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications (BabyAGI). 2023.https://github.com/yoheinakajima/babyagi

work page 2023

[2] [2]

MemGPT: Towards LLMs as Operating Systems

C. Packer, S. Wooders, K. Lin, V. Fang, S. G. Patil, I. Stoica, and J. E. Gonzalez. MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

P. Rasmussen, P. Paliychuk, T. Beauvais, J. Ryan, and D. Chalef. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. arXiv:2501.13956, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[4] [4]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

P. Chhikara, D. Khant, S. Aryan, T. Singh, and D. Yadav. Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory. arXiv:2504.19413, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[5] [5]

Latimer, N

C. Latimer, N. Boschi, A. Neeser, C. Bartholomew, G. Srivastava, X. Wang, and N. Ramakrishnan. Hindsight is 20/20: Building Agent Memory that Retains, Recalls, and Reflects. arXiv:2512.12818, 2025

work page arXiv 2025

[6] [6]

H. P. Nii. The Blackboard Model of Problem Solving and the Evolution of Blackboard Architectures. AI Magazine, 7(2):38–53, 1986. 11

work page 1986