S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination
Pith reviewed 2026-05-20 15:30 UTC · model grok-4.3
The pith
S-Bus uses a server-side DeliveryLog to automatically reconstruct each agent's read set from HTTP GET operations and enforce Observable-Read Isolation that prevents structural race conditions during shared-shard collaboration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The DeliveryLog mechanism automatically reconstructs each agent's read set from HTTP GET operations and provides Observable-Read Isolation that prevents structural race conditions when agents collaborate via shared shards, supported by machine-checked proofs in TLAPS, exhaustive TLC model checking at N=3, Dafny lemmas, and empirical parity with PostgreSQL SERIALIZABLE and Redis WATCH/MULTI showing zero Type-I corruptions across 427,308 conflicts.
What carries the argument
The DeliveryLog, a per-agent log of HTTP GET operations that reconstructs the read set at commit time without agent SDK changes under HTTP/1.1.
If this is right
- Agents can safely share state shards without introducing silent output corruptions.
- No changes to agent code or frameworks are required to obtain the isolation guarantee.
- The mechanism achieves zero Type-I errors on contention workloads that produce hundreds of thousands of HTTP-409 conflicts.
- Observable-Read Isolation remains neutral for dedicated-shard topologies but becomes harmful when preservation of concurrent contradictions occurs in single-shard writing.
Where Pith is reading between the lines
- The same read-set reconstruction idea could be applied to other observable request protocols beyond HTTP/1.1.
- Frameworks that currently rely on explicit locks or version checks might simplify their coordination layer if they adopt an observable-traffic log.
- Larger-scale empirical runs with agent counts well beyond the N=3 model would be a direct next measurement to check whether the zero-corruption result holds at production scale.
Load-bearing premise
The approach assumes agents communicate exclusively via observable HTTP/1.1 traffic and that the workload uses dedicated shards rather than single-shard collaborative writing.
What would settle it
Observation of even one Type-I corruption during a shared-shard contention run that exceeds the scale of the TLC-checked model or a counter-example to the retained typing axiom in the TLAPS proof.
read the original abstract
Concurrent LLM agents sharing mutable natural-language state produce Structural Race Conditions (SRCs): write-write and cross-shard stale-read conflicts that silently corrupt agent output. Existing multi-agent frameworks (LangGraph, CrewAI, AutoGen) provide no write-ownership semantics over shared state. We present S-Bus, an HTTP middleware whose central mechanism is a server-side DeliveryLog: a per-agent log of HTTP GET operations that automatically reconstructs each agent's read set at commit time without agent SDK changes under HTTP/1.1. The consistency property the DeliveryLog provides -- Observable-Read Isolation (ORI), a partial causal consistency over the HTTP-observable projection of the read set -- prevents structural race conditions when agents collaborate via shared shards. Three contributions: (C1) The DeliveryLog mechanism for automatic HTTP-traffic-based read-set reconstruction, with three-tier mechanised evidence: ReadSetSoundness and ORICommitSafety machine-checked in TLAPS (modulo one retained typing axiom); exhaustive TLC at N=3 (20,763,484 distinct states, zero violations); Dafny discharges 9 inductive soundness lemmas. (C2) Empirical structural-conflict prevention parity against PostgreSQL 17 SERIALIZABLE and Redis 7 WATCH/MULTI on shared-shard contention sweeps with 427,308 active HTTP-409 conflicts: zero Type-I corruptions across all three backends. (C3) ORI's operating envelope is topology-conditional: semantically neutral in dedicated-shard workloads; harmful in single-shard collaborative writing because preservation propagates concurrent contradictions. Source code: https://github.com/sajjadanwar0/sbus
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces S-Bus, an HTTP middleware for multi-agent LLM state coordination. Its central mechanism is a server-side DeliveryLog that automatically reconstructs each agent's read set from HTTP/1.1 GET operations without SDK changes. This enables Observable-Read Isolation (ORI), a partial causal consistency property over the observable read-set projection, which the authors claim prevents structural race conditions (write-write and cross-shard stale reads) on shared shards. Evidence includes TLAPS machine-checked proofs of ReadSetSoundness and ORICommitSafety (modulo one typing axiom), exhaustive TLC model checking at N=3 (20,763,484 states, zero violations), 9 Dafny inductive lemmas, and empirical parity with PostgreSQL SERIALIZABLE and Redis WATCH/MULTI (zero Type-I corruptions across 427,308 conflicts). The paper also analyzes ORI's topology-conditional envelope: neutral for dedicated shards but harmful for single-shard collaborative writing.
Significance. If the formal results and empirical findings hold, the work provides a practical, low-friction approach to state isolation in multi-agent LLM systems that avoids modifying agent code. The combination of machine-checked TLAPS proofs, exhaustive TLC checking, Dafny lemmas, and reproducible empirical comparison to established database mechanisms is a notable strength. The explicit discussion of when ORI becomes harmful adds useful nuance to the operating envelope.
major comments (3)
- [TLC Model Checking] TLC Model Checking section: exhaustive checking is performed only at N=3 (20,763,484 states, zero violations). Given the exponential growth of the state space, this leaves open the possibility of undetected violations of ReadSetSoundness or ORICommitSafety for N>3 or in cross-shard interleavings; the central claim that ORI prevents SRCs in general workloads therefore rests on an incomplete coverage argument.
- [TLAPS Proofs] TLAPS Proofs section: ReadSetSoundness and ORICommitSafety are machine-checked only modulo one retained typing axiom. The manuscript should either discharge the axiom or provide a concrete argument that it does not affect the isolation guarantees, as this directly impacts the soundness of the mechanized evidence.
- [Operating Envelope] Operating Envelope section: the claim that ORI is semantically neutral in dedicated-shard workloads but harmful in single-shard collaborative writing is stated qualitatively; no quantitative boundary conditions or workload characterization are supplied to delineate when preservation of concurrent contradictions becomes load-bearing.
minor comments (2)
- [Abstract] The abstract states 'three contributions' but the body does not explicitly label C1–C3 in the same order or with matching headings; adding explicit cross-references would improve readability.
- [Evaluation] Figure captions for the empirical conflict sweeps should include the exact number of runs and the definition of 'Type-I corruption' to allow direct replication.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the strengths of our multi-tier formal evidence and empirical comparisons. We address each major comment point by point below.
read point-by-point responses
-
Referee: [TLC Model Checking] TLC Model Checking section: exhaustive checking is performed only at N=3 (20,763,484 states, zero violations). Given the exponential growth of the state space, this leaves open the possibility of undetected violations of ReadSetSoundness or ORICommitSafety for N>3 or in cross-shard interleavings; the central claim that ORI prevents SRCs in general workloads therefore rests on an incomplete coverage argument.
Authors: We agree that exhaustive TLC checking at N=3, while covering 20,763,484 states with zero violations, cannot exhaustively rule out violations at larger N due to state-space explosion. The TLAPS proofs of ReadSetSoundness and ORICommitSafety are parameterized over arbitrary N and are not restricted to small models; the model checking validates the specification at a tractable scale. We will revise the manuscript to explicitly discuss this complementarity and to state the coverage limitations of model checking for N>3. revision: partial
-
Referee: [TLAPS Proofs] TLAPS Proofs section: ReadSetSoundness and ORICommitSafety are machine-checked only modulo one retained typing axiom. The manuscript should either discharge the axiom or provide a concrete argument that it does not affect the isolation guarantees, as this directly impacts the soundness of the mechanized evidence.
Authors: The retained typing axiom asserts well-typedness of DeliveryLog operations in the TLA+ encoding. We will add a concrete argument in the revised manuscript showing that the axiom is orthogonal to the isolation properties: it only concerns syntactic validity of read-set reconstruction and does not affect the causal-ordering or commit-safety invariants, which are proved independently of it. This argument preserves the soundness of the mechanized evidence. revision: yes
-
Referee: [Operating Envelope] Operating Envelope section: the claim that ORI is semantically neutral in dedicated-shard workloads but harmful in single-shard collaborative writing is stated qualitatively; no quantitative boundary conditions or workload characterization are supplied to delineate when preservation of concurrent contradictions becomes load-bearing.
Authors: We agree the current discussion is qualitative. In the revision we will augment the Operating Envelope section with quantitative indicators drawn from the empirical evaluation (agent counts, conflict rates, and measured overhead) that delineate when preservation of concurrent contradictions becomes load-bearing, e.g., performance impact thresholds observed in single-shard versus dedicated-shard configurations. revision: yes
Circularity Check
No significant circularity; claims rest on external formal verification and independent benchmarks
full rationale
The paper presents the DeliveryLog as a novel construction for HTTP-based read-set reconstruction and defines Observable-Read Isolation as its consistency property. Soundness is established via machine-checked proofs in TLAPS (ReadSetSoundness, ORICommitSafety), exhaustive TLC model checking at N=3, and Dafny inductive lemmas, plus direct empirical parity with PostgreSQL SERIALIZABLE and Redis WATCH/MULTI on 427k conflicts. These are independent external artifacts rather than self-citations or fitted parameters. No self-definitional equations, renamed predictions, or load-bearing author citations appear in the derivation chain. The topology-conditional operating envelope is stated explicitly as an assumption, not derived from prior results.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math One retained typing axiom in the TLAPS proof of ReadSetSoundness and ORICommitSafety
invented entities (2)
-
DeliveryLog
no independent evidence
-
Observable-Read Isolation (ORI)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DeliveryLog mechanism for automatic HTTP-traffic-based read-set reconstruction... Observable-Read Isolation (ORI)... READSETSOUNDNESS and ORICOMMITSAFETY machine-checked in TLAPS
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ORI prevents G2-item over Robs... partial causal consistency over the HTTP-observable projection
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
- [1]
- [2]
-
[3]
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Q. Wu et al., “AutoGen,” arXiv:2308.08155, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [4]
- [5]
- [6]
- [7]
- [8]
-
[9]
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
O. Khattab et al., “DSPy,” arXiv:2310.03714, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [10]
-
[11]
Why do multi-agent LLM systems fail?
M. Cemri et al., “Why do multi-agent LLM systems fail?” 2025
work page 2025
-
[12]
Generative Agents: Interactive Simulacra of Human Behavior,
J. S. Park et al., “Generative Agents: Interactive Simulacra of Human Behavior,” UIST, 2023
work page 2023
-
[13]
Managing Update Conflicts in Bayou,
D. Terry et al., “Managing Update Conflicts in Bayou,” SOSP, 1995
work page 1995
-
[14]
MemGPT: Towards LLMs as Operating Systems
C. Packer et al., “MemGPT,” arXiv:2310.08560, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [15]
- [16]
-
[17]
Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary,
C. Li et al., “Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary,” OSDI, 2012
work page 2012
-
[18]
https://github.com/sajjadanwar0/abus
-
[19]
Transactional Storage for Geo-Replicated Systems,
Y . Sovran et al., “Transactional Storage for Geo-Replicated Systems,” SOSP, 2011
work page 2011
-
[20]
Serializable Snapshot Isolation in PostgreSQL,
D. Ports & K. Grittner, “Serializable Snapshot Isolation in PostgreSQL,” VLDB, 2012
work page 2012
- [21]
- [22]
-
[23]
Coordination Avoidance in Database Systems,
P. Bailis et al., “Coordination Avoidance in Database Systems,” VLDB, 2014
work page 2014
- [24]
-
[25]
Time, Clocks, and the Ordering of Events in a Distributed System,
L. Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System,” CACM, 21(7), 1978
work page 1978
- [26]
-
[27]
J. R. Wilcox et al., “Verdi,” PLDI, 2015
work page 2015
- [28]
- [29]
-
[30]
Speedy Transactions in Multicore In-Memory Databases,
S. Tu et al., “Speedy Transactions in Multicore In-Memory Databases,” SOSP, 2013
work page 2013
-
[31]
Hekaton: SQL Server’s Memory-Optimized OLTP Engine,
C. Diaconu et al., “Hekaton: SQL Server’s Memory-Optimized OLTP Engine,” SIGMOD, 2013
work page 2013
- [32]
- [33]
-
[34]
P. Bernstein & N. Goodman, “Concurrency Control,” ACM Surv., 1981
work page 1981
-
[35]
H. T. Kung & J. T. Robinson, “OCC,” ACM TODS, 1981
work page 1981
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
-
[43]
Loro Team, “Loro: Movable Tree CRDT,” https://loro.dev, 2024
work page 2024
- [44]
-
[45]
Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests,
R. Fielding and J. Reschke, “Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests,” IETF RFC 7232, June 2014
work page 2014
-
[46]
The Measurement of Observer Agreement for Categorical Data,
J. R. Landis and G. G. Koch, “The Measurement of Observer Agreement for Categorical Data,”Biometrics, vol. 33, no. 1, pp. 159–174, 1977
work page 1977
-
[47]
Temporal Technologies, “Workflow Update API,” Temporal documenta- tion, 2024. https://docs.temporal.io/workflows#update
work page 2024
-
[48]
A. Beamer et al., “FoundationDB Directory Layer,” FoundationDB doc- umentation, 2024. https://apple.github.io/foundationdb/developer-guide. html#directories
work page 2024
-
[49]
Letta: Stateful Agents Beyond Context Windows,
C. Packer et al., “Letta: Stateful Agents Beyond Context Windows,” GitHub, 2024. https://github.com/letta-ai/letta
work page 2024
-
[50]
Large Language Models are not Fair Evaluators
P. Wang et al., “Large Language Models are not Fair Evaluators,” arXiv:2305.17926, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[51]
Scalable Transactions across Heterogeneous NoSQL Key-Value Data Stores,
A. Dey, A. Fekete, R. Nambiar, U. Röhm, “Scalable Transactions across Heterogeneous NoSQL Key-Value Data Stores,”PVLDB, vol. 6, no. 12, pp. 1434–1439, 2013
work page 2013
-
[52]
AgentScope: A Flexible yet Robust Multi-Agent Platform,
D. Gao et al., “AgentScope: A Flexible yet Robust Multi-Agent Platform,” arXiv:2402.14034, 2024
-
[53]
V oyager: An Open-Ended Embodied Agent with Large Language Models,
G. Wang et al., “V oyager: An Open-Ended Embodied Agent with Large Language Models,”Transactions on Machine Learning Research, 2024
work page 2024
-
[54]
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering,
J. Yang et al., “SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering,”NeurIPS, 2024
work page 2024
-
[55]
Verus: Verifying Rust Programs using Linear Ghost Types,
A. Lattuada et al., “Verus: Verifying Rust Programs using Linear Ghost Types,”OOPSLA, 2023
work page 2023
-
[56]
Creusot: A Foundry for the Deductive Verification of Rust Programs,
X. Denis, J.-H. Jourdan, C. Marché, “Creusot: A Foundry for the Deductive Verification of Rust Programs,” inFormal Methods: 25th Intl. Symp., 2023
work page 2023
-
[57]
Boki: Stateful Serverless Computing with Shared Logs,
Z. Jia, E. Witchel, “Boki: Stateful Serverless Computing with Shared Logs,”SOSP, 2021
work page 2021
-
[58]
Language Primitives and Type Discipline for Structured Communication-Based Programming,
K. Honda, V . T. Vasconcelos, M. Kubo, “Language Primitives and Type Discipline for Structured Communication-Based Programming,”ESOP, 1998. APPENDIX The rubric used by both LLM judges in the Exp. PH-3 validation study (§7.9) is reproduced verbatim below. The prompt was frozen before observing inter-judge agreement and was not revised. You are a strict cod...
work page 1998
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.