CobbleDB: Modelling Levelled Storage by Composition
Pith reviewed 2026-05-10 18:52 UTC · model grok-4.3
The pith
Equivalence between store specifications allows composing performance features into a correct database backing store.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a composition-based approach to building correct-by-construction database backing stores. In previous work, we specified the behaviour of several store variants and proved their correctness and equivalence. Here, we derive a Java implementation: the simplicity of the specification makes manual construction straightforward. We leverage spec-guaranteed store equivalence to compose performance features, then demonstrate practical value with CobbleDB, a reimplementation of RocksDB's levelled storage.
What carries the argument
Spec-guaranteed store equivalence, used to compose performance features from prior verified variants while preserving overall correctness.
If this is right
- Database stores can be constructed correctly by composing previously verified variants.
- Performance features can be added modularly without re-verifying the entire system.
- A reimplementation of RocksDB's levelled compaction becomes possible with formal guarantees.
- Implementation effort reduces to faithful translation from the specification.
Where Pith is reading between the lines
- This composition method could apply to other storage architectures such as LSM trees.
- It opens a route to libraries of reusable, verified store components for custom designs.
- Developers could prototype and combine new performance features more rapidly than before.
Load-bearing premise
The manual Java implementation faithfully matches the specification and the prior equivalence proofs apply directly to the composed features.
What would settle it
Running CobbleDB against RocksDB on the same sequence of operations and observing different results on reads or writes, or finding a counterexample where the composed store violates one of the original specifications.
Figures
read the original abstract
We present a composition-based approach to building correctby-construction database backing stores. In previous work, we specified the behaviour of several store variants and proved their correctness and equivalence. Here, we derive a Java implementation: the simplicity of the specification makes manual construction straightforward. We leverage spec-guaranteed store equivalence to compose performance features, then demonstrate practical value with CobbleDB, a reimplementation of RocksDB's levelled storage.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a composition-based approach to building correct-by-construction database backing stores. Building on prior formal specifications and equivalence proofs for store variants, the authors manually derive a Java implementation of CobbleDB that reimplements RocksDB's levelled storage. They claim that spec simplicity makes the translation straightforward and leverage the proven equivalences to compose performance features while preserving correctness.
Significance. If the implementation faithfully realizes the specifications, the work offers a method for constructing reliable storage engines by composing proven components, which could reduce bugs in production systems like RocksDB. The practical demonstration strengthens the case for applying formal methods to database storage, though its impact depends on verification of the hand-written code.
major comments (2)
- [Implementation section] Implementation section (around the description of manual Java construction from the spec): The paper states that 'the simplicity of the specification makes manual construction straightforward' and that equivalences transfer, but provides no mechanical verification (e.g., refinement proof, model checking, or differential testing) that the composed levelled-storage operations in CobbleDB match the spec on key behaviors. This is load-bearing for the central claim, as any divergence would invalidate the performance-feature composition.
- [Composition and demonstration section] Section on leveraging spec-guaranteed equivalence for composition: The approach relies on prior separate proofs applying directly to the composed features in the Java code, but without evidence that the manual translation preserves the equivalence relations for the specific levelled-storage operations (e.g., compaction or read paths), the practical value demonstration rests on an unverified assumption.
minor comments (2)
- [Abstract and Introduction] The abstract and introduction could more explicitly distinguish between the prior proofs and the new implementation steps to clarify what is novel versus reused.
- [Figures and pseudocode] Figure captions or pseudocode (if present) for the levelled storage composition would benefit from explicit mapping to the formal spec equations to aid readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on verification of the manual implementation. We respond to each major comment below, clarifying the paper's scope while agreeing to strengthen the presentation where appropriate.
read point-by-point responses
-
Referee: [Implementation section] Implementation section (around the description of manual Java construction from the spec): The paper states that 'the simplicity of the specification makes manual construction straightforward' and that equivalences transfer, but provides no mechanical verification (e.g., refinement proof, model checking, or differential testing) that the composed levelled-storage operations in CobbleDB match the spec on key behaviors. This is load-bearing for the central claim, as any divergence would invalidate the performance-feature composition.
Authors: The paper's primary contribution is composition of performance features at the specification level, where equivalences have already been formally proven. The Java implementation is presented as a direct manual derivation enabled by spec simplicity, with the implementation section describing the mapping of abstract operations to concrete Java methods. We agree that explicit mechanical verification (such as refinement proofs or differential testing) is absent and would further strengthen the demonstration. We will revise the implementation section to include additional concrete examples of operation mappings (e.g., for get/put and compaction) and a brief discussion of the manual derivation process and its limitations. revision: partial
-
Referee: [Composition and demonstration section] Section on leveraging spec-guaranteed equivalence for composition: The approach relies on prior separate proofs applying directly to the composed features in the Java code, but without evidence that the manual translation preserves the equivalence relations for the specific levelled-storage operations (e.g., compaction or read paths), the practical value demonstration rests on an unverified assumption.
Authors: Equivalences are established between specifications; the Java code composes the corresponding implementations following the same structure as the composed specs. For levelled-storage operations such as compaction and read paths, the code implements the behaviors defined by the composed specifications. We will revise the composition section to add explicit examples illustrating how the direct mapping from spec to code preserves the equivalence relations for these operations, making the transfer explicit rather than assumed. revision: partial
Circularity Check
No significant circularity; relies on prior separate proofs with manual implementation as explicit assumption
full rationale
The paper's derivation starts from prior specifications and equivalence proofs (cited as previous work), manually translates to Java citing spec simplicity, then composes features using those equivalences and demonstrates with CobbleDB. This chain does not reduce any claim to its inputs by construction, nor does it involve fitted parameters renamed as predictions, self-definitional loops, or a load-bearing self-citation chain that collapses without external verification. The manual translation step is presented as an assumption rather than a derived result, placing any mismatch risk under correctness rather than circularity. Prior proofs are treated as independent support.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Prior specifications of store variants are correct and their equivalence has been proven.
Reference graph
Works this paper leans on
-
[1]
Tomsic, Manuel Bravo, Zhongmiao Li, Tyler Crain, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro
Deepthi Devaki Akkoorath, Alejandro Z. Tomsic, Manuel Bravo, Zhongmiao Li, Tyler Crain, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro. 2016. Cure: Strong semantics meets high availability and low latency. In Int. Conf. on Distributed Comp. Sys. (ICDCS) . IEEE Comp. Society, Nara, Japan, 405–414. doi:10.1109/ICDCS.2016.98
-
[2]
Andrea Cerone, Giovanni Bernardi, and Alexey Gotsman. 2015. A Framework for Transactional Consistency Models with Atomic Visibility. In Int. Conf. on Concurrency Theory (CONCUR) (Leibniz Int. Proc. in Informatics (LIPIcs), Vol. 42) , Luca Aceto and David de Frutos Escrig (Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Madrid, Spain, 58–71. doi:10...
-
[3]
CLBG Contributors. [n. d.]. Measured: Which programming language is fastest? Benchmarks Game. https://benchmarksgame-team.pages.debian.net/ benchmarksgame/index.html Consulted 17 January 2026
work page 2026
-
[4]
AntidoteDB Contributors. 2026. https://github.com/AntidoteDB/antidote
work page 2026
-
[5]
RocksDB Contributors. 2026. Compaction. https://github.com/facebook/rocksdb/ wiki/Leveled-Compaction
work page 2026
-
[6]
RocksDB Contributors. 2026. MANIFEST. https://github.com/facebook/rocksdb/ wiki/MANIFEST
work page 2026
-
[7]
RocksDB Contributors. 2026. Write Stalls. https://github.com/facebook/rocksdb/ wiki/Write-Stalls
work page 2026
-
[8]
Brian Cooper and YCSB Contributors. 2026. brianfrankcooper/YCSB. https: //github.com/brianfrankcooper/YCSB
work page 2026
-
[9]
Shabnam Ghasemirad, Si Liu, Christoph Sprenger, Luca Multazzu, and David A. Basin. 2024. VerIso: Verifiable Isolation Guarantees for Database Transactions. Proc. VLDB Endownment 18, 5 (Oct. 2024), 1362–1375. https://www.vldb.org/ pvldb/vol18/p1362-ghasemirad.pdf
work page 2024
-
[10]
Saalik Hatia. 2023. Leveraging formal specification to implement a database back- end. Ph. D. Dissertation. Sorbonne Université. https://theses.hal.science/tel- 04291337
work page 2023
-
[11]
Haonan Lu, Shuai Mu, Siddhartha Sen, and Wyatt Lloyd. 2023. NCC: Natu- ral Concurrency Control for Strictly Serializable Datastores by Avoiding the Timestamp-Inversion Pitfall. In Symp. on Op. Sys. Design and Implementation (OSDI), Roxana Geambasu and Ed Nightingale (Eds.). Usenix, Boston, MA, USA, 305–323. https://www.usenix.org/conference/osdi23/present...
work page 2023
-
[12]
Emilie Ma, Ayush Pandey, Jaurel Fosset, and Saalik Hatia. [n. d.]. CobbleDB source code. https://gitlab.inria.fr/shapiro/CobbleDB. Commit 77ab1b9c
-
[13]
Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. 2010. Toward a Verified Relational Database Management System. InSymp. on Principles of Prog. Lang. (POPL) (POPL ’10). Assoc. for Computing Machinery, Madrid, Spain, 237–248. doi:10.1145/1706299.1706329
-
[14]
Nisarg Patel, Siddharth Krishna, Dennis Shasha, and Thomas Wies. 2021. Veri- fying concurrent multicopy search structures. In Conf. on Object-Oriented Prog. Sys., Lang. and Applications (OOPSLA) (Proc. ACM Program. Lang., Vol. 5) . Assoc. for Computing Machinery Special Interest Group on Pg. Lang. (SIGPLAN), Assoc. for Computing Machinery, Chicago, Ill., ...
-
[15]
Redis. 2026. Redis. https://github.com/redis/redis
work page 2026
-
[16]
RocksDB Contributors. 2026. RocksDB: A Persistent Key-Value Store for Flash and RAM Storage. https://github.com/facebook/rocksdb
work page 2026
-
[17]
Edgard Schiebelbein, Saalik Hatia, Annette Bieniusa, Gustavo Petri, Carla Fer- reira, and Marc Shapiro. 2024. Models for Storage in Database Backends. In W. on Principles and Practice of Consistency for Distr. Data (PaPoC) , Assoc. for Computing Machinery (Ed.). Euro. Conf. on Comp. Sys. (EuroSys), Assoc. for Computing Machinery, Athens, Greece, 58–66. do...
-
[18]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. Conflict-free Replicated Data Types. In Int. Symp. on Stabilization, Safety, and Security of Dist. Sys. (SSS) (Lecture Notes in Comp. Sc. (LNCS), Vol. 6976) , Xavier Défago, Franck Petit, and V. Villain (Eds.). Springer-Verlag, Grenoble, France, 386–400. doi:10.1007/978-3-642-24550-3_29
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.