pith. sign in

arxiv: 2604.06579 · v2 · pith:C2BYJ6NDnew · submitted 2026-04-08 · 💻 cs.DB

SonicDB S6: A Storage-Efficient Verkle Trie for High-Throughput Blockchains

Pith reviewed 2026-05-21 10:22 UTC · model grok-4.3

classification 💻 cs.DB
keywords Verkle TrieStorage optimizationBlockchain databaseDelta nodesNode specializationHigh-throughput blockchainNon-forking chain
0
0 comments X

The pith

SonicDB S6 reduces Verkle Trie live storage by 97.8 percent and archive storage by 95 percent by exploiting a non-forking blockchain history.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SonicDB S6 as a Rust-based Verkle Trie database built for the Sonic blockchain, which generates blocks every 300 milliseconds. It demonstrates that the chain's non-forking property permits storage optimizations that would be unsafe if multiple conflicting histories had to be preserved. An O(kn²) dynamic program selects occupancy-aware node specializations that shrink live storage, while delta nodes store only changed slots to shrink archive storage. Batching, multi-threaded computation, and homomorphic Pedersen caching then raise throughput 3.2 times above a persistent Geth Verkle baseline without missing production block rates.

Core claim

SonicDB S6 leverages the Sonic blockchain's non-forking property to enable aggressive storage optimizations. Occupancy-aware node specializations, selected via an O(kn²) dynamic program, reduce live storage by 97.8%. Delta nodes that record only changed slots reduce archive storage by 95%. Batched updates, multi-threaded commitment computation, and homomorphic Pedersen caching yield 3.2× higher throughput than a persistent Geth Verkle baseline while sustaining production block-rate performance.

What carries the argument

Occupancy-aware node specializations chosen by an O(kn²) dynamic program together with delta nodes that store only changed slots.

If this is right

  • Live-state queries can be served from a Verkle Trie that occupies less than 3 percent of the space required by an unspecialized version.
  • Archive queries become feasible at scale because only 5 percent of the usual historical data must be retained.
  • Commitment overhead drops enough to support 3.2 times the transaction rate of existing persistent Verkle implementations.
  • Real-time state access remains possible at block intervals of 300 milliseconds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same non-forking assumption could let other high-throughput chains adopt similar node specialization and delta techniques.
  • The dynamic program for choosing node layouts might transfer to other variable-density tree structures outside blockchains.
  • Homomorphic caching of Pedersen commitments could lower costs in any setting that repeatedly aggregates vector commitments.

Load-bearing premise

The blockchain never forks, so the system can assume a single permanent history and safely apply optimizations that discard information needed only for alternative histories.

What would settle it

Deploy SonicDB S6 on a Sonic test network under sustained 300-millisecond block production and measure whether live plus archive storage and query latency remain within the reported bounds compared with an unmodified Verkle Trie.

Figures

Figures reproduced from arXiv: 2604.06579 by Bernhard Scholz, Herbert Jordan, Lorenz Schuler, Luigi Crisci.

Figure 1
Figure 1. Figure 1: Ethereum Verkle Trie structure. orange in [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: LiveDB update of a single leaf node. Because no historical data is [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: ArchiveDB update of a single leaf node. The changed node is copied [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Conversion between a base node and a delta node with delta size [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Loading a delta node: the on-disk form is read first, then the base [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Composition details of Verkle leaf node variants [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Layered Architecture Transaction Manager Interface. The highest layer is the transaction man￾ager interface, which provides the EVM with a way to query and update the database. The interface is split into the database and the state interfaces. The former provides functions for opening and closing the DB, creating check￾points, and accessing live and archive states. This is implemented in lib.rs. The state … view at source ↗
Figure 8
Figure 8. Figure 8: Layered Architecture Tree Implementation . The tree implementation provides an interface to read and write key-value pairs stored in the Verkle Trie and to compute root commitments. It consists of two sub-layers: the embedding layer and the Verkle Trie layer. The embedding layer maps the Ethereum state to key-value pairs. These key-value pairs are then stored using the Verkle Trie layer. Computing the key … view at source ↗
Figure 9
Figure 9. Figure 9: Layered Architecture This allows quick cache to avoid inserting elements that are requested only once in the cache, as their reuse distance is infinite. The main implication is that inserting an element into a full cache does not guarantee that it will be inserted into the main cache allocation. Instead, quick cache inserts it into a secondary allocation called the ghost pool, which tracks previously reque… view at source ↗
Figure 10
Figure 10. Figure 10: Layered Architecture Storage Layers. The storage layers sit on top of the file layers and are respon￾sible for persisting nodes to disk and loading them into memory upon request. The nodes are stored in flat files as plain old data. Nodes have to implement an interface that allows them to be converted to and from bytes. This way, it is possible to use the in-memory representation of the nodes directly, wh… view at source ↗
Figure 11
Figure 11. Figure 11: Distribution of actual slot usage in Verkle nodes of a LiveDb [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Distribution of actual slot usage in Verkle nodes of an ArchiveDB [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Reusable vs. total node counts and used node counts of a LiveDB [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Pareto frontier - Size vs Mgas/s Leaf Specializations [PITH_FULL_IMAGE:figures/full_fig_p033_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: LiveDB vs Geth Verkle (LevelDB) performance for the first 40M [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: LiveDB vs ArchiveDb performance for the first 40M blocks. [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Comparison of LiveDb Sizes in GiB for 40M blocks [PITH_FULL_IMAGE:figures/full_fig_p035_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Comparison of Archive Sizes in GiB for 40M blocks [PITH_FULL_IMAGE:figures/full_fig_p035_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Archive DB growth rate with different combinations of delta node [PITH_FULL_IMAGE:figures/full_fig_p036_19.png] view at source ↗
read the original abstract

The Ethereum state database uses Merkle Patricia Trie (MPT), which suffers from large witness proof sizes and high storage overhead. Verkle Tries have been proposed as a replacement, offering witness proofs below 150 bytes through vector commitments and Inner Product Argument aggregation. However, deploying a Verkle Trie in a high-throughput, short block-time blockchain such as Sonic, which produces a block every 300 milliseconds, introduces substantial engineering challenges related to storage efficiency, commitment computation costs, and the need to serve both live and historical state queries in real time. We present SonicDB S6, a production-grade Rust Verkle Trie database for the Sonic blockchain, which leverages its non-forking property to enable aggressive storage optimizations. Occupancy-aware node specializations, selected via an $\mathcal{O}(k n^2)$ dynamic program, reduce live storage by 97.8\%. Delta nodes that record only changed slots reduce archive storage by 95\%. Batched updates, multi-threaded commitment computation, and homomorphic Pedersen caching yield $3.2\times$ higher throughput than a persistent Geth Verkle baseline while sustaining production block-rate performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents SonicDB S6, a production-grade Rust implementation of a Verkle Trie for the Sonic blockchain. It claims that the chain's non-forking linear history enables aggressive storage optimizations: occupancy-aware node specializations chosen by an O(kn²) dynamic program reduce live storage by 97.8%, delta nodes that store only changed slots reduce archive storage by 95%, and batched updates combined with multi-threaded commitment computation and homomorphic Pedersen caching deliver 3.2× higher throughput than a persistent Geth Verkle baseline while sustaining 300 ms block production and real-time live/historical queries.

Significance. If the correctness invariants for delta-node reconstruction and DP-selected specializations can be established, the work would constitute a substantial engineering contribution to practical Verkle Trie deployment in high-throughput, short-block-time blockchains, directly addressing storage bloat and commitment latency that currently limit adoption.

major comments (2)
  1. [§4 (Optimizations)] The central performance claims rest on the non-forking property permitting delta nodes and occupancy specializations, yet no section supplies an invariant, proof sketch, or empirical test showing that delta-node reconstruction preserves Verkle vector-commitment openings and inner-product-argument soundness for historical queries under batched multi-threaded updates.
  2. [§5] §5 (Evaluation): the reported 97.8% live-storage reduction, 95% archive reduction, and 3.2× throughput improvement are stated without accompanying experimental methodology, workload description, hardware configuration, baseline implementation details, or statistical error analysis, preventing assessment of whether the numbers are reproducible or hold at production block rates.
minor comments (2)
  1. [§3] The notation O(kn²) for the dynamic program is introduced without defining the parameters k and n or discussing its practical runtime for realistic trie sizes.
  2. [Figures 4-7] Several figures comparing storage and throughput lack axis labels, legends, or error bars, reducing clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and indicate the changes planned for the revised version.

read point-by-point responses
  1. Referee: [§4 (Optimizations)] The central performance claims rest on the non-forking property permitting delta nodes and occupancy specializations, yet no section supplies an invariant, proof sketch, or empirical test showing that delta-node reconstruction preserves Verkle vector-commitment openings and inner-product-argument soundness for historical queries under batched multi-threaded updates.

    Authors: We agree that the current manuscript lacks an explicit invariant or proof sketch for delta-node reconstruction. The non-forking linear history guarantees sequential updates, so delta nodes record only modified slots and can be applied in timestamp order to reconstruct any historical state. We will add a subsection to §4 containing a proof sketch: because Pedersen commitments are homomorphic, updating only the changed slots preserves the overall vector commitment opening; inner-product arguments are recomputed solely on the affected subtrees during batched updates. Thread safety is ensured by per-node mutexes that serialize commitment computation. We will also include an empirical verification that reconstructed historical queries match a reference sequential implementation. revision: yes

  2. Referee: [§5] §5 (Evaluation): the reported 97.8% live-storage reduction, 95% archive reduction, and 3.2× throughput improvement are stated without accompanying experimental methodology, workload description, hardware configuration, baseline implementation details, or statistical error analysis, preventing assessment of whether the numbers are reproducible or hold at production block rates.

    Authors: We acknowledge that §5 currently omits the requested methodological details. In the revision we will expand the section to describe the workload (10 000 blocks drawn from the Sonic testnet with realistic transaction mixes), hardware (dual Intel Xeon Gold 6248R, 256 GB RAM, NVMe storage), baseline (a persistent Rust port of Geth’s Verkle Trie using the same Pedersen parameters), and statistical analysis (means and standard deviations computed over 50 independent runs with 95 % confidence intervals). These additions will allow readers to evaluate reproducibility at the target 300 ms block interval. revision: yes

Circularity Check

0 steps flagged

No circularity: optimizations and metrics are empirical and implementation-driven

full rationale

The paper presents an engineering system for a Verkle Trie database that applies occupancy-aware node specializations (chosen by an O(kn²) dynamic program) and delta nodes to achieve measured storage reductions, plus batched/multi-threaded updates for throughput gains. These are concrete algorithmic and data-structure choices enabled by an external blockchain property (non-forking), with results reported via direct comparison to a Geth baseline. No equations, predictions, or first-principles derivations appear that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The central claims rest on implementation details and benchmark numbers rather than any load-bearing logical reduction to the paper's own assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that Sonic never forks and on the correctness of an O(kn²) dynamic program for node specialization; no free parameters or invented entities are identifiable from the abstract alone.

axioms (1)
  • domain assumption Sonic blockchain is non-forking
    Invoked to justify aggressive storage optimizations without correctness loss

pith-pipeline@v0.9.0 · 5739 in / 1287 out tokens · 56433 ms · 2026-05-21T10:22:54.022688+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    A signed binary multiplication technique.The Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951

    Andrew D Booth. A signed binary multiplication technique.The Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951

  2. [2]

    Efficient zero-knowledge arguments for arithmetic circuits in the discrete log setting

    Jonathan Bootle, Andrea Cerulli, Pyrros Chaidos, Jens Groth, and Christophe Petit. Efficient zero-knowledge arguments for arithmetic circuits in the discrete log setting. In Marc Fischlin and Jean-S´ ebastien Coron, edi- tors,Advances in Cryptology – EUROCRYPT 2016, volume 9666 ofLecture Notes in Computer Science, pages 327–357. Springer, 2016

  3. [3]

    Recursive proof composition without a trusted setup

    Sean Bowe, Jack Grigg, and Daira Hopwood. Recursive proof composition without a trusted setup. Cryptology ePrint Archive, Report 2019/1021, 2019

  4. [4]

    Bulletproofs: Short proofs for confidential trans- actions and more

    Benedikt B¨ unz, Jonathan Bootle, Dan Boneh, Andrew Poelstra, Pieter Wuille, and Greg Maxwell. Bulletproofs: Short proofs for confidential trans- actions and more. InProceedings of the IEEE Symposium on Security and Privacy, pages 315–334. IEEE, 2018

  5. [5]

    State expiry EIP.https://notes.ethereum.org/ @vbuterin/state_expiry_eip, 2021

    Vitalik Buterin. State expiry EIP.https://notes.ethereum.org/ @vbuterin/state_expiry_eip, 2021. Ethereum Research

  6. [6]

    Verkle trees.https://vitalik.eth.limo/general/ 2021/06/18/verkle.html, June 2021

    Vitalik Buterin. Verkle trees.https://vitalik.eth.limo/general/ 2021/06/18/verkle.html, June 2021. Ethereum Research

  7. [7]

    Verkle trie eip.https://notes.ethereum.org/ @vbuterin/verkle_tree_eip, 2023

    Vitalik Buterin. Verkle trie eip.https://notes.ethereum.org/ @vbuterin/verkle_tree_eip, 2023. Ethereum Research

  8. [8]

    Vector commitments and their appli- cations

    Dario Catalano and Dario Fiore. Vector commitments and their appli- cations. In Kaoru Kurosawa and Goichiro Hanaoka, editors,Public-Key Cryptography – PKC 2013, pages 55–72, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg

  9. [9]

    Jemin Choi, Sidi Mohamed Beillahi, Srisht Singh, Panagiotis Michalopou- los, Peilun Li, Andreas Veneris, and Fan Long. Lmpt: A novel authenti- cated data structure to eliminate storage bottlenecks for high performance blockchains.IEEE Transactions on Network and Service Management, 21:1333–1343, 2024

  10. [10]

    Cormen, Charles E

    Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.Introduction to Algorithms. MIT Press, 3rd edition, 2009

  11. [11]

    How to prove yourself: Practical solutions to identification and signature problems

    Amos Fiat and Adi Shamir. How to prove yourself: Practical solutions to identification and signature problems. In Andrew M. Odlyzko, editor, Advances in Cryptology — CRYPTO ’86, volume 263 ofLecture Notes in Computer Science, pages 186–194, Berlin, Heidelberg, 1987. Springer

  12. [12]

    Statelessness, state expiry and history expiry

    Ethereum Foundation. Statelessness, state expiry and history expiry. https://ethereum.org/roadmap/statelessness/, 2026. 39

  13. [13]

    Storage proofs for historical ethereum data

    Joseph Hilsberg and Sushmita Ruj. Storage proofs for historical ethereum data. InProceedings of the 7th ACM International Symposium on Blockchain and Secure Critical Infrastructure, BSCI ’25, New York, NY, USA, 2025. Association for Computing Machinery

  14. [14]

    Clock-pro: an effective im- provement of the clock replacement

    Song Jiang, Feng Chen, and Xiaodong Zhang. Clock-pro: an effective im- provement of the clock replacement. InProceedings of the Annual Confer- ence on USENIX Annual Technical Conference, ATEC ’05, page 35, USA,

  15. [15]

    A fast Ethereum-Compatible forkless database.arXiv, cs.DB:2512.04735, Dec 2025

    Herbert Jordan, Kamil Jezek, Pavle Subotic, and Bernhard Scholz. A fast Ethereum-Compatible forkless database.arXiv, cs.DB:2512.04735, Dec 2025

  16. [16]

    Efficient forkless blockchain databases.IEEE Access, 13:207175–207187, 2025

    Herbert Jordan, Kamil Jezek, Pavle Suboti´ c, and Bernhard Scholz. Efficient forkless blockchain databases.IEEE Access, 13:207175–207187, 2025

  17. [17]

    Constant-size com- mitments to polynomials and their applications

    Aniket Kate, Gregory M Zaverucha, and Ian Goldberg. Constant-size com- mitments to polynomials and their applications. InInternational conference on the theory and application of cryptology and information security, pages 177–194. Springer, 2010

  18. [18]

    Linux man-pages

    Michael Kerrisk.lseek(2) – linux programmer’s manual. Linux man-pages. https://man7.org/linux/man-pages/man2/lseek.2.html

  19. [19]

    Linux man-pages

    Michael Kerrisk.pread(2) – linux programmer’s manual. Linux man-pages. https://man7.org/linux/man-pages/man2/pread.2.html

  20. [20]

    Linux man-pages

    Michael Kerrisk.read(2) – linux programmer’s manual. Linux man-pages. https://man7.org/linux/man-pages/man2/read.2.html

  21. [21]

    Linux man-pages

    Michael Kerrisk.write(2) – linux programmer’s manual. Linux man-pages. https://man7.org/linux/man-pages/man2/write.2.html

  22. [22]

    LVMT: An Efficient Authenticated Storage for Blockchain

    Chenxing Li, Sidi Mohamed Beillahi, Guang Yang, Ming Wu, Wei Xu, and Fan Long. LVMT: An Efficient Authenticated Storage for Blockchain. ACM Trans. Storage, 20(3), June 2024

  23. [23]

    Sonic labs.https://soniclabs.com, 2026

    Sonic Operations Ltd. Sonic labs.https://soniclabs.com, 2026

  24. [24]

    Bandersnatch: a fast elliptic curve built over the bls12-381 scalar field.Des

    Simon Masson, Antonio Sanso, and Zhenfei Zhang. Bandersnatch: a fast elliptic curve built over the bls12-381 scalar field.Des. Codes Cryptography, 92(12):4131–4143, September 2024

  25. [25]

    Ralph C. Merkle. A digital signature based on a conventional encryption function. In Carl Pomerance, editor,Advances in Cryptology – CRYPTO ’87, volume 293 ofLecture Notes in Computer Science, pages 369–378. Springer, 1988

  26. [26]

    Morrison

    Donald R. Morrison. PATRICIA—practical algorithm to retrieve informa- tion coded in alphanumeric.Journal of the ACM, 15(4):514–534, 1968. 40

  27. [27]

    Towards stateless clients in ethereum: Benchmarking verkle trees and binary merkle trees with snarks.arXiv, cs.CR:2504.14069, 2025

    Jan Oberst. Towards stateless clients in ethereum: Benchmarking verkle trees and binary merkle trees with snarks.arXiv, cs.CR:2504.14069, 2025

  28. [28]

    Non-interactive and information-theoretic secure verifiable secret sharing

    Torben Pryds Pedersen. Non-interactive and information-theoretic secure verifiable secret sharing. In Joan Feigenbaum, editor,Advances in Cryptol- ogy – CRYPTO 1991, volume 576 ofLecture Notes in Computer Science, pages 129–140. Springer, 1992

  29. [29]

    Nomt: Nearly-optimal merkle trie for ssd-era blockchain state.https://https://polkadotecosystem.com/tools/dev/nomt/

    Polkadot. Nomt: Nearly-optimal merkle trie for ssd-era blockchain state.https://https://polkadotecosystem.com/tools/dev/nomt/. Accessed: 2026-01-22

  30. [30]

    mlsm: making authen- ticated storage faster in ethereum

    Pandian Raju, Soujanya Ponnapalli, Evan Kaminsky, Gilad Oved, Zachary Keener, Vijay Chidambaram, and Ittai Abraham. mlsm: making authen- ticated storage faster in ethereum. In10th USENIX Conference, HotStor- age’18, page 10, USA, 2018. USENIX Association

  31. [31]

    quick cache: Lightweight and high performance concurrent cache.https://github.com/arthurprs/quick-cache/

    Arthur Silva. quick cache: Lightweight and high performance concurrent cache.https://github.com/arthurprs/quick-cache/. Accessed: 2026- 01-27

  32. [32]

    Carmen: A verkle trie state database for sonic.https:// github.com/0xsoniclabs/carmen

    Sonic Labs. Carmen: A verkle trie state database for sonic.https:// github.com/0xsoniclabs/carmen

  33. [33]

    Hakv: A hotness-aware zone management approach to optimizing performance of lsm-tree-based key-value stores.ACM Trans

    Hui Sun, Qianli Yue, Guanzhong Chen, Yi Zou, Yinliang Yue, and Xiao Qin. Hakv: A hotness-aware zone management approach to optimizing performance of lsm-tree-based key-value stores.ACM Trans. Archit. Code Optim., 22(3), September 2025

  34. [34]

    go-verkle github repository.https://github.com/ ethereum/go-verkle, 2026

    Go-Verkle Team. go-verkle github repository.https://github.com/ ethereum/go-verkle, 2026

  35. [35]

    Towards scalable threshold cryp- tosystems

    Alin Tomescu, Robert Chen, Yiming Zheng, Ittai Abraham, Benny Pinkas, Guy Golan Gueta, and Srinivas Devadas. Towards scalable threshold cryp- tosystems. In2020 IEEE Symposium on Security and Privacy (SP), pages 877–893. IEEE, 2020

  36. [36]

    Wayne L. Winston. Book reviews: Introduction to mathematical program- ming, applications and algorithms (second edition).Math. Comput. Educ., 36(1):85–87, September 2002

  37. [37]

    Ethereum: A secure decentralised generalised transaction ledger

    Gavin Wood. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Yellow Paper, 2014. Continuously revised; current version athttps://ethereum.github.io/yellowpaper/paper.pdf

  38. [38]

    Lei yang. Salt: Breaking the i/o bottleneck for blockchains with a scalable authenticated key-value store.https://www.megaeth.com/blog-news/ endgame-how-salt-breaks-the-bottleneck-thats-been-strangling-blockchains, 2025. 41