A Pragmatic Approach to Learned Indexing in RocksDB: Targeted Optimizations with Minimal System Modification
Pith reviewed 2026-05-25 02:17 UTC · model grok-4.3
The pith
Off-the-shelf learned indexes integrate into RocksDB via Memtable reuse and block-aware disk placement for up to 2.1X read throughput.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By deploying off-the-shelf learned indexes separately in Memtables with a reuse mechanism that preserves structural knowledge across instances and replacing the disk index with a block-aware learned index that supports worst-case single-I/O lookups, MountDB achieves up to 1.5X higher write throughput and 2.1X higher read throughput than state-of-the-art systems while requiring no modifications to the storage layer or read path.
What carries the argument
The reuse mechanism that preserves structural knowledge across Memtable instances, combined with the block-aware adaptation of read-only learned indexes for worst-case single-I/O lookups.
If this is right
- Up to 1.5X higher write throughput than state-of-the-art systems on large-scale diverse workloads.
- Up to 2.1X higher read throughput than state-of-the-art systems on large-scale diverse workloads.
- Learned indexes can be integrated into production systems with minimal overhead and no changes to the storage layer or read path.
- Established learned indexes can support concurrency and persistence when placed according to the Memtable and disk separation.
Where Pith is reading between the lines
- The same targeted placement and reuse pattern could be applied to other LSM-based key-value stores that maintain a similar Memtable-to-disk boundary.
- Block-aware learned indexes might improve I/O predictability in systems with varying block sizes or different storage media.
- Sustained memory footprint over long-running workloads could be measured to check whether model reuse keeps overall resource use low.
Load-bearing premise
The separation between in-memory Memtables and immutable on-disk files plus the reuse mechanism is sufficient to let off-the-shelf learned indexes support concurrency and persistence without correctness or performance regressions under write-heavy workloads.
What would settle it
A write-heavy workload with frequent Memtable replacements that produces either data inconsistencies, model adaptation failures, or throughput below the B+-tree baseline.
Figures
read the original abstract
Learned indexes have emerged as a promising alternative to traditional index structures, offering higher throughput and lower memory usage by approximating the cumulative key distribution function with lightweight models. Despite these benefits, adoption in production systems remains limited, partly because learned indexes that support concurrency and persistence as effectively as, e.g., the B+-Tree, do not yet exist, while many research prototypes introduce substantial complexity. In this paper, we investigate whether off-the-shelf learned indexes can be integrated into a production database with minimal storage-engine redesign. Using RocksDB as a case study, we exploit its separation between in-memory Memtables and immutable on-disk files to deploy specialized indexes at each level. We show that directly applying existing learned indexes is insufficient under write-heavy workloads because frequent Memtable replacement prevents models from fully adapting. To address this, we introduce a reuse mechanism that preserves structural knowledge across Memtable instances. At the storage level, we replace RocksDB's disk index with a learned index without modifying the storage layer or read path. We further adapt a read-only learned index to be block-aware, enabling worst-case single-I/O lookups. We implement these techniques in MountDB, an extension of RocksDB. Experiments on large-scale workloads with diverse data distributions and access patterns show up to 1.5X higher write throughput and 2.1X higher read throughput than state-of-the-art systems, demonstrating that established learned indexes can be integrated into production systems with minimal overhead and substantial performance benefits.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes integrating off-the-shelf learned indexes into RocksDB by exploiting the separation between in-memory Memtables and immutable on-disk files. It introduces a reuse mechanism to preserve model knowledge across frequent Memtable replacements under write-heavy workloads and adapts a read-only learned index to be block-aware for worst-case single-I/O lookups. These changes are implemented in MountDB without modifying the storage layer or read path, yielding up to 1.5× higher write throughput and 2.1× higher read throughput than state-of-the-art systems on large-scale workloads with diverse distributions and access patterns.
Significance. If the results hold, the work is significant for demonstrating a pragmatic path to adopt established learned indexes in production systems with targeted, minimal modifications rather than full redesigns. The reuse mechanism and block-aware adaptation address specific barriers (concurrency, persistence, adaptation lag) while crediting the use of off-the-shelf indexes and the Memtable/on-disk separation as enabling strengths. This could lower the barrier to deployment compared to research prototypes that introduce substantial complexity.
major comments (2)
- [Abstract, paragraph on Memtable replacement] Abstract, paragraph on Memtable replacement: the reuse mechanism is asserted to solve adaptation lag under write-heavy workloads, yet no detail is given on model-state transfer across replacements while preserving thread-safety and persistence guarantees under concurrent writes. This assumption is load-bearing for the central claim that the Memtable/on-disk separation suffices to support concurrency and persistence without correctness or performance regressions.
- [Abstract, storage level paragraph] Abstract, storage-level paragraph: the claim that the disk index is replaced 'without modifying the storage layer or read path' and that a read-only learned index is adapted to be block-aware for single-I/O lookups requires that the learned index expose exactly the same lookup and iterator interfaces (including error bounds, block metadata handling, and semantics) as the original block-based index. Any deviation would force read-path changes, undermining the 'minimal modification' premise and the attribution of the reported throughput gains.
minor comments (1)
- The abstract reports throughput numbers without workload details, baseline descriptions, error bars, or data-exclusion rules; the full manuscript should ensure these are explicitly documented in the experimental section to support reproducibility.
Simulated Author's Rebuttal
We thank the referee for the thoughtful comments highlighting areas where the abstract could better support its claims. We address each point below with clarifications drawn from the full manuscript and indicate where revisions will be made.
read point-by-point responses
-
Referee: [Abstract, paragraph on Memtable replacement] Abstract, paragraph on Memtable replacement: the reuse mechanism is asserted to solve adaptation lag under write-heavy workloads, yet no detail is given on model-state transfer across replacements while preserving thread-safety and persistence guarantees under concurrent writes. This assumption is load-bearing for the central claim that the Memtable/on-disk separation suffices to support concurrency and persistence without correctness or performance regressions.
Authors: The full manuscript (Section 3.2) details the reuse mechanism: model parameters are transferred via a compact serialization of the CDF approximation during Memtable replacement, with the new Memtable initialized from this state to reduce adaptation lag. Thread-safety is achieved through atomic model pointer swaps and snapshot-based reads that prevent concurrent modification visibility; persistence is preserved because Memtable models are transient (SSTables on disk use the separate block-aware index) and recovery rebuilds from WAL without relying on in-memory models. We agree the abstract should briefly reference this transfer process to make the concurrency claim self-contained and will revise the abstract accordingly. revision: yes
-
Referee: [Abstract, storage level paragraph] Abstract, storage-level paragraph: the claim that the disk index is replaced 'without modifying the storage layer or read path' and that a read-only learned index is adapted to be block-aware for single-I/O lookups requires that the learned index expose exactly the same lookup and iterator interfaces (including error bounds, block metadata handling, and semantics) as the original block-based index. Any deviation would force read-path changes, undermining the 'minimal modification' premise and the attribution of the reported throughput gains.
Authors: Section 4.1 and 4.3 of the manuscript specify that the block-aware adaptation of the read-only learned index (based on an off-the-shelf model) produces outputs that map directly to existing block boundaries and metadata formats, preserving identical lookup, iterator, error-bound, and semantic interfaces. The read-path code paths remain unchanged because the index is a drop-in replacement at the file level; no new error handling or metadata logic is introduced. This compatibility is what enables the reported gains to be attributed to index efficiency rather than interface changes. revision: no
Circularity Check
No circularity: empirical integration paper with no derivations or fitted predictions
full rationale
The paper describes a systems engineering effort to integrate off-the-shelf learned indexes into RocksDB by exploiting existing Memtable/on-disk separation, adding a reuse mechanism, and adapting a read-only index for block awareness. No equations, parameter fitting, or predictive claims appear in the abstract or described approach; performance numbers (1.5X write, 2.1X read) are reported from experiments rather than derived from any model. The central claim rests on implementation details and benchmarking, not on any self-referential reduction, self-citation chain, or renaming of known results. This is a standard empirical contribution whose validity is assessed by reproduction of the reported throughput gains, not by internal logical closure.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Abdullah Al-Mamun, Hao Wu, Qiyang He, Jianguo Wang, and Walid G. Aref
-
[2]
A Survey of Learned Indexes for the Multi-dimensional Space. ACM Comput. Surv. (2025). https://doi.org/10.1145/3768575
-
[3]
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chand- hiramoorthi, and Diego Didona. 2019. SILK: Preventing latency spikes in Log- Structured merge Key-Value stores. InUSENIX Annual Technical Conference
work page 2019
-
[4]
Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017. FloDB: Unlocking Memory in Persistent Key-Value Stores. InProceedings of the 12th European Conference on Computer Systems, EuroSys
work page 2017
-
[5]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. 2020. Char- acterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies, FAST 2020, Santa Clara, CA, USA, February 24-27, 2020, Sam H. Noh and Brent Welch (Eds.). USENIX Association, 209–223. https://www.usenix.org/con...
work page 2020
-
[6]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing
work page 2010
-
[7]
Yifan Dai, Yien Xu, Aishwarya Ganesan, Ramnatthan Alagappan, Brian Kroth, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2020. From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI). https: //www.usenix.org/conference/osdi20/presentation/dai
work page 2020
-
[8]
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David B. Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In Proceedings of the International Conference on Management of Data (SIGMOD). https://doi.org/10.1145/3318464.3389711
-
[9]
Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In 8th Biennial Conference on Innovative Data Systems Research (CIDR). https:// cidrdb.org/cidr2017/papers/p82-dong-cidr17.pdf
work page 2017
-
[10]
Paolo Ferragina, Fabrizio Lillo, and Giorgio Vinciguerra. 2020. Why Are Learned Indexes So Effective?. In Proceedings of the 37th International Conference on Machine Learning (ICML), Vol. 119. https://proceedings.mlr.press/v119/ ferragina20a.html
work page 2020
-
[11]
Paolo Ferragina and Giorgio Vinciguerra. 2020. The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow. 13, 8 (2020). https://doi.org/10.14778/3389133.3389135
-
[12]
Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. FITing-Tree: A Data-aware Index Structure. In Proceedings of the International Conference on Management of Data (SIGMOD). https: //doi.org/10.1145/3299869.3319860
-
[13]
Jiake Ge, Boyu Shi, Yanfeng Chai, Yuanhui Luo, Yunda Guo, Yinxuan He, and Yunpeng Chai. 2023. Cutting Learned Index into Pieces: An In-depth Inquiry into Updatable Learned Indexes. In 39th IEEE International Conference on Data Engineering (ICDE). https://doi.org/10.1109/ICDE55515.2023.00031
-
[14]
Sanjay Ghemawhat, Jeff Dean, Chris Mumford, David Grogan, and Victor Costan
- [15]
-
[16]
Alireza Heidari, Amirhossein Ahmadi, and Wei Zhang. 2025. DobLIX: A Dual- Objective Learned Index for Log-Structured Merge Trees. Proc. VLDB Endow. 18, 11 (2025). https://doi.org/10.14778/3749646.3749667
-
[17]
Alireza Heidari, Amirhossein Ahmadi, and Wei Zhang. 2025. DobLIX: A Dual- Objective Learned Index for Log-Structured Merge Trees. https://github.com/ ah89/DobLIX
work page 2025
-
[18]
Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann. 2019. SOSD: A Benchmark for Learned Indexes. NeurIPS Workshop on Machine Learning for Systems (2019). https: //doi.org/10.48550/arXiv.1911.13014
-
[19]
Tim Kraska, Mohammad Alizadeh, Alex Beutel, Ed H Chi, Jialin Ding, Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao, and Vikram Nathan. 2021. Sagedb: A learned database system. (2021)
work page 2021
-
[20]
Chi, Jeffrey Dean, and Neoklis Polyzotis
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the International Conference on Management of Data (SIGMOD). https://doi.org/10.1145/3183713. 3196909
-
[21]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized struc- tured storage system. ACM SIGOPS Oper. Syst. Rev. 44, 2 (2010). https: //doi.org/10.1145/1773912.1773922
-
[22]
Shane Culpepper, and Renata Borovica-Gajic
Hai Lan, Zhifeng Bao, J. Shane Culpepper, and Renata Borovica-Gajic. 2023. Updatable Learned Indexes Meet Disk-Resident DBMS - From Evaluations to Design Choices. Proc. ACM Manag. Data 1, 2 (2023). https://doi.org/10.1145/ 3589284
work page 2023
-
[23]
Shane Culpepper, Renata Borovica-Gajic, and Yu Dong
Hai Lan, Zhifeng Bao, J. Shane Culpepper, Renata Borovica-Gajic, and Yu Dong
-
[24]
In 40th IEEE International Conference on Data Engineering (ICDE)
A Fully On-Disk Updatable Learned Index. In 40th IEEE International Conference on Data Engineering (ICDE). https://doi.org/10.1109/ICDE60146. 2024.00369
-
[25]
Viktor Leis, Alfons Kemper, and Thomas Neumann. 2013. The adaptive radix tree: ARTful indexing for main-memory databases. In 29th IEEE International Conference on Data Engineering (ICDE). https://doi.org/10.1109/ICDE.2013. 6544812
-
[26]
Baotong Lu, Jialin Ding, Eric Lo, Umar Farooq Minhas, and Tianzheng Wang
-
[27]
APEX: A High-Performance Learned Index on Persistent Memory. Proc. VLDB Endow. 15, 3 (2021). https://www.vldb.org/pvldb/vol15/p597-lu.pdf
work page 2021
-
[28]
Kai Lu. 2022. TridentKV: A Read-Optimized LSM-Tree Based KV Store via Adap- tive Indexing and Space-Efficient Partitioning. https://github.com/emperorlu/ Learned-RocksDB
work page 2022
-
[29]
Kai Lu, Nannan Zhao, Jiguang Wan, Changhong Fei, Wei Zhao, and Tongliang Deng. 2022. TridentKV: A Read-Optimized LSM-Tree Based KV Store via Adap- tive Indexing and Space-Efficient Partitioning. IEEE Trans. Parallel Distributed Syst. 33, 8 (2022). https://doi.org/10.1109/TPDS.2021.3118599
-
[30]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. WiscKey: Separating Keys from Values in SSD-Conscious Storage. ACM Trans. Storage 13, 1 (2017). https://doi.org/10.1145/3033273
-
[31]
Lailong Luo, Deke Guo, Richard T. B. Ma, Ori Rottenstreich, and Xueshan Luo
-
[32]
Optimizing Bloom Filter: Challenges, Solutions, and Comparisons. IEEE Commun. Surv. Tutorials 21, 2 (2019). https://doi.org/10.1109/COMST.2018. 2889329
- [33]
-
[34]
O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J
Patrick E. O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O’Neil. 1996. The Log-Structured Merge-Tree (LSM-Tree).Acta Informatica 33, 4 (1996). https: //doi.org/10.1007/s002360050048
-
[35]
William W. Pugh. 1990. Skip Lists: A Probabilistic Alternative to Balanced Trees. Commun. ACM 33, 6 (1990). https://doi.org/10.1145/78973.78977
-
[36]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP). https://doi.org/10.1145/3132747.3132765
-
[37]
Subhadeep Sarkar, Dimitris Staratzis, Zichen Zhu, and Manos Athanassoulis. 2021. Constructing and Analyzing the LSM Compaction Design Space. Proc. VLDB Endow. 14, 11 (2021), 2216–2229. https://doi.org/10.14778/3476249.3476274
- [38]
-
[39]
Speedb, Inc. 2022. Speedb: RocksDB-compatible high-performance storage engine. https://github.com/speedb-io/speedb
work page 2022
-
[40]
Zhaoyan Sun, Xuanhe Zhou, and Guoliang Li. 2023. Learned index: A compre- hensive experimental evaluation. Proc. VLDB Endow. 16, 8 (2023)
work page 2023
-
[41]
Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, Paul Bardea, Amruta Ranade, Ben Darnell, Bram Gruneir, Justin Jaffray, Lucy Zhang, and Peter Mattis. 2020. CockroachDB: The Resilient Geo-Distributed SQL Database. In Proceedings of the 2020 International Conferen...
-
[42]
Chuzhe Tang, Youyun Wang, Zhiyuan Dong, Gansen Hu, Zhaoguo Wang, Minjie Wang, and Haibo Chen. 2020. XIndex: a scalable learned index for multicore data storage. In PPoPP ’20:25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. https://doi.org/10.1145/3332466.3374547
-
[43]
Youyun Wang, Chuzhe Tang, Zhaoguo Wang, and Haibo Chen. 2020. SIn- dex: a scalable learned index for string keys. In APSys ’20:11th ACM SIGOPS Asia-Pacific Workshop on Systems, Tsukuba, Japan, August 24-25, 2020, Taesoo Kim and Patrick P. C. Lee (Eds.). ACM, 17–24. https://doi.org/10.1145/3409963. 3410496
-
[44]
Yi Wang, Jianan Yuan, Shangyu Wu, Huan Liu, Jiaxian Chen, Chenlin Ma, and Jianbin Qin. 2024. LeaderKV: Improving Read Performance of KV Stores via Learned Index and Decoupled KV Table. In 40th IEEE International Conference on Data Engineering (ICDE). https://doi.org/10.1109/ICDE60146.2024.00010
-
[45]
Chaichon Wongkham, Baotong Lu, Chris Liu, Zhicong Zhong, Eric Lo, and Tianzheng Wang. 2022. Are Updatable Learned Indexes Ready? Proc. VLDB Endow. 15, 11 (2022). https://doi.org/10.14778/3551793.3551848
-
[46]
Jiacheng Wu, Yong Zhang, Shimin Chen, Yu Chen, Jin Wang, and Chunxiao Xing
-
[47]
Updatable Learned Index with Precise Positions. Proc. VLDB Endow. 14, 8 (2021). https://doi.org/10.14778/3457390.3457393
-
[48]
Qing Xie, Chaoyi Pang, Xiaofang Zhou, Xiangliang Zhang, and Ke Deng. 2014. Maximum error-bounded Piecewise Linear Representation for online stream approximation. VLDB J. 23, 6 (2014). https://doi.org/10.1007/s00778-014-0355-0
- [49]
-
[50]
Jiaoyi Zhang, Kai Su, and Huanchen Zhang. 2024. Making In-Memory Learned Indexes Efficient on Disk. Proc. ACM Manag. Data 2, 3 (2024). https://doi.org/ 10.1145/3654954
-
[51]
Yong Zhang, Xinran Xiong, and Oana Balmau. 2022. TONE: cutting tail- latency in learned indexes. InCHEOPS@EuroSys: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems. https://doi.org/10.1145/3503646.3524295
-
[52]
Weihong Zhou and Shiyu Yang. 2024. SLIPP: A Space-Efficient Learned Index for String Keys. In Proceedings of the 6th International Conference on Big-data Service and Intelligent Computation, BDSIC 2024, Hong Kong, Hong Kong, May 29-31, 2024. ACM, 69–77. https://doi.org/10.1145/3686540.3686550
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.