arxiv: 2604.03007 · v1 · submitted 2026-04-03 · 💻 cs.DC · cs.DB

Recognition: no theorem link

CIDER: Boosting Memory-Disaggregated Key-Value Stores with Pessimistic Synchronization

Yuxuan Du , Xuchuan Luo , Xin Wang , Yangfan Zhou , Jiacheng Shen

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:19 UTC · model grok-4.3

classification 💻 cs.DC cs.DB

keywords memory disaggregationkey-value storespessimistic synchronizationredundant I/Oswrite-combiningcontention-aware synchronizationthroughput optimizationYCSB benchmark

0 comments

The pith

Switching to pessimistic synchronization cuts redundant I/Os that bottleneck memory-disaggregated key-value stores.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that memory-disaggregated KV stores create too many redundant network I/Os because their optimistic synchronization clashes with highly concurrent workloads. CIDER replaces this with pessimistic synchronization on the compute side, adds global write-combining to merge updates, and uses a contention-aware scheme to avoid unnecessary overhead when conflicts are rare. If the approach holds, limited network bandwidth between compute and memory pools stops being the main limiter on throughput. A sympathetic reader cares because disaggregated memory promises flexible resource allocation but currently wastes that flexibility on avoidable traffic. The authors demonstrate the change lifts throughput of existing systems by up to 6.6 times on standard YCSB workloads.

Core claim

CIDER demonstrates that pessimistic synchronization, paired with global write-combining and contention-aware mechanisms, directly addresses the root cause of redundant I/Os in memory-disaggregated KV stores by aligning access control with high-concurrency patterns on disaggregated memory.

What carries the argument

CIDER framework that applies pessimistic synchronization together with global write-combining to merge cross-node writes and a contention-aware scheme that adapts locking behavior under varying conflict rates.

If this is right

Throughput of existing memory-disaggregated KV stores rises by up to 6.6 times under YCSB benchmarks.
Network traffic between compute and memory pools falls because redundant cross-node I/Os are eliminated.
Performance remains stable even when workloads shift between high and low contention levels.
No hardware changes are required; gains come from compute-side changes only.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pessimistic-plus-write-combining pattern could be tested on disaggregated databases or object stores facing similar network bottlenecks.
An adaptive version that switches between optimistic and pessimistic modes based on measured contention might extend the approach to mixed workloads.
Future systems with faster interconnects could still benefit if write-combining reduces total data movement rather than just latency.
Measuring energy use on the memory nodes before and after CIDER would show whether lower I/O volume also cuts power draw.

Load-bearing premise

The root cause of redundant I/Os is the mismatch between optimistic synchronization of existing memory-disaggregated KV stores and the highly concurrent workloads on DM.

What would settle it

Run the same high-concurrency YCSB workload on a state-of-the-art memory-disaggregated KV store with and without CIDER, then compare the exact count of remote I/Os generated; a large drop would support the claim.

Figures

Figures reproduced from arXiv: 2604.03007 by Jiacheng Shen, Xin Wang, Xuchuan Luo, Yangfan Zhou, Yuxuan Du.

**Figure 1.** Figure 1: The throughput and retry count of the pointer array with optimistic synchronization under a highly-contented write-intensive workload. 16 32 64 128 256 512 Number of Clients 0 2 4 6 Throughput (Mops/s) Optimistic ShiftLock [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 7.** Figure 7: The workflow of global WC. Based on this observation, we propose global WC over the MCS lock. In the rest of this section, we first introduce how CIDER achieves UPDATE operations in Section 4.2.1. INSERT and DELETE operations are separately introduced in Section 4.2.2 since they are handled differently to guarantee correctness. 4.2.1 Combining UPDATE operations [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

**Figure 8.** Figure 8: The structures of the lock node, lock entry, and data pointer. [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: The workflows of SEARCH, INSERT, and UPDATE operations under the optimistic mode. The contention-aware synchronization scheme enables finegrained and seamless transitions between synchronization modes for individual KV pairs. For hot KV pairs, pessimistic synchronization with global WC eliminates redundant IDU operations by batching modifications and preventing unnecessary retries. For cold KV pairs, opt… view at source ↗

**Figure 10.** Figure 10: The workflow of the UPDATE and DELETE operations under the pessimistic mode. The solid lines indicate RDMA_READ and RDMA_WRITE, and the dashed lines indicate atomic RDMA_CAS and RDMA_FAA. acquiring the lock, the client conducts the remote out-of-place update, i.e., writes the new KV data (○3 ) and performs an RDMA_CAS to update the pointer (○4 ). After that, the client releases the lock (○5 ). The client… view at source ↗

**Figure 11.** Figure 11: The throughput comparison on a pointer array. [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗

**Figure 13.** Figure 13: The throughput and latency of CIDER and base [PITH_FULL_IMAGE:figures/full_fig_p010_13.png] view at source ↗

**Figure 16.** Figure 16: The end-to-end throughput on RACE. 32 64 128256512 Number of Clients 0 50 100 150 P50 Latency (us) CIDER P50 O-SYNC P50 CAS P50 ShiftLock P50 32 64 128256512 Number of Clients 0 50 100 150 32 64 128256512 Number of Clients 0 50 100 150 10 1 10 2 10 3 10 4 CIDER P99 O-SYNC P99 CAS P99 ShiftLock P99 0 50 100 150 10 1 10 2 10 3 10 4 P99 Latency (us) (a) Write-intensive (b) Read-intensive (c) Write-only [PIT… view at source ↗

**Figure 18.** Figure 18: The end-to-end throughput on SMART. 32 64 128256512 Number of Clients 0 50 100 150 P50 Latency (us) CIDER P50 O-SYNC P50 CAS P50 ShiftLock P50 32 64 128256512 Number of Clients 0 50 100 150 32 64 128256512 Number of Clients 0 50 100 150 10 1 10 2 10 3 10 4 CIDER P50 O-SYNC P50 CAS P50 ShiftLock P50 0 25 50 75 10 1 10 2 10 3 10 4 P99 Latency (us) (a) Write-intensive (b) Read-intensive (c) Write-only [PITH… view at source ↗

**Figure 21.** Figure 21: The efficiency comparison of different WC mechanisms. TPC-C 0 0.6 1.2 Throughput (Mtxns/s) CAS CIDER ShiftLock TATP 0 4 8 [PITH_FULL_IMAGE:figures/full_fig_p011_21.png] view at source ↗

**Figure 23.** Figure 23: The performance comparison as a function of the array [PITH_FULL_IMAGE:figures/full_fig_p015_23.png] view at source ↗

**Figure 24.** Figure 24: The performance comparison as a function of the value [PITH_FULL_IMAGE:figures/full_fig_p015_24.png] view at source ↗

read the original abstract

Memory-disaggregated key-value (KV) stores suffer from a severe performance bottleneck due to their I/O redundancy issues. A huge amount of redundant I/Os are generated when synchronizing concurrent data accesses, making the limited network between the compute and memory pools of DM a performance bottleneck. We identify the root cause for the redundant I/O lies in the mismatch between the optimistic synchronization of existing memory-disaggregated KV stores and the highly concurrent workloads on DM. In this paper, we propose to boost memory-disaggregated KV stores with pessimistic synchronization. We propose CIDER, a compute-side I/O optimization framework, to verify our idea. CIDER adopts a global write-combining technique to further reduce cross-node redundant I/Os. A contention-aware synchronization scheme is designed to improve the performance of pessimistic synchronization under low contention scenarios. Experimental results show that CIDER effectively improves the throughput of state-of-the-art memory-disaggregated KV stores by up to $6.6\times$ under the YCSB benchmark.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CIDER shows that pessimistic synchronization plus global write-combining cuts redundant I/Os in disaggregated KV stores and delivers measurable throughput gains on YCSB.

read the letter

The paper's main point is straightforward: optimistic synchronization in existing memory-disaggregated KV stores generates too many redundant cross-node I/Os under high concurrency, and switching to a compute-side pessimistic approach with global write-combining and contention-aware adaptation fixes much of it. CIDER implements this as a framework that locks pessimistically on the compute side, batches writes globally to reduce traffic, and adapts the locking strategy based on observed contention. This specific mix applied to disaggregated KV stores is not in the cited prior work, so the concrete framework counts as new. The experiments report up to 6.6x throughput improvement, which lines up with the claimed reduction in I/Os if the baselines are the current state-of-the-art systems. The causal story from root cause to mitigation is clear and the logic around I/O savings is easy to follow without hidden parameters or circular steps. The paper does well by keeping the changes localized to the synchronization layer rather than rewriting the whole store. One soft spot is that the abstract gives the headline multiplier but leaves the exact baselines, network parameters, and error bars for the full text; if those sections show consistent gains across contention levels and fair comparisons, the result holds. A minor practical question is how the contention detection overhead behaves at scale, though the adaptation mechanism is meant to handle low-contention cases. This work is for people building or tuning disaggregated storage layers in cloud settings. A reader who cares about practical I/O reduction in KV stores will find usable ideas here. It deserves peer review because the problem is real, the fix is targeted, and the claims are testable with the described experiments.

Referee Report

2 major / 2 minor

Summary. The paper claims that memory-disaggregated KV stores incur severe redundant cross-node I/Os under high concurrency because their optimistic synchronization protocols retry on conflicts, and proposes CIDER, a compute-side framework that switches to pessimistic synchronization, adds global write-combining to batch updates, and uses a contention-aware fallback to avoid pessimism overhead under low contention, yielding up to 6.6× throughput on YCSB.

Significance. If the experimental claims hold, the work supplies a concrete, deployable alternative to optimistic designs that directly targets the network bottleneck in disaggregated memory; the combination of pessimistic locking with write-combining is a practical insight that could be adopted by systems such as FaRM or HERD variants and is strengthened by the reproducible YCSB evaluation.

major comments (2)

[§4] §4.1–4.3 and Figure 7: the reported 6.6× throughput gain is presented without naming the exact baseline implementations, the precise YCSB workload mix (read/write ratio, key distribution), number of runs, or error bars; without these the central performance claim cannot be fully assessed.
[§3.2] §3.2, Algorithm 1: the global write-combining logic is described at a high level but lacks a proof or invariant showing that it eliminates the redundant I/Os identified in §2 without introducing new ordering or consistency violations under the assumed DM model.

minor comments (2)

[Abstract] Abstract: the phrase 'state-of-the-art memory-disaggregated KV stores' should list the concrete systems (e.g., 'FaRM, HERD, and KVell') to allow readers to map the 6.6× claim immediately.
[§2.2] §2.2: the definition of 'redundant I/O' is informal; a short equation or pseudocode quantifying the extra round-trips per conflict would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. We address the two major comments below, providing clarifications and committing to improvements in the revised manuscript.

read point-by-point responses

Referee: [§4] §4.1–4.3 and Figure 7: the reported 6.6× throughput gain is presented without naming the exact baseline implementations, the precise YCSB workload mix (read/write ratio, key distribution), number of runs, or error bars; without these the central performance claim cannot be fully assessed.

Authors: We agree that these experimental details are essential for full assessment and reproducibility. In the revised version we will explicitly name the baseline implementations (the specific state-of-the-art optimistic memory-disaggregated KV stores), state the exact YCSB parameters (50/50 read/write ratio, Zipfian key distribution with the reported skew), report that all results are averages over 5 independent runs, and add error bars to Figure 7. These additions will be incorporated without altering any performance numbers. revision: yes
Referee: [§3.2] §3.2, Algorithm 1: the global write-combining logic is described at a high level but lacks a proof or invariant showing that it eliminates the redundant I/Os identified in §2 without introducing new ordering or consistency violations under the assumed DM model.

Authors: We acknowledge that a more explicit invariant would strengthen the section. Under the DM model the memory pool is passive and all operations are performed via RDMA; the global write-combiner serializes updates to each key at the compute side before issuing a single write, thereby removing the redundant read-modify-write sequences that arise from optimistic retries. Because pessimistic locks already enforce mutual exclusion and the combiner produces a total order on writes to the same key, no new ordering or consistency violations are introduced. We will add a short paragraph in §3.2 stating this invariant and its relation to the DM assumptions. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is an experimental systems contribution. It identifies a root cause (optimistic synchronization generating redundant cross-node I/Os under high concurrency on disaggregated memory), proposes CIDER with pessimistic synchronization plus global write-combining and contention-aware fallback, and validates the design via YCSB throughput measurements (up to 6.6×). No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations appear in the abstract or described derivation. The central claims rest on external benchmark results rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces CIDER as a new optimization framework but does not explicitly list free parameters, axioms, or invented entities; it relies on standard distributed-systems assumptions about network I/O costs and contention behavior.

pith-pipeline@v0.9.0 · 5484 in / 1171 out tokens · 40519 ms · 2026-05-13T18:19:19.584361+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages

[1]

TATP Benchmark

2025. TATP Benchmark. https://tatpbenchmark.sourceforge.net/. Accessed: 2025

work page 2025
[2]

Aguilera, Naama Ben-David, Rachid Guerraoui, Antoine Murat, Athanasios Xygkis, and Igor Zablotchi

Marcos K. Aguilera, Naama Ben-David, Rachid Guerraoui, Antoine Murat, Athanasios Xygkis, and Igor Zablotchi. 2023. uBFT: Microsecond-Scale BFT using Disaggregated Memory. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vol- ume 2, ASPLOS 2023, Vancouver, BC, Canada, March 25-...

work page doi:10.1145/3575693.3575732 2023
[3]

Aguilera, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker

Emmanuel Amaro, Christopher Branner-Augmon, Zhihong Luo, Amy Ouster- hout, Marcos K. Aguilera, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker

work page
[4]

InEuroSys ’20: Fifteenth Eu- roSys Conference 2020, Heraklion, Greece, April 27-30, 2020

Can far memory improve job throughput?. InEuroSys ’20: Fifteenth Eu- roSys Conference 2020, Heraklion, Greece, April 27-30, 2020. ACM, 14:1–14:16. https://doi.org/10.1145/3342195.3387522

work page doi:10.1145/3342195.3387522 2020
[5]

Aguilera

Emmanuel Amaro, Stephanie Wang, Aurojit Panda, and Marcos K. Aguilera. 2023. Logical Memory Pools: Flexible and Local Disaggregated Memory. InProceedings of the 22nd ACM Workshop on Hot Topics in Networks, HotNets 2023, Cambridge, MA, USA, November 28-29, 2023. ACM, 25–32. https://doi.org/10.1145/3626111. 3628201

work page doi:10.1145/3626111 2023
[6]

Hang An, Fang Wang, Dan Feng, Xiaomin Zou, Zefeng Liu, and Jianshun Zhang

work page
[7]

InProceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023

Marlin: A Concurrent and Write-Optimized B+-tree Index on Disaggregated Memory. InProceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023. ACM, 695–704. https: //doi.org/10.1145/3605573.3605576

work page doi:10.1145/3605573.3605576 2023
[8]

Accessed: 2025

InfiniBand Trade Association. Accessed: 2025. Enabling the Modern Data Center – RDMA for the Enterprise. https://www.infinibandta.org

work page 2025
[9]

Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny

work page
[10]

InProceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems

Workload analysis of a large-scale key-value store. InProceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems. 53–64

work page
[11]

Shai Bergman, Priyank Faldu, Boris Grot, Lluís Vilanova, and Mark Silber- stein. 2022. Reconsidering OS memory optimizations in the presence of dis- aggregated memory. InISMM ’22: ACM SIGPLAN International Symposium on Memory Management, San Diego, CA, USA, 14 June 2022. ACM, 1–14. https://doi.org/10.1145/3520263.3534650

work page doi:10.1145/3520263.3534650 2022
[12]

Rethinking software runtimes for disaggregated memory,

Irina Calciu, M. Talha Imran, Ivan Puddu, Sanidhya Kashyap, Hasan Al Maruf, Onur Mutlu, and Aasheesh Kolli. 2021. Rethinking software runtimes for disaggre- gated memory. InASPLOS ’21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19-23, 2021, Tim Sherwood, Emery D. Be...

work page doi:10.1145/3445814.3446713 2021
[13]

Lei Chen, Shi Liu, Chenxi Wang, Haoran Ma, Yifan Qiao, Zhe Wang, Chenggang Wu, Youyou Lu, Xiaobing Feng, Huimin Cui, Shan Lu, and Harry Xu. 2024. A Tale of Two Paths: Toward a Hybrid Data Plane for Efficient Far-Memory Applications. In18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. U...

work page 2024
[14]

Zhangyu Chen, Yu Hua, Bo Ding, and Pengfei Zuo. 2020. Lock-free Concurrent Level Hashing for Persistent Memory. InProceedings of the 2020 USENIX Annual Technical Conference, USENIX ATC 2020, July 15-17, 2020. USENIX Association, 799–812. https://www.usenix.org/conference/atc20/presentation/chen

work page 2020
[15]

Dah-Ming Chiu and Raj Jain. 1989. Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks.Comput. Networks 17 (1989), 1–14. https://doi.org/10.1016/0169-7552(89)90019-6

work page doi:10.1016/0169-7552(89)90019-6 1989
[16]

Accessed: 2025

CXL Consortium. Accessed: 2025. Compute Express Link. https://www.comput eexpresslink.org

work page 2025
[17]

Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears

Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. InProceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10-11, 2010. ACM, 143–154. https://doi.org/10.1145/1807128.1807152

work page doi:10.1145/1807128.1807152 2010
[18]

Accessed: 2025

NVIDIA Corporation. Accessed: 2025. Advanced Transport. https://docs.nvidia. com/networking/display/ofedv502180/advanced+transport

work page 2025
[19]

Ananth Devulapalli and Pete Wyckoff. 2005. Distributed Queue-Based Locking Using Advanced Network Features. In34th International Conference on Parallel Processing (ICPP 2005), 14-17 June 2005, Oslo, Norway. IEEE Computer Society, 408–415. https://doi.org/10.1109/ICPP.2005.34

work page doi:10.1109/icpp.2005.34 2005
[20]

Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuang-Ching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The Design and Operation of CloudLab. In2019 USENIX Annual T...

work page 2019
[21]

Andersen, and Michael Kaminsky

Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013, Lombard, IL, USA, April 2-5, 2013. USENIX Association, 371–384. https://www.usenix.org/conference/nsdi13/technical-ses...

work page 2013
[22]

Jian Gao, Qing Wang, and Jiwu Shu. 2025. ShiftLock: Mitigate One-sided RDMA Lock Contention via Handover. In23rd USENIX Conference on File and Storage Technologies, FAST 2025, Santa Clara, CA, February 25-27, 2025. USENIX Associa- tion, 355–372. https://www.usenix.org/conference/fast25/presentation/gao

work page 2025
[23]

Zhiyuan Guo, Yizhou Shan, Xuhao Luo, Yutong Huang, and Yiying Zhang. 2022. Clio: a hardware-software co-designed disaggregated memory system. InASPLOS ’22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022. ACM, 417–433. https://doi.org/10.1145...

work page doi:10.1145/3503222.3507762 2022
[24]

Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, and Myoungsoo Jung. 2023. CXL-ANNS: Software-Hardware Collaborative Memory Disaggregation and Computation for Billion-Scale Approximate Nearest Neighbor Search. InProceedings of the 2023 USENIX Annual Technical Conference, USENIX ATC 2023, Boston, MA, USA, July 10-12, 2023. USENIX As...

work page 2023
[25]

https://www.usenix.org/conference/atc23/presentation/jang

work page
[26]

Seung-Seob Lee, Yanpeng Yu, Yupeng Tang, Anurag Khandelwal, Lin Zhong, and Abhishek Bhattacharjee. 2021. MIND: In-Network Memory Management for Disaggregated Data Centers. InSOSP ’21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29,

work page 2021
[27]

https://doi.org/10.1145/3477132.3483561

ACM, 488–504. https://doi.org/10.1145/3477132.3483561

work page doi:10.1145/3477132.3483561
[28]

Aguilera, Kimberly Keeton, and Vijay Chidambaram

Se Kwon Lee, Soujanya Ponnapalli, Sharad Singhal, Marcos K. Aguilera, Kimberly Keeton, and Vijay Chidambaram. 2022. DINOMO: An Elastic, Scalable, High- Performance Key-Value Store for Disaggregated Persistent Memory.Proc. VLDB Endow.15, 13 (2022), 4023–4037. https://www.vldb.org/pvldb/vol15/p4023- lee.pdf

work page 2022
[29]

Pengfei Li, Yu Hua, Pengfei Zuo, Zhangyu Chen, and Jiajie Sheng. 2023. ROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory Systems. In21st USENIX Conference on File and Storage Technologies, FAST 2023, Santa Clara, CA, USA, February 21-23, 2023. USENIX Association, 99–114. https://www.usenix.org/conference/fast23/presentation/...

work page 2023
[30]

Lyu, and Yang- fan Zhou

Xuchuan Luo, Jiacheng Shen, Pengfei Zuo, Xin Wang, Michael R. Lyu, and Yang- fan Zhou. 2024. CHIME: A Cache-Efficient and High-Performance Hybrid Index on Disaggregated Memory. InProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, SOSP 2024, Austin, TX, USA, November 4-6, 2024. ACM, 110–126. https://doi.org/10.1145/3694715.3695959

work page doi:10.1145/3694715.3695959 2024
[31]

Lyu, and Yangfan Zhou

Xuchuan Luo, Pengfei Zuo, Jiacheng Shen, Jiazhen Gu, Xin Wang, Michael R. Lyu, and Yangfan Zhou. 2023. SMART: A High-Performance Adaptive Radix Tree for Disaggregated Memory. In17th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2023, Boston, MA, USA, July 10-12, 2023. USENIX Association, 553–571. https://www.usenix.org/conference/o...

work page 2023
[32]

Lynch and A.A

N.A. Lynch and A.A. Shvartsman. 1997. Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. InProceedings of IEEE 27th International Symposium on Fault Tolerant Computing. 272–281

work page 1997
[33]

Haoran Ma, Yifan Qiao, Shi Liu, Shan Yu, Yuanjiang Ni, Qingda Lu, Jiesheng Wu, Yiying Zhang, Miryung Kim, and Harry Xu. 2024. DRust: Language-Guided Distributed Shared Memory with Fine Granularity, Full Transparency, and Ultra Efficiency. In18th USENIX Symposium on Operating Systems Design and Imple- mentation, OSDI 2024, Santa Clara, CA, USA, July 10-12,...

work page 2024
[34]

Waldspurger

Hasan Al Maruf, Yuhong Zhong, Hongyi Wang, Mosharaf Chowdhury, Asaf Cidon, and Carl A. Waldspurger. 2023. Memtrade: Marketplace for Disaggregated Memory Clouds.Proc. ACM Meas. Anal. Comput. Syst.7, 2 (2023), 41:1–41:27. https://doi.org/10.1145/3589985

work page doi:10.1145/3589985 2023
[35]

Mellor-Crummey and Michael L

John M. Mellor-Crummey and Michael L. Scott. 1991. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors.ACM Trans. Comput. Syst. 9, 1 (1991), 21–65. https://doi.org/10.1145/103727.103729

work page doi:10.1145/103727.103729 1991
[36]

Memcached Development Team. 2025. Memcached: a distributed memory object caching system. https://memcached.org/. Accessed: 2025

work page 2025
[37]

Xinhao Min, Kai Lu, Pengyu Liu, Jiguang Wan, Changsheng Xie, Daohui Wang, Ting Yao, and Huatao Wu. 2024. SepHash: A Write-Optimized Hash Index On Disaggregated Memory via Separate Segment Structure.Proc. VLDB Endow.17, 5 (2024), 1091–1104. https://www.vldb.org/pvldb/vol17/p1091-lu.pdf

work page 2024
[38]

Sumit Kumar Monga, Sanidhya Kashyap, and Changwoo Min. 2021. Birds of a Feather Flock Together: Scaling RDMA RPCs with Flock. InSOSP ’21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021. ACM, 212–227. https://doi.org/10.1145/3477132. 3483576

work page doi:10.1145/3477132 2021
[39]

Marnidala, Abhinav Vishnu, Karthikeyan Vaidyanathan, and Dhabaleswar K

Sundeep Narravula, A. Marnidala, Abhinav Vishnu, Karthikeyan Vaidyanathan, and Dhabaleswar K. Panda. 2007. High Performance Distributed Lock Man- agement Services using Network-based Remote Atomic Operations. InSev- enth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 14-17 May 2007, Rio de Janeiro, Brazil. IEEE Computer Soci...

work page doi:10.1109/ccgrid.2007.58 2007
[40]

Vlad Nitu, Boris Teabe, Alain Tchana, Canturk Isci, and Daniel Hagimont. 2018. Welcome to zombieland: practical and energy-efficient memory disaggregation in a datacenter. InProceedings of the Thirteenth EuroSys Conference, EuroSys 2018, Porto, Portugal, April 23-26, 2018. ACM, 16:1–16:12. https://doi.org/10.1145/ 3190508.3190537

work page arXiv 2018
[41]

Feng Ren, Mingxing Zhang, Kang Chen, Huaxia Xia, Zuoning Chen, and Yongwei Wu. 2024. Scaling Up Memory Disaggregated Applications with SMART. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2024, La Jolla, CA, USA, 27 April 2024- 1 May 2024. ACM, 351–367. ht...

work page arXiv 2024
[42]

Aguilera, and Adam Belay

Zhenyuan Ruan, Malte Schwarzkopf, Marcos K. Aguilera, and Adam Belay

work page
[43]

In14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020

AIFM: High-Performance, Application-Integrated Far Memory. In14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020. USENIX Association, 315–332. https: //www.usenix.org/conference/osdi20/presentation/ruan

work page 2020
[44]

Salvatore Sanfilippo and Redis Ltd. 2025. Redis. https://redis.io. Accessed: 2025

work page 2025
[45]

Michael L. Scott. 2013.Shared-Memory Synchronization. Morgan & Claypool Publishers. https://doi.org/10.2200/S00499ED1V01Y201304CAC023

work page doi:10.2200/s00499ed1v01y201304cac023 2013
[46]

Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. 2018. LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation. In13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018. USENIX Association, 69–87. https://www.usenix.org/conference/osdi18/presentation/shan

work page 2018
[47]

Jiacheng Shen, Pengfei Zuo, Xuchuan Luo, Yuxin Su, Jiazhen Gu, Hao Feng, Yangfan Zhou, and Michael R. Lyu. 2023. Ditto: An Elastic and Adaptive Memory- Disaggregated Caching System. InProceedings of the 29th Symposium on Operat- ing Systems Principles, SOSP 2023, Koblenz, Germany, October 23-26, 2023. ACM, 675–691. https://doi.org/10.1145/3600006.3613144

work page doi:10.1145/3600006.3613144 2023
[48]

Jiacheng Shen, Pengfei Zuo, Xuchuan Luo, Tianyi Yang, Yuxin Su, Yangfan Zhou, and Michael R. Lyu. 2023. FUSEE: A Fully Memory-Disaggregated Key- Value Store. In21st USENIX Conference on File and Storage Technologies, FAST 2023, Santa Clara, CA, USA, February 21-23, 2023. USENIX Association, 81–98. https://www.usenix.org/conference/fast23/presentation/shen

work page 2023
[49]

Transaction Processing Performance Council. 2025. TPC Benchmark C (TPC-C). https://www.tpc.org/tpcc/. Accessed: 2025

work page 2025
[50]

Lluís Vilanova, Lina Maudlej, Shai Bergman, Till Miemietz, Matthias Hille, Nils Asmussen, Michael Roitzsch, Hermann Härtig, and Mark Silberstein. 2022. Slash- ing the disaggregation tax in heterogeneous data centers with FractOS. InEuroSys ’22: Seventeenth European Conference on Computer Systems, Rennes, France, April 5 - 8, 2022. ACM, 352–367. https://do...

work page doi:10.1145/3492321.3519569 2022
[51]

Bond, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu

Chenxi Wang, Haoran Ma, Shi Liu, Yuanqi Li, Zhenyuan Ruan, Khanh Nguyen, Michael D. Bond, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2020. Semeru: A Memory-Disaggregated Managed Runtime. In14th USENIX Sympo- sium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020. USENIX Association, 261–280. https://www.u...

work page 2020
[52]

Chenxi Wang, Haoran Ma, Shi Liu, Yifan Qiao, Jonathan Eyolfson, Christian Navasca, Shan Lu, and Guoqing Harry Xu. 2022. MemLiner: Lining up Tracing and Application for a Far-Memory-Friendly Runtime. In16th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2022, Carlsbad, CA, USA, July 11-13, 2022. USENIX Association, 35–53. https://www...

work page 2022
[53]

Qing Wang, Youyou Lu, and Jiwu Shu. 2022. Sherman: A Write-Optimized Dis- tributed B+Tree Index on Disaggregated Memory. InSIGMOD ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 1033–1048. https://doi.org/10.1145/3514221.3517824

work page doi:10.1145/3514221.3517824 2022
[54]

Qing Wang, Youyou Lu, and Jiwu Shu. 2025. Designing an Efficient Tree Index on Disaggregated Memory.Commun. ACM68, 05 (2025), 92–100. doi: 10.1145/ 3709647

work page 2025
[55]

Qing Wang, Youyou Lu, Erci Xu, Junru Li, Youmin Chen, and Jiwu Shu. 2021. Concordia: Distributed Shared Memory with In-Network Cache Coherence. In 19th USENIX Conference on File and Storage Technologies, FAST 2021, February 23-25, 2021. USENIX Association, 277–292. https://www.usenix.org/conference/ fast21/presentation/wang

work page 2021
[56]

Xingda Wei, Zhiyuan Dong, Rong Chen, and Haibo Chen. 2018. Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better!. In13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018. USENIX Association, 233–251. https://www.usen ix.org/conference/osdi18/presentation/wei

work page 2018
[57]

Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. InProceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, October 4-7, 2015. ACM, 87–104. https://doi.org/10.1145/2815400.2815419

work page doi:10.1145/2815400.2815419 2015
[58]

Juncheng Yang, Yao Yue, and KV Rashmi. 2020. A large scale analysis of hundreds of in-memory cache clusters at Twitter. In14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 191–208

work page 2020
[59]

Dong Young Yoon, Mosharaf Chowdhury, and Barzan Mozafari. 2018. Dis- tributed Lock Management with RDMA: Decentralization without Starvation. InProceedings of the 2018 International Conference on Management of Data, SIG- MOD Conference 2018, Houston, TX, USA, June 10-15, 2018. ACM, 1571–1586. https://doi.org/10.1145/3183713.3196890

work page doi:10.1145/3183713.3196890 2018
[60]

Zhuolong Yu, Yiwen Zhang, Vladimir Braverman, Mosharaf Chowdhury, and Xin Jin. 2020. NetLock: Fast, Centralized Lock Management Using Programmable Switches. InSIGCOMM ’20: Proceedings of the 2020 Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communicat...

work page doi:10.1145/3387514.3405857 2020
[61]

Daniel Zahka and Ada Gavrilovska. 2022. FAM-Graph: Graph Analytics on Disaggregated Memory. In2022 IEEE International Parallel and Distributed Pro- cessing Symposium, IPDPS 2022, Lyon, France, May 30 - June 3, 2022. IEEE, 81–92. https://doi.org/10.1109/IPDPS53621.2022.00017

work page doi:10.1109/ipdps53621.2022.00017 2022
[62]

Hanze Zhang, Ke Cheng, Rong Chen, and Haibo Chen. 2024. Fast and Scalable In-network Lock Management Using Lock Fission. In18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. USENIX Association, 251–268. https://www.usenix.org/c onference/osdi24/presentation/zhang-hanze

work page 2024
[63]

Ming Zhang, Yu Hua, and Zhijun Yang. 2024. Motor: Enabling Multi-Versioning for Distributed Transactions on Disaggregated Memory. In18th USENIX Sym- posium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. USENIX Association, 801–819. https: //www.usenix.org/conference/osdi24/presentation/zhang-ming

work page 2024
[64]

Ming Zhang, Yu Hua, Pengfei Zuo, and Lurong Liu. 2022. FORD: Fast One-sided RDMA-based Distributed Transactions for Disaggregated Persistent Memory. In 20th USENIX Conference on File and Storage Technologies, FAST 2022, Santa Clara, CA, USA, February 22-24, 2022. USENIX Association, 51–68. https://www.usen ix.org/conference/fast22/presentation/zhang-ming

work page 2022
[65]

Bernstein, Daniel S

Qizhen Zhang, Philip A. Bernstein, Daniel S. Berger, and Badrish Chandramouli

work page
[66]

VLDB Endow.15, 4 (2021), 766–779

Redy: Remote Dynamic Memory Cache.Proc. VLDB Endow.15, 4 (2021), 766–779. https://www.vldb.org/pvldb/vol15/p766-zhang.pdf

work page 2021
[67]

Qizheni Zhang, Yifan Cai, Sebastian Angel, Vincent Liu, Ang Chen, and Boon Thau Loo. 2020. Rethinking Data Management Systems for Disaggre- gated Data Centers. In10th Conference on Innovative Data Systems Research, CIDR 2020, Amsterdam, The Netherlands, January 12-15, 2020, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2020/papers/p6-zhang-cidr20.pdf

work page 2020
[68]

Qizhen Zhang, Xinyi Chen, Sidharth Sankhe, Zhilei Zheng, Ke Zhong, Sebastian Angel, Ang Chen, Vincent Liu, and Boon Thau Loo. 2022. Optimizing Data- intensive Systems in Disaggregated Data Centers with TELEPORT. InSIGMOD ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 1345–1359. https://doi.org/10.1145/...

work page doi:10.1145/3514221.3517856 2022
[69]

Pengfei Zuo, Jiazhao Sun, Liu Yang, Shuangwu Zhang, and Yu Hua. 2021. One- sided RDMA-Conscious Extendible Hashing for Disaggregated Memory. InPro- ceedings of the 2021 USENIX Annual Technical Conference, USENIX ATC 2021, July 14-16, 2021. USENIX Association, 15–29. https://www.usenix.org/conference/at c21/presentation/zuo 6K 60K 600K 6M 60M Array Size 0 ...

work page 2021