Leveraging I/O Stalls for Efficient Scheduling in ANNS
Pith reviewed 2026-05-20 02:43 UTC · model grok-4.3
The pith
LIOS runs ANNS index updates inside search-thread I/O stalls to deliver up to 2.68× faster insertions while holding latency near a user target.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a scheduling framework called LIOS can execute index updates by filling the I/O stall intervals already present in search threads. It achieves this through three coordinated techniques: breaking updates into resumable subtasks that fit inside individual stall windows, limiting the expected overrun of any subtask to a chosen bound, and continuously adjusting the fraction of available stall time allocated to updates so that end-to-end search latency degradation converges on the user-specified target. When applied to two existing update-optimized disk-based ANNS systems, the approach yields insertion speedups of up to 2.68× and deletion speedups of up to 2.18× while保持着
What carries the argument
LIOS framework that splits updates into resumable subtasks sized for single I/O stalls, bounds expected overruns, and dynamically adjusts the idle-time fraction allocated to updates.
If this is right
- Search threads become a source of free CPU cycles for background index maintenance without extra thread creation.
- Update throughput rises while search latency degradation is kept near a configurable target rather than left uncontrolled.
- Existing update-optimized ANNS systems can adopt the scheduling logic with only modest internal changes.
- The fraction of stall time given to updates self-tunes across varying query and update mixes.
Where Pith is reading between the lines
- The same stall-filling idea could be tested on other I/O-heavy services such as key-value stores or log-structured merge trees where query threads also block on disk.
- If stall durations prove more variable than assumed, the overrun-bounding technique might need an online estimator of recent stall lengths.
- Applying the method to compaction or garbage-collection tasks inside the same ANNS index could further reduce write amplification.
Load-bearing premise
Every update operation can be divided into small enough resumable pieces that reliably fit inside one I/O stall window while still allowing the expected overrun to be bounded to a chosen threshold.
What would settle it
Run the same workload on the modified systems but with updates that cannot be split into subtasks small enough to fit typical stall windows; check whether insertion and deletion speedups remain above 2× and whether search latency stays within the user-specified degradation bound.
Figures
read the original abstract
Disk-based graph indexes for approximate nearest neighbor search (ANNS) must serve latency-sensitive queries and throughput-demanding updates concurrently. We observe that over 40% of search-thread CPU time is spent stalling on disk I/O; such idle cycles are invisible to thread-level scheduling yet available for other work. We present LIOS(Leverage I/O Stall), a framework that executes index updates inside search-side I/O stall windows. LIOS introduces three techniques: (i) splitting each update into resumable subtasks small enough to fit within a single stall window; (ii) bounding the expected overrun of update subtasks to a given threshold; and (iii) dynamically adjusting the fraction of idle time devoted to updates to drive end-to-end search latency degradation toward a user-specified target. We integrate LIOS into two update-optimized ANNS systems, FreshDiskANN and OdinANN. LIOS achieves speedups of up to 2.68$\times$ in insertion and 2.18$\times$ in deletion, with search latency degradation maintained near the user-specified target.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces LIOS, a framework for disk-based graph ANNS indexes that exploits I/O stall time in search threads to execute concurrent updates. It presents three techniques—splitting updates into resumable subtasks, bounding expected overrun via statistical estimators, and a feedback controller for dynamic idle-time allocation—and integrates them into FreshDiskANN and OdinANN. The central empirical claim is that these mechanisms deliver up to 2.68× insertion and 2.18× deletion speedups while keeping search latency degradation near a user-specified target.
Significance. If the reported speedups hold under broader workloads, the work provides a practical method for increasing update throughput in latency-sensitive disk-based ANNS systems by reclaiming otherwise idle CPU cycles. Direct integration and measurement on two production-oriented systems, rather than simulation or modeling, strengthens the evidence. The approach treats the 40 % stall observation as an empirical input rather than a universal premise, which is appropriate.
major comments (1)
- The load-bearing assumption that every update can be decomposed into resumable subtasks whose execution times fit inside typical I/O stall windows (and whose overrun can still be bounded) is only partially supported by the checkpointing description. A quantitative characterization of subtask size distribution or worst-case overrun probability would be needed to confirm that the target latency degradation remains controllable across varying graph densities.
minor comments (3)
- The abstract omits workload descriptions, baseline systems, and any mention of variance or number of runs; adding one sentence with these details would make the performance claims immediately assessable.
- Figure captions and axis labels for the latency-degradation plots should explicitly state the chosen overrun threshold and target degradation values used in each experiment.
- A short related-work paragraph contrasting LIOS with prior I/O-aware or stall-utilization schedulers in databases or storage systems would help readers situate the contribution.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and recommendation for minor revision. We address the single major comment below.
read point-by-point responses
-
Referee: The load-bearing assumption that every update can be decomposed into resumable subtasks whose execution times fit inside typical I/O stall windows (and whose overrun can still be bounded) is only partially supported by the checkpointing description. A quantitative characterization of subtask size distribution or worst-case overrun probability would be needed to confirm that the target latency degradation remains controllable across varying graph densities.
Authors: We agree that additional quantitative support would strengthen the manuscript. Section 4 describes the decomposition of updates into resumable subtasks at natural boundaries (e.g., per-edge or per-vector operations) chosen to fit within observed I/O stalls, together with the statistical overrun estimator. To directly address the concern, we will add an empirical characterization in the revision: subtask execution-time distributions measured across graphs with varying densities (average degree 20–200), plus the resulting overrun probabilities under the bounding estimator. This will confirm controllability of latency degradation for the reported workloads. revision: yes
Circularity Check
No significant circularity
full rationale
The paper is an empirical systems contribution that observes I/O stall behavior in disk-based ANNS indexes (an input measurement) and implements three concrete scheduling techniques inside existing systems (FreshDiskANN and OdinANN). Reported speedups are obtained from direct integration and runtime measurements rather than from any model-derived prediction, fitted parameter, or first-principles derivation. No equations, uniqueness theorems, or ansatzes appear in the provided text that reduce to self-definition, self-citation chains, or renaming of known results. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
free parameters (2)
- overrun threshold
- target latency degradation
axioms (1)
- domain assumption Search threads spend over 40% of CPU time stalled on disk I/O
Reference graph
Works this paper leans on
-
[1]
Approximate nearest neighbor queries in fixed dimensions
Sunil Arya and David M Mount. Approximate nearest neighbor queries in fixed dimensions. InACM-SIAM Symposium on Discrete Algorithms (SODA), 1993
work page 1993
-
[2]
Efficient index- ing of billion-scale datasets of deep descriptors
Artem Babenko and Victor Lempitsky. Efficient index- ing of billion-scale datasets of deep descriptors. InIEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2016
work page 2016
-
[3]
SPANN: Highly-efficient billion-scale approxi- mate nearest neighborhood search
Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. SPANN: Highly-efficient billion-scale approxi- mate nearest neighborhood search. InAdvances in Neu- ral Information Processing Systems (NeurIPS), 2021
work page 2021
-
[4]
Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. Fast approximate nearest neighbor search with the navi- gating spreading-out graph.Proceedings of the VLDB Endowment (PVLDB), 2019
work page 2019
-
[5]
FreshDiskANN baseline imple- mentation
g4197. FreshDiskANN baseline imple- mentation. https://github.com/g4197/ FreshDiskANN-baseline, 2024. Commit8ea2f4d
work page 2024
-
[6]
Jianyang Gao and Cheng Long. RaBitQ: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search. InACM SIG- MOD International Conference on Management of Data (SIGMOD), 2024
work page 2024
-
[7]
Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD
Hao Guo and Youyou Lu. Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD. InUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2025
work page 2025
-
[8]
Hao Guo and Youyou Lu. PipeANN implementa- tion. https://github.com/thustorage/PipeANN,
-
[9]
Hao Guo and Youyou Lu. OdinANN: Direct insert for consistently stable performance in billion-scale graph- based vector search. InUSENIX Conference on File and Storage Technologies (FAST), 2026
work page 2026
-
[10]
Ac- celerating large-scale inference with anisotropic vector quantization
Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. Ac- celerating large-scale inference with anisotropic vector quantization. InInternational Conference on Machine Learning (ICML), 2020
work page 2020
-
[11]
DiskANN: Fast accurate billion-point nearest neighbor search on a single node
Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vard- han Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. DiskANN: Fast accurate billion-point nearest neighbor search on a single node. InAdvances in Neural Information Processing Systems (NeurIPS), 2019
work page 2019
-
[12]
Herve Jegou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2010
work page 2010
-
[13]
Searching in one billion vectors: re-rank with source coding
Hervé Jégou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. InIEEE International Con- ference on Acoustics, Speech, and Signal Processing (ICASSP), 2011
work page 2011
-
[14]
Billion- scale similarity search with gpus.IEEE Transactions on Big Data, 2019
Jeff Johnson, Matthijs Douze, and Hervé Jégou. Billion- scale similarity search with gpus.IEEE Transactions on Big Data, 2019
work page 2019
-
[15]
I/o passthru: Upstreaming a flexible and efficient i/o path in linux
Kanchan Joshi, Anuj Gupta, Javier González, Ankit Ku- mar, Krishna Kanth Reddy, Arun George, Simon Lund, and Jens Axboe. I/o passthru: Upstreaming a flexible and efficient i/o path in linux. InUSENIX Conference on File and Storage Technologies (FAST), 2024
work page 2024
-
[16]
Dingyi Kang, Dongming Jiang, Hanshen Yang, Hang Liu, and Bingzhe Li. Scalable disk-based approximate nearest neighbor search with page-aligned graph.arXiv preprint arXiv:2509.25487, 2025
-
[17]
Retrieval-augmented generation for knowledge- intensive NLP tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge- intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS), 2020
work page 2020
-
[18]
Liang Li, Shufeng Gong, Yanan Yang, Yiduo Wang, and Jie Wu. I/o optimizations for graph-based disk- resident approximate nearest neighbor search: A design space exploration.Proceedings of the VLDB Endow- ment (PVLDB), 2026
work page 2026
-
[19]
Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest neighbor search using hierar- chical navigable small world graphs.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018. 13
work page 2018
-
[20]
SPACEV1B: A billion-scale vector dataset for text descriptors
Microsoft. SPACEV1B: A billion-scale vector dataset for text descriptors. https: //github.com/microsoft/SPTAG/tree/master/ datasets/SPACEV1B, 2020
work page 2020
-
[21]
Emanuel Parzen. On estimation of a probability den- sity function and mode.The Annals of Mathematical Statistics, 1962
work page 1962
-
[22]
Aditi Singh, Suhas Jayaram Subramanya, Ravis- hankar Krishnaswamy, and Harsha Vardhan Simhadri. FreshDiskANN: A fast and accurate graph-based ann index for streaming similarity search.arXiv preprint arXiv:2105.09613, 2021
-
[23]
Soar: Improved indexing for approx- imate nearest neighbor search
Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, and Sanjiv Kumar. Soar: Improved indexing for approx- imate nearest neighbor search. InAdvances in Neural Information Processing Systems (NeurIPS), 2023
work page 2023
-
[24]
Mengzhao Wang, Weizhi Xu, Xiaomeng Yi, Songlin Wu, Zhangyang Peng, Xiangyu Ke, Yunjun Gao, Xiao- liang Xu, Rentong Guo, and Charles Xie. Starling: An I/O-efficient disk-resident graph index framework for high-dimensional vector similarity search on data seg- ment. InACM SIGMOD International Conference on Management of Data (SIGMOD), 2024
work page 2024
-
[25]
Mengzhao Wang, Xiaoliang Xu, Qiang Yue, and Yuxi- ang Wang. A comprehensive survey and experimental comparison of graph-based approximate nearest neigh- bor search.Proceedings of the VLDB Endowment (PVLDB), 2021
work page 2021
-
[26]
In-place updates of a graph in- dex for streaming approximate nearest neighbor search
Haike Xu, Magdalen Dobson Manohar, Philip A Bern- stein, Badrish Chandramouli, Richard Wen, and Har- sha Vardhan Simhadri. In-place updates of a graph in- dex for streaming approximate nearest neighbor search. arXiv preprint arXiv:2502.13826, 2025
-
[27]
SPFresh: Incremental in-place update for billion-scale vector search
Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen, Qianxi Zhang, Cheng Li, Ziyue Yang, Fan Yang, Yuqing Yang, et al. SPFresh: Incremental in-place update for billion-scale vector search. InACM Symposium on Op- erating Systems Principles (SOSP), 2023
work page 2023
-
[28]
Gorgeous: Revisiting the data layout for disk-resident high-dimensional vector search
Peiqi Yin, Xiao Yan, Qihui Zhou, Hui Li, Xiaolu Li, Lin Zhang, Meiling Wang, Xin Yao, and James Cheng. Gorgeous: Revisiting the data layout for disk-resident high-dimensional vector search. InACM SIGMOD In- ternational Conference on Management of Data (SIG- MOD), 2026
work page 2026
-
[29]
Weichen Zhao, Yuncheng Lu, Yao Tian, Hao Zhang, Jiehui Li, Minghao Zhao, Yakun Li, and Weining Qian. Optimizing SSD-resident graph indexing for high-throughput vector search.arXiv preprint arXiv:2602.22805, 2026. 14 A Additional Experiments We report two additional experiments on SIFT-10M using Testbed B. We first study how the observed latency statistics...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.