pith. sign in

arxiv: 2605.19335 · v1 · pith:BJG45WGCnew · submitted 2026-05-19 · 💻 cs.DB

Leveraging I/O Stalls for Efficient Scheduling in ANNS

Pith reviewed 2026-05-20 02:43 UTC · model grok-4.3

classification 💻 cs.DB
keywords Approximate Nearest Neighbor SearchDisk-based Graph IndexesI/O StallsConcurrent UpdatesSchedulingLatency ControlFreshDiskANN
0
0 comments X

The pith

LIOS runs ANNS index updates inside search-thread I/O stalls to deliver up to 2.68× faster insertions while holding latency near a user target.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that disk-based ANNS systems waste over 40% of search-thread CPU time on disk I/O stalls that are invisible to conventional schedulers. These idle windows can be repurposed for index updates without adding threads or violating latency goals. LIOS does this by splitting each update into resumable subtasks sized to fit single stalls, bounding the expected overrun of those subtasks, and dynamically tuning the share of idle time given to updates so that overall search latency stays close to a chosen threshold. The result matters for any disk-resident ANNS index that must serve latency-sensitive queries while ingesting or deleting data at high rates, because it turns an existing waste of CPU cycles into useful work. When integrated into FreshDiskANN and OdinANN, the method produces the reported speedups while keeping measured latency degradation near the target.

Core claim

The central claim is that a scheduling framework called LIOS can execute index updates by filling the I/O stall intervals already present in search threads. It achieves this through three coordinated techniques: breaking updates into resumable subtasks that fit inside individual stall windows, limiting the expected overrun of any subtask to a chosen bound, and continuously adjusting the fraction of available stall time allocated to updates so that end-to-end search latency degradation converges on the user-specified target. When applied to two existing update-optimized disk-based ANNS systems, the approach yields insertion speedups of up to 2.68× and deletion speedups of up to 2.18× while保持着

What carries the argument

LIOS framework that splits updates into resumable subtasks sized for single I/O stalls, bounds expected overruns, and dynamically adjusts the idle-time fraction allocated to updates.

If this is right

  • Search threads become a source of free CPU cycles for background index maintenance without extra thread creation.
  • Update throughput rises while search latency degradation is kept near a configurable target rather than left uncontrolled.
  • Existing update-optimized ANNS systems can adopt the scheduling logic with only modest internal changes.
  • The fraction of stall time given to updates self-tunes across varying query and update mixes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same stall-filling idea could be tested on other I/O-heavy services such as key-value stores or log-structured merge trees where query threads also block on disk.
  • If stall durations prove more variable than assumed, the overrun-bounding technique might need an online estimator of recent stall lengths.
  • Applying the method to compaction or garbage-collection tasks inside the same ANNS index could further reduce write amplification.

Load-bearing premise

Every update operation can be divided into small enough resumable pieces that reliably fit inside one I/O stall window while still allowing the expected overrun to be bounded to a chosen threshold.

What would settle it

Run the same workload on the modified systems but with updates that cannot be split into subtasks small enough to fit typical stall windows; check whether insertion and deletion speedups remain above 2× and whether search latency stays within the user-specified degradation bound.

Figures

Figures reproduced from arXiv: 2605.19335 by Juncheng Zhang, Patrick P.C. Lee, Yongkun Li, Yuanming Ren.

Figure 1
Figure 1. Figure 1: DiskANN organizes its index by storing raw vectors to￾gether with their out-neighbors on disk, while keeping a compressed representation of the vectors in memory to enable fast similarity comparisons. latency. We will open-source our LIOS prototype in the final version of the paper. 2 Background 2.1 Basics of Graph-based ANNS Systems We consider graph-based ANNS systems for vector storage in disk-based sce… view at source ↗
Figure 3
Figure 3. Figure 3: Proportion of prune and search times in the overall update time in SIFT-10M. Note that simple thread-level optimizations, such as adding search threads or reallocating cores, cannot readily reclaim this idle time. A stalled search thread mid-query occupies its CPU context and cannot be preempted with another query without incurring context-switch overhead and break￾ing query-level state. The stall is intra… view at source ↗
Figure 2
Figure 2. Figure 2: Idle ratios of BeamSearch and PipeSearch across dataset scales and index configurations. high computational cost that dominates index updates (§3.2). 3.1 Unexploited Idle Time in Search Search threads in graph-based ANNS systems spend a sub￾stantial fraction of their time waiting on disk I/O, consistently across different graph indexes and system configurations. We define the idle ratio as the fraction of … view at source ↗
Figure 4
Figure 4. Figure 4: Overview of LIOS’s co-execution design. strained by CPU computations rather than I/O. Observation #2. Index updates in graph-based ANNS systems are bottlenecked by CPU-intensive neighbor pruning, which accounts for 40.9–93.3% of total update time across both insertions and deletions. 4 LIOS Design 4.1 Overview Our observations in §3 reveal a natural complementarity: search threads spend over 40% of their C… view at source ↗
Figure 6
Figure 6. Figure 6: Temporal stability of idle-time distributions on SIFT-10M, including the KDEs of four independently sampled time windows and the aggregate of all samples. Top row: PipeSearch idle time; bottom row: BeamSearch I/O latency at #IO=4. outer loop advances to find the next non-done candidate: i = 3 is skipped (done[3] = T), and c4 is selected at i = 4 as the third neighbor and added to result. Now, we have |resu… view at source ↗
Figure 7
Figure 7. Figure 7: Exp #2 (Overrun-bounded time budgeting): SIFT-10M, Testbed B; θ=5%. Top row: BeamSearch (I/O time); bottom row: PipeSearch (idle time). Exp #2 (Overrun-bounded time budgeting) [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Exp #3 (Adaptive utilization tuning): Utilization ratio and latency overrun over time on SIFT-100M. Delete phase Merge phase Confidence Interval Na¨ıve +Intra +Est. +UTuner +Prefetch Baseline 0 20 40 Latency (ms) 37.9 8.4 8.6 7.7 7.6 7.3 Na¨ıve +Intra +Est. +UTuner +Prefetch Baseline 0 5 10 15 Latency (ms) 15.4 8.3 8.5 7.7 7.5 7.3 (a) Delete-phase latency (b) Insert-phase latency Na¨ıve +Intra +Est. +UTune… view at source ↗
Figure 9
Figure 9. Figure 9: Exp #4 (Breakdown analysis): SIFT-10M (R=32, θ=5%). Components added cumulatively from naive co-execution. (a, b) Search latency; (c, d) update speedup vs. w/o LIOS. • w/o utilization tuner: use a fixed utilization ratio instead of LIOS’s adaptive control. • w/o cache prefetching: disable prefetching of search￾critical data after each co-execution episode [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Exp #5 (Latency threshold sensitivity): SIFT-10M, Testbed B; R=96. Top row: BeamSearch; bottom row: PipeSearch. Delete Speedup Merge Speedup w/o LIOS w/ LIOS 4/28 8/24 12/20 16/16 20/12 24/8 26/6 27/5 28/4 Search/Update threads 0 1 2 Speedup 4/28 8/24 12/20 16/16 20/12 24/8 26/6 27/5 28/4 Search/Update threads 0.0 2.5 5.0 7.5 Latency (ms) 4/28 8/24 12/20 16/16 20/12 24/8 26/6 27/5 28/4 Search/Update threa… view at source ↗
Figure 11
Figure 11. Figure 11: Exp #6 (Thread configuration sensitivity): SIFT-10M, Testbed B; R=96, θ = 5%. Top: BeamSearch; bottom: PipeSearch. Left: speedup; right: mean search latency (w/o LIOS vs. w/ LIOS). exhibits the same pattern, reaching 40% and 42% at the 28/4 split. This is expected: more search threads generate more I/O idle periods, creating additional overlap opportunities for LIOS. Notably, LIOS provides meaningful spee… view at source ↗
Figure 12
Figure 12. Figure 12: shows that this added state is small: each check￾point remains at the KiB level; for example, selecting R=96 from a candidate pool of size 500 requires less than 10 KiB [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗
read the original abstract

Disk-based graph indexes for approximate nearest neighbor search (ANNS) must serve latency-sensitive queries and throughput-demanding updates concurrently. We observe that over 40% of search-thread CPU time is spent stalling on disk I/O; such idle cycles are invisible to thread-level scheduling yet available for other work. We present LIOS(Leverage I/O Stall), a framework that executes index updates inside search-side I/O stall windows. LIOS introduces three techniques: (i) splitting each update into resumable subtasks small enough to fit within a single stall window; (ii) bounding the expected overrun of update subtasks to a given threshold; and (iii) dynamically adjusting the fraction of idle time devoted to updates to drive end-to-end search latency degradation toward a user-specified target. We integrate LIOS into two update-optimized ANNS systems, FreshDiskANN and OdinANN. LIOS achieves speedups of up to 2.68$\times$ in insertion and 2.18$\times$ in deletion, with search latency degradation maintained near the user-specified target.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript introduces LIOS, a framework for disk-based graph ANNS indexes that exploits I/O stall time in search threads to execute concurrent updates. It presents three techniques—splitting updates into resumable subtasks, bounding expected overrun via statistical estimators, and a feedback controller for dynamic idle-time allocation—and integrates them into FreshDiskANN and OdinANN. The central empirical claim is that these mechanisms deliver up to 2.68× insertion and 2.18× deletion speedups while keeping search latency degradation near a user-specified target.

Significance. If the reported speedups hold under broader workloads, the work provides a practical method for increasing update throughput in latency-sensitive disk-based ANNS systems by reclaiming otherwise idle CPU cycles. Direct integration and measurement on two production-oriented systems, rather than simulation or modeling, strengthens the evidence. The approach treats the 40 % stall observation as an empirical input rather than a universal premise, which is appropriate.

major comments (1)
  1. The load-bearing assumption that every update can be decomposed into resumable subtasks whose execution times fit inside typical I/O stall windows (and whose overrun can still be bounded) is only partially supported by the checkpointing description. A quantitative characterization of subtask size distribution or worst-case overrun probability would be needed to confirm that the target latency degradation remains controllable across varying graph densities.
minor comments (3)
  1. The abstract omits workload descriptions, baseline systems, and any mention of variance or number of runs; adding one sentence with these details would make the performance claims immediately assessable.
  2. Figure captions and axis labels for the latency-degradation plots should explicitly state the chosen overrun threshold and target degradation values used in each experiment.
  3. A short related-work paragraph contrasting LIOS with prior I/O-aware or stall-utilization schedulers in databases or storage systems would help readers situate the contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: The load-bearing assumption that every update can be decomposed into resumable subtasks whose execution times fit inside typical I/O stall windows (and whose overrun can still be bounded) is only partially supported by the checkpointing description. A quantitative characterization of subtask size distribution or worst-case overrun probability would be needed to confirm that the target latency degradation remains controllable across varying graph densities.

    Authors: We agree that additional quantitative support would strengthen the manuscript. Section 4 describes the decomposition of updates into resumable subtasks at natural boundaries (e.g., per-edge or per-vector operations) chosen to fit within observed I/O stalls, together with the statistical overrun estimator. To directly address the concern, we will add an empirical characterization in the revision: subtask execution-time distributions measured across graphs with varying densities (average degree 20–200), plus the resulting overrun probabilities under the bounding estimator. This will confirm controllability of latency degradation for the reported workloads. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is an empirical systems contribution that observes I/O stall behavior in disk-based ANNS indexes (an input measurement) and implements three concrete scheduling techniques inside existing systems (FreshDiskANN and OdinANN). Reported speedups are obtained from direct integration and runtime measurements rather than from any model-derived prediction, fitted parameter, or first-principles derivation. No equations, uniqueness theorems, or ansatzes appear in the provided text that reduce to self-definition, self-citation chains, or renaming of known results. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach rests on an empirical observation about stall time and on the engineering premise that updates can be decomposed into small resumable pieces.

free parameters (2)
  • overrun threshold
    User-chosen bound on expected subtask overrun that controls how aggressively updates are scheduled.
  • target latency degradation
    User-specified search latency increase that the dynamic adjustment loop tries to approach.
axioms (1)
  • domain assumption Search threads spend over 40% of CPU time stalled on disk I/O
    Stated as the source of usable idle cycles; taken as given from the authors' measurements.

pith-pipeline@v0.9.0 · 5724 in / 1267 out tokens · 61672 ms · 2026-05-20T02:43:50.320693+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Approximate nearest neighbor queries in fixed dimensions

    Sunil Arya and David M Mount. Approximate nearest neighbor queries in fixed dimensions. InACM-SIAM Symposium on Discrete Algorithms (SODA), 1993

  2. [2]

    Efficient index- ing of billion-scale datasets of deep descriptors

    Artem Babenko and Victor Lempitsky. Efficient index- ing of billion-scale datasets of deep descriptors. InIEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2016

  3. [3]

    SPANN: Highly-efficient billion-scale approxi- mate nearest neighborhood search

    Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. SPANN: Highly-efficient billion-scale approxi- mate nearest neighborhood search. InAdvances in Neu- ral Information Processing Systems (NeurIPS), 2021

  4. [4]

    Fast approximate nearest neighbor search with the navi- gating spreading-out graph.Proceedings of the VLDB Endowment (PVLDB), 2019

    Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. Fast approximate nearest neighbor search with the navi- gating spreading-out graph.Proceedings of the VLDB Endowment (PVLDB), 2019

  5. [5]

    FreshDiskANN baseline imple- mentation

    g4197. FreshDiskANN baseline imple- mentation. https://github.com/g4197/ FreshDiskANN-baseline, 2024. Commit8ea2f4d

  6. [6]

    RaBitQ: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search

    Jianyang Gao and Cheng Long. RaBitQ: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search. InACM SIG- MOD International Conference on Management of Data (SIGMOD), 2024

  7. [7]

    Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD

    Hao Guo and Youyou Lu. Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD. InUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2025

  8. [8]

    PipeANN implementa- tion

    Hao Guo and Youyou Lu. PipeANN implementa- tion. https://github.com/thustorage/PipeANN,

  9. [9]

    OdinANN: Direct insert for consistently stable performance in billion-scale graph- based vector search

    Hao Guo and Youyou Lu. OdinANN: Direct insert for consistently stable performance in billion-scale graph- based vector search. InUSENIX Conference on File and Storage Technologies (FAST), 2026

  10. [10]

    Ac- celerating large-scale inference with anisotropic vector quantization

    Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. Ac- celerating large-scale inference with anisotropic vector quantization. InInternational Conference on Machine Learning (ICML), 2020

  11. [11]

    DiskANN: Fast accurate billion-point nearest neighbor search on a single node

    Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vard- han Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. DiskANN: Fast accurate billion-point nearest neighbor search on a single node. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

  12. [12]

    Product quantization for nearest neighbor search.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2010

    Herve Jegou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search.IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2010

  13. [13]

    Searching in one billion vectors: re-rank with source coding

    Hervé Jégou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. InIEEE International Con- ference on Acoustics, Speech, and Signal Processing (ICASSP), 2011

  14. [14]

    Billion- scale similarity search with gpus.IEEE Transactions on Big Data, 2019

    Jeff Johnson, Matthijs Douze, and Hervé Jégou. Billion- scale similarity search with gpus.IEEE Transactions on Big Data, 2019

  15. [15]

    I/o passthru: Upstreaming a flexible and efficient i/o path in linux

    Kanchan Joshi, Anuj Gupta, Javier González, Ankit Ku- mar, Krishna Kanth Reddy, Arun George, Simon Lund, and Jens Axboe. I/o passthru: Upstreaming a flexible and efficient i/o path in linux. InUSENIX Conference on File and Storage Technologies (FAST), 2024

  16. [16]

    Scalable disk-based approximate nearest neighbor search with page-aligned graph.arXiv preprint arXiv:2509.25487, 2025

    Dingyi Kang, Dongming Jiang, Hanshen Yang, Hang Liu, and Bingzhe Li. Scalable disk-based approximate nearest neighbor search with page-aligned graph.arXiv preprint arXiv:2509.25487, 2025

  17. [17]

    Retrieval-augmented generation for knowledge- intensive NLP tasks

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge- intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS), 2020

  18. [18]

    I/o optimizations for graph-based disk- resident approximate nearest neighbor search: A design space exploration.Proceedings of the VLDB Endow- ment (PVLDB), 2026

    Liang Li, Shufeng Gong, Yanan Yang, Yiduo Wang, and Jie Wu. I/o optimizations for graph-based disk- resident approximate nearest neighbor search: A design space exploration.Proceedings of the VLDB Endow- ment (PVLDB), 2026

  19. [19]

    Efficient and robust approximate nearest neighbor search using hierar- chical navigable small world graphs.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018

    Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest neighbor search using hierar- chical navigable small world graphs.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018. 13

  20. [20]

    SPACEV1B: A billion-scale vector dataset for text descriptors

    Microsoft. SPACEV1B: A billion-scale vector dataset for text descriptors. https: //github.com/microsoft/SPTAG/tree/master/ datasets/SPACEV1B, 2020

  21. [21]

    On estimation of a probability den- sity function and mode.The Annals of Mathematical Statistics, 1962

    Emanuel Parzen. On estimation of a probability den- sity function and mode.The Annals of Mathematical Statistics, 1962

  22. [22]

    FreshDiskANN: A fast and accurate graph-based ANN index for streaming similarity search.arXiv preprint arXiv:2105.09613,

    Aditi Singh, Suhas Jayaram Subramanya, Ravis- hankar Krishnaswamy, and Harsha Vardhan Simhadri. FreshDiskANN: A fast and accurate graph-based ann index for streaming similarity search.arXiv preprint arXiv:2105.09613, 2021

  23. [23]

    Soar: Improved indexing for approx- imate nearest neighbor search

    Philip Sun, David Simcha, Dave Dopson, Ruiqi Guo, and Sanjiv Kumar. Soar: Improved indexing for approx- imate nearest neighbor search. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

  24. [24]

    Starling: An I/O-efficient disk-resident graph index framework for high-dimensional vector similarity search on data seg- ment

    Mengzhao Wang, Weizhi Xu, Xiaomeng Yi, Songlin Wu, Zhangyang Peng, Xiangyu Ke, Yunjun Gao, Xiao- liang Xu, Rentong Guo, and Charles Xie. Starling: An I/O-efficient disk-resident graph index framework for high-dimensional vector similarity search on data seg- ment. InACM SIGMOD International Conference on Management of Data (SIGMOD), 2024

  25. [25]

    A comprehensive survey and experimental comparison of graph-based approximate nearest neigh- bor search.Proceedings of the VLDB Endowment (PVLDB), 2021

    Mengzhao Wang, Xiaoliang Xu, Qiang Yue, and Yuxi- ang Wang. A comprehensive survey and experimental comparison of graph-based approximate nearest neigh- bor search.Proceedings of the VLDB Endowment (PVLDB), 2021

  26. [26]

    In-place updates of a graph in- dex for streaming approximate nearest neighbor search

    Haike Xu, Magdalen Dobson Manohar, Philip A Bern- stein, Badrish Chandramouli, Richard Wen, and Har- sha Vardhan Simhadri. In-place updates of a graph in- dex for streaming approximate nearest neighbor search. arXiv preprint arXiv:2502.13826, 2025

  27. [27]

    SPFresh: Incremental in-place update for billion-scale vector search

    Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen, Qianxi Zhang, Cheng Li, Ziyue Yang, Fan Yang, Yuqing Yang, et al. SPFresh: Incremental in-place update for billion-scale vector search. InACM Symposium on Op- erating Systems Principles (SOSP), 2023

  28. [28]

    Gorgeous: Revisiting the data layout for disk-resident high-dimensional vector search

    Peiqi Yin, Xiao Yan, Qihui Zhou, Hui Li, Xiaolu Li, Lin Zhang, Meiling Wang, Xin Yao, and James Cheng. Gorgeous: Revisiting the data layout for disk-resident high-dimensional vector search. InACM SIGMOD In- ternational Conference on Management of Data (SIG- MOD), 2026

  29. [29]

    Optimizing SSD-resident graph indexing for high-throughput vector search.arXiv preprint arXiv:2602.22805, 2026

    Weichen Zhao, Yuncheng Lu, Yao Tian, Hao Zhang, Jiehui Li, Minghao Zhao, Yakun Li, and Weining Qian. Optimizing SSD-resident graph indexing for high-throughput vector search.arXiv preprint arXiv:2602.22805, 2026. 14 A Additional Experiments We report two additional experiments on SIFT-10M using Testbed B. We first study how the observed latency statistics...