pith. sign in

arxiv: 2512.09331 · v2 · submitted 2025-12-10 · 💻 cs.DC · cs.IR

Passing the Baton: High Throughput Distributed Disk-Based Vector Search with BatANN

Pith reviewed 2026-05-16 23:57 UTC · model grok-4.3

classification 💻 cs.DC cs.IR
keywords distributed vector searchapproximate nearest neighbordisk-based ANNhigh throughputglobal graphquery forwardingscalabilitythroughput scaling
0
0 comments X

The pith

BatANN distributes vector search by forwarding full query states to keep one global graph across servers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces BatANN as a distributed system for disk-based approximate nearest neighbor search on datasets too large for a single server. It preserves the efficiency of one unified graph by sending the complete current query state to whichever server stores the next graph neighborhood, allowing the search to continue locally there instead of scattering partial work. This yields substantially higher throughput than prior distributed approaches while holding latency low, all over ordinary TCP links. A sympathetic reader would care because vector search powers retrieval in large AI systems, and single-machine limits are already being hit on billion-scale collections.

Core claim

BatANN is a distributed disk-based approximate nearest neighbor system that retains the logarithmic search efficiency of a single global graph. When a query needs data stored on another machine, the full state of the query is forwarded so execution continues locally on that machine, delivering near-linear throughput scaling with the number of servers.

What carries the argument

Full-state query forwarding that sends the entire current search state to the remote server holding the next graph neighborhood, preserving locality during distributed execution.

If this is right

  • On 1B-point datasets at 0.95 recall using 10 servers, BatANN reaches 3.5-5.59x the throughput of scatter-gather baselines.
  • It reaches 1.44-2.09x the throughput of DistributedANN on the same workloads.
  • Mean latency remains below 3 ms.
  • All results are obtained over standard TCP without specialized networking hardware.
  • It is the first open-source distributed disk-based vector search system that uses one global graph.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The forwarding approach could support datasets in the tens of billions simply by adding more standard servers without rebuilding the index.
  • Similar state-passing ideas might apply to other graph-based retrieval methods that currently rely on separate per-server indexes.
  • Network costs for full-state forwarding appear lower than expected for these workloads, opening room for further scaling experiments on wider clusters.

Load-bearing premise

Partitioning a single global graph across disks on multiple servers preserves recall and search efficiency without meaningful loss.

What would settle it

Running the system on a 1B-vector dataset with 10 servers and observing either recall below 0.95 or throughput scaling below linear with added servers would show the central claim does not hold.

Figures

Figures reproduced from arXiv: 2512.09331 by Ben Landrum (1), Ken Birman (1) ((1) Cornell University), Nam Anh Dang (1).

Figure 1
Figure 1. Figure 1: Scatter–Gather vs. Global Index Many distributed vector search systems employ some variant of a scatter–gather paradigm. As seen in [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: System overview exotic today. We target a more common case, where nodes are connected by commodity networking hardware and communicate using TCP, but in doing so must embrace the sweet spot for this pro￾tocol, which can sustain very high throughput but is not ideal for round-trip interactions. Accordingly, our design prioritizes reduc￾ing the amount of time queries spend waiting on communication, in much t… view at source ↗
Figure 3
Figure 3. Figure 3: Number of hops vs. inter-partition hops for BI [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of inter-partition hops between [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: The system runs a dedicated ZeroMQ receiver thread [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of Distance Computations and Disk [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: QPS recall graph of different search methods on [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: QPS recall curve of BIGANN, MSSPACEV on 5 and [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: QPS recall curve of BIGANN, MSSPACEV, DEEP on 5 and 10 servers for 100M. BatANN outperforms ScatterGather on [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Distance comparisons and I/O vs. recall plot of [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Latency Recall curve of BIGANN 1B on 5, 10 server [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗
Figure 14
Figure 14. Figure 14: Throughput and latency comparison for 𝑊 = 1, 8 for BIGANN 1B on 10 servers. 𝑊 = 8 has higher performance across regimes with respect to both metrics. heuristic to detect which phase the search is currently in and adjust the I/O pipeline accordingly. Further optimizations for single-node DiskANN focus on reorga￾nizing vector layout on disk. DiskANN stores vectors contiguously by ID, which provides no local… view at source ↗
read the original abstract

Vector search underpins modern information-retrieval systems, including retrieval-augmented generation (RAG) pipelines and search engines over unstructured text and images. As datasets scale to billions of vectors, disk-based vector search has emerged as a practical solution. However, looking to the future, we must anticipate datasets too large for any single server and throughput demands that exceed the limits of locally attached SSDs. We present BatANN, a distributed disk-based approximate nearest neighbor (ANN) system that retains the logarithmic search efficiency of a single global graph while achieving near-linear throughput scaling in the number of servers. Our core innovation is that when accessing a neighborhood which is stored on another machine, we send the full state of the query to the other machine to continue executing there for improved locality. On 1B-point datasets at 0.95 recall using 10 servers, BatANN achieves 3.5-5.59x of the scatter-gather baseline and 1.44-2.09x the throughput of DistributedANN, respectively, while maintaining mean latency below 3 ms. Moreover, we get these results on standard TCP. To our knowledge, BatANN is the first open-source distributed disk-based vector search system to operate over a single global graph.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents BatANN, a distributed disk-based approximate nearest neighbor search system that partitions a single global graph across multiple servers. The core mechanism forwards the full query state (candidates, visited set, distance bounds) to the remote server when a neighborhood access crosses a partition boundary, aiming to preserve logarithmic search efficiency while achieving near-linear throughput scaling. On 1B-point datasets at 0.95 recall with 10 servers, it reports 3.5-5.59× throughput over a scatter-gather baseline and 1.44-2.09× over DistributedANN, with mean latency below 3 ms using standard TCP; it claims to be the first open-source distributed disk-based vector search system operating over a single global graph.

Significance. If the reported speedups and latency hold under the baton-passing model, the work would provide a practical path to scaling disk-based ANN beyond single-server SSD limits while retaining the efficiency advantages of a unified graph. The empirical gains on commodity TCP and the open-source positioning are concrete strengths that could influence large-scale retrieval pipelines in RAG and search engines.

major comments (2)
  1. [Abstract] Abstract: The headline throughput claims (3.5-5.59× over scatter-gather and 1.44-2.09× over DistributedANN) and sub-3 ms mean latency rest on the assumption that cross-partition traversals remain infrequent and that full-state forwarding incurs negligible overhead relative to SSD reads; however, the manuscript provides no measurements of average hops per query, serialized state size transferred per hop, or partition-cut sizes for the 1B-point experiments.
  2. [Experimental results] Experimental results: Performance numbers are reported for specific recall targets and server counts without error bars, variance across runs, or detailed methodology for how the global graph is partitioned and assigned to disks, which leaves the near-linear scaling claim difficult to evaluate.
minor comments (1)
  1. [Abstract] Abstract: The claim that BatANN is 'the first open-source' system of its kind should be supported by a more explicit comparison table or discussion in the related-work section rather than left as a knowledge claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to incorporate additional measurements and methodological details.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline throughput claims (3.5-5.59× over scatter-gather and 1.44-2.09× over DistributedANN) and sub-3 ms mean latency rest on the assumption that cross-partition traversals remain infrequent and that full-state forwarding incurs negligible overhead relative to SSD reads; however, the manuscript provides no measurements of average hops per query, serialized state size transferred per hop, or partition-cut sizes for the 1B-point experiments.

    Authors: We agree that explicit measurements of cross-partition behavior would strengthen the claims. In the revised version we will add a dedicated paragraph and table in the experimental evaluation section reporting, for the 1B-point workloads: (1) average cross-partition hops per query, (2) average serialized state size transferred per hop, and (3) partition-cut statistics (number of edges crossing partitions). These quantities will be collected under the exact same configuration and recall targets used for the throughput results, allowing readers to quantify the overhead of baton passing relative to SSD reads. revision: yes

  2. Referee: [Experimental results] Experimental results: Performance numbers are reported for specific recall targets and server counts without error bars, variance across runs, or detailed methodology for how the global graph is partitioned and assigned to disks, which leaves the near-linear scaling claim difficult to evaluate.

    Authors: We acknowledge that the current presentation lacks error bars and a sufficiently detailed partitioning description. In the revision we will (1) add standard-deviation error bars to all throughput and latency plots, computed over at least five independent runs, and (2) expand the experimental setup subsection to describe the graph partitioning algorithm, the objective used to minimize cut size, and the mapping of partitions to disks/servers. These additions will make the near-linear scaling results easier to reproduce and evaluate. revision: yes

Circularity Check

0 steps flagged

No circularity: performance claims are purely empirical benchmarks against external baselines

full rationale

The paper describes a distributed ANN system and reports measured throughput and latency numbers on 1B-point datasets (3.5-5.59x over scatter-gather, 1.44-2.09x over DistributedANN, <3 ms mean latency). These are direct experimental comparisons with no equations, fitted parameters, self-referential definitions, or derivation steps that reduce to the paper's own inputs. The single-global-graph partitioning assumption is stated as a design choice and evaluated empirically; it is not justified by any self-citation chain or ansatz that collapses into the claimed result. No load-bearing mathematical step exists to inspect for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard distributed-systems assumptions about network latency, disk I/O, and graph partitioning; no free parameters or invented entities are introduced beyond the new system design itself.

axioms (1)
  • domain assumption Network and disk performance characteristics allow effective full-query-state forwarding without prohibitive overhead
    Invoked to support near-linear scaling and sub-3ms latency claims on standard TCP

pith-pipeline@v0.9.0 · 5539 in / 1249 out tokens · 51075 ms · 2026-05-16T23:57:00.918084+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 1 internal anchor

  1. [1]

    M., and Simhadri, H

    Adams, P., Li, M., Zhang, S., Tan, L., Chen, Q., Li, M., Li, Z., Risvik, K. M., and Simhadri, H. V.DistributedANN: Efficient Scaling of a Single DiskANN Graph Across Thousands of Computers. InThe 1st Workshop on Vector Databases(July 2025)

  2. [2]

    C., Krishnamurthy, A

    Ahalt, S. C., Krishnamurthy, A. K., Chen, P., and Melton, D. E.Competitive learning algorithms for vector quantization.Neural Networks 3, 3 (Jan. 1990), 277–290. 11

  3. [3]

    L.Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD.IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 5 (May 2021), 1666–1677

    André, F., Kermarrec, A.-M., and Scouarnec, N. L.Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD.IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 5 (May 2021), 1666–1677

  4. [4]

    M.Approximate nearest neighbor queries in fixed dimensions

    Arya, S., and Mount, D. M.Approximate nearest neighbor queries in fixed dimensions. InProceedings of the fourth annual ACM-SIAM symposium on discrete algorithms(Austin, Texas, USA, 1993), Soda ’93, Society for Industrial and Applied Mathematics, pp. 271–280. Number of pages: 10 tex.address: USA

  5. [5]

    2020), 101374

    Aumüller, M., Bernhardsson, E., and Faithfull, A.ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms.Information Systems 87(Jan. 2020), 101374

  6. [6]

    L.Searching in metric spaces.Acm Computing Surveys 33, 3 (Sept

    Chávez, E., Navarro, G., Baeza-Y ates, R., and Marroqín, J. L.Searching in metric spaces.Acm Computing Surveys 33, 3 (Sept. 2001), 273–321

  7. [7]

    Deng, S., Yan, X., Ng, K. K. W., Jiang, C., and Cheng, J.Pyramid: A General Framework for Distributed Similarity Search, June 2019. arXiv:1906.10602 [cs]

  8. [8]

    Desrochers, C.A Fast General Purpose Lock-Free Queue for C++.moody- camel.com(2014)

  9. [9]

    S., Rodriguez, A., Kulaw- iak, D

    Dilocker, E., van Luijt, B., Voorbach, B., Hasan, M. S., Rodriguez, A., Kulaw- iak, D. A., Antas, M., and Duckworth, P.Weaviate, Nov. 2025. original-date: 2016-03-30T15:03:17Z

  10. [10]

    InProceedings of the 20th International Conference on World Wide Web(2011), ACM, pp

    Dong, W., Moses, C., and Li, K.Efficient k-nearest neighbor graph construction for generic similarity measures. InProceedings of the 20th International Conference on World Wide Web(2011), ACM, pp. 577–586

  11. [11]

    InProceedings of the USENIX annual technical conference (ATC)(July 2019), pp

    Duplyakin, D., Ricci, R., Maricq, A., Wong, G., Duerig, J., Eide, E., Stoller, L., Hibler, M., Johnson, D., Webb, K., Akella, A., W ang, K., Ricart, G., Landwe- ber, L., Elliott, C., Zink, M., and Cecchet, E.The design and operation of CloudLab. InProceedings of the USENIX annual technical conference (ATC)(July 2019), pp. 1–14

  12. [12]

    Gottesbüren, L., Dhulipala, L., Jayaram, R., and Lacki, J.Unleash- ing Graph Partitioning for Large-Scale Nearest Neighbor Search, Mar. 2024. arXiv:2403.01797

  13. [13]

    In19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25)(2025), pp

    Guo, H., and Lu, Y.Achieving {Low-Latency} {Graph-Based} Vector Search via Aligning {Best-First} Search Algorithm with {SSD}. In19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25)(2025), pp. 171–186

  14. [14]

    Guo, H., and Lu, Y.OdinANN: Direct Insert for Consistently Stable Performance in Billion-Scale Graph-Based Vector Search.24th USENIX Conference on File and Storage Technologies (FAST ’26)(2026)

  15. [15]

    In Proceedings of the 22nd ACM international conference on Information & Knowledge Management(San Francisco California USA, Oct

    Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., and Heck, L.Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management(San Francisco California USA, Oct. 2013), ACM, pp. 2333–2338

  16. [16]

    InThe 1st workshop on vector databases (2025)

    Ingber, A., and Liberty, E.Accurate and efficient metadata filtering in pinecone’s serverless vector database. InThe 1st workshop on vector databases (2025)

  17. [17]

    In2023 USENIX Annual Technical Conference (USENIX ATC 23)(2023), USENIX Association, pp

    Jang, J., Choi, H., Bae, H., Lee, S., Kwon, M., and Jung, M.CXL-ANNS: Software- Hardware collaborative memory disaggregation and computation for Billion- Scale approximate nearest neighbor search. In2023 USENIX Annual Technical Conference (USENIX ATC 23)(2023), USENIX Association, pp. 585–600

  18. [18]

    V., Krishnawamy, R., and Kadekodi, R.DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

    Jayaram Subramanya, S., Devvrit, F., Simhadri, H. V., Krishnawamy, R., and Kadekodi, R.DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. InAdvances in Neural Information Processing Systems(2019), vol. 32, Curran Associates, Inc

  19. [19]

    V., Zink, S., and Birman, K

    Jha, S., Behrens, J., Gkountouvas, T., Milano, M., Song, W., Tremel, E., Re- nesse, R. V., Zink, S., and Birman, K. P.Derecho: Fast state machine replication for cloud services.ACM Trans. Comput. Syst. 36, 2 (Apr. 2019)

  20. [20]

    Johnson, J., Douze, M., and Jégou, H.Billion-Scale Similarity Search with GPUs.IEEE Transactions on Big Data 7, 3 (July 2021), 535–547

  21. [21]

    2011), 117–128

    Jégou, H., Douze, M., and Schmid, C.Product Quantization for Nearest Neighbor Search.IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (Jan. 2011), 117–128

  22. [22]

    InAdvances in neural information processing systems(2020), H

    Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., and Kiela, D.Retrieval- augmented generation for knowledge-intensive NLP tasks. InAdvances in neural information processing systems(2020), H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33, Cur...

  23. [23]

    D., Shen, Z., Blelloch, G., Dhulipala, L., Gu, Y., Simhadri, H

    Manohar, M. D., Shen, Z., Blelloch, G., Dhulipala, L., Gu, Y., Simhadri, H. V., and Sun, Y.ParlayANN: Scalable and deterministic parallel graph-based approximate nearest neighbor search algorithms. InProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (New York, NY, USA, 2024), PPoPP ’24, Association for...

  24. [24]

    Oakley, J., and Ferhatosmanoglu, H.SQUASH: Serverless and Dis- tributed Quantization-based Attributed Vector Similarity Search, Feb. 2025. arXiv:2502.01528 [cs]

  25. [25]

    Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., and Sutskever, I.Learning Transferable Visual Models From Natural Language Supervision. InProceedings of the 38th International Conference on Machine Learning(July 2021), PMLR, pp. 8748–8763. ISSN: 2640-3498

  26. [26]

    V., Aumüller, M., Ingber, A., Douze, M., Williams, G., Manohar, M

    Simhadri, H. V., Aumüller, M., Ingber, A., Douze, M., Williams, G., Manohar, M. D., Baranchuk, D., Liberty, E., Liu, F., Landrum, B., Karjikar, M., Dhuli- pala, L., Chen, M., Chen, Y., Ma, R., Zhang, K., Cai, Y., Shi, J., Chen, Y., Zheng, W., W an, Z., Yin, J., and Huang, B.Results of the big ANN: NeurIPS’23 compe- tition, 2024. arXiv: 2409.17424 [cs.IR]

  27. [27]

    FreshDiskANN: A fast and accurate graph-based ANN index for streaming similarity search.arXiv preprint arXiv:2105.09613,

    Singh, A., Subramanya, S. J., Krishnaswamy, R., and Simhadri, H. V. FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search, May 2021. arXiv:2105.09613 [cs]

  28. [28]

    Tatsuno, K., Miyashita, D., Ikeda, T., Ishiyama, K., Sumiyoshi, K., and Deguchi, J.AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval, Feb. 2025. arXiv:2404.06004 [cs]. [29]The ZeroMQ Community. ZeroMQ: The intelligent transport layer, 2025

  29. [29]

    V., Sundar, H

    Upreti, N., Simhadri, H. V., Sundar, H. S., Sundaram, K., Boshra, S., Perumal- swamy, B., Atri, S., Chisholm, M., Singh, R. R., Y ang, G., Hass, T., Dudhey, N., Pattipaka, S., Hildebrand, M., Manohar, M., Moffitt, J., Xu, H., Datha, N., Gupta, S., Krishnaswamy, R., Gupta, P., Sahu, A., V arada, H., Barthwal, S., Mor, R., Codella, J., Cooper, S., Pilch, K....

  30. [30]

    InProceedings of the 2021 International Conference on Management of Data(Virtual Event China, June 2021), ACM, pp

    Wang, J., Yi, X., Guo, R., Jin, H., Xu, P., Li, S., Wang, X., Guo, X., Li, C., Xu, X., Yu, K., Yuan, Y., Zou, Y., Long, J., Cai, Y., Li, Z., Zhang, Z., Mo, Y., Gu, J., Jiang, R., Wei, Y., and Xie, C.Milvus: A Purpose-Built Vector Data Management System. InProceedings of the 2021 International Conference on Management of Data(Virtual Event China, June 2021...

  31. [31]

    Starling: An i/o- efficient disk-resident graph index framework for high- dimensional vector similarity search on data segment

    W ang, M., Xu, W., Yi, X., Wu, S., Peng, Z., Ke, X., Gao, Y., Xu, X., Guo, R., and Xie, C.Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment.Proceedings of the ACM on Management of Data 2, 1 (Mar. 2024), 1–27. arXiv:2401.02116 [cs]

  32. [32]

    In-place updates of a graph in- dex for streaming approximate nearest neighbor search

    Xu, H., Manohar, M. D., Bernstein, P. A., Chandramouli, B., Wen, R., and Simhadri, H. V.In-place updates of a graph index for streaming approximate nearest neighbor search.CoRR abs/2502.13826(Feb. 2025)

  33. [33]

    Xu, X., Wang, M., Wang, Y., and Ma, D.Two-stage routing with optimized guided search and greedy algorithm on proximity graph.Knowledge-Based Systems 229(2021), 107305

  34. [34]

    SPFresh: Incremental in-place update for billion-scale vector search

    Xu, Y., Liang, H., Li, J., Xu, S., Chen, Q., Zhang, Q., Li, C., Yang, Z., Yang, F., Yang, Y., and others. SPFresh: Incremental in-place update for billion-scale vector search. InProceedings of the 29th symposium on operating systems principles (2023), pp. 545–561

  35. [35]

    Yu. A. Malkov, Yu. A. Malkov, Malkov, Y. A., D. A. Y ashunin, and Y ashunin, D. A.Efficient and Robust Approximate Nearest Neighbor Search Using Hierar- chical Navigable Small World Graphs.IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 4 (Apr. 2020), 824–836

  36. [36]

    In17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)(Boston, MA, July 2023), USENIX Association, pp

    Zhang, Q., Xu, S., Chen, Q., Sui, G., Xie, J., Cai, Z., Chen, Y., He, Y., Yang, Y., Y ang, F., Y ang, M., and Zhou, L.VBASE: Unifying online vector similarity search and relational queries via relaxed monotonicity. In17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)(Boston, MA, July 2023), USENIX Association, pp. 377–395

  37. [37]

    Towards Efficient and Scalable Distributed Vector Search with RDMA, July 2025

    Zhi, X., Chen, M., Yan, X., Lu, B., Li, H., Zhang, Q., Chen, Q., and Cheng, J. Towards Efficient and Scalable Distributed Vector Search with RDMA, July 2025. arXiv:2507.06653 [cs]. 12