arxiv: 2604.09173 · v1 · submitted 2026-04-10 · 💻 cs.DB · cs.OS

Recognition: unknown

Decoupling Vector Data and Index Storage for Space Efficiency

Yuanming Ren , Juncheng Zhang , Yanjing Ren , Rui Yang , Di Wu , Patrick P. C. Lee

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:41 UTC · model grok-4.3

classification 💻 cs.DB cs.OS

keywords vector storagedecoupled storageANNSapproximate nearest neighbordisk-based searchstorage optimizationindex metadatavector datasets

0 comments

The pith

Decoupling vector data from index metadata in disk-based approximate nearest neighbor search systems can reduce storage space by up to 58.7% while maintaining competitive or improved query and update performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that current disk-based ANNS systems pay a high price for storing vector data and index metadata in the same place, which wastes space, forces extra disk reads on every search, and creates heavy write amplification on updates. It introduces DecoupleVS as a framework that stores the two components separately and then applies targeted compression, custom data layouts, and separate handling for queries and updates. This approach cuts overall storage needs on real billion-scale datasets while keeping search accuracy and speed high. A reader would care because vector search now powers many AI and recommendation systems, and the cost of storing and accessing these massive datasets on disk is a growing practical limit.

Core claim

DecoupleVS is a storage management framework that decouples vector data from auxiliary index metadata in disk-based ANNS systems. It applies specialized techniques for compression, data layouts, query processing, and update handling on the separated components, achieving substantial storage reduction while preserving high search and update performance and accuracy on billion-scale datasets.

What carries the argument

DecoupleVS, the framework that separates vector data storage from index metadata storage to allow independent optimizations for compression, layout, and access patterns.

Load-bearing premise

The primary storage, read, and write problems in existing ANNS systems come from co-locating vector data with index metadata rather than from other design choices.

What would settle it

Measuring storage size and query/update latency on the same billion-scale dataset with both DecoupleVS and a monolithic system; if storage savings fall below 20% or performance degrades noticeably, the claimed benefit would not hold.

Figures

Figures reproduced from arXiv: 2604.09173 by Di Wu, Juncheng Zhang, Patrick P. C. Lee, Rui Yang, Yanjing Ren, Yuanming Ren.

**Figure 1.** Figure 1: Basic workflow of disk-based ANNS system, using a graph-based auxiliary index as an example. Third, decoupling destroys the sequential I/O locality from the co-location strategy, so retrieving vector data and auxiliary index metadata requires separate I/O operations and increases access overhead. Finally, decoupling complicates consistency maintenance between vector data and auxiliary index metadata durin… view at source ↗

**Figure 2.** Figure 2: Search latency breakdowns of DiskANN and PipeANN, including CPU time and I/O wait time, normalized to the search latency of DiskANN. The number above each bar represents the search latency in microseconds (µs). We use single-threaded execution to eliminate interference from multi-threading and limit variance. graph traversal. Since vertices are fixed-size and page-aligned, DiskANN can directly compute any… view at source ↗

**Figure 4.** Figure 4: Normalized value dispersion and information entropy of different datasets. eral novel techniques to address the storage and I/O inefficiencies described in §2.3. 3.1 Design Overview [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Delta compression for multi-dimensional vectors. Segment file Block-level metadata Block Block Block Full-precision vectors Chunk Chunk Segment file Segment files Chunk-level metadata cached in memory First-block offset # blocks Boundary vector IDs Base vector Become immutable when filled Symbol frequency table Metadata file for a single segment Mutable space Chunk-level metadata Chunk-level metadata [PIT… view at source ↗

**Figure 6.** Figure 6: Hierarchical storage layout for vector data. gorithm [10, 11] to compactly store the sorted neighbor IDs using a two-level representation: each ID is split into lower and higher bits, where the lower bits have the same width across all IDs and record the exact representation, while the higher bits of all IDs are stored in a single bitmap for compact representation. 3.3 Hierarchical Storage Layouts To suppo… view at source ↗

**Figure 7.** Figure 7: Exp#1 (Space efficiency). For 100M-scale datasets, sizes are normalized to SPANN. For billion-scale datasets, sizes are normalized to DiskANN. The numbers above the bars indicate the absolute storage sizes in GiB. 5.2 Macrobenchmarks 5.2.1 Space Efficiency We compare the storage sizes of DecoupleVS against DiskANN and SPANN. We omit the results for PipeANN since it shares the identical storage layout and … view at source ↗

**Figure 8.** Figure 8: Exp#2 (Search throughput). We show the throughput (queries per second) against Recall@10 (%); higher is better. The points from left to right in each curve correspond to increasing candidate list sizes, which generally lead to higher recall. SPANN DiskANN PipeANN DecoupleVS 66 76 87 Recall@10 10 0 10 1 10 2 10 3 P99 Latency (ms) 45 72 100 Recall@10 10 0 10 1 10 2 10 3 45 72 100 Recall@10 10 0 10 1 10 2 10 … view at source ↗

**Figure 9.** Figure 9: Exp#3 (Search latency). We show the P99 latency (ms) against Recall@10 (%); lower is better. best-first search algorithm, which overlaps disk I/Os with distance computations to enhance search performance. Figures 8(d)-8(e) show the search throughput on billionscale datasets. Consistent with the 100 M-scale results, DecoupleVS achieves higher throughput than DiskANN on both SIFT1B and SPACEV1B. However, D… view at source ↗

**Figure 10.** Figure 10: Exp#4 (Concurrent search and updates). We show the overall update performance on SIFT100M. This modest increase results from three factors. First, DecoupleVS caches compression metadata, with minimal overhead (28 MiB on SIFT100M). Second, DecoupleVS requires additional memory buffers for compression and decompression during updates. Third, since DecoupleVS processes the auxiliary index in batches with … view at source ↗

**Figure 11.** Figure 11: Exp#6 (Update performance breakdown). We show the average computation and disk I/O times per update operation, normalized to DiskANN. The numbers atop bars indicate the absolute time (in seconds). points into the on-disk index. We decompose each component into computation and disk I/O times. We present average results over 10 iterations, with error bars showing 95% confidence intervals based on the Stu… view at source ↗

read the original abstract

Managing large-scale vector datasets with disk-based approximate nearest neighbor search (ANNS) systems faces critical efficiency challenges stemming from the co-location of vector data and auxiliary index metadata. Our analysis of state-of-the-art ANNS systems reveals that such co-location incurs substantial storage overhead, generates excessive reads during search queries, and causes severe write amplification during updates. We present DecoupleVS, a decoupled vector storage management framework that enables specialized optimizations for vector data and auxiliary index metadata. DecoupleVS incorporates various design techniques for effective compression, data layouts, search queries, and updates, so as to significantly reduce storage space, while maintaining high search and update performance and high search accuracy. Evaluation on real-world public and proprietary billion-scale datasets shows that DecoupleVS reduces storage space by up to 58.7\%, while delivering competitive or improved search query and update performance, compared to state-of-the-art monolithic disk-based ANNS systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DecoupleVS shows a practical decoupling of vector data from index metadata that cuts storage by up to 58.7% on billion-scale ANNS workloads while keeping query and update performance competitive.

read the letter

DecoupleVS takes the co-location of vectors and auxiliary index metadata in disk-based ANNS systems and splits them apart so each can use its own compression and layout. The paper spells out how this reduces storage bloat, cuts read amplification on queries, and lowers write amplification on updates, then builds separate paths for search and modification. Evaluation on public and proprietary billion-scale datasets reports up to 58.7% space savings with query and update times at or above the monolithic baselines they test against. That empirical focus on real data and on update handling is the part that stands out; many vector papers stop at static search metrics. The design choices for compression and data layouts appear to be the main levers, and the results suggest they work without obvious accuracy loss in the reported cases. The soft spots sit in the missing details. The abstract does not show ablation numbers on how much each technique contributes, nor does it break down performance under mixed read-write loads or across different vector dimensions. It is also unclear how much extra code complexity the decoupled paths add in practice or whether the gains hold when the index is updated frequently. These are standard questions for a systems paper rather than fatal gaps, but they need the full sections to judge. This work is aimed at people who run or tune large-scale vector indexes in production and care about storage costs. A reader who already knows the monolithic ANNS landscape will get the most from the concrete comparisons. It deserves a serious referee because the problem is timely, the scale of the experiments matches the claim, and the central empirical result looks reproducible enough to review.

Referee Report

0 major / 3 minor

Summary. The paper proposes DecoupleVS, a decoupled vector storage management framework for disk-based approximate nearest neighbor search (ANNS) systems. It identifies storage overhead, read amplification during queries, and write amplification during updates as consequences of co-locating vector data with auxiliary index metadata in monolithic designs. DecoupleVS applies specialized compression, data layouts, and separate query/update paths to reduce space while preserving search accuracy and performance. Evaluation on real-world public and proprietary billion-scale datasets reports up to 58.7% storage reduction with competitive or improved query and update performance relative to state-of-the-art monolithic disk-based ANNS systems.

Significance. If the empirical claims hold, the work addresses a practical bottleneck in large-scale vector databases where storage costs dominate. The decoupling strategy enables independent optimization of data and metadata, which is a direct systems contribution. Credit is due for the evaluation on both public and proprietary billion-scale datasets and for reporting non-degraded query/update performance alongside the space savings.

minor comments (3)

The abstract states that DecoupleVS 'incorporates various design techniques for effective compression, data layouts, search queries, and updates'; a concise summary table or diagram early in the paper (e.g., in the system overview section) would clarify which techniques apply to which component and how they interact.
The 58.7% space reduction is presented as the maximum observed; specifying the exact dataset, index type, and compression configuration that achieves this figure would strengthen the central empirical claim.
The paper compares against 'state-of-the-art monolithic disk-based ANNS systems'; an explicit list of the baselines (with version numbers or citations) and a summary table of storage, latency, throughput, and recall metrics would improve readability and reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of DecoupleVS and the recommendation for minor revision. The review correctly identifies the core challenges of co-located vector and index storage in disk-based ANNS and acknowledges the practical value of our decoupling approach, including the evaluation on billion-scale datasets. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper is a systems/empirical contribution that identifies storage co-location overheads in existing disk-based ANNS systems, proposes a decoupled framework (DecoupleVS) with compression, layout, query, and update techniques, and validates the approach via direct evaluation on public and proprietary billion-scale datasets. No derivation chain, fitted parameters, self-referential predictions, or load-bearing self-citations appear in the abstract or described structure. Central claims (up to 58.7% space reduction with competitive performance) rest on external experimental comparison rather than reduction to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no specific free parameters, axioms, or invented entities detailed.

pith-pipeline@v0.9.0 · 5467 in / 951 out tokens · 44202 ms · 2026-05-10T16:41:01.154166+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Cassandra

Apache. Cassandra. https://cassandra.apache. org/, 2025

2025
[2]

Language models are few-shot learners.Proc

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Sub- biah, Jared D Kaplan, Prafulla Dhariwal, Arvind Nee- lakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Proc. of NeurIPS, 2020

2020
[3]

Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. InProc. of USENIX FAST, 2020

2020
[4]

Hsieh, Deborah A

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured data. InProc. of USENIX OSDI, 2006

2006
[5]

Sptag: A li- brary for fast approximate nearest neighbor search

Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason Li, Chuanjie Liu, Lin- tao Zhang, and Jingdong Wang. Sptag: A li- brary for fast approximate nearest neighbor search. https://github.com/Microsoft/SPTAG, 2018

2018
[6]

SPANN: Highly-efficient billion-scale approx- imate nearest neighborhood search.Proc

Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. SPANN: Highly-efficient billion-scale approx- imate nearest neighborhood search.Proc. of NeurIPS, 2021

2021
[7]

Yann Collet. LZ4. https://github.com/lz4/lz4, 2011

2011
[8]

Deep neural networks for YouTube recommendations

Paul Covington, Jay Adams, and Emre Sargin. Deep neural networks for YouTube recommendations. In Proc. of ACM RecSys, 2016

2016
[9]

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding

Jarek Duda. Asymmetric numeral systems: Entropy coding combining speed of Huffman coding with com- pression rate of arithmetic coding.arXiv preprint arXiv:1311.2540, 2013

work page Pith review arXiv 2013
[10]

Efficient storage and retrieval by content and address of static files.Journal of the ACM (JACM), 21(2):246--260, 1974

Peter Elias. Efficient storage and retrieval by content and address of static files.Journal of the ACM (JACM), 21(2):246--260, 1974

1974
[11]

Massachusetts Institute of Technology, Project MAC, 1971

Robert Mario Fano.On the number of bits required to implement an associative memory. Massachusetts Institute of Technology, Project MAC, 1971

1971
[12]

Fast approximate nearest neighbor search with the navi- gating spreading-out graph.Proceedings of the VLDB Endowment, 12(5):461--474, 2019

Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. Fast approximate nearest neighbor search with the navi- gating spreading-out graph.Proceedings of the VLDB Endowment, 12(5):461--474, 2019

2019
[13]

RabitQ: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc

Jianyang Gao and Cheng Long. RabitQ: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc. of ACM SIGMOD, 2(3):1--27, 2024

2024
[14]

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.arXiv preprint arXiv:2312.10997, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[15]

Op- timized product quantization.IEEE Trans

Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. Op- timized product quantization.IEEE Trans. on Pat- tern Analysis and Machine Intelligence, 36(4):744--755, 2013

2013
[16]

Succinct

Giuseppe Ottaviano. Succinct. https://github. com/ot/succinct, 2017

2017
[17]

Real-time person- alization using embeddings for search ranking at airbnb

Mihajlo Grbovic and Haibin Cheng. Real-time person- alization using embeddings for search ranking at airbnb. InProc. of the 24th ACM SIGKDD, 2018

2018
[18]

Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD

Hao Guo and Youyou Lu. Achieving low-latency graph- based vector search via aligning best-first search algo- rithm with SSD. InProc. of USENIX OSDI, 2025

2025
[19]

OdinANN: Direct insert for consistently stable performance in billion-scale graph- based vector search

Hao Guo and Youyou Lu. OdinANN: Direct insert for consistently stable performance in billion-scale graph- based vector search. InProc. of USENIX FAST, 2026

2026
[20]

Research talk: Approximate nearest neighbor search systems at scale

Harsha Simhadri. Research talk: Approximate nearest neighbor search systems at scale. https://youtu.be/ BnYNdSIKibQ?t=179, 2022

2022
[21]

ZipNN: Lossless compression for AI models

Moshik Hershcovitch, Andrew Wood, Leshem Choshen, Guy Girmonsky, Roy Leibovitz, Or Ozeri, Ilias Enn- mouri, Michal Malka, Peter Chin, Swaminathan Sun- 13 dararaman, et al. ZipNN: Lossless compression for AI models. InProc. of IEEE CLOUD, 2025

2025
[22]

AI and the future of unstructured data

IBM. AI and the future of unstructured data. https: //www.ibm.com/think/insights/unstructured- data-trends, 2025

2025
[23]

AI data centers are swallowing the world’s memory and storage supply, setting the stage for a pricing apocalypse that could last a decade

Luke James. AI data centers are swallowing the world’s memory and storage supply, setting the stage for a pricing apocalypse that could last a decade. https://www.tomshardware.com/pc- components/storage/perfect-storm-of- demand-and-supply-driving-up-storage- costs?ref=aisecret.us, 2025

2025
[24]

DiskANN: Fast accurate billion-point near- est neighbor search on a single node.Proc

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vard- han Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. DiskANN: Fast accurate billion-point near- est neighbor search on a single node.Proc. of NeurIPS, 2019

2019
[25]

Product quantization for nearest neighbor search.IEEE Trans

Herve Jegou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search.IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):117--128, 2010

2010
[26]

Searching in one billion vectors: re- rank with source coding

Herv´e J ´egou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. Searching in one billion vectors: re- rank with source coding. InProc. of IEEE ICASSP, 2011

2011
[27]

Introducing the Mil- vus Sizing Tool: Calculating and Optimizing Your Milvus Deployment Resources

Ken Zhang, Fendy Feng. Introducing the Mil- vus Sizing Tool: Calculating and Optimizing Your Milvus Deployment Resources . https: //milvus.io/blog/introducing-the-milvus- sizing-tool-calculating-and-optimizing- your-milvus-deployment-resources.md, 2025

2025
[28]

Dynamic Huffman coding.Journal of Algorithms, 6(2):163--180, 1985

Donald E Knuth. Dynamic Huffman coding.Journal of Algorithms, 6(2):163--180, 1985

1985
[29]

Datasets for ap- proximate nearest neighbor search

Laurent Amsaleg and Herv ´e J ´egou. Datasets for ap- proximate nearest neighbor search. http://corpus- texmex.irisa.fr/, 2010

2010
[30]

VStore: in-storage graph based vector search accelerator

Shengwen Liang, Ying Wang, Ziming Yuan, Cheng Liu, Huawei Li, and Xiaowei Li. VStore: in-storage graph based vector search accelerator. InProc. of ACM/IEEE DAC, 2022

2022
[31]

Efficient and robust approximate nearest neighbor search using hier- archical navigable small world graphs.IEEE Trans

Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest neighbor search using hier- archical navigable small world graphs.IEEE Trans. on Pattern Analysis and Machine Intelligence, 42(4):824-- 836, 2018

2018
[32]

SPTAG issue #416: Segmentation fault when building SIFT1B

Microsoft. SPTAG issue #416: Segmentation fault when building SIFT1B. https://github.com/ microsoft/SPTAG/issues/416, 2024

2024
[33]

Flann-fast library for approximate nearest neighbors user manual.Computer Science Department, University of British Columbia, Vancouver, BC, Canada, 5(6), 2009

Marius Muja and David Lowe. Flann-fast library for approximate nearest neighbors user manual.Computer Science Department, University of British Columbia, Vancouver, BC, Canada, 5(6), 2009

2009
[34]

BIG ANN-Benchmarks

NeurIPS. BIG ANN-Benchmarks. https://big-ann- benchmarks.com/neurips21.html, 2021

2021
[35]

Fm- delta: Lossless compression for storing massive fine- tuned foundation models.Proc

Wanyi Ning, Jingyu Wang, Qi Qi, Mengde Zhu, Haifeng Sun, Daixuan Cheng, Jianxin Liao, and Ce Zhang. Fm- delta: Lossless compression for storing massive fine- tuned foundation models.Proc. of NeurIPS, 2024

2024
[36]

The ChatGPT Retrieval Plugin lets you easily search and find personal or work documents by ask- ing questions in everyday language

OpenAI. The ChatGPT Retrieval Plugin lets you easily search and find personal or work documents by ask- ing questions in everyday language. https://github. com/openai/chatgpt-retrieval-plugin, 2023

2023
[37]

Partitioned elias-fano indexes

Giuseppe Ottaviano and Rossano Venturini. Partitioned elias-fano indexes. InProceedings of the 37th Interna- tional ACM SIGIR Conference on Research & Develop- ment in Information Retrieval, 2014

2014
[38]

TiKV.https://tikv.org, 2025

PinCap. TiKV.https://tikv.org, 2025

2025
[39]

Sentence-bert: Sen- tence embeddings using siamese bert-networks

Nils Reimers and Iryna Gurevych. Sentence-bert: Sen- tence embeddings using siamese bert-networks. InProc. of the EMNLP, 2019

2019
[40]

FPGA-based lossless data compression using Huffman and LZ77 algorithms

Suzanne Rigler, William Bishop, and Andrew Kennings. FPGA-based lossless data compression using Huffman and LZ77 algorithms. In2007 Canadian conference on electrical and computer engineering, pages 1235--1238. IEEE, 2007

2007
[41]

A mathematical theory of commu- nication.The Bell system technical journal, 27(3):379-- 423, 1948

Claude E Shannon. A mathematical theory of commu- nication.The Bell system technical journal, 27(3):379-- 423, 1948

1948
[42]

Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search,

Aditi Singh, Suhas Jayaram Subramanya, Ravis- hankar Krishnaswamy, and Harsha Vardhan Simhadri. FreshDiskANN: A fast and accurate graph-based ann index for streaming similarity search.arXiv preprint arXiv:2105.09613, 2021

work page arXiv 2021
[43]

Scalable billion-point approximate nearest neighbor search using SmartSSDs

Bing Tian, Haikun Liu, Zhuohui Duan, Xiaofei Liao, Hai Jin, and Yu Zhang. Scalable billion-point approximate nearest neighbor search using SmartSSDs. InProc. of USENIX ATC, 2024

2024
[44]

Towards high- throughput and low-latency billion-scale vector search via CPU/GPU collaborative filtering and re-ranking

Bing Tian, Haikun Liu, Yuhang Tang, Shihai Xiao, Zhuohui Duan, Xiaofei Liao, Hai Jin, Xuecang Zhang, Junhua Zhu, and Yu Zhang. Towards high- throughput and low-latency billion-scale vector search via CPU/GPU collaborative filtering and re-ranking. In Proc. of USENIX FAST, 2025

2025
[45]

The relative neighbourhood graph of a finite planar set.Pattern recognition, 12(4):261--268, 1980

Godfried T Toussaint. The relative neighbourhood graph of a finite planar set.Pattern recognition, 12(4):261--268, 1980

1980
[46]

DiskANN: Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search

Simhadri Harsha Vardhan, Krishnaswamy Ravishankar, Srinivasa Gopal, Subramanya Suhas Jayaram, Antoni- jevic Andrija, Pryce Dax, Kaczynski David, Williams Shane, Gollapudi Siddarth, Sivashankar Varun, Karia 14 Neel, Singh Aditi, Jaiswal Shikhar, Mahapatro Nee- lam, Adams Philip, Tower Bryan, and Patel Yash. DiskANN: Graph-structured Indices for Scalable, F...

2023
[47]

MILC: Inverted list compression in memory.Proceedings of the VLDB Endowment, 10(8):853--864, 2017

Jianguo Wang, Chunbin Lin, Ruining He, Moojin Chae, Yannis Papakonstantinou, and Steven Swanson. MILC: Inverted list compression in memory.Proceedings of the VLDB Endowment, 10(8):853--864, 2017

2017
[48]

Starling: An I/O-efficient disk-resident graph index framework for high-dimensional vector similarity search on data segment.Proc

Mengzhao Wang, Weizhi Xu, Xiaomeng Yi, Songlin Wu, Zhangyang Peng, Xiangyu Ke, Yunjun Gao, Xi- aoliang Xu, Rentong Guo, and Charles Xie. Starling: An I/O-efficient disk-resident graph index framework for high-dimensional vector similarity search on data segment.Proc. of ACM SIGMOD, 2024

2024
[49]

ZipLLM:towards efficient LLM storage reduction via tensor deduplication and delta com- pression

Zirui Wang, Tingfeng Lan, Zhaoyuan Su, Juncheng Yang, and Yue Cheng. ZipLLM:towards efficient LLM storage reduction via tensor deduplication and delta com- pression. InProc. of USENIX NSDI, 2026

2026
[50]

Configure replication in Weaviate ANN service

Weaviate. Configure replication in Weaviate ANN service. https://docs.weaviate.io/deploy/ configuration/replication, 2025

2025
[51]

Bernstein, Badrish Chan- dramouli, Richard Wen, and Harsha Vardhan Simhadri

Haike Xu, Magdalen Dobson Manohar, Philip A Bern- stein, Badrish Chandramouli, Richard Wen, and Har- sha Vardhan Simhadri. In-place updates of a graph in- dex for streaming approximate nearest neighbor search. arXiv preprint arXiv:2502.13826, 2025

work page arXiv 2025
[52]

SPFresh: Incremental in- place update for billion-scale vector search

Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen, Qianxi Zhang, Cheng Li, Ziyue Yang, Fan Yang, Yuqing Yang, et al. SPFresh: Incremental in- place update for billion-scale vector search. InProc. of ACM SOSP, 2023

2023
[53]

New Generation Entropy coders

Yann Collet. New Generation Entropy coders. https: //github.com/Cyan4973/FiniteStateEntropy, 2019

2019
[54]

Finesse: Fine-grained fea- ture locality based fast resemblance detection for post- deduplication delta compression

Yucheng Zhang, Wen Xia, Dan Feng, Hong Jiang, Yu Hua, and Qiang Wang. Finesse: Fine-grained fea- ture locality based fast resemblance detection for post- deduplication delta compression. InProc. of USENIX FAST, 2019

2019
[55]

Fast vector query processing for large datasets beyond GPU memory with reordered pipelining

Zili Zhang, Fangyue Liu, Gang Huang, Xuanzhe Liu, and Xin Jin. Fast vector query processing for large datasets beyond GPU memory with reordered pipelining. InProc. of USENIX NSDI, 2024. 15

2024