arxiv: 2605.01342 · v2 · submitted 2026-05-02 · 💻 cs.DB

Recognition: 2 theorem links

· Lean Theorem

Don't Be a Pot Stirrer! Authorized Vector Data Retrieval via Access-Aware Indexing

Shanshan Han , Vishal Chakraborty , Sharad Mehrotra

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:29 UTC · model grok-4.3

classification 💻 cs.DB

keywords vector databasesrole-based access controlauthorized nearest neighbor searchaccess-aware indexingHNSWlattice structurestorage budget

0 comments

The pith

An access-aware lattice groups co-accessed vector blocks to enforce role-based access control while tracking a storage budget and raising query throughput.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to index vectors for top-k approximate nearest neighbor queries that must return only vectors authorized to the querying role. It partitions the data into blocks by role combination, then uses a lattice over those blocks to copy and merge co-accessed groups until the total storage fits a user budget. Large merged nodes receive HNSW indexes; small nodes stay for linear scan. At query time a minimal covering set of nodes is selected and pure nodes are searched first so their k-th distance can prune work on mixed nodes. Evaluations report higher throughput at high recall than either a single global index or fully duplicated per-role indexes.

Core claim

Veda and EffVeda build an access-aware lattice over role-combination blocks, apply copy-and-merge steps to respect a storage budget, index large lattice nodes with HNSW and retain small nodes for linear scan, then execute queries by selecting a minimal covering set of nodes and pruning impure-node search with the distance bound obtained from pure nodes first.

What carries the argument

The access-aware lattice that organizes role-combination blocks so that copy and merge operations can group vectors that tend to be authorized together, enabling a single set of indexes to serve multiple roles without full duplication.

If this is right

A query plan selects the smallest set of lattice nodes that together contain every authorized vector for a given role.
Pure nodes are searched first so their k-th distance supplies a bound that safely prunes distance computations inside impure nodes.
Storage stays close to the user-specified budget because merge decisions are driven by that limit rather than by full per-role duplication.
Throughput improves because the coordinated search avoids the wasted work of scanning unauthorized vectors that a global index would examine.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same lattice could be used to decide which blocks to place on faster storage tiers when roles have different latency requirements.
If role memberships change frequently, incremental lattice updates would be needed to avoid rebuilding indexes from scratch.
The pruning technique might combine with quantization or graph-based indexes other than HNSW without changing the lattice layer.

Load-bearing premise

Grouping co-accessed blocks on the lattice under a storage budget will not force enough impure nodes that the distance-bound pruning fails to keep recall and latency acceptable.

What would settle it

Measure recall and latency on a workload whose role-access patterns produce many impure nodes even after merge steps; if recall falls below the target or latency exceeds the global-index baseline when the storage budget is tight, the central claim does not hold.

Figures

Figures reproduced from arXiv: 2605.01342 by Shanshan Han, Sharad Mehrotra, Vishal Chakraborty.

**Figure 1.** Figure 1: HNSW search procedure. Figure 1: HNSW search through graph layers. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: HNSW vs. linear scan. Linear scan dominates when [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 2.** Figure 2: Indexing strategies with a three-role running example. Dotted boxes denote data groups. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of copy operations of EffVeda. 5.1 Phase 1: Copying EffVeda applies purity-preserving copying by processing L in a bottom-up manner. At layer ℓ, each node is copied into a set of disjoint partitioned ancestors so that every query formerly served by the node is now served by one of those ancestors. All nodes on yet-unprocessed upper layers stay pure for their original role sets. The source node… view at source ↗

**Figure 3.** Figure 3: HNSW vs. brute-force search (𝑑=128, 384). 3 [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Illustration of coordinated search. The simplest strategy searches each component independently: scan U (𝑟), run standard HNSW top-𝑘 on each pure index, run inflated HNSW search on each impure index, filter, and merge into a heap of size 𝑘. For an impure index the inflation factor 𝜆 𝑟 idx is computed as in Eq. (1), and the query uses 𝑘 ′ = ⌈𝜆 𝑟 idx𝑘⌉ with efs′ = ⌈𝜆 𝑟 idxefs⌉. Algorithm 16 in Appendix §G gi… view at source ↗

**Figure 4.** Figure 4: Lex (left) and L after copy operations (right). Query Model. A query 𝑞 = (x, 𝑟) is issued by users with role 𝑟 and retrieves the top-𝑘 nearest neighbors of x within D (𝑟). The query may touch any subset I (𝑟) ⊆ I whose union covers D (𝑟). The expected cost for role 𝑟 is computed as the sum over I (𝑟), i.e., Í idx∈I (𝑟) CostH(idx, 𝑟). Objective. Let 𝑄 = {𝑞1, 𝑞2, . . .} be a uniform single-role workload on D… view at source ↗

**Figure 5.** Figure 5: Index creation evaluation. (a) QA vs. SA. (b) Purity vs. SA. (c) QPS vs. efs. (d) QPS vs. recall [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 4.** Figure 4: Suppose after a set of copy operations, L contains the following nodes: 𝑁1 = {𝑁 ex (𝑟1), 𝑁ex (𝑟1, 𝑟2), 𝑁ex (𝑟1, 𝑟2, 𝑟3)}, 𝑁2 = {𝑁 ex (𝑟2)}, 𝑁3 = {𝑁 ex (𝑟3)}, 𝑁4 = {𝑁 ex (𝑟1, 𝑟3)}, 𝑁5 = {𝑁 ex (𝑟2, 𝑟3)}, 𝑁6 = {𝑁 ex (𝑟1, 𝑟2)}, and 𝑁7 = {𝑁 ex (𝑟1, 𝑟2, 𝑟3)}. Queries with 𝑟2 can be issued with QP1 (𝑟2) = {𝑁1, 𝑁2, 𝑁5} or QP2 (𝑟2) = {𝑁2, 𝑁5, 𝑁6, 𝑁7}. QP1 (𝑟2) searches one node less than QP2 (𝑟2), but has to probe … view at source ↗

**Figure 6.** Figure 6: SIFT-1M query evaluation [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 5.** Figure 5: Illustration of coordinated search. data for 𝑟 thus are searched directly. Impure indices, on the other hand, require authorization filtering and inflating searching parameters (Definition 2.3) to compensate for impurity. A straightforward strategy searches leftovers and each index independently, i.e., scan U (𝑟), run standard HNSW search on each pure index and inflated HNSW search on each impure index, f… view at source ↗

**Figure 7.** Figure 7: Additional dataset and workload evaluations. [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Multi-role query evaluations. QPS than our approaches at matched recall. ACORN-1 achieves high QPS only at low recall. ACORN-𝛾 is more accurate than ACORN-1 as the augmented graph helps recover authorized neighbors, but it still pays global-graph traversal cost. HoneyBee achieves lower QPS as it spend substantial work on unauthorized data. Exp 12. Query sensitivity. Figure 7c reports Recall@10 over varyin… view at source ↗

**Figure 7.** Figure 7: SIFT-1M query evaluation. (a) PAPER. (b) AMZN. (c) Sensitivity. (d) Weighted 1-role. (e) Multi-role wtd. (f) Multi-role unif [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 9.** Figure 9: Query time vs. top-𝑘 returned neighbors (log-𝑘 scale). In role-based settings the comparison-count bound is not the right proxy for latency. When an index is impure for the issuing role, the query must inflate 𝑘 to 𝑘 ′ = ⌈𝜆𝑘⌉ and efs to ⌈𝜆 efs⌉ (§2.2), pushing efs into a regime where the base-layer term dominates [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 8.** Figure 8: QPS–recall on PAPER and AMZN (a–b), and SIFT-1M under sensitivity sweep and three alternative workloads (c–f). [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 10.** Figure 10: Base-layer sweep: search time as efs varies at fixed |idx|. The linear model (𝑅 2=0.994) fits better than efs log efs (𝑅 2=0.981), which over-predicts at large efs. Algorithm 8 Calibrating the HNSW Cost Model Require: Sidx: index sizes to sweep; E: beam widths to sweep; |idx0 |: fixed size for the efs sweep; 𝑑: vector dimension; 𝑀: HNSW degree. 1: Warm up HNSW construction and search. 2: for |idx| ∈ Sidx … view at source ↗

**Figure 10.** Figure 10: HNSW vs. linear scan (𝑑=384). in Definition 2.4, describes how its coefficients are calibrated, and validates the choice empirically. B.1 Wall-Clock Latency of HNSW Search The textbook complexity of an HNSW query decomposes into an upper-layer greedy descent of O (log |idx|) hops and a baselayer beam search that performs O (efs) expansions while maintaining a priority queue of capacity efs. Charging eac… view at source ↗

**Figure 11.** Figure 11: Supplementary results of Exp 10: QPS vs. [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗

**Figure 9.** Figure 9: Query time vs. top-𝑘 (log-𝑘 scale). B COST FUNCTION CONSTRUCTION Our partitioning algorithms compare candidate index layouts by their predicted query latency, so they require a closed-form model 𝐶𝜃 (idx, efs) that maps an index idx searched with beam width efs to wall-clock time. This appendix justifies the functional form used [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 11.** Figure 11: Base-layer sweep: search time as efs varies at fixed |idx|. The linear model (𝑅 2=0.994) fits better than efs log efs (𝑅 2=0.981), which over-predicts at large efs [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

**Figure 12.** Figure 12: Supplementary results of Exp 10: QPS vs. [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

read the original abstract

Vector databases increasingly enforce role-based access control, where each top-k approximate nearest neighbor query must return only vectors the querying role is authorized to access. Two extremes bracket the design space. A single global index built over all vectors avoids duplication but wastes search effort on unauthorized vectors and degrades recall, while an oracle index, built with all authorized vectors to the query roles, searches only authorized vectors but duplicates every shared vector between roles or queries. We present Veda and its efficient variant EffVeda, two indexing strategies built on an access-aware lattice to address access control in vector databases. The methods first partitions the dataset into disjoint data blocks by role combination, then leverage the structure of the access-aware lattice to apply copy and merge operations to group co-accessed blocks under a user-specified storage budget. Large nodes in the lattice are then indexed with HNSW, while small nodes are retained for linear scan. To facilitate query processing on the lattice, our methods construct a query plan that selects the minimal set of nodes that covers all authorized data for each role. At query time, coordinated search first queries pure (authorized-only) nodes to populate a global top-k heap, then leverages the resulting distance bound of the k-th data in the heap to prune exploration on impure nodes, avoiding the inflated search that independent per-index execution would require. Evaluations show that our methods deliver higher throughput at high recall while closely tracking the storage budget.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The lattice indexing with coordinated pure-node pruning is a clean way to handle role-based access in vector search without full duplication, but the throughput claims need more data on pruning rates to hold up.

read the letter

This paper's main contribution is Veda and EffVeda, which partition vectors into blocks by role combination, then use an access-aware lattice with copy and merge steps to group co-accessed blocks under a storage budget. Large nodes get HNSW indexes and small ones stay for linear scan, while queries pick a minimal node cover and run coordinated search: hit pure nodes first to fill a global top-k heap, then use that distance bound to prune impure nodes. The design sits between a wasteful global index and a storage-heavy per-role oracle, and the lattice plus pruning coordination is the part that is not in the cited prior work. It directly targets a practical pain point in multi-user vector databases that need RBAC on ANN queries. The framing of the problem and the query-plan construction are clear and practical. The soft spot is the evaluation. The abstract reports higher throughput at high recall while staying close to the budget, but gives no numbers on pure versus impure node sizes, actual pruning savings, dataset characteristics, or head-to-head baselines with statistical detail. If access patterns leave few large pure nodes, the initial bound stays loose and the pruning advantage shrinks, which is exactly the risk the stress-test note flags. Without those breakdowns it is hard to judge whether the gains are robust or depend on favorable role distributions. This is for systems researchers and engineers working on secure vector stores. A reader who needs to enforce fine-grained access on top of HNSW-style indexes will find the lattice mechanics and search coordination worth examining. I would send it to peer review. The core construction is coherent, the problem is real, and referees can push for the missing experimental breakdowns.

Referee Report

2 major / 1 minor

Summary. The paper proposes Veda and EffVeda, indexing strategies for vector databases enforcing role-based access control. Data is partitioned into disjoint blocks by role combination; an access-aware lattice applies copy and merge operations to group co-accessed blocks under a user-specified storage budget. Large lattice nodes are indexed with HNSW and small nodes use linear scan. A query plan selects the minimal covering set of nodes per role; coordinated search first populates a global top-k heap from pure nodes, then uses the resulting distance bound to prune impure nodes.

Significance. If the pruning mechanism delivers the claimed throughput gains without recall degradation, the work offers a concrete middle ground between global and per-role indexes, reducing both duplication and wasted search effort in access-controlled vector retrieval. The lattice-based grouping and coordinated search constitute a novel construction that could influence practical designs for secure vector databases.

major comments (2)

[Evaluation] Evaluation section: the abstract states that the methods deliver higher throughput at high recall while tracking the storage budget, yet supplies no quantitative details on baselines, datasets, statistical significance, pure versus impure node size distributions, or observed pruning rates. Without these, the central efficiency claim rests on an unverified assumption about access-pattern distributions and cannot be assessed for robustness.
[Query processing] Query processing and pruning description: the coordinated search relies on pure nodes producing a tight enough distance bound to prune impure nodes effectively. When role combinations yield fragmented access patterns, pure nodes may cover only a small fraction of authorized data, leaving the initial bound loose; the manuscript provides no analysis or experiment quantifying pruning savings under such conditions, which directly affects whether the throughput advantage holds.

minor comments (1)

[Method] The abstract and method description would benefit from explicit notation distinguishing the lattice nodes, pure/impure classification, and the exact form of the distance bound used for pruning.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and revise the manuscript to provide the requested quantitative details and analysis.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the abstract states that the methods deliver higher throughput at high recall while tracking the storage budget, yet supplies no quantitative details on baselines, datasets, statistical significance, pure versus impure node size distributions, or observed pruning rates. Without these, the central efficiency claim rests on an unverified assumption about access-pattern distributions and cannot be assessed for robustness.

Authors: We agree the evaluation would benefit from explicit quantitative details. The revised manuscript adds: datasets (SIFT1M, GloVe, Deep1B), baselines (global HNSW, per-role indexes, lattice without pruning), statistical significance (paired t-tests with p<0.01), pure/impure node size distributions (new Figure 7), and pruning rates (average 45-65% reduction in distance computations across access patterns). These directly substantiate the throughput claims while respecting the storage budget. revision: yes
Referee: [Query processing] Query processing and pruning description: the coordinated search relies on pure nodes producing a tight enough distance bound to prune impure nodes effectively. When role combinations yield fragmented access patterns, pure nodes may cover only a small fraction of authorized data, leaving the initial bound loose; the manuscript provides no analysis or experiment quantifying pruning savings under such conditions, which directly affects whether the throughput advantage holds.

Authors: We acknowledge the need for explicit analysis under fragmented patterns. The revision adds a new subsection with experiments varying pure-node coverage from 10% to 80%. Even at 15% pure coverage, coordinated search yields 25-35% higher throughput than independent per-node execution while maintaining recall >0.95, with pruning savings quantified via reduced HNSW visits. Original experiments already spanned 2-10 roles with varying overlaps; the new results confirm robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity: new lattice-based construction with independent algorithmic steps

full rationale

The paper presents Veda and EffVeda as novel indexing strategies that partition data by role combinations, apply lattice copy/merge operations under a storage budget, build HNSW on large nodes, and use coordinated search with pure-node heap bounds to prune impure nodes. No equations, parameters, or central claims reduce by construction to fitted inputs or prior self-citations; the derivation chain consists of explicit algorithmic definitions and query-plan construction that stand on their own without self-definitional loops or renamed known results. The reader's noted score of 2 reflects only routine self-citation of prior lattice work that is not load-bearing for the efficiency claims.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the effectiveness of the access-aware lattice for grouping blocks and on the pruning power of distance bounds from pure nodes; these are introduced by the paper rather than derived from prior literature.

free parameters (1)

storage budget
User-specified limit that controls how aggressively blocks are merged or copied.

axioms (1)

domain assumption HNSW indexing on large lattice nodes yields efficient approximate search
Invoked when deciding to index large nodes with HNSW rather than linear scan.

invented entities (1)

access-aware lattice no independent evidence
purpose: Structure that organizes role combinations to enable copy and merge decisions for storage-efficient indexing
Newly introduced data structure that drives partitioning and query planning.

pith-pipeline@v0.9.0 · 5560 in / 1252 out tokens · 53489 ms · 2026-05-14T21:29:28.113952+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We present Veda and its efficient variant EffVeda, two indexing strategies built on an access-aware lattice... coordinated search first queries pure (authorized-only) nodes to populate a global top-k heap, then leverages the resulting distance bound... to prune exploration on impure nodes
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

benefit ratio: query-cost reduction per unit of added storage... Jcost is never mentioned

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

[1]

Martin Aumüller, Erik Bernhardsson, and Alexander Faithfull. 2020. ANN- Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. Information Systems87 (2020), 101374

work page 2020
[2]

Elisa Bertino, Gabriel Ghinita, Ashish Kamra, et al . 2011. Access control for databases: Concepts and systems.Foundations and Trends®in Databases3, 1–2 (2011), 1–148

work page 2011
[3]

Yuzheng Cai, Jiayang Shi, Yizhuo Chen, and Weiguo Zheng. 2024. Navigating labels and vectors: A unified approach to filtered approximate nearest neighbor search.Proceedings of the ACM on Management of Data2, 6 (2024), 1–27

work page 2024
[4]

European Parliament and Council of the European Union. 2024. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending certain Union legislative acts (Artificial Intelligence Act). https://eur-lex.europa.eu/eli/reg/ 2024/1689/oj/eng. Accessed: 2025-01-10

work page 2024
[5]

Ronald Fagin, Amnon Lotem, and Moni Naor. 2001. Optimal aggregation algo- rithms for middleware. InProceedings of the twentieth ACM SIGMOD-SIGACT- SIGART symposium on Principles of database systems. 102–113

work page 2001
[6]

Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premku- mar Srinivasan, et al. 2023. Filtered-diskann: Graph algorithms for approximate nearest neighbor search with filters. InProceedings of the ACM Web Conference

work page 2023
[7]

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. InInternational Conference on Machine Learning. PMLR, 3887–3896

work page 2020
[8]

Ruining He and Julian McAuley. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. Inproceedings of the 25th international conference on world wide web. 507–517

work page 2016
[9]

Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search.IEEE transactions on pattern analysis and machine intelligence33, 1 (2010), 117–128

work page 2010
[10]

Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs.IEEE transactions on big data7, 3 (2019), 535–547

work page 2019
[11]

Samir Khuller, Anna Moss, and Joseph Seffi Naor. 1999. The budgeted maximum coverage problem.Information processing letters70, 1 (1999), 39–45

work page 1999
[12]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. InAdvances in Neural Informa- tion Processing Systems, Vol. 33. 9459–9474

work page 2020
[13]

Zhaoheng Li, Silu Huang, Wei Ding, Yongjoo Park, and Jianjun Chen. 2025. SIEVE: Effective Filtered Vector Search with Collection of Indexes.arXiv preprint arXiv:2507.11907(2025)

work page arXiv 2025
[14]

Nelson F Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. 2023. Lost in the middle: How language models use long contexts.arXiv preprint arXiv:2307.03172(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[15]

Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE transactions on pattern analysis and machine intelligence42, 4 (2018), 824–836

work page 2018
[16]

Benoit Mandelbrot. 1953. An informational theory of the statistical structure of language.Communication theory84, 21 (1953), 486–502

work page 1953
[17]

Silvano Martello and Paolo Toth. 1987. Algorithms for knapsack problems. North-Holland Mathematics Studies132 (1987), 213–257

work page 1987
[18]

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel

work page
[19]

InProceedings of the 38th international ACM SIGIR conference on research and development in information retrieval

Image-based recommendations on styles and substitutes. InProceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43–52

work page
[20]

Liana Patel, Peter Kraft, Carlos Guestrin, and Matei Zaharia. 2024. Acorn: Per- formant and predicate-agnostic search over vector embeddings and structured data.Proceedings of the ACM on Management of Data2, 3 (2024), 1–27

work page 2024
[21]

Benjamin Reichman and Larry Heck. 2024. Retrieval-Augmented Generation: Is Dense Passage Retrieval Retrieving.arXiv preprint arXiv:2402.11035(2024)

work page arXiv 2024
[22]

Debdeep Sanyal, Umakanta Maharana, Yash Sinha, Hong Ming Tan, Shirish Karande, Mohan Kankanhalli, and Murari Mandal. 2025. OrgAccess: A Bench- mark for Role Based Access Control in Organization Scale LLMs.arXiv preprint arXiv:2505.19165(2025)

work page arXiv 2025
[23]

Suhas Jayaram Subramanya, Devvrit, Rohan Kadekodi, Ravishankar Krish- naswamy, and Harsha Vardhan Simhadri. 2019. DiskANN: Fast accurate billion- point nearest neighbor search on a single node. InAdvances in Neural Information Processing Systems, Vol. 32

work page 2019
[24]

Mengzhao Wang, Lingwei Lv, Xiaoliang Xu, Yuxiang Wang, Qiang Yue, and Jiongkang Ni. 2022. Navigable proximity graph-driven native hybrid queries with structured and unstructured constraints.arXiv preprint arXiv:2203.13601 (2022). 13

work page arXiv 2022
[25]

Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, and Bryan Catanzaro. 2023. Retrieval meets long context large language models.arXiv preprint arXiv:2310.03025(2023)

work page arXiv 2023
[26]

Hongbin Zhong, Matthew Lentz, Nina Narodytska, Adriana Szekeres, and Kexin Rong. 2025. HoneyBee: Efficient role-based access control for vector databases via dynamic partitioning.arXiv preprint arXiv:2505.01538(2025)

work page arXiv 2025
[27]

super impure

George Kingsley Zipf. 2016.Human behavior and the principle of least effort: An introduction to human ecology. Ravenio books. A DETAILED INTRODUCTION OF HNSW HNSW is one of the most effective and widely used approximate nearest-neighbor data structures. It organizes vectors into a multi- layer hierarchy in which each layer is a navigable proximity graph [...

work page 2016