Veda and EffVeda partition vectors into disjoint role-combination blocks, apply lattice-based copy and merge operations within a storage budget, index large nodes with HNSW, and use coordinated search with distance bounds to deliver higher throughput at high recall.
Title resolution pending
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.
MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.
GRAB-ANNS is a new GPU graph index that achieves up to 240x higher hybrid search throughput via bucket layouts and hybrid intra/inter-bucket edges.
NuggetIndex manages atomic nuggets with temporal validity and lifecycle metadata to filter outdated information before ranking, yielding 42% higher nugget recall, 9pp better temporal correctness, and 55% fewer conflicts than passage or unmanaged proposition baselines.
Metadata Reasoner uses agentic LLM reasoning on metadata to select sufficient and minimal data sources, achieving 83.16% F1 on KramaBench and 85.5% F1 on noisy synthetic benchmarks while avoiding low-quality tables 99% of the time.
LSM-VEC integrates hierarchical graphs with LSM-tree levels for out-of-place dynamic updates, sampling-based search, and connectivity-aware reordering, outperforming prior disk-based ANN systems on billion-scale data with higher recall, lower latency, and over 66% memory reduction.
A neural sparse retrieval system with granular subword tokenization (max 3 chars) achieves 91.4% recall@10 on a 6M music document corpus versus 57.7% for trigrams, with improved HCI exploration efficiency and zero added query latency.
SkipDisk is a disk-memory hybrid ANN search that achieves 63-85% of HNSW latency at 10-20% memory footprint via dedicated pivots for tighter lower bounds, three-level pruning, and decoupled async I/O.
NFTDELTA detects permission control vulnerabilities in NFT contracts by combining sequence and graph views of function CFGs, reporting 241 confirmed issues across 795 collections with 97.92% average precision.
citing papers explorer
-
Don't Be a Pot Stirrer! Authorized Vector Data Retrieval via Access-Aware Indexing
Veda and EffVeda partition vectors into disjoint role-combination blocks, apply lattice-based copy and merge operations within a storage budget, index large nodes with HNSW, and use coordinated search with distance bounds to deliver higher throughput at high recall.
-
Semantic Recall for Vector Search
Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.
-
Unified and Efficient Approach for Multi-Vector Similarity Search
MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.
-
GRAB-ANNS: High-Throughput Indexing and Hybrid Search via GPU-Native Bucketing
GRAB-ANNS is a new GPU graph index that achieves up to 240x higher hybrid search throughput via bucket layouts and hybrid intra/inter-bucket edges.
-
NuggetIndex: Governed Atomic Retrieval for Maintainable RAG
NuggetIndex manages atomic nuggets with temporal validity and lifecycle metadata to filter outdated information before ranking, yielding 42% higher nugget recall, 9pp better temporal correctness, and 55% fewer conflicts than passage or unmanaged proposition baselines.
-
An Agentic Approach to Metadata Reasoning
Metadata Reasoner uses agentic LLM reasoning on metadata to select sufficient and minimal data sources, achieving 83.16% F1 on KramaBench and 85.5% F1 on noisy synthetic benchmarks while avoiding low-quality tables 99% of the time.
-
LSM-VEC: A Large-Scale Disk-Based System for Dynamic Vector Search
LSM-VEC integrates hierarchical graphs with LSM-tree levels for out-of-place dynamic updates, sampling-based search, and connectivity-aware reordering, outperforming prior disk-based ANN systems on billion-scale data with higher recall, lower latency, and over 66% memory reduction.
-
Surface-Form Neural Sparse Retrieval: Robust Fuzzy Matching for Industrial Music Search
A neural sparse retrieval system with granular subword tokenization (max 3 chars) achieves 91.4% recall@10 on a 6M music document corpus versus 57.7% for trigrams, with improved HCI exploration efficiency and zero added query latency.
-
Low-Latency Out-of-Core ANN Search in High-Dimensional Space
SkipDisk is a disk-memory hybrid ANN search that achieves 63-85% of HNSW latency at 10-20% memory footprint via dedicated pivots for tighter lower bounds, three-level pruning, and decoupled async I/O.
-
NFTDELTA: Detecting Permission Control Vulnerabilities in NFT Contracts through Multi-View Learning
NFTDELTA detects permission control vulnerabilities in NFT contracts by combining sequence and graph views of function CFGs, reporting 241 confirmed issues across 795 collections with 97.92% average precision.