Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.
Title resolution pending
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 10representative citing papers
MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.
GRAB-ANNS is a new GPU graph index that achieves up to 240x higher hybrid search throughput via bucket layouts and hybrid intra/inter-bucket edges.
NuggetIndex manages atomic nuggets with temporal validity and lifecycle metadata to filter outdated information before ranking, yielding 42% higher nugget recall, 9pp better temporal correctness, and 55% fewer conflicts than passage or unmanaged proposition baselines.
Metadata Reasoner uses agentic LLM reasoning on metadata to select sufficient and minimal data sources, achieving 83.16% F1 on KramaBench and 85.5% F1 on noisy synthetic benchmarks while avoiding low-quality tables 99% of the time.
LSM-VEC integrates hierarchical graphs with LSM-tree levels for out-of-place dynamic updates, sampling-based search, and connectivity-aware reordering, outperforming prior disk-based ANN systems on billion-scale data with higher recall, lower latency, and over 66% memory reduction.
A neural sparse retrieval system with granular subword tokenization (max 3 chars) achieves 91.4% recall@10 on a 6M music document corpus versus 57.7% for trigrams, with improved HCI exploration efficiency and zero added query latency.
SkipDisk is a disk-memory hybrid ANN search that achieves 63-85% of HNSW latency at 10-20% memory footprint via dedicated pivots for tighter lower bounds, three-level pruning, and decoupled async I/O.
Veda and EffVeda partition vector data by role combinations, apply lattice-based copy/merge under storage budget, index large nodes with HNSW and small nodes with linear scan, then use query plans and coordinated search with pruning to answer authorized top-k queries.
NFTDELTA detects permission control vulnerabilities in NFT contracts by combining sequence and graph views of function CFGs, reporting 241 confirmed issues across 795 collections with 97.92% average precision.
citing papers explorer
-
Semantic Recall for Vector Search
Semantic Recall is a new evaluation metric for approximate nearest neighbor search that focuses only on semantically relevant results, with Tolerant Recall as a proxy when relevance labels are unavailable.
-
Unified and Efficient Approach for Multi-Vector Similarity Search
MV-HNSW is the first native hierarchical graph index for multi-vector data, achieving over 90% recall with up to 14x lower search latency than prior filter-and-refine approaches across seven datasets.
-
GRAB-ANNS: High-Throughput Indexing and Hybrid Search via GPU-Native Bucketing
GRAB-ANNS is a new GPU graph index that achieves up to 240x higher hybrid search throughput via bucket layouts and hybrid intra/inter-bucket edges.
-
NuggetIndex: Governed Atomic Retrieval for Maintainable RAG
NuggetIndex manages atomic nuggets with temporal validity and lifecycle metadata to filter outdated information before ranking, yielding 42% higher nugget recall, 9pp better temporal correctness, and 55% fewer conflicts than passage or unmanaged proposition baselines.
-
An Agentic Approach to Metadata Reasoning
Metadata Reasoner uses agentic LLM reasoning on metadata to select sufficient and minimal data sources, achieving 83.16% F1 on KramaBench and 85.5% F1 on noisy synthetic benchmarks while avoiding low-quality tables 99% of the time.
-
LSM-VEC: A Large-Scale Disk-Based System for Dynamic Vector Search
LSM-VEC integrates hierarchical graphs with LSM-tree levels for out-of-place dynamic updates, sampling-based search, and connectivity-aware reordering, outperforming prior disk-based ANN systems on billion-scale data with higher recall, lower latency, and over 66% memory reduction.
-
Surface-Form Neural Sparse Retrieval: Robust Fuzzy Matching for Industrial Music Search
A neural sparse retrieval system with granular subword tokenization (max 3 chars) achieves 91.4% recall@10 on a 6M music document corpus versus 57.7% for trigrams, with improved HCI exploration efficiency and zero added query latency.
-
Low-Latency Out-of-Core ANN Search in High-Dimensional Space
SkipDisk is a disk-memory hybrid ANN search that achieves 63-85% of HNSW latency at 10-20% memory footprint via dedicated pivots for tighter lower bounds, three-level pruning, and decoupled async I/O.
-
Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing
Veda and EffVeda partition vector data by role combinations, apply lattice-based copy/merge under storage budget, index large nodes with HNSW and small nodes with linear scan, then use query plans and coordinated search with pruning to answer authorized top-k queries.
-
NFTDELTA: Detecting Permission Control Vulnerabilities in NFT Contracts through Multi-View Learning
NFTDELTA detects permission control vulnerabilities in NFT contracts by combining sequence and graph views of function CFGs, reporting 241 confirmed issues across 795 collections with 97.92% average precision.