hub Mixed citations

Billion-scale similarity search with GPUs

· 2017 · cs.CV · arXiv 1702.08734

Mixed citation behavior. Most common role is method (60%).

31 Pith papers citing it

Method 60% of classified citations

open full Pith review browse 31 citing papers arXiv PDF

abstract

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less parallelism, such as k-min selection, or make poor use of the memory hierarchy. We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art. We apply it in different similarity search scenarios, by proposing optimized design for brute-force, approximate and compressed-domain search based on product quantization. In all these setups, we outperform the state of the art by large margins. Our implementation enables the construction of a high accuracy k-NN graph on 95 million images from the Yfcc100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. We have open-sourced our approach for the sake of comparison and reproducibility.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

method 3 background 2

citation-polarity summary

use method 3 background 1 unclear 1

representative citing papers

Dense Passage Retrieval for Open-Domain Question Answering

cs.CL · 2020-04-10 · accept · novelty 8.0

Dense dual-encoder retrievers outperform BM25 by 9-19% absolute in top-20 passage retrieval accuracy across open-domain QA datasets and enable new state-of-the-art end-to-end QA results.

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

cs.IR · 2021-04-17 · accept · novelty 8.0

BEIR is a heterogeneous zero-shot benchmark showing BM25 as a robust baseline while re-ranking and late-interaction models perform best on average at higher cost, with dense and sparse models lagging in generalization.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

cs.CL · 2019-08-27 · unverdicted · novelty 8.0

Sentence-BERT adapts BERT with siamese and triplet networks to produce sentence embeddings for efficient cosine-similarity comparisons, cutting computation time from hours to seconds on similarity search while matching BERT accuracy.

RWGBench: Evaluating Scholarly Positioning in Related Work Generation

cs.DL · 2026-05-30 · unverdicted · novelty 7.0

RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.

Modernizing User Privacy Preference Measurement through GPPI: A GDPR-aligned Privacy Preference Item Bank

cs.HC · 2026-05-23 · unverdicted · novelty 7.0

A 527-item GDPR-aligned privacy preference item bank was developed by extracting 669 statements from 99 GDPR articles and validating them through multi-round expert consensus and semantic clustering.

MIST: A Co-Design Framework for Heterogeneous, Multi-Stage LLM Inference

cs.AR · 2025-04-14 · unverdicted · novelty 7.0

MIST is a new simulator for heterogeneous multi-stage LLM inference that combines hardware traces with analytical models to explore configuration trade-offs in hybrid CPU-accelerator systems.

ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability

cs.CL · 2025-02-17 · unverdicted · novelty 7.0

ExaGPT uses span-level similarity retrieval from human and LLM datastores to detect machine-generated text while supplying the matching spans as human-interpretable evidence, achieving up to 37-point accuracy gains over prior interpretable detectors at 1% FPR.

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

cs.CL · 2020-05-22 · accept · novelty 7.0

RAG models set new state-of-the-art results on open-domain QA by retrieving Wikipedia passages and conditioning a generative model on them, while also producing more factual text than parametric baselines.

The Decomposition Is the Fingerprint: Per-Component Identity for Agent Skills

cs.CR · 2026-06-30 · unverdicted · novelty 6.0

A per-component SimHash fingerprint supplies structural identity for AI agent skills, recovering family membership under paraphrase and refactoring with AUC 0.974 while localizing changes.

When More Cores Hurts: The Vector Database Scaling Paradox in HPC

cs.DC · 2026-06-08 · unverdicted · novelty 6.0

Large-scale HPC evaluation of Qdrant, Milvus, and Weaviate reveals that workload patterns limit scaling and extra cores can reduce throughput, exposing a cloud-to-HPC design mismatch.

NTILC: Neural Tool Invocation via Learned Compression

cs.SE · 2026-06-04 · unverdicted · novelty 6.0

NTILC replaces in-context tool registry lookup with learned latent retrieval using a signature-aware composite loss, reducing context consumption by over 95% and latency by up to 74%.

CourseBlueprint: A Structured Pipeline for Adaptive Pedagogical Video Generation Grounded in Course Corpora

cs.CY · 2026-05-22 · unverdicted · novelty 6.0

CourseBlueprint builds a typed pipeline over a 23-lecture biomedical imaging corpus to generate prerequisite-aware, learner-adaptive videos with auditable engagement contracts and slide grounding.

Text-Guided Visual Representation Learning for Robust Multimodal E-Commerce Recommendation

cs.IR · 2026-05-17 · unverdicted · novelty 6.0

TGQ-Former uses metadata-guided hybrid queries and dual-gated modulation to improve visual token selection in multimodal e-commerce retrieval, raising average Hit Rate@100 by 6.04% over baselines.

DataComp-LM: In search of the next generation of training sets for language models

cs.LG · 2024-06-17 · unverdicted · novelty 6.0

DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

cs.CL · 2024-01-31 · unverdicted · novelty 6.0

RAPTOR introduces a tree-organized retrieval method using recursive abstractive summaries, achieving a 20% absolute accuracy improvement on the QuALITY benchmark when paired with GPT-4.

Unsupervised Adversarial Graph Alignment with Graph Embedding

cs.SI · 2019-07-01 · unverdicted · novelty 6.0

UAGA aligns two graph embedding spaces via adversarial training in a fully unsupervised setting, with an incremental extension iUAGA that uses discovered pseudo-anchors to refine both embeddings and alignments.

Pyramid: A General Framework for Distributed Similarity Search

cs.DC · 2019-06-25 · unverdicted · novelty 6.0

Pyramid is a distributed similarity search framework based on HNSW that partitions datasets into similar-item sub-datasets for efficient query processing and includes failure recovery and straggler mitigation.

QRAFTI: An Agentic Framework for Empirical Research in Quantitative Finance

cs.MA · 2026-04-20 · unverdicted · novelty 6.0

QRAFTI is a multi-agent framework using tool-calling and reflection-based planning to emulate quant research tasks like factor replication and signal testing on financial data.

TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning

cs.AI · 2026-04-14 · unverdicted · novelty 6.0

A multi-agent system for explainable fake news detection that decomposes claims, retrieves evidence, verifies with calibrated confidence, and aggregates logic verdicts, showing better interpretability than BERT/RoBERTa on the LIAR benchmark despite lower raw accuracy.

Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval

cs.AI · 2026-04-13 · unverdicted · novelty 6.0

A hybrid graph-text retrieval system for cyber threat intelligence improves multi-hop question answering by up to 35% over vector-based RAG on a 3,300-question benchmark.

Decision-Aware Memory Cards: Counterfactual-Inspired Context Selection and Compression for Tool-Using LLM Agents

cs.AI · 2026-06-06 · unverdicted · novelty 5.0

CICL scores and compresses context evidence for LLM agents via action-shift and outcome-uplift metrics, lifting hit@1 from 0.58 to 0.78 on 50 SWE-bench retrieval tasks.

ESGLens: An LLM-Based RAG Framework for Interactive ESG Report Analysis and Score Prediction

cs.CL · 2026-03-29 · conditional · novelty 5.0

ESGLens applies RAG and LLM embeddings to extract GRI-aligned information from ESG reports and achieves 0.48 Pearson correlation when regressing environmental scores on 300 company reports.

The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search

cs.IR · 2019-07-17 · unverdicted · novelty 5.0

Local intrinsic dimensionality enables selection of query sets with varying difficulty for nearest neighbor search benchmarking, and common real-world datasets are not diverse as performance on one predicts others well.

An Empirical Comparison of FAISS and FENSHSES for Nearest Neighbor Search in Hamming Space

cs.LG · 2019-06-24 · unverdicted · novelty 5.0

Empirical benchmark of FAISS (main memory) versus FENSHSES (secondary memory) on Hamming-space nearest-neighbor search across indexing speed, latency, and RAM.

citing papers explorer

Showing 31 of 31 citing papers.

Dense Passage Retrieval for Open-Domain Question Answering cs.CL · 2020-04-10 · accept · none · ref 83 · internal anchor
Dense dual-encoder retrievers outperform BM25 by 9-19% absolute in top-20 passage retrieval accuracy across open-domain QA datasets and enable new state-of-the-art end-to-end QA results.
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models cs.IR · 2021-04-17 · accept · none · ref 32
BEIR is a heterogeneous zero-shot benchmark showing BM25 as a robust baseline while re-ranking and late-interaction models perform best on average at higher cost, with dense and sparse models lagging in generalization.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks cs.CL · 2019-08-27 · unverdicted · none · ref 17
Sentence-BERT adapts BERT with siamese and triplet networks to produce sentence embeddings for efficient cosine-similarity comparisons, cutting computation time from hours to seconds on similarity search while matching BERT accuracy.
RWGBench: Evaluating Scholarly Positioning in Related Work Generation cs.DL · 2026-05-30 · unverdicted · none · ref 19 · internal anchor
RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.
Modernizing User Privacy Preference Measurement through GPPI: A GDPR-aligned Privacy Preference Item Bank cs.HC · 2026-05-23 · unverdicted · none · ref 52 · internal anchor
A 527-item GDPR-aligned privacy preference item bank was developed by extracting 669 statements from 99 GDPR articles and validating them through multi-round expert consensus and semantic clustering.
MIST: A Co-Design Framework for Heterogeneous, Multi-Stage LLM Inference cs.AR · 2025-04-14 · unverdicted · none · ref 30 · internal anchor
MIST is a new simulator for heterogeneous multi-stage LLM inference that combines hardware traces with analytical models to explore configuration trade-offs in hybrid CPU-accelerator systems.
ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability cs.CL · 2025-02-17 · unverdicted · none · ref 15 · internal anchor
ExaGPT uses span-level similarity retrieval from human and LLM datastores to detect machine-generated text while supplying the matching spans as human-interpretable evidence, achieving up to 37-point accuracy gains over prior interpretable detectors at 1% FPR.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks cs.CL · 2020-05-22 · accept · none · ref 26
RAG models set new state-of-the-art results on open-domain QA by retrieving Wikipedia passages and conditioning a generative model on them, while also producing more factual text than parametric baselines.
The Decomposition Is the Fingerprint: Per-Component Identity for Agent Skills cs.CR · 2026-06-30 · unverdicted · none · ref 22 · internal anchor
A per-component SimHash fingerprint supplies structural identity for AI agent skills, recovering family membership under paraphrase and refactoring with AUC 0.974 while localizing changes.
When More Cores Hurts: The Vector Database Scaling Paradox in HPC cs.DC · 2026-06-08 · unverdicted · none · ref 33 · internal anchor
Large-scale HPC evaluation of Qdrant, Milvus, and Weaviate reveals that workload patterns limit scaling and extra cores can reduce throughput, exposing a cloud-to-HPC design mismatch.
NTILC: Neural Tool Invocation via Learned Compression cs.SE · 2026-06-04 · unverdicted · none · ref 4 · internal anchor
NTILC replaces in-context tool registry lookup with learned latent retrieval using a signature-aware composite loss, reducing context consumption by over 95% and latency by up to 74%.
CourseBlueprint: A Structured Pipeline for Adaptive Pedagogical Video Generation Grounded in Course Corpora cs.CY · 2026-05-22 · unverdicted · none · ref 29 · internal anchor
CourseBlueprint builds a typed pipeline over a 23-lecture biomedical imaging corpus to generate prerequisite-aware, learner-adaptive videos with auditable engagement contracts and slide grounding.
Text-Guided Visual Representation Learning for Robust Multimodal E-Commerce Recommendation cs.IR · 2026-05-17 · unverdicted · none · ref 18 · internal anchor
TGQ-Former uses metadata-guided hybrid queries and dual-gated modulation to improve visual token selection in multimodal e-commerce retrieval, raising average Hit Rate@100 by 6.04% over baselines.
DataComp-LM: In search of the next generation of training sets for language models cs.LG · 2024-06-17 · unverdicted · none · ref 92 · internal anchor
DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval cs.CL · 2024-01-31 · unverdicted · none · ref 26 · internal anchor
RAPTOR introduces a tree-organized retrieval method using recursive abstractive summaries, achieving a 20% absolute accuracy improvement on the QuALITY benchmark when paired with GPT-4.
Unsupervised Adversarial Graph Alignment with Graph Embedding cs.SI · 2019-07-01 · unverdicted · none · ref 33 · internal anchor
UAGA aligns two graph embedding spaces via adversarial training in a fully unsupervised setting, with an incremental extension iUAGA that uses discovered pseudo-anchors to refine both embeddings and alignments.
Pyramid: A General Framework for Distributed Similarity Search cs.DC · 2019-06-25 · unverdicted · none · ref 29 · internal anchor
Pyramid is a distributed similarity search framework based on HNSW that partitions datasets into similar-item sub-datasets for efficient query processing and includes failure recovery and straggler mitigation.
QRAFTI: An Agentic Framework for Empirical Research in Quantitative Finance cs.MA · 2026-04-20 · unverdicted · none · ref 30
QRAFTI is a multi-agent framework using tool-calling and reflection-based planning to emulate quant research tasks like factor replication and signal testing on financial data.
TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning cs.AI · 2026-04-14 · unverdicted · none · ref 7
A multi-agent system for explainable fake news detection that decomposes claims, retrieves evidence, verifies with calibrated confidence, and aggregates logic verdicts, showing better interpretability than BERT/RoBERTa on the LIAR benchmark despite lower raw accuracy.
Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval cs.AI · 2026-04-13 · unverdicted · none · ref 19
A hybrid graph-text retrieval system for cyber threat intelligence improves multi-hop question answering by up to 35% over vector-based RAG on a 3,300-question benchmark.
Decision-Aware Memory Cards: Counterfactual-Inspired Context Selection and Compression for Tool-Using LLM Agents cs.AI · 2026-06-06 · unverdicted · none · ref 16 · internal anchor
CICL scores and compresses context evidence for LLM agents via action-shift and outcome-uplift metrics, lifting hit@1 from 0.58 to 0.78 on 50 SWE-bench retrieval tasks.
ESGLens: An LLM-Based RAG Framework for Interactive ESG Report Analysis and Score Prediction cs.CL · 2026-03-29 · conditional · none · ref 36 · internal anchor
ESGLens applies RAG and LLM embeddings to extract GRI-aligned information from ESG reports and achieves 0.48 Pearson correlation when regressing environmental scores on 300 company reports.
The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search cs.IR · 2019-07-17 · unverdicted · none · ref 15 · internal anchor
Local intrinsic dimensionality enables selection of query sets with varying difficulty for nearest neighbor search benchmarking, and common real-world datasets are not diverse as performance on one predicts others well.
An Empirical Comparison of FAISS and FENSHSES for Nearest Neighbor Search in Hamming Space cs.LG · 2019-06-24 · unverdicted · none · ref 10 · internal anchor
Empirical benchmark of FAISS (main memory) versus FENSHSES (secondary memory) on Hamming-space nearest-neighbor search across indexing speed, latency, and RAM.
Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use cs.CR · 2026-05-06 · unverdicted · none · ref 12
A server-side architecture with policy-aware ingestion and ABAC-based retrieval gating prevents cross-tenant data leakage in multitenant enterprise RAG and agent systems.
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI cs.CR · 2026-05-04 · unverdicted · none · ref 112 · 2 links
A survey providing a taxonomy of TEE platforms, an agent-centric threat model, and open challenges for applying confidential computing to secure agentic AI systems.
Using predefined vector systems to speed up neural network multimillion class classification cs.LG · 2026-04-01 · unverdicted · none · ref 11
Predefined vector systems structure neural network latent spaces to allow O(1) label prediction via index searches on embedding vectors, delivering up to 11.6x speedup on multimillion-class tasks while preserving accuracy and enabling new-class detection.
Product Quantization for Surface Soil Similarity cs.LG · 2025-06-03 · unverdicted · none · ref 7 · internal anchor
A pipeline using product quantization and systematic parameter evaluation creates data-driven soil taxonomies with higher specificity than human-derived classifications.
Evaluation of Chunking Strategies for Effective Text Embedding in Low-Resource Language on Agricultural Documents cs.CL · 2026-05-21 · unverdicted · none · ref 4 · internal anchor
Recursive character-based chunking at 300 characters outperforms Sentence-Based, Khmer-Aware, and LLM-Based methods on L2 distance, answer relevance, and Khmer IoU in a 5-fold evaluation on 18 Khmer agricultural QA pairs.
From Knowledge to Action: Outcomes of the 2025 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry cond-mat.mtrl-sci · 2026-05-04 · unverdicted · none · ref 136
Hackathon submissions indicate LLMs are moving from general assistants toward composable multi-agent systems for structuring scientific knowledge and automating tasks in materials science and chemistry.
MedCase-Structured: A Text-to-FHIR Dataset for Benchmarking Diagnostic Reasoning in Clinically Realistic EHR Settings cs.CL · 2026-05-28 · unreviewed · ref 4 · 2 links · internal anchor

Billion-scale similarity search with GPUs

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer