Hashing for Similarity Search: A Survey
read the original abstract
Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space.
This paper has not been read by Pith yet.
Forward citations
Cited by 4 Pith papers
-
RNSG: A Range-Aware Graph Index for Efficient Range-Filtered Approximate Nearest Neighbor Search
RNSG approximates the range-aware relative neighborhood graph (RRNG) to enable high-performance range-filtered ANN queries with one compact index instead of many.
-
Algorithms for Similarity Search and Pseudorandomness
Improved LSH frameworks for ANN search with space-time tradeoffs and matching lower bounds, a novel set-based ANN approach, self-tuning experiments, and deterministic/randomized pseudorandom generators with near-optim...
-
Statistical Clear Sky Fitting Algorithm
A statistical algorithm extracts a clear-sky performance signal from PV power measurements without external weather, irradiance, or configuration data.
-
Learning Compressed Sentence Representations for On-Device Text Processing
Four binarization strategies turn continuous sentence embeddings into binary form, cutting storage by over 98% with only about 2% performance drop on downstream tasks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.