pith. machine review for the scientific record. sign in

arxiv: 1509.02897 · v1 · submitted 2015-09-09 · 💻 cs.DS · cs.CG· cs.IR

Recognition: unknown

Practical and Optimal LSH for Angular Distance

Authors on Pith no claims yet
classification 💻 cs.DS cs.CGcs.IR
keywords algorithmangulardistancefamilyoptimalaboveandonibound
0
0 comments X
read the original abstract

We show the existence of a Locality-Sensitive Hashing (LSH) family for the angular distance that yields an approximate Near Neighbor Search algorithm with the asymptotically optimal running time exponent. Unlike earlier algorithms with this property (e.g., Spherical LSH [Andoni, Indyk, Nguyen, Razenshteyn 2014], [Andoni, Razenshteyn 2015]), our algorithm is also practical, improving upon the well-studied hyperplane LSH [Charikar, 2002] in practice. We also introduce a multiprobe version of this algorithm, and conduct experimental evaluation on real and synthetic data sets. We complement the above positive results with a fine-grained lower bound for the quality of any LSH family for angular distance. Our lower bound implies that the above LSH family exhibits a trade-off between evaluation time and quality that is close to optimal for a natural class of LSH functions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Reformer: The Efficient Transformer

    cs.LG 2020-01 accept novelty 8.0

    Reformer matches standard Transformer accuracy on long sequences while using far less memory and running faster via LSH attention and reversible residual layers.

  2. Using predefined vector systems to speed up neural network multimillion class classification

    cs.LG 2026-04 unverdicted novelty 5.0

    Predefined vector systems structure neural network latent spaces to allow O(1) label prediction via index searches on embedding vectors, delivering up to 11.6x speedup on multimillion-class tasks while preserving accu...