pith. sign in

arxiv: 1501.01062 · v3 · pith:PHQTRXDDnew · submitted 2015-01-06 · 💻 cs.DS

Optimal Data-Dependent Hashing for Approximate Near Neighbors

classification 💻 cs.DS
keywords hashingoptimalspacedatadata-dependentainr14approximateapproximation
0
0 comments X
read the original abstract

We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an $n$-point data set in a $d$-dimensional space our data structure achieves query time $O(d n^{\rho+o(1)})$ and space $O(n^{1+\rho+o(1)} + dn)$, where $\rho=\tfrac{1}{2c^2-1}$ for the Euclidean space and approximation $c>1$. For the Hamming space, we obtain an exponent of $\rho=\tfrac{1}{2c-1}$. Our result completes the direction set forth in [AINR14] who gave a proof-of-concept that data-dependent hashing can outperform classical Locality Sensitive Hashing (LSH). In contrast to [AINR14], the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures [IM98,AI06] for all approximation factors $c>1$. From the technical perspective, we proceed by decomposing an arbitrary dataset into several subsets that are, in a certain sense, pseudo-random.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. On the practicality of quantum sieving algorithms for the shortest vector problem

    quant-ph 2024-10 unverdicted novelty 6.0

    Quantum sieving for SVP in dimension 400 needs ~10^13 physical qubits and ~10^31 years under optimistic assumptions, offering no practical speedup over classical methods.