Parallel Metric Skiplists and Nearest Neighbor Search

Rohin Garg; Xiangyun Ding; Yan Gu; Yihan Sun

read the original abstract

The metric skip-list is a data structure designed for efficient nearest and $k$-nearest neighbor search in metric spaces. For many real-world datasets with reasonable distributions - specifically, those with a constant expansion rate - it supports $\tilde{O}(n)$ construction time and $O(k\log n)$ query time, where $n$ is the input size and $k$ is the number of nearest neighbors in queries. Notably, unlike alternative approaches, it does not require a bounded aspect ratio, making it more flexible for input data distributions. However, the inherently sequential nature of its original construction has, to our knowledge, precluded any existing parallel algorithm. In this paper, we present highly parallel and work-efficient algorithms for constructing metric skip lists. Under the assumption of a constant expansion rate, our approach achieves an expected work of $O(n \log n)$ and a polylogarithmic span with high probability. Our design is based on novel algorithmic insights that improves the sequential procedure, enabling a divide-and-conquer strategy that facilitates parallelism while maintaining efficiency. With our algorithms, we can also support improved bounds for relevant applications using nearest neighbor as building blocks, including bichromatic closest pair (BCP), density-based clustering, and $k$-NN graph construction, among others. To our knowledge, many of these results represent the first solutions to achieve both work efficiency and polylogarithmic span, relying solely on the assumption of a constant expansion rate.

Parallel Metric Skiplists and Nearest Neighbor Search

discussion (0)