HDBSCAN: Density based Clustering over Location Based Services
pith:FCMBHUQ4 Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{FCMBHUQ4}
Prints a linked pith:FCMBHUQ4 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
Location Based Services (LBS) have become extremely popular and used by millions of users. Popular LBS run the entire gamut from mapping services (such as Google Maps) to restaurants (such as Yelp) and real-estate (such as Redfin). The public query interfaces of LBS can be abstractly modeled as a kNN interface over a database of two dimensional points: given an arbitrary query point, the system returns the k points in the database that are nearest to the query point. Often, k is set to a small value such as 20 or 50. In this paper, we consider the novel problem of enabling density based clustering over an LBS with only a limited, kNN query interface. Due to the query rate limits imposed by LBS, even retrieving every tuple once is infeasible. Hence, we seek to construct a cluster assignment function f(.) by issuing a small number of kNN queries, such that for any given tuple t in the database which may or may not have been accessed, f(.) outputs the cluster assignment of t with high accuracy. We conduct a comprehensive set of experiments over benchmark datasets and popular real-world LBS such as Yahoo! Flickr, Zillow, Redfin and Google Maps.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.