pith. sign in

arxiv: 1812.11026 · v1 · pith:ZADGPFBVnew · submitted 2018-12-28 · 📊 stat.ME

Hybrid Wasserstein Distance and Fast Distribution Clustering

classification 📊 stat.ME
keywords distanceapproximationwassersteinclusteringdifferencesdistributionestimatedlocation-scale
0
0 comments X p. Extension
pith:ZADGPFBV Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{ZADGPFBV}

Prints a linked pith:ZADGPFBV badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We define a modified Wasserstein distance for distribution clustering which inherits many of the properties of the Wasserstein distance but which can be estimated easily and computed quickly. The modified distance is the sum of two terms. The first term --- which has a closed form --- measures the location-scale differences between the distributions. The second term is an approximation that measures the remaining distance after accounting for location-scale differences. We consider several forms of approximation with our main emphasis being a tangent space approximation that can be estimated using nonparametric regression. We evaluate the strengths and weaknesses of this approach on simulated and real examples.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.