pith. sign in

arxiv: 1603.08578 · v2 · pith:B5OY325Onew · submitted 2016-03-28 · 🧮 math.ST · cs.IT· math.IT· stat.ML· stat.TH

Analysis of k-Nearest Neighbor Distances with Application to Entropy Estimation

classification 🧮 math.ST cs.ITmath.ITstat.MLstat.TH
keywords estimatork-nndistancesentropyfinite-sampleinformationmutualbehavior
0
0 comments X
read the original abstract

Estimating entropy and mutual information consistently is important for many machine learning applications. The Kozachenko-Leonenko (KL) estimator (Kozachenko & Leonenko, 1987) is a widely used nonparametric estimator for the entropy of multivariate continuous random variables, as well as the basis of the mutual information estimator of Kraskov et al. (2004), perhaps the most widely used estimator of mutual information in this setting. Despite the practical importance of these estimators, major theoretical questions regarding their finite-sample behavior remain open. This paper proves finite-sample bounds on the bias and variance of the KL estimator, showing that it achieves the minimax convergence rate for certain classes of smooth functions. In proving these bounds, we analyze finite-sample behavior of k-nearest neighbors (k-NN) distance statistics (on which the KL estimator is based). We derive concentration inequalities for k-NN distances and a general expectation bound for statistics of k-NN distances, which may be useful for other analyses of k-NN methods.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Statistical estimation of the Kullback-Leibler divergence

    math.ST 2019-06 unverdicted novelty 4.0

    Establishes asymptotic unbiasedness and L2-consistency for k-NN estimators of KL divergence between densities in R^d under broad conditions, including Gaussians, plus new results for Kozachenko-Leonenko entropy estimators.