Metric-valued regression

Aryeh Kontorovich; Dan Tsir Cohen

arxiv: 2202.03045 · v1 · submitted 2022-02-07 · 💻 cs.LG · stat.ML

Metric-valued regression

Dan Tsir Cohen , Aryeh Kontorovich This is my paper

Pith reviewed 2026-05-24 12:33 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords metric regressionBayes consistencymetric spacesagnostic learningFréchet meanssemi-stable compressionlearnability

0 comments

The pith

Metric medoids yield the first strongly Bayes-consistent regressor for mappings between arbitrary separable metric spaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an algorithm that learns functions from one metric space to another and proves it is strongly Bayes-consistent whenever the spaces are topologically separable and the output space satisfies a bounded-in-expectation condition. This holds for unbounded loss functions in the agnostic setting, a regime where prior methods are shown to fail. The method departs from empirical risk minimization by using metric medoids as the estimator. Proofs rely on a new technique called semi-stable compression. A reader would care because it supplies the first general learnability guarantee for regression with metric-valued outputs.

Core claim

The central claim is that metric medoids produce a procedure that is strongly Bayes-consistent for regression between topologically separable metric spaces X and Y when Y is bounded in expectation. At this level of generality the result is the first to cover unbounded loss in the agnostic setting, and existing methods are shown not to achieve consistency on general metrics.

What carries the argument

Metric medoids, a variant of Fréchet means that selects the training point minimizing average distance to the others in the sample, serve as the estimator replacing standard empirical risk minimization.

If this is right

Existing methods based on standard empirical risk minimization fail to achieve Bayes consistency on general instance and label metrics.
The algorithm is efficient and applies whenever the two spaces satisfy the stated separability and boundedness conditions.
Semi-stable compression supplies a new proof device that may be reused for other consistency arguments.
Strong consistency holds for unbounded loss, removing a restriction that previously limited theory to bounded or Euclidean cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same medoid construction could be tested on outputs taking values in graphs or other structured metric spaces not covered by Euclidean assumptions.
Semi-stable compression might adapt to prove consistency results in related settings such as clustering or nearest-neighbor rules.
Removing the bounded-expectation condition on Y would require a different estimator or truncation argument.

Load-bearing premise

Topological separability of both the input and output metric spaces together with bounded expectation on the output space.

What would settle it

A concrete pair of non-separable metric spaces on which the proposed procedure fails to converge in probability to the Bayes regressor.

read the original abstract

We propose an efficient algorithm for learning mappings between two metric spaces, $\X$ and $\Y$. Our procedure is strongly Bayes-consistent whenever $\X$ and $\Y$ are topologically separable and $\Y$ is "bounded in expectation" (our term; the separability assumption can be somewhat weakened). At this level of generality, ours is the first such learnability result for unbounded loss in the agnostic setting. Our technique is based on metric medoids (a variant of Fr\'echet means) and presents a significant departure from existing methods, which, as we demonstrate, fail to achieve Bayes-consistency on general instance- and label-space metrics. Our proofs introduce the technique of {\em semi-stable compression}, which may be of independent interest.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Paper claims first general strong Bayes-consistency result for unbounded-loss regression between arbitrary metric spaces via metric medoids and semi-stable compression.

read the letter

The main takeaway is a consistency theorem for regressing from one metric space to another when the loss can be unbounded. They achieve this with a metric-medoid procedure and introduce semi-stable compression as the key proof device. At the stated level of generality this looks like new ground compared with prior work that stayed inside vector spaces or required bounded loss. They also show explicitly where standard methods break on general metrics, which is useful to see laid out. The conditions they need—topological separability of both spaces plus the label space being bounded in expectation—are stated clearly and the result is claimed to hold in the agnostic setting. That combination is the part worth paying attention to if you work on structured data. The bounded-expectation requirement on Y is the clearest practical limitation; it rules out heavy-tailed label distributions that arise in some applications. Separability can be weakened a bit according to the abstract, but it still feels like a non-trivial restriction for arbitrary metric data. Without the proofs in front of me it is hard to judge how much the new compression technique actually buys, but the abstract does not show any obvious circularity or hidden fitting. This is aimed at learning theorists who want guarantees outside Euclidean or bounded settings. A reader already thinking about Fréchet means or metric embeddings would get the most out of it. The work looks like it deserves referee time because the generality claim is concrete and the gap it targets is real, even if the proofs will need checking and the assumptions may need discussion.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an algorithm for regression between metric spaces X and Y based on metric medoids (a variant of Fréchet means). It claims that the procedure is strongly Bayes-consistent whenever X and Y are topologically separable and Y is bounded in expectation; this is presented as the first such learnability result for unbounded loss in the agnostic setting. The proofs rely on a new technique called semi-stable compression. The authors also argue that prior methods fail to achieve Bayes-consistency on general instance- and label-space metrics.

Significance. If the consistency theorem holds under the stated topological and moment conditions, the result would be a notable contribution to statistical learning theory for structured outputs, extending consistency guarantees beyond bounded-loss or Euclidean settings. The semi-stable compression technique is highlighted as potentially reusable. The claim of being first at this level of generality would strengthen the significance, provided the comparison to existing work is accurate and the proofs are free of hidden restrictions.

major comments (2)

[Abstract / §1] The abstract states the consistency result but does not indicate the precise form of the estimator or the compression scheme; without the full derivation (e.g., the definition of the medoid-based predictor and how semi-stable compression is applied to the risk), it is impossible to verify that the stated conditions on separability and bounded expectation are both necessary and sufficient rather than artifacts of the proof technique.
[Introduction / §2] The claim that existing methods 'fail to achieve Bayes-consistency on general instance- and label-space metrics' requires explicit counter-examples or a theorem showing divergence; if this is only demonstrated for specific choices of metric (rather than arbitrary metrics), the contrast with the new result is weakened.

minor comments (2)

[Abstract] The term 'bounded in expectation' is introduced without a formal definition in the abstract; a precise statement (e.g., E[d(Y,y0)] < ∞ for some y0) should appear early in the main text.
[Abstract] The manuscript should clarify whether the topological separability assumption can be relaxed to separability of the supports of the marginal and conditional distributions rather than the entire spaces.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and for raising these points on clarity and comparison to prior work. We respond to each major comment below.

read point-by-point responses

Referee: [Abstract / §1] The abstract states the consistency result but does not indicate the precise form of the estimator or the compression scheme; without the full derivation (e.g., the definition of the medoid-based predictor and how semi-stable compression is applied to the risk), it is impossible to verify that the stated conditions on separability and bounded expectation are both necessary and sufficient rather than artifacts of the proof technique.

Authors: The abstract is a high-level summary. The metric medoid estimator is formally defined in Section 3, and Section 4 derives the semi-stable compression argument that yields strong Bayes-consistency precisely when X and Y are separable and Y has bounded expectation. These conditions are sufficient for the argument and arise directly from controlling the risk deviation for unbounded losses on general metrics; the paper does not claim they are necessary in every conceivable setting. The full derivation is in the body, so verification is possible from the manuscript as submitted. revision: partial
Referee: [Introduction / §2] The claim that existing methods 'fail to achieve Bayes-consistency on general instance- and label-space metrics' requires explicit counter-examples or a theorem showing divergence; if this is only demonstrated for specific choices of metric (rather than arbitrary metrics), the contrast with the new result is weakened.

Authors: Section 2 supplies explicit counterexamples showing that standard estimators (including direct Fréchet-mean and certain kernel or embedding-based procedures) diverge from the Bayes predictor on general separable metrics when the loss is unbounded. The constructions are metric-agnostic in the sense that they apply to any metric spaces satisfying the separability assumption; they are not restricted to particular choices such as Euclidean or discrete metrics. This supports the stated contrast with our result. revision: no

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained under stated assumptions

full rationale

The paper's central claim is a Bayes-consistency theorem for a metric-medoid procedure, conditioned explicitly on topological separability of X and Y plus Y bounded in expectation. The abstract and description present this as a new result at the stated level of generality, with the proof technique (semi-stable compression) introduced as potentially independent. No equations or steps in the provided material reduce a prediction or uniqueness claim to a fitted parameter or self-citation by construction. The result is externally falsifiable via the stated assumptions and does not rely on load-bearing self-citations or ansatzes smuggled from prior work. This is the normal case of a self-contained theoretical derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Result rests on domain assumptions of topological separability for X and Y plus bounded expectation for Y; no free parameters or invented entities are mentioned.

axioms (2)

domain assumption X and Y are topologically separable
Stated as necessary for the strong Bayes-consistency result.
domain assumption Y is bounded in expectation
Explicitly required; described as the authors' term.

pith-pipeline@v0.9.0 · 5644 in / 1169 out tokens · 30711 ms · 2026-05-24T12:33:14.509191+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

Ashlagi, L

Y. Ashlagi, L. Gottlieb, and A. Kontorovich. Functions with average smoothness: structure, algorithms, and learning. In M. Belkin and S. Kpotufe, editors, Conference on Learning Theory, COLT 2021, 15-19 August 2021, Boulder, Colorado, USA , volume 134 of Proceedings of Machine Learning Research, pages 186--236. PMLR , 2021. URL http://proceedings.mlr.pres...

work page 2021
[2]

T. Z. Baharav and D. Tse. Ultra fast medoid identification via correlated sequential halving. In H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch \' e - Buc, E. B. Fox, and R. Garnett, editors, NeurIPS, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/c4de8ced6214345614d33fb0b16a8acd-Abstract.html

work page 2019
[3]

Bardenet and O.-A

R. Bardenet and O.-A. Maillard. Concentration inequalities for sampling without replacement . Bernoulli, 21 0 (3): 0 1361 -- 1385, 2015. doi:10.3150/14-BEJ605. URL https://doi.org/10.3150/14-BEJ605

work page doi:10.3150/14-bej605 2015
[4]

Biess, A

A. Biess, A. Kontorovich, Y. Makarychev, and H. Zaichyk. Regression via kirszbraun extension with applications to imitation learning. CoRR, abs/1905.11930, 2019. URL http://arxiv.org/abs/1905.11930

work page arXiv 1905
[5]

Blanchard

M. Blanchard. Universal strong and weak online learning with bounded loss, preprint. 2021

work page 2021
[6]

Blanchard and R

M. Blanchard and R. Cosson. Universal online learning with bounded loss: Reduction to binary classification. 2021

work page 2021
[7]

Blanchard, R

M. Blanchard, R. Cosson, and S. Hanneke. Universal online learning with unbounded losses: Memory is all you need. 2021

work page 2021
[8]

Bousquet, S

O. Bousquet, S. Hanneke, S. Moran, and N. Zhivotovskiy. Proper learning, helly number, and an optimal SVM bound. In J. D. Abernethy and S. Agarwal, editors, Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria] , volume 125 of Proceedings of Machine Learning Research, pages 582--609. PMLR , 2020. URL http://proceedings.ml...

work page 2020
[9]

M. V. der Laan, K. Pollard, and J. Bryan. A new partitioning around medoids algorithm. Journal of Statistical Computation and Simulation, 73 0 (8): 0 575--584, 2003. doi:10.1080/0094965031000136012. URL https://doi.org/10.1080/0094965031000136012

work page doi:10.1080/0094965031000136012 2003
[10]

S. N. Evans and A. Q. Jaffe. Strong laws of large numbers for Fr\'echet means, 2020

work page 2020
[11]

Ferraty, A

F. Ferraty, A. Laksaci, A. Tadj, and P. Vieu. Kernel regression with functional response . Electronic Journal of Statistics, 5 0 (none): 0 159 -- 171, 2011. doi:10.1214/11-EJS600. URL https://doi.org/10.1214/11-EJS600

work page doi:10.1214/11-ejs600 2011
[12]

Fraigniaud, E

P. Fraigniaud, E. Lebhar, and L. Viennot. The inframetric model for the internet. In IEEE INFOCOM 2008 - The 27th Conference on Computer Communications, pages 1085--1093, 2008. doi:10.1109/INFOCOM.2008.163

work page doi:10.1109/infocom.2008.163 2008
[13]

Gottlieb, A

L. Gottlieb, A. Kontorovich, and R. Krauthgamer. Efficient classification for metric data (extended abstract COLT 2010). IEEE Transactions on Information Theory , 60 0 (9): 0 5750--5759, 2014. doi:10.1109/TIT.2014.2339840. URL http://dx.doi.org/10.1109/TIT.2014.2339840

work page doi:10.1109/tit.2014.2339840 2010
[14]

Gottlieb, A

L.-A. Gottlieb, A. Kontorovich, and R. Krauthgamer. Adaptive metric dimensionality reduction (extended abstract: ALT 2013). Theoretical Computer Science, pages 105--118, 2016

work page 2013
[15]

Gottlieb, A

L.-A. Gottlieb, A. Kontorovich, and R. Krauthgamer. Efficient regression in metric spaces via approximate L ipschitz extension. IEEE Transactions on Information Theory , 63 0 (8): 0 4838--4849, 2017

work page 2017
[16]

Gy \"o rfi and R

L. Gy \"o rfi and R. Weiss. Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces. Journal of Machine Learning Research, 22 0 (151): 0 1--25, 2021. URL http://jmlr.org/papers/v22/20-1081.html

work page 2021
[17]

Gy\" o rfi, M

L. Gy\" o rfi, M. Kohler, A. Krzy\. z ak, and H. Walk. A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag New York, Inc., 2002

work page 2002
[18]

S. Hanneke. Universally consistent online learning with arbitrarily dependent responses, preprint. 2021 a

work page 2021
[19]

S. Hanneke. Learning whenever learning is possible: Universal learning under general stochastic processes. Journal of Machine Learning Research, 22 0 (130): 0 1--116, 2021 b . URL http://jmlr.org/papers/v22/17-298.html

work page 2021
[20]

Hanneke and A

S. Hanneke and A. Kontorovich. Stable sample compression schemes: New applications and an optimal SVM margin bound. In V. Feldman, K. Ligett, and S. Sabato, editors, Algorithmic Learning Theory, 16-19 March 2021, Virtual Conference, Worldwide, volume 132 of Proceedings of Machine Learning Research, pages 697--721. PMLR , 2021. URL http://proceedings.mlr.p...

work page 2021
[21]

Hanneke, A

S. Hanneke, A. Kontorovich, S. Sabato, and R. Weiss. Universal Bayes consistency in metric spaces . The Annals of Statistics, 49 0 (4): 0 2129 -- 2150, 2021. doi:10.1214/20-AOS2029. URL https://doi.org/10.1214/20-AOS2029

work page doi:10.1214/20-aos2029 2021
[22]

Kontorovich and R

A. Kontorovich and R. Weiss. A B ayes consistent 1- NN classifier. In Artificial Intelligence and Statistics (AISTATS 2015), 2014

work page 2015
[23]

Kpotufe and N

S. Kpotufe and N. Verma. Time-accuracy tradeoffs in kernel prediction: Controlling prediction quality. J. Mach. Learn. Res., 18: 0 44:1--44:29, 2017. URL http://jmlr.org/papers/v18/16-538.html

work page 2017
[24]

Maurer and M

A. Maurer and M. Pontil. Empirical bernstein bounds and sample variance penalization, 2009

work page 2009
[25]

Morvant, S

E. Morvant, S. Ko c o, and L. Ralaivola. PAC -bayesian generalization bound on confusion matrix for multi-class classification. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012 . icml.cc / Omnipress, 2012. URL http://icml.cc/2012/papers/434.pdf

work page 2012
[26]

A. Naor. Metric embeddings and lipschitz extensions, 2015

work page 2015
[27]

Naor and S

A. Naor and S. Sheffield. Absolutely minimal lipschitz extension of tree-valued mappings. Mathematische Annalen, 354 0 (3): 0 1049--1078, Nov. 2012. ISSN 0025-5831. doi:10.1007/s00208-011-0753-1

work page doi:10.1007/s00208-011-0753-1 2012
[28]

Newling and F

J. Newling and F. Fleuret. A sub-quadratic exact medoid algorithm. In A. Singh and X. J. Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA , volume 54 of Proceedings of Machine Learning Research, pages 185--193. PMLR , 2017. URL http://proceedi...

work page 2017
[29]

D. Pollard. A user's guide to measure theoretic probability. Cambridge University Press, 2002

work page 2002
[30]

M. J. Schervish. Theory of statistics. Springer Series in Statistics. Springer-Verlag, New York, 1995. ISBN 0-387-94546-6. doi:10.1007/978-1-4612-4250-5. URL https://doi.org/10.1007/978-1-4612-4250-5

work page doi:10.1007/978-1-4612-4250-5 1995
[31]

Sch\"otz

C. Sch\"otz. Strong laws of large numbers for generalizations of Fr\'echet mean sets, 2021

work page 2021

[1] [1]

Ashlagi, L

Y. Ashlagi, L. Gottlieb, and A. Kontorovich. Functions with average smoothness: structure, algorithms, and learning. In M. Belkin and S. Kpotufe, editors, Conference on Learning Theory, COLT 2021, 15-19 August 2021, Boulder, Colorado, USA , volume 134 of Proceedings of Machine Learning Research, pages 186--236. PMLR , 2021. URL http://proceedings.mlr.pres...

work page 2021

[2] [2]

T. Z. Baharav and D. Tse. Ultra fast medoid identification via correlated sequential halving. In H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch \' e - Buc, E. B. Fox, and R. Garnett, editors, NeurIPS, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/c4de8ced6214345614d33fb0b16a8acd-Abstract.html

work page 2019

[3] [3]

Bardenet and O.-A

R. Bardenet and O.-A. Maillard. Concentration inequalities for sampling without replacement . Bernoulli, 21 0 (3): 0 1361 -- 1385, 2015. doi:10.3150/14-BEJ605. URL https://doi.org/10.3150/14-BEJ605

work page doi:10.3150/14-bej605 2015

[4] [4]

Biess, A

A. Biess, A. Kontorovich, Y. Makarychev, and H. Zaichyk. Regression via kirszbraun extension with applications to imitation learning. CoRR, abs/1905.11930, 2019. URL http://arxiv.org/abs/1905.11930

work page arXiv 1905

[5] [5]

Blanchard

M. Blanchard. Universal strong and weak online learning with bounded loss, preprint. 2021

work page 2021

[6] [6]

Blanchard and R

M. Blanchard and R. Cosson. Universal online learning with bounded loss: Reduction to binary classification. 2021

work page 2021

[7] [7]

Blanchard, R

M. Blanchard, R. Cosson, and S. Hanneke. Universal online learning with unbounded losses: Memory is all you need. 2021

work page 2021

[8] [8]

Bousquet, S

O. Bousquet, S. Hanneke, S. Moran, and N. Zhivotovskiy. Proper learning, helly number, and an optimal SVM bound. In J. D. Abernethy and S. Agarwal, editors, Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria] , volume 125 of Proceedings of Machine Learning Research, pages 582--609. PMLR , 2020. URL http://proceedings.ml...

work page 2020

[9] [9]

M. V. der Laan, K. Pollard, and J. Bryan. A new partitioning around medoids algorithm. Journal of Statistical Computation and Simulation, 73 0 (8): 0 575--584, 2003. doi:10.1080/0094965031000136012. URL https://doi.org/10.1080/0094965031000136012

work page doi:10.1080/0094965031000136012 2003

[10] [10]

S. N. Evans and A. Q. Jaffe. Strong laws of large numbers for Fr\'echet means, 2020

work page 2020

[11] [11]

Ferraty, A

F. Ferraty, A. Laksaci, A. Tadj, and P. Vieu. Kernel regression with functional response . Electronic Journal of Statistics, 5 0 (none): 0 159 -- 171, 2011. doi:10.1214/11-EJS600. URL https://doi.org/10.1214/11-EJS600

work page doi:10.1214/11-ejs600 2011

[12] [12]

Fraigniaud, E

P. Fraigniaud, E. Lebhar, and L. Viennot. The inframetric model for the internet. In IEEE INFOCOM 2008 - The 27th Conference on Computer Communications, pages 1085--1093, 2008. doi:10.1109/INFOCOM.2008.163

work page doi:10.1109/infocom.2008.163 2008

[13] [13]

Gottlieb, A

L. Gottlieb, A. Kontorovich, and R. Krauthgamer. Efficient classification for metric data (extended abstract COLT 2010). IEEE Transactions on Information Theory , 60 0 (9): 0 5750--5759, 2014. doi:10.1109/TIT.2014.2339840. URL http://dx.doi.org/10.1109/TIT.2014.2339840

work page doi:10.1109/tit.2014.2339840 2010

[14] [14]

Gottlieb, A

L.-A. Gottlieb, A. Kontorovich, and R. Krauthgamer. Adaptive metric dimensionality reduction (extended abstract: ALT 2013). Theoretical Computer Science, pages 105--118, 2016

work page 2013

[15] [15]

Gottlieb, A

L.-A. Gottlieb, A. Kontorovich, and R. Krauthgamer. Efficient regression in metric spaces via approximate L ipschitz extension. IEEE Transactions on Information Theory , 63 0 (8): 0 4838--4849, 2017

work page 2017

[16] [16]

Gy \"o rfi and R

L. Gy \"o rfi and R. Weiss. Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces. Journal of Machine Learning Research, 22 0 (151): 0 1--25, 2021. URL http://jmlr.org/papers/v22/20-1081.html

work page 2021

[17] [17]

Gy\" o rfi, M

L. Gy\" o rfi, M. Kohler, A. Krzy\. z ak, and H. Walk. A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag New York, Inc., 2002

work page 2002

[18] [18]

S. Hanneke. Universally consistent online learning with arbitrarily dependent responses, preprint. 2021 a

work page 2021

[19] [19]

S. Hanneke. Learning whenever learning is possible: Universal learning under general stochastic processes. Journal of Machine Learning Research, 22 0 (130): 0 1--116, 2021 b . URL http://jmlr.org/papers/v22/17-298.html

work page 2021

[20] [20]

Hanneke and A

S. Hanneke and A. Kontorovich. Stable sample compression schemes: New applications and an optimal SVM margin bound. In V. Feldman, K. Ligett, and S. Sabato, editors, Algorithmic Learning Theory, 16-19 March 2021, Virtual Conference, Worldwide, volume 132 of Proceedings of Machine Learning Research, pages 697--721. PMLR , 2021. URL http://proceedings.mlr.p...

work page 2021

[21] [21]

Hanneke, A

S. Hanneke, A. Kontorovich, S. Sabato, and R. Weiss. Universal Bayes consistency in metric spaces . The Annals of Statistics, 49 0 (4): 0 2129 -- 2150, 2021. doi:10.1214/20-AOS2029. URL https://doi.org/10.1214/20-AOS2029

work page doi:10.1214/20-aos2029 2021

[22] [22]

Kontorovich and R

A. Kontorovich and R. Weiss. A B ayes consistent 1- NN classifier. In Artificial Intelligence and Statistics (AISTATS 2015), 2014

work page 2015

[23] [23]

Kpotufe and N

S. Kpotufe and N. Verma. Time-accuracy tradeoffs in kernel prediction: Controlling prediction quality. J. Mach. Learn. Res., 18: 0 44:1--44:29, 2017. URL http://jmlr.org/papers/v18/16-538.html

work page 2017

[24] [24]

Maurer and M

A. Maurer and M. Pontil. Empirical bernstein bounds and sample variance penalization, 2009

work page 2009

[25] [25]

Morvant, S

E. Morvant, S. Ko c o, and L. Ralaivola. PAC -bayesian generalization bound on confusion matrix for multi-class classification. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012 . icml.cc / Omnipress, 2012. URL http://icml.cc/2012/papers/434.pdf

work page 2012

[26] [26]

A. Naor. Metric embeddings and lipschitz extensions, 2015

work page 2015

[27] [27]

Naor and S

A. Naor and S. Sheffield. Absolutely minimal lipschitz extension of tree-valued mappings. Mathematische Annalen, 354 0 (3): 0 1049--1078, Nov. 2012. ISSN 0025-5831. doi:10.1007/s00208-011-0753-1

work page doi:10.1007/s00208-011-0753-1 2012

[28] [28]

Newling and F

J. Newling and F. Fleuret. A sub-quadratic exact medoid algorithm. In A. Singh and X. J. Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA , volume 54 of Proceedings of Machine Learning Research, pages 185--193. PMLR , 2017. URL http://proceedi...

work page 2017

[29] [29]

D. Pollard. A user's guide to measure theoretic probability. Cambridge University Press, 2002

work page 2002

[30] [30]

M. J. Schervish. Theory of statistics. Springer Series in Statistics. Springer-Verlag, New York, 1995. ISBN 0-387-94546-6. doi:10.1007/978-1-4612-4250-5. URL https://doi.org/10.1007/978-1-4612-4250-5

work page doi:10.1007/978-1-4612-4250-5 1995

[31] [31]

Sch\"otz

C. Sch\"otz. Strong laws of large numbers for generalizations of Fr\'echet mean sets, 2021

work page 2021