Multiclass Classification, Information, Divergence, and Surrogate Risk

Feng Ruan; John C. Duchi; Khashayar Khosravi

arxiv: 1603.00126 · v2 · pith:LAD5IVKJnew · submitted 2016-03-01 · 🧮 math.ST · cs.IT· math.IT· stat.TH

Multiclass Classification, Information, Divergence, and Surrogate Risk

John C. Duchi , Khashayar Khosravi , Feng Ruan This is my paper

classification 🧮 math.ST cs.ITmath.ITstat.TH

keywords classificationresultslossesmulticlassdivergencesequivalenceinformationmathsf

0 comments

read the original abstract

We provide a unifying view of statistical information measures, multi-way Bayesian hypothesis testing, loss functions for multi-class classification problems, and multi-distribution $f$-divergences, elaborating equivalence results between all of these objects, and extending existing results for binary outcome spaces to more general ones. We consider a generalization of $f$-divergences to multiple distributions, and we provide a constructive equivalence between divergences, statistical information (in the sense of DeGroot), and losses for multiclass classification. A major application of our results is in multi-class classification problems in which we must both infer a discriminant function $\gamma$---for making predictions on a label $Y$ from datum $X$---and a data representation (or, in the setting of a hypothesis testing problem, an experimental design), represented as a quantizer $\mathsf{q}$ from a family of possible quantizers $\mathsf{Q}$. In this setting, we characterize the equivalence between loss functions, meaning that optimizing either of two losses yields an optimal discriminant and quantizer $\mathsf{q}$, complementing and extending earlier results of Nguyen et. al. to the multiclass case. Our results provide a more substantial basis than standard classification calibration results for comparing different losses: we describe the convex losses that are consistent for jointly choosing a data representation and minimizing the (weighted) probability of error in multiclass classification problems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Chernoff Information as a Privacy Constraint for Adversarial Classification and Membership Advantage
cs.IT 2024-03 unverdicted novelty 5.0

Chernoff DP is sandwiched between KL DP and ε-DP, outperforms KL in numerical Laplace-mechanism tests, and yields a new upper bound on adversary membership advantage compared with (ε,δ)-DP bounds.