Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Ehsan Amid; Manfred K. Warmuth; Rohan Anil; Tomer Koren

arxiv: 1906.03361 · v3 · pith:4JQD3AVWnew · submitted 2019-06-08 · 💻 cs.LG · stat.ML

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Ehsan Amid , Manfred K. Warmuth , Rohan Anil , Tomer Koren This is my paper

classification 💻 cs.LG stat.ML

keywords losslayertemperaturebregmandivergencesgeneralizationlogarithmlogistic

0 comments

read the original abstract

We introduce a temperature into the exponential function and replace the softmax output layer of neural nets by a high temperature generalization. Similarly, the logarithm in the log loss we use for training is replaced by a low temperature logarithm. By tuning the two temperatures we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural nets by our bi-temperature generalization of logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large data sets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method using the Tsallis divergence.

This paper has not been read by Pith yet.

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

discussion (0)