On Loss Functions for Deep Neural Networks in Classification

Katarzyna Janocha , Wojciech Marian Czarnecki

Authors on Pith no claims yet

classification 💻 cs.LG

keywords deepclassificationfunctionslossnetsclassifierslossesmodels

read the original abstract

Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design - one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors' opinion, underrepresented - while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rethinking the Harmonic Loss via Non-Euclidean Distance Layers
cs.LG 2026-03 unverdicted novelty 7.0

Non-Euclidean distance variants of harmonic loss improve accuracy, gradient stability, and energy efficiency over cross-entropy and Euclidean harmonic loss in vision backbones and large language models.
An Uncertainty-Aware Loss Function Incorporating Fuzzy Logic: Application to MRI Brain Image Segmentation
cs.CV 2026-04 unverdicted novelty 4.0

A loss function merging categorical cross-entropy with fuzzy entropy yields better segmentation metrics than standard cross-entropy on IBSR and OASIS brain MRI datasets using U-Net and U-Net++.