pith. sign in

hub Mixed citations

Regularizing Neural Networks by Penalizing Confident Output Distributions

Mixed citation behavior. Most common role is background (60%).

15 Pith papers citing it
Background 60% of classified citations
abstract

We systematically explore regularizing neural networks by penalizing low entropy output distributions. We show that penalizing low entropy output distributions, which has been shown to improve exploration in reinforcement learning, acts as a strong regularizer in supervised learning. Furthermore, we connect a maximum entropy based confidence penalty to label smoothing through the direction of the KL divergence. We exhaustively evaluate the proposed confidence penalty and label smoothing on 6 common benchmarks: image classification (MNIST and Cifar-10), language modeling (Penn Treebank), machine translation (WMT'14 English-to-German), and speech recognition (TIMIT and WSJ). We find that both label smoothing and the confidence penalty improve state-of-the-art models across benchmarks without modifying existing hyperparameters, suggesting the wide applicability of these regularizers.

hub tools

citation-role summary

background 3 method 2

citation-polarity summary

representative citing papers

DeepL\'evy: Learning Heavy-Tailed Uncertainty in Highly Volatile Time Series

cs.LG · 2026-05-11 · unverdicted · novelty 7.0 · 3 refs

DeepLévy learns mixtures of Lévy stable distributions for heavy-tailed time series forecasting by minimizing discrepancies between empirical and parametric characteristic functions, outperforming prior methods on tail risk metrics under extreme volatility.

Annotations Mitigate Post-Training Mode Collapse

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.

Can LLMs Learn to Reason Robustly under Noisy Supervision?

cs.LG · 2026-04-05 · conditional · novelty 6.0

Online Label Refinement lets LLMs learn robust reasoning from noisy supervision by correcting labels when majority answers show rising rollout success and stable history, delivering 3-4% gains on math and reasoning benchmarks even at high noise levels.

Condensation Transition in Entropy-Constrained Probability Spaces

cond-mat.stat-mech · 2026-05-09 · unverdicted · novelty 5.0

Below a critical entropy H_c ≈ log K - 1 + γ in the large-K limit, the typical fixed-entropy distribution on the probability simplex condenses so that one component holds a macroscopic probability fraction while the rest form a uniform background.

Non-Intrusive Automatic Speech Recognition Refinement: A Survey

eess.AS · 2025-08-10 · accept · novelty 4.0

A survey that classifies non-intrusive ASR refinement methods into five categories, reviews domain adaptation and evaluation datasets, proposes standardized metrics, and identifies future research directions.

citing papers explorer

Showing 15 of 15 citing papers.