pith. sign in

Exploring Generalization in Deep Learning

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it
abstract

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 4

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

clear filters

representative citing papers

A Sharper Picture of Generalization in Transformers

cs.LG · 2026-05-20 · unverdicted · novelty 6.0 · 2 refs

PAC-Bayes applied to low-sharpness flat minima yields non-vacuous generalization bounds for boolean functions whose Fourier spectra are sparse and low-degree, with parameters estimable by property testing.

Feature Starvation as Geometric Instability in Sparse Autoencoders

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Adaptive elastic net SAEs (AEN-SAEs) mitigate feature starvation in SAEs by combining ℓ2 structural stability with adaptive ℓ1 reweighting, producing a Lipschitz-continuous sparse coding map that recovers global feature support under mild assumptions.

citing papers explorer

Showing 4 of 4 citing papers after filters.

  • Sample Complexity of Scientific Discovery: PAC Learnability of Compositional Function Trees cs.LG · 2026-06-28 · unverdicted · none · ref 65 · internal anchor

    Proves that Rademacher complexity of depth-d compositional trees over finite operator vocabulary is controlled by (K b L)^{d} / sqrt(n) under Lipschitz conditions on operators.

  • A Sharper Picture of Generalization in Transformers cs.LG · 2026-05-20 · unverdicted · none · ref 21 · 2 links · internal anchor

    PAC-Bayes applied to low-sharpness flat minima yields non-vacuous generalization bounds for boolean functions whose Fourier spectra are sparse and low-degree, with parameters estimable by property testing.

  • Feature Starvation as Geometric Instability in Sparse Autoencoders cs.LG · 2026-05-06 · unverdicted · none · ref 29

    Adaptive elastic net SAEs (AEN-SAEs) mitigate feature starvation in SAEs by combining ℓ2 structural stability with adaptive ℓ1 reweighting, producing a Lipschitz-continuous sparse coding map that recovers global feature support under mild assumptions.

  • Margin-Adaptive Confidence Ranking for Reliable LLM Judgement cs.LG · 2026-05-14 · unverdicted · none · ref 84 · internal anchor

    Develops a margin-adaptive learned confidence estimator for LLMs with generalization guarantees to improve agreement rates with human judgments over heuristic baselines.