Large scale structure of neural network loss landscapes

Stanislav Fort, Stanislaw Jastrzebski · 2019

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Flatness and Gradient Alignment Are Both Necessary: Spectral-Aware Gradient-Aligned Exploration for Multi-Distribution Learning

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Excess risk decomposes into independent alignment (trace of inverse average Hessian times gradient covariance) and curvature terms, so both flatness and gradient alignment are required; SAGE achieves this and sets new SOTA on DomainBed.

Flat Channels to Infinity in Neural Loss Landscapes

cs.LG · 2025-06-17 · unverdicted · novelty 7.0

Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.

citing papers explorer

Showing 2 of 2 citing papers.

Flatness and Gradient Alignment Are Both Necessary: Spectral-Aware Gradient-Aligned Exploration for Multi-Distribution Learning cs.LG · 2026-05-08 · unverdicted · none · ref 2
Excess risk decomposes into independent alignment (trace of inverse average Hessian times gradient covariance) and curvature terms, so both flatness and gradient alignment are required; SAGE achieves this and sets new SOTA on DomainBed.
Flat Channels to Infinity in Neural Loss Landscapes cs.LG · 2025-06-17 · unverdicted · none · ref 37
Neural loss landscapes contain flat channels to infinity along which gradient flow leads pairs of neurons to implement gated linear units.

Large scale structure of neural network loss landscapes

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer