5th International Conference on Learning Representations

Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, Ping Tak Peter Tang , title = · 2017

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Navigating Potholes with Geometry-Aware Sharpness Minimization

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

LLQR+SAM pairs a slow learned geometry preconditioner with fast SAM perturbations to amplify escape from locally sharp 'potholes' while stabilizing flat basins, producing consistent gains over SAM and LLQR alone.

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

Symmetrizing cross-entropy produces the unique convex multi-class unhinged loss, which locally approximates other symmetric losses, and enables new interpolating losses SGCE and alpha-MAE with competitive performance on noisy-label benchmarks.

From Backward Spreading to Forward Replay: Revisiting Target Construction in LLM Parameter Editing

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

Proposes forward replay of target hidden states from the first editing layer instead of backward spreading, claiming equivalent complexity but higher accuracy for LLM parameter editing.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Navigating Potholes with Geometry-Aware Sharpness Minimization cs.LG · 2026-05-15 · unverdicted · none · ref 7
LLQR+SAM pairs a slow learned geometry preconditioner with fast SAM perturbations to amplify escape from locally sharp 'potholes' while stabilizing flat basins, producing consistent gains over SAM and LLQR alone.
Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels cs.LG · 2026-05-19 · unverdicted · none · ref 114
Symmetrizing cross-entropy produces the unique convex multi-class unhinged loss, which locally approximates other symmetric losses, and enables new interpolating losses SGCE and alpha-MAE with competitive performance on noisy-label benchmarks.
From Backward Spreading to Forward Replay: Revisiting Target Construction in LLM Parameter Editing cs.CL · 2026-05-01 · unverdicted · none · ref 47
Proposes forward replay of target hidden states from the first editing layer instead of backward spreading, claiming equivalent complexity but higher accuracy for LLM parameter editing.

5th International Conference on Learning Representations

fields

years

verdicts

representative citing papers

citing papers explorer