pith. sign in

Title resolution pending

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2024 1

verdicts

UNVERDICTED 1

representative citing papers

Optimization Hyper-parameter Laws for Large Language Models

cs.LG · 2024-09-07 · unverdicted · novelty 6.0

Opt-Laws predicts LLM final training loss from LR schedules via SDE-derived convergence and escape features, with 94% Top-2 hit rate on held-out schedules and F1=0.92 for divergence detection.

citing papers explorer

Showing 1 of 1 citing paper.

  • Optimization Hyper-parameter Laws for Large Language Models cs.LG · 2024-09-07 · unverdicted · none · ref 10

    Opt-Laws predicts LLM final training loss from LR schedules via SDE-derived convergence and escape features, with 94% Top-2 hit rate on held-out schedules and F1=0.92 for divergence detection.