pith. sign in

Symmetry in language statistics shapes the geometry of model representations

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

years

2026 7

clear filters

representative citing papers

Quantifying Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

A framework quantifies hyperparameter transfer via scaling-law fit quality, extrapolation robustness, and loss penalty, with ablations showing that μP's advantage over standard parameterization stems from maximizing the embedding layer learning rate to avoid bottlenecks and instabilities in AdamW.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 1 of 1 citing paper after filters.