pith. sign in

The Thirteenth International Conference on Learning Representations , year=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CL 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

Prescriptive Scaling Laws for Data Constrained Training

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

A one-parameter scaling law models excess loss from data repetition as an additive overfitting penalty, recommending model capacity increases over excessive repetition and showing that strong weight decay reduces the penalty coefficient by ~70%.

Compute Optimal Tokenization

cs.CL · 2026-05-02 · unverdicted · novelty 6.0

In compute-optimal regimes, language model parameter count scales proportionally with data bytes rather than tokens, and the optimal compression rate decreases with increasing compute.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Compute Optimal Tokenization cs.CL · 2026-05-02 · unverdicted · none · ref 6

    In compute-optimal regimes, language model parameter count scales proportionally with data bytes rather than tokens, and the optimal compression rate decreases with increasing compute.