loss” here and every ranking, coverage and entropy metric arefull-catalogue evaluationquantities, computed against the full ∼13.6M-item Stage-2 catalogue: “loss

· 2004 · arXiv 0283.4393

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Scaling Laws for Behavioral Foundation Models over User Event Sequences

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

Across 600 runs from 10^15 to 10^19 FLOPs, behavioral models show a 2% embedder is compute-optimal at all scales, training is data-heavy at low compute, and optimal negatives increase with budget until memory-limited.

citing papers explorer

Showing 1 of 1 citing paper.

Scaling Laws for Behavioral Foundation Models over User Event Sequences cs.LG · 2026-06-03 · unverdicted · none · ref 26
Across 600 runs from 10^15 to 10^19 FLOPs, behavioral models show a 2% embedder is compute-optimal at all scales, training is data-heavy at low compute, and optimal negatives increase with budget until memory-limited.

loss” here and every ranking, coverage and entropy metric arefull-catalogue evaluationquantities, computed against the full ∼13.6M-item Stage-2 catalogue: “loss

fields

years

verdicts

representative citing papers

citing papers explorer