IRSL applies IRT to reduce scaling law estimation from O(M×N) to O(M+N) parameters, enabling reliable estimates with only 50 questions per benchmark after calibration and generalizable ability scores across related benchmarks.
Datadecide: How to predict best pretraining data with small experiments.arXiv preprint arXiv:2504.11393, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2representative citing papers
Maps common low-compute research strategies for foundation models onto statistical, internal, external, and construct validity threats via a causal-inference lens.
citing papers explorer
-
Item Response Scaling Laws: A Measurement Theory Approach for Efficient and Generalizable Neural Scaling Estimation
IRSL applies IRT to reduce scaling law estimation from O(M×N) to O(M+N) parameters, enabling reliable estimates with only 50 questions per benchmark after calibration and generalizable ability scores across related benchmarks.
-
Validity Threats for Foundation Model Research
Maps common low-compute research strategies for foundation models onto statistical, internal, external, and construct validity threats via a causal-inference lens.