Curated 50-example subsets of LAM benchmarks, via regression, predict human preferences at 0.98 correlation, outperforming the full benchmark and yielding the open-sourced HUMANS proxy.
K- Means provides native sample weight support in scikit-learn, enabling efficient task-aware clustering with O(n·D·K·I) complexity where I <100 iterations
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Putting HUMANS first: Efficient LAM Evaluation with Human Preference Alignment
Curated 50-example subsets of LAM benchmarks, via regression, predict human preferences at 0.98 correlation, outperforming the full benchmark and yielding the open-sourced HUMANS proxy.