The paper compiles practical lessons on reproducible LM evaluation and introduces the lm-eval library to mitigate common methodological problems in NLP.
The 61st Annual Meeting Of The Association For Computational Linguistics , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2024 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Lessons from the Trenches on Reproducible Evaluation of Language Models
The paper compiles practical lessons on reproducible LM evaluation and introduces the lm-eval library to mitigate common methodological problems in NLP.