LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.
arXiv preprint arXiv:2102.00176 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
TF-IDF identifies labeled experts in the top 25 recommendations 79.5% of the time versus 51.5% for GPT-4o mini on an astronomy observatory dataset.
citing papers explorer
-
LIMO: Less is More for Reasoning
LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.
-
Traditional statistical representations outperform generative AI in identifying expert peer reviewers
TF-IDF identifies labeled experts in the top 25 recommendations 79.5% of the time versus 51.5% for GPT-4o mini on an astronomy observatory dataset.