AutoSelection discovers data recipes from a 90K instruction pool that outperform full-data training and other selectors on reasoning tasks for SFT across multiple models.
From quantity to quality: Boosting LLM performance with self-guided data selection for instruction tuning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
From Instance Selection to Fixed-Pool Data Recipe Search for Supervised Fine-Tuning
AutoSelection discovers data recipes from a 90K instruction pool that outperform full-data training and other selectors on reasoning tasks for SFT across multiple models.