Quiet-STaR lets language models learn token-level rationales from general text, producing zero-shot gains on GSM8K and CommonsenseQA after continued pretraining.
She eats three for breakfast every morning and bakes muffins for her friends every day with four
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Quiet-STaR lets language models learn token-level rationales from general text, producing zero-shot gains on GSM8K and CommonsenseQA after continued pretraining.