Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
Emergent complexity and zero-shot transfer via unsupervised environment design
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
A curriculum sampling questions with high variance in success rate improves reinforcement learning performance for LLM reasoning tasks.
citing papers explorer
-
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
-
Learning to Reason at the Frontier of Learnability
A curriculum sampling questions with high variance in success rate improves reinforcement learning performance for LLM reasoning tasks.