ThetaEvolve enables small open-source LLMs to achieve new best-known bounds on open problems such as circle packing by combining test-time RL with a large program database and lazy penalties.
Start with larger changes (e.g., +/- 0.05) to explore the local landscape, then gradually decrease the step size (e.g., to +/- 0.01, then +/- 0.001) to fine-tune the solution
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
ThetaEvolve: Test-time Learning on Open Problems
ThetaEvolve enables small open-source LLMs to achieve new best-known bounds on open problems such as circle packing by combining test-time RL with a large program database and lazy penalties.