Code-Guided Reasoning protocol reports a 28 percentage-point macro accuracy gain for small language models on MCQA when using generated executable Python scaffolds versus direct answering on 20k+ items.
Executable code actions elicit better LLM agents
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds
Code-Guided Reasoning protocol reports a 28 percentage-point macro accuracy gain for small language models on MCQA when using generated executable Python scaffolds versus direct answering on 20k+ items.