A VOI-based controller for dual inference budgets improves multi-hop QA performance by prioritizing search actions and selectively finalizing answers.
Reasoning aware self-consistency: Leveraging reasoning paths for efficient LLM sampling
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
MoE routing states at boundary and delimiter anchors form basins that align with final answers, enabling RAD, a string-free multi-rollout selector that matches majority voting on math and code tasks.
citing papers explorer
-
Inference-Time Budget Control for LLM Search Agents
A VOI-based controller for dual inference budgets improves multi-hop QA performance by prioritizing search actions and selectively finalizing answers.