A VOI-based controller for dual inference budgets improves multi-hop QA performance by prioritizing search actions and selectively finalizing answers.
Reasoning aware self-consistency: Leveraging reasoning paths for efficient LLM sampling
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
MoE routing states at boundary and delimiter anchors form basins that align with final answers, enabling RAD, a string-free multi-rollout selector that matches majority voting on math and code tasks.
citing papers explorer
-
Inference-Time Budget Control for LLM Search Agents
A VOI-based controller for dual inference budgets improves multi-hop QA performance by prioritizing search actions and selectively finalizing answers.
-
Does the Same Token Mean the Same State? MoE Routing as Signal for Reasoning Control
MoE routing states at boundary and delimiter anchors form basins that align with final answers, enabling RAD, a string-free multi-rollout selector that matches majority voting on math and code tasks.