Do we truly need so many samples? multi-llm repeated sampling efficiently scale test-time compute

Jianhao Chen, Zishuo Xun, Bocheng Zhou, Han Qi, Hangfan Zhang, Qiaosheng Zhang, Yang Chen, Wei Hu, Yuzhong Qu, Wanli Ouyang, et al · 2025 · arXiv 2504.00762

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation

cs.CL · 2026-01-05 · unverdicted · novelty 7.0

ModeX selects the modal semantic output from multiple LLM generations via a similarity graph and recursive spectral clustering without needing reward models or evaluators.

One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models

cs.LG · 2026-04-20 · unverdicted · novelty 6.0

Denoising Recursion Models train multi-step noise reversal in looped transformers and outperform the prior Tiny Recursion Model on ARC-AGI.

Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification

cs.AI · 2026-04-18 · unverdicted · novelty 6.0

Cross-model semantic disagreement adds an epistemic uncertainty term that improves total uncertainty estimation over self-consistency alone, helping flag confident errors in LLMs.

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

cs.AI · 2025-03-12 · unverdicted · novelty 5.0

The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.

citing papers explorer

Showing 4 of 4 citing papers.

ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation cs.CL · 2026-01-05 · unverdicted · none · ref 37
ModeX selects the modal semantic output from multiple LLM generations via a similarity graph and recursive spectral clustering without needing reward models or evaluators.
One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models cs.LG · 2026-04-20 · unverdicted · none · ref 134
Denoising Recursion Models train multi-step noise reversal in looped transformers and outperform the prior Tiny Recursion Model on ARC-AGI.
Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification cs.AI · 2026-04-18 · unverdicted · none · ref 4
Cross-model semantic disagreement adds an epistemic uncertainty term that improves total uncertainty estimation over self-consistency alone, helping flag confident errors in LLMs.
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models cs.AI · 2025-03-12 · unverdicted · none · ref 83
The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.

Do we truly need so many samples? multi-llm repeated sampling efficiently scale test-time compute

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer