arXiv preprint arXiv:2602.10273 , year=

Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning , author= · 2026 · arXiv 2602.10273

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling

cs.LG · 2026-06-07 · unverdicted · novelty 7.0

EGPS localizes MCMC moves to high-entropy decision points using forward-pass entropy, yielding up to 12.6× wall-clock speedup and best-or-tied accuracy on MATH500, HumanEval, and GPQA for Qwen2.5-Math-7B.

Sampling for Quality: Training-Free Reward-Guided LLM Decoding via Sequential Monte Carlo

cs.LG · 2026-04-07 · unverdicted · novelty 7.0

Sequential Monte Carlo sampling from a reward-augmented sequence distribution improves LLM performance on HumanEval by up to 54.9% and MATH500 by up to 8.8%, outperforming standard sampling and GRPO.

Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

The power distribution is the target of power sampling, the closed-form solution to self-reward KL-regularized RL, and the basis for power self-distillation that matches sampling performance at lower cost.

The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling

cs.AI · 2026-05-04 · unverdicted · novelty 6.0 · 2 refs

APPS approximates power targets p(x)^alpha via parallel particle propagation with proposal-corrected reweighting and future-value-guided selection at block boundaries, improving accuracy-runtime trade-offs in training-free decoding.

MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency

cs.LG · 2026-06-02 · unverdicted · novelty 5.0

MOSAIC uses an Integer Linear Program scheduler for expert placement and prompt assignment plus adaptive aggregation to achieve 1.7-2.3x end-to-end speedup on 4-GPU MoA workloads while keeping accuracy within 0.1pp.

Self-Consistency via Marginal Sharpening

cs.LG · 2026-05-27 · unverdicted · novelty 5.0

A new autoregressive parallel sampling procedure approximates sampling from the sharpened answer marginal to improve inference-time self-consistency in language models on reasoning benchmarks.

citing papers explorer

Showing 6 of 6 citing papers.

Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling cs.LG · 2026-06-07 · unverdicted · none · ref 10
EGPS localizes MCMC moves to high-entropy decision points using forward-pass entropy, yielding up to 12.6× wall-clock speedup and best-or-tied accuracy on MATH500, HumanEval, and GPQA for Qwen2.5-Math-7B.
Sampling for Quality: Training-Free Reward-Guided LLM Decoding via Sequential Monte Carlo cs.LG · 2026-04-07 · unverdicted · none · ref 1
Sequential Monte Carlo sampling from a reward-augmented sequence distribution improves LLM performance on HumanEval by up to 54.9% and MATH500 by up to 8.8%, outperforming standard sampling and GRPO.
Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation cs.LG · 2026-05-06 · unverdicted · none · ref 147
The power distribution is the target of power sampling, the closed-form solution to self-reward KL-regularized RL, and the basis for power self-distillation that matches sampling performance at lower cost.
The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling cs.AI · 2026-05-04 · unverdicted · none · ref 10 · 2 links
APPS approximates power targets p(x)^alpha via parallel particle propagation with proposal-corrected reweighting and future-value-guided selection at block boundaries, improving accuracy-runtime trade-offs in training-free decoding.
MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency cs.LG · 2026-06-02 · unverdicted · none · ref 199
MOSAIC uses an Integer Linear Program scheduler for expert placement and prompt assignment plus adaptive aggregation to achieve 1.7-2.3x end-to-end speedup on 4-GPU MoA workloads while keeping accuracy within 0.1pp.
Self-Consistency via Marginal Sharpening cs.LG · 2026-05-27 · unverdicted · none · ref 16
A new autoregressive parallel sampling procedure approximates sampling from the sharpened answer marginal to improve inference-time self-consistency in language models on reasoning benchmarks.

arXiv preprint arXiv:2602.10273 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer