A probabilistic inference approach to inference-time scaling of llms using particle-based monte carlo methods

Isha Puri, Shivchander Sudalairaj, Guangxuan Xu, Kai Xu, Akash Srivastava · 2025 · arXiv 2502.01618

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling

cs.AI · 2025-10-16 · unverdicted · novelty 7.0

ToolPRM provides fine-grained intra-call process supervision via a new dataset and reward model, outperforming outcome and coarse-grained alternatives on function-calling benchmarks.

Inference-Time Scaling of Diffusion Language Models via Trajectory Refinement

cs.LG · 2025-07-11 · conditional · novelty 7.0

PG-DLM applies particle Gibbs sampling over full trajectories in diffusion language models to enable iterative refinement, yielding higher accuracy on reward-guided generation with theoretical convergence guarantees.

TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation

cs.LG · 2025-11-27 · unverdicted · novelty 6.0

TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and SQL-Spider.

TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning

cs.LG · 2025-05-16 · unverdicted · novelty 5.0

TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.

citing papers explorer

Showing 4 of 4 citing papers.

ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling cs.AI · 2025-10-16 · unverdicted · none · ref 30
ToolPRM provides fine-grained intra-call process supervision via a new dataset and reward model, outperforming outcome and coarse-grained alternatives on function-calling benchmarks.
Inference-Time Scaling of Diffusion Language Models via Trajectory Refinement cs.LG · 2025-07-11 · conditional · none · ref 37
PG-DLM applies particle Gibbs sampling over full trajectories in diffusion language models to enable iterative refinement, yielding higher accuracy on reward-guided generation with theoretical convergence guarantees.
TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation cs.LG · 2025-11-27 · unverdicted · none · ref 39
TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and SQL-Spider.
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning cs.LG · 2025-05-16 · unverdicted · none · ref 30
TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.

A probabilistic inference approach to inference-time scaling of llms using particle-based monte carlo methods

fields

years

verdicts

representative citing papers

citing papers explorer