Optimal stopping vs best-of-n for inference time optimization.arXiv preprint arXiv:2510.01394, 2025

Yusuf Kalayci, Vinod Raman, Shaddin Dughmi · 2025 · arXiv 2510.01394

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

ATLAS: Agentic Test-time Learning-to-Allocate Scaling

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

ATLAS introduces an LLM-orchestrated agentic framework for dynamic test-time scaling via extensible 'explore' actions, achieving higher accuracy with fewer API calls than fixed-workflow baselines on four benchmarks.

Query-Centric Optimization of AI Workflows via Approximate Query Processing and Proxy Models

cs.DB · 2026-06-30 · unverdicted · novelty 4.0

Query-centric AQP and proxy-model strategies reduce expensive model calls by 60-90% with under 10% error on TPC-DS and LLM tasks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Query-Centric Optimization of AI Workflows via Approximate Query Processing and Proxy Models cs.DB · 2026-06-30 · unverdicted · none · ref 10
Query-centric AQP and proxy-model strategies reduce expensive model calls by 60-90% with under 10% error on TPC-DS and LLM tasks.

Optimal stopping vs best-of-n for inference time optimization.arXiv preprint arXiv:2510.01394, 2025

fields

years

verdicts

representative citing papers

citing papers explorer