The paper proves the first optimal O(n^{-1/2}) Wasserstein-1 CLT rates for locally dependent sequences and geometrically ergodic Markov chains, plus new W_p rates for p greater than or equal to 2 under mild moments, with an application to U-statistics.
John Wiley & Sons
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8verdicts
UNVERDICTED 8representative citing papers
SDP constructs a task-induced state space from raw text by having agents commit to and certify natural-language predicates as states, enabling structured planning and analysis in unstructured language environments.
A critique-and-routing controller cast as a finite-horizon MDP with policy-gradient optimization outperforms one-shot routing baselines on reasoning benchmarks while using the strongest agent for under 25% of calls.
Robust minimax task inference in BFMs achieves dynamics-shift robustness from nominal offline data alone and outperforms standard baselines.
Framework uses LLMs to map natural-language questions about MCTS to explanations based on tree statistics like visit counts and values, without hand-crafted formal logic.
Value mirror descent integrates mirror descent into value iteration for discounted MDPs, delivering near-optimal sample complexity of order |S||A|(1-γ)^{-3}ε^{-2} for general convex regularizers and bounded Bregman divergence between generated and optimal policies.
Error propagation mitigation in digital twins is cast as an MDP/POMDP with HMM-derived regimes as states, where the MDP policy maximizes reward and the POMDP recovers 95% of that performance.
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.
citing papers explorer
-
Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains
The paper proves the first optimal O(n^{-1/2}) Wasserstein-1 CLT rates for locally dependent sequences and geometrically ergodic Markov chains, plus new W_p rates for p greater than or equal to 2 under mild moments, with an application to U-statistics.
-
State-Centric Decision Process
SDP constructs a task-induced state space from raw text by having agents commit to and certify natural-language predicates as states, enabling structured planning and analysis in unstructured language environments.
-
Iterative Critique-and-Routing Controller for Multi-Agent Systems with Heterogeneous LLMs
A critique-and-routing controller cast as a finite-horizon MDP with policy-gradient optimization outperforms one-shot routing baselines on reasoning benchmarks while using the strongest agent for under 25% of calls.
-
When Dynamics Shift, Robust Task Inference Wins: Offline Imitation Learning with Behavior Foundation Models Revisited
Robust minimax task inference in BFMs achieves dynamics-shift robustness from nominal offline data alone and outperforms standard baselines.
-
Toward Template-Free Explainability for Monte Carlo Tree Search
Framework uses LLMs to map natural-language questions about MCTS to explanations based on tree statistics like visit counts and values, without hand-crafted formal logic.
-
Value Mirror Descent for Reinforcement Learning
Value mirror descent integrates mirror descent into value iteration for discounted MDPs, delivering near-optimal sample complexity of order |S||A|(1-γ)^{-3}ε^{-2} for general convex regularizers and bounded Bregman divergence between generated and optimal policies.
-
Optimal sequential decision-making for error propagation mitigation in digital twins
Error propagation mitigation in digital twins is cast as an MDP/POMDP with HMM-derived regimes as states, where the MDP policy maximizes reward and the POMDP recovers 95% of that performance.
-
Sample Complexity for Markov Decision Processes and Stochastic Optimal Control with Static Risk Measures
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.