Locality sets the fundamental round lower bound L_ε = floor(log(1/2ε)/log(1/γ)) for ε-accuracy on large-diameter graphs; direct propagation achieves it while gossip averaging pays extra 1/gap(W) factors.
Offline reinforcement learning with implicit Q -learning
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
A new RL framework for chronic disease management compresses time-to-control using clinician capability weighting and action intensity constraints, yielding 15 percentage point gains on synthetic type 2 diabetes simulations over standard offline RL.
citing papers explorer
-
Locality, Not Spectral Mixing, Governs Direct Propagation in Distributed Offline Dynamic Programming
Locality sets the fundamental round lower bound L_ε = floor(log(1/2ε)/log(1/γ)) for ε-accuracy on large-diameter graphs; direct propagation achieves it while gossip averaging pays extra 1/gap(W) factors.
-
Learning to Compress Time-to-Control: A Reinforcement Learning Framework for Chronic Disease Management
A new RL framework for chronic disease management compresses time-to-control using clinician capability weighting and action intensity constraints, yielding 15 percentage point gains on synthetic type 2 diabetes simulations over standard offline RL.