Advances in Neural Information Processing Systems , pages=

Stabilizing off-policy q-learning via bootstrapping error reduction , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

AdamO: A Collapse-Suppressed Optimizer for Offline RL

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

AdamO modifies Adam with an orthogonality correction to ensure the spectral radius of the TD update operator stays below one, providing a theoretical stability guarantee for offline RL.

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

cs.LG · 2020-05-04 · unverdicted · novelty 2.0

Offline RL promises to extract high-utility policies from static datasets but faces fundamental challenges that current methods only partially address.

citing papers explorer

Showing 2 of 2 citing papers.

AdamO: A Collapse-Suppressed Optimizer for Offline RL cs.LG · 2026-05-03 · unverdicted · none · ref 13
AdamO modifies Adam with an orthogonality correction to ensure the spectral radius of the TD update operator stays below one, providing a theoretical stability guarantee for offline RL.
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems cs.LG · 2020-05-04 · unverdicted · none · ref 178
Offline RL promises to extract high-utility policies from static datasets but faces fundamental challenges that current methods only partially address.

Advances in Neural Information Processing Systems , pages=

fields

years

verdicts

representative citing papers

citing papers explorer