Reinforcement learning: An introduction

Richard S Sutton · 2018

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

Multi-Armed Sampling Problem and the End of Exploration

cs.LG · 2025-07-14 · conditional · novelty 8.0

Multi-armed sampling framework shows near-optimal regret is achievable with minimal exploration, unlike bandits, and unifies both via a continuous temperature family.

SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

cs.LG · 2025-02-21 · unverdicted · novelty 5.0

SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.

Mission-Aligned Learning-Informed Control of Autonomous Systems: Formulation and Foundations

math.OC · 2025-07-06 · unverdicted · novelty 4.0

The paper formulates a two-level optimization scheme integrating control, classical planning, and reinforcement learning to improve safety and interpretability in autonomous systems.

Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies

cs.LG · 2025-02-23 · unverdicted · novelty 2.0

Ensemble RL models that integrate A2C, PPO, and SAC with SVM, decision trees, and logistic regression outperform individual RL models on risk-adjusted metrics in trading, but performance is sensitive to the variance threshold tau.

citing papers explorer

Showing 4 of 4 citing papers.

Multi-Armed Sampling Problem and the End of Exploration cs.LG · 2025-07-14 · conditional · none · ref 28
Multi-armed sampling framework shows near-optimal regret is achievable with minimal exploration, unlike bandits, and unifies both via a continuous temperature family.
SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning cs.LG · 2025-02-21 · unverdicted · none · ref 47
SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.
Mission-Aligned Learning-Informed Control of Autonomous Systems: Formulation and Foundations math.OC · 2025-07-06 · unverdicted · none · ref 71
The paper formulates a two-level optimization scheme integrating control, classical planning, and reinforcement learning to improve safety and interpretability in autonomous systems.
Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies cs.LG · 2025-02-23 · unverdicted · none · ref 25
Ensemble RL models that integrate A2C, PPO, and SAC with SVM, decision trees, and logistic regression outperform individual RL models on risk-adjusted metrics in trading, but performance is sensitive to the variance threshold tau.

Reinforcement learning: An introduction

fields

years

verdicts

representative citing papers

citing papers explorer