Multi-armed sampling framework shows near-optimal regret is achievable with minimal exploration, unlike bandits, and unifies both via a continuous temperature family.
Reinforcement learning: An introduction
4 Pith papers cite this work. Polarity classification is still indexing.
years
2025 4representative citing papers
SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.
The paper formulates a two-level optimization scheme integrating control, classical planning, and reinforcement learning to improve safety and interpretability in autonomous systems.
Ensemble RL models that integrate A2C, PPO, and SAC with SVM, decision trees, and logistic regression outperform individual RL models on risk-adjusted metrics in trading, but performance is sensitive to the variance threshold tau.
citing papers explorer
-
Multi-Armed Sampling Problem and the End of Exploration
Multi-armed sampling framework shows near-optimal regret is achievable with minimal exploration, unlike bandits, and unifies both via a continuous temperature family.
-
SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning
SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.
-
Mission-Aligned Learning-Informed Control of Autonomous Systems: Formulation and Foundations
The paper formulates a two-level optimization scheme integrating control, classical planning, and reinforcement learning to improve safety and interpretability in autonomous systems.
-
Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies
Ensemble RL models that integrate A2C, PPO, and SAC with SVM, decision trees, and logistic regression outperform individual RL models on risk-adjusted metrics in trading, but performance is sensitive to the variance threshold tau.