Towards minimax Optimality of Model-based Robust Reinforcement Learning

Pierre Clavier, Erwan Le Pennec, Matthieu Geist · 2023 · arXiv 2302.05372

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty

cs.LG · 2025-06-14 · unverdicted · novelty 7.0

DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.

Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form

cs.LG · 2024-08-29 · unverdicted · novelty 7.0

Presents the first algorithm to identify an ε-optimal policy in robust constrained MDPs via epigraph form and bisection search with Õ(ε^{-4}) robust policy evaluations.

Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning

cs.LG · 2025-02-05 · unverdicted · novelty 6.0

Wolfpack attack framework disrupts MARL cooperation by targeting initial and assisting agents; WALL trains robust policies against it with reported experimental gains.

citing papers explorer

Showing 3 of 3 citing papers.

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty cs.LG · 2025-06-14 · unverdicted · none · ref 6
DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form cs.LG · 2024-08-29 · unverdicted · none · ref 21
Presents the first algorithm to identify an ε-optimal policy in robust constrained MDPs via epigraph form and bisection search with Õ(ε^{-4}) robust policy evaluations.
Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning cs.LG · 2025-02-05 · unverdicted · none · ref 1
Wolfpack attack framework disrupts MARL cooperation by targeting initial and assisting agents; WALL trains robust policies against it with reported experimental gains.

Towards minimax Optimality of Model-based Robust Reinforcement Learning

fields

years

verdicts

representative citing papers

citing papers explorer