Extends robust MDPs to continuous time with policy gradient derivations using differential equation methods and proposes optimizers achieving linear convergence and specific sample complexities.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
RL training disrupts gradient-based adversarial attacks by inducing unstable low-magnitude gradients that limit the effectiveness of methods like PGD within practical budgets.
citing papers explorer
-
Policy Gradient for Continuous-Time Robust Markov Decision Processes
Extends robust MDPs to continuous time with policy gradient derivations using differential equation methods and proposes optimizers achieving linear convergence and specific sample complexities.
-
Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization
RL training disrupts gradient-based adversarial attacks by inducing unstable low-magnitude gradients that limit the effectiveness of methods like PGD within practical budgets.