pith. sign in

T−1X k=t+1 g(xk, π(xk)) # We can directly apply Corollary 2 of Sukhija et al. [62] to obtain Jg(π, f ′)−J g(π, f) =E τ f π

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

CONDITIONAL 1

representative citing papers

Sampling-Based Safe Reinforcement Learning

cs.LG · 2026-05-19 · conditional · novelty 6.0

SBSRL approximates worst-case safety optimization over uncertain dynamics via finite sampling, adds epistemic-uncertainty-constrained exploration, and supplies high-probability safety guarantees plus finite-time sample-complexity bounds for near-optimal policies.

citing papers explorer

Showing 1 of 1 citing paper.

  • Sampling-Based Safe Reinforcement Learning cs.LG · 2026-05-19 · conditional · none · ref 71

    SBSRL approximates worst-case safety optimization over uncertain dynamics via finite sampling, adds epistemic-uncertainty-constrained exploration, and supplies high-probability safety guarantees plus finite-time sample-complexity bounds for near-optimal policies.