PDOCRL is an oracle-efficient primal-dual method for offline constrained RL under general function approximation that returns near-optimal policies with O(eps^{-2}) samples under partial optimal-policy coverage and a stronger realizability condition.
Safe and Efficient: A Primal-Dual Method for Offline Convex CMDPs under Partial Data Coverage
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Offline Constrained Reinforcement Learning under Partial Data Coverage
PDOCRL is an oracle-efficient primal-dual method for offline constrained RL under general function approximation that returns near-optimal policies with O(eps^{-2}) samples under partial optimal-policy coverage and a stronger realizability condition.