PDOCRL is an oracle-efficient primal-dual method for offline constrained RL under general function approximation that returns near-optimal policies with O(eps^{-2}) samples under partial optimal-policy coverage and a stronger realizability condition.
Q-learning and Pontryagin’s minimum principle
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Offline Constrained Reinforcement Learning under Partial Data Coverage
PDOCRL is an oracle-efficient primal-dual method for offline constrained RL under general function approximation that returns near-optimal policies with O(eps^{-2}) samples under partial optimal-policy coverage and a stronger realizability condition.