DARE provides a distribution-aware sample-level constraint release mechanism for offline-to-online RL based on behavioral consistency with a behavior model, supported by theoretical analysis and D4RL experiments showing improved stability and performance.
For your convenience, we provide the pseudocode for Algorithm 1 in the paper below
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
From Static Constraints to Dynamic Adaptation: Sample-Level Constraint Release for Offline-to-Online Reinforcement Learning
DARE provides a distribution-aware sample-level constraint release mechanism for offline-to-online RL based on behavioral consistency with a behavior model, supported by theoretical analysis and D4RL experiments showing improved stability and performance.