W-SparQ-BL models time-varying lower-level responses with multi-output GPs and sparse approximations to achieve sublinear dynamic regret in bilevel optimization under noise.
arXiv preprint arXiv:2302.12202 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Space-sampled Value Decay is proposed as a simple forgetting mechanism for DQN and SAC modifications that shows positive but limited effects on returns in non-stationary RL environments.
citing papers explorer
-
No-regret optimization of time-varying bilevel problems
W-SparQ-BL models time-varying lower-level responses with multi-output GPs and sparse approximations to achieve sublinear dynamic regret in bilevel optimization under noise.