Online reinforcement learning in non-stationary context-driven environments

Hamadanian, P · 2023 · arXiv 2302.02182

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

RASP-Tuner matches or beats GP-UCB and CMA-ES regret on seven of nine synthetic non-stationary tasks while running 8-12 times faster per step.

cs.LG · 2026-05-04

Showing 2 of 2 citing papers.

RASP-Tuner: Retrieval-Augmented Soft Prompts for Context-Aware Black-Box Optimization in Non-Stationary Environments cs.LG · 2026-04-20 · unverdicted · none · ref 48
RASP-Tuner matches or beats GP-UCB and CMA-ES regret on seven of nine synthetic non-stationary tasks while running 8-12 times faster per step.
Efficient Preference Poisoning Attack on Offline RLHF cs.LG · 2026-05-04 · unreviewed · ref 94