RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.
Bypassing the monster: A faster and simpler optimal algorithm for contextual bandits under realizability.Mathematics of Operations Research, 47(3):1904–1931, 2022
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits
RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.