RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.
Thompson sampling for contextual bandits with linear payoffs
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits
RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.