Best-action queries yield Õ(min{T/k, √(T-k)}) regret for i.i.d. stochastic rewards but only Ω(√(T-k)) regret for correlated stochastic or adversarial rewards in the bandit-feedback model.
ISBN 9781605585161
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A new TWCTV regularizer using weighted Schatten-p norms on gradients and adaptive sparse weighting in the M-product framework is proposed for robust tensor completion, with an ADMM solver and claimed superior performance on image tasks.
citing papers explorer
-
Multi-Armed Bandits With Best-Action Queries
Best-action queries yield Õ(min{T/k, √(T-k)}) regret for i.i.d. stochastic rewards but only Ω(√(T-k)) regret for correlated stochastic or adversarial rewards in the bandit-feedback model.
-
Robust Low-Rank Tensor Completion based on M-product with Weighted Correlated Total Variation and Sparse Regularization
A new TWCTV regularizer using weighted Schatten-p norms on gradients and adaptive sparse weighting in the M-product framework is proposed for robust tensor completion, with an ADMM solver and claimed superior performance on image tasks.