Introduces BMC, a manifold bandit framework that organizes problems into a hierarchical task tree and applies Bayesian learning to balance productivity, diversity, and utility in LLM curriculum sampling.
Correlated bandits or: How to minimize mean-squared error online
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means of the arms. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator, based on sample variances and covariances, and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minmax theory, we also derive fundamental performance limits for the correlated bandit problem.
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Manifold Bandits: Bayesian Curriculum Learning over the Latent Geometry of Large Language Models
Introduces BMC, a manifold bandit framework that organizes problems into a hierarchical task tree and applies Bayesian learning to balance productivity, diversity, and utility in LLM curriculum sampling.