pith. sign in

hub Mixed citations

Finite-time analysis of the multiarmed bandit problem.Mach

Mixed citation behavior. Most common role is background (40%).

16 Pith papers citing it
3,730 external citations · Crossref
Background 40% of classified citations

hub tools

citation-role summary

background 2 method 2 dataset 1

citation-polarity summary

representative citing papers

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

cs.AI · 2026-05-06 · unverdicted · novelty 7.0

Bandit algorithms can be adapted to Tree MDPs by treating policies as arms with shared-data confidence bounds, achieving polynomial memory and instance-dependent bounds on sample complexity and regret that depend on terminal-state gaps rather than all policies.

Offline Local Search for Online Stochastic Bandits

cs.LG · 2026-04-10 · unverdicted · novelty 7.0

A generic conversion turns offline local search algorithms into online stochastic combinatorial bandit algorithms with O(log^3 T) approximate regret.

Playing the network backward: A Game Theoretic Attribution Framework

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Backward attribution is reframed as integrals over trajectories in a two-player game on the network, unifying gradients and alpha-beta-LRP while enabling new adaptations that outperform prior methods on ViT-B/16 localization metrics.

Discrete Diffusion for Codebook-Based Beam Candidate Generation

eess.SP · 2026-04-09 · unverdicted · novelty 6.0

A discrete denoising diffusion model learns from probing histories to generate promising beam candidates, yielding better SNR, lower beam-miss probability, and reduced probe regret than baselines under tight probing budgets.

Best Agent Identification for General Game Playing

cs.LG · 2025-07-01 · unverdicted · novelty 6.0

An optimistic confidence-interval ranking procedure for best-arm identification across multiple independent bandits yields lower average simple regret and error probability than prior methods when selecting high-performing agents for each game in GVGAI and Ludii.

Calibrated Model-Based Deep Reinforcement Learning

cs.LG · 2019-06-19 · unverdicted · novelty 5.0

Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.

citing papers explorer

Showing 16 of 16 citing papers.