Lagrangian index policy for restless bandits with average reward

Konstantin Avrachenkov, Vivek S Borkar, Pratik Shah · 2024 · arXiv 2412.12641

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

contest 1

representative citing papers

Restless Bandits with Individual Penalty Constraints: Near-Optimal Indices and Deep Reinforcement Learning

cs.LG · 2026-04-05 · unverdicted · novelty 7.0

The POW index policy for restless multi-armed bandits with per-arm penalty constraints is asymptotically optimal, computable offline per user, and learnable via deep RL.

Lagrange Index based Scheduling for Minimizing Age of Updates from Heterogeneous Sources

cs.NI · 2026-04-20 · unverdicted · novelty 5.0

Lagrange index heuristic for RMAB-SMDP scheduling minimizes weighted AoI under non-preemptive heterogeneous updates in wireless networks.

citing papers explorer

Showing 2 of 2 citing papers.

Restless Bandits with Individual Penalty Constraints: Near-Optimal Indices and Deep Reinforcement Learning cs.LG · 2026-04-05 · unverdicted · none · ref 1
The POW index policy for restless multi-armed bandits with per-arm penalty constraints is asymptotically optimal, computable offline per user, and learnable via deep RL.
Lagrange Index based Scheduling for Minimizing Age of Updates from Heterogeneous Sources cs.NI · 2026-04-20 · unverdicted · none · ref 14
Lagrange index heuristic for RMAB-SMDP scheduling minimizes weighted AoI under non-preemptive heterogeneous updates in wireless networks.

Lagrangian index policy for restless bandits with average reward

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer