arXiv preprint arXiv:2003.10113 , year=

Algorithms for non-stationary generalized linear bandits , author= · 2003 · arXiv 2003.10113

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent

stat.ML · 2026-05-25 · unverdicted · novelty 7.0

DOMD-GLB is the first algorithm for nonstationary GLBs with O(1) per-round costs, achieving dynamic regret bounds of order Õ(c_μ^{-1/2} d^{3/4} P_T^{1/4} T^{3/4}) for drifting and Õ(c_μ^{-1/3} d^{2/3} Γ_T^{1/3} T^{2/3}) for piecewise-stationary environments.

Equilibrium and Pricing in Consumer Networks with Nonlinear Utilities: An Online Shape-Constrained Learning Approach

math.ST · 2026-05-13 · unverdicted · novelty 7.0

The paper establishes equilibrium existence and uniqueness for nonlinear utility consumer networks under contraction conditions and proposes a shape-constrained isotonic regression approach with strict no-regret convergence for learning utilities in targeted monopoly pricing.

DARLING: Detection Augmented Reinforcement Learning with Non-Stationary Guarantees

cs.LG · 2026-04-17 · unverdicted · novelty 7.0 · 2 refs

DARLING augments RL with change detection to match minimax lower bounds on dynamic regret for piecewise stationary tabular and linear MDPs under separability and reachability conditions.

Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context Drifts

cs.LG · 2026-06-08 · unverdicted · novelty 5.0

Dri-MED achieves Õ(κ d² log T / Δ̃) regret and Õ(d) constraint violations for drifting contextual bandits with personalized preferences and baseline constraints under practitioner-friendly assumptions.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent stat.ML · 2026-05-25 · unverdicted · none · ref 1
DOMD-GLB is the first algorithm for nonstationary GLBs with O(1) per-round costs, achieving dynamic regret bounds of order Õ(c_μ^{-1/2} d^{3/4} P_T^{1/4} T^{3/4}) for drifting and Õ(c_μ^{-1/3} d^{2/3} Γ_T^{1/3} T^{2/3}) for piecewise-stationary environments.
Equilibrium and Pricing in Consumer Networks with Nonlinear Utilities: An Online Shape-Constrained Learning Approach math.ST · 2026-05-13 · unverdicted · none · ref 61
The paper establishes equilibrium existence and uniqueness for nonlinear utility consumer networks under contraction conditions and proposes a shape-constrained isotonic regression approach with strict no-regret convergence for learning utilities in targeted monopoly pricing.
DARLING: Detection Augmented Reinforcement Learning with Non-Stationary Guarantees cs.LG · 2026-04-17 · unverdicted · none · ref 8 · 2 links
DARLING augments RL with change detection to match minimax lower bounds on dynamic regret for piecewise stationary tabular and linear MDPs under separability and reachability conditions.
Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context Drifts cs.LG · 2026-06-08 · unverdicted · none · ref 12
Dri-MED achieves Õ(κ d² log T / Δ̃) regret and Õ(d) constraint violations for drifting contextual bandits with personalized preferences and baseline constraints under practitioner-friendly assumptions.

arXiv preprint arXiv:2003.10113 , year=

fields

years

verdicts

representative citing papers

citing papers explorer