Advances in Neural Information Processing Systems , volume=

Eluder dimension, the sample complexity of optimistic exploration , author=

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Autoregressive Learning in Joint KL: Sharp Oracle Bounds and Lower Bounds

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

Joint KL yields horizon-free approximation but an information-theoretic lower bound of order Omega(H) for estimation error in autoregressive learning, with matching computationally efficient upper bounds.

Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

stat.ML · 2026-05-14 · unverdicted · novelty 7.0

ORBIT learns the (β-1)-smooth oracle price map via local polynomial approximation and bandit convex optimization in a semiparametric contextual pricing model, achieving regret Õ(T^{(2β-1)/(4β-3)} + √(dT)) with a matching lower bound for fixed d.

On Characterizing Learnability for Adversarial Noisy Bandits

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Learnability of adversarial noisy bandits is characterized by the convexified generalized maximin volume for oblivious adversaries and for adaptive adversaries when the arm space is countable.

Interpreting Reinforcement Learning Agents with Susceptibilities

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Susceptibilities applied to regret in deep RL agents reveal stagewise internal development in parameter space of a gridworld model that policy inspection alone cannot detect, validated via activation steering.

Optimal Online and Offline Algorithms for Contextual MNL with Applications to Assortment and Pricing

math.OC · 2026-04-21 · unverdicted · novelty 6.0

New algorithms for joint contextual MNL assortment and pricing deliver improved online regret bounds of order W sqrt(d T log N)/L0 and local suboptimality guarantees offline.

citing papers explorer

Showing 5 of 5 citing papers.

Autoregressive Learning in Joint KL: Sharp Oracle Bounds and Lower Bounds cs.LG · 2026-05-12 · unverdicted · none · ref 18
Joint KL yields horizon-free approximation but an information-theoretic lower bound of order Omega(H) for estimation error in autoregressive learning, with matching computationally efficient upper bounds.
Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning stat.ML · 2026-05-14 · unverdicted · none · ref 42
ORBIT learns the (β-1)-smooth oracle price map via local polynomial approximation and bandit convex optimization in a semiparametric contextual pricing model, achieving regret Õ(T^{(2β-1)/(4β-3)} + √(dT)) with a matching lower bound for fixed d.
On Characterizing Learnability for Adversarial Noisy Bandits cs.LG · 2026-05-09 · unverdicted · none · ref 23
Learnability of adversarial noisy bandits is characterized by the convexified generalized maximin volume for oblivious adversaries and for adaptive adversaries when the arm space is countable.
Interpreting Reinforcement Learning Agents with Susceptibilities cs.LG · 2026-05-08 · unverdicted · none · ref 98
Susceptibilities applied to regret in deep RL agents reveal stagewise internal development in parameter space of a gridworld model that policy inspection alone cannot detect, validated via activation steering.
Optimal Online and Offline Algorithms for Contextual MNL with Applications to Assortment and Pricing math.OC · 2026-04-21 · unverdicted · none · ref 30
New algorithms for joint contextual MNL assortment and pricing deliver improved online regret bounds of order W sqrt(d T log N)/L0 and local suboptimality guarantees offline.

Advances in Neural Information Processing Systems , volume=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer