pith. sign in

arxiv: 2410.05117 · v2 · pith:7IQRWTXBnew · submitted 2024-10-07 · 💻 cs.LG · cs.IT· math.IT· math.ST· stat.ML· stat.TH

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

classification 💻 cs.LG cs.ITmath.ITmath.STstat.MLstat.TH
keywords lowerinteractiveboundsdecisionemphmakingboundestimation
0
0 comments X
read the original abstract

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms that collect data interactively (e.g., algorithms for bandits and reinforcement learning). Recent work of Foster et al. (2021, 2023) provides minimax lower bounds for interactive decision making using seemingly different analysis techniques from the classical methods. These results -- which are proven using a complexity measure known as the \emph{Decision-Estimation Coefficient} (DEC) -- capture difficulties unique to interactive learning, yet do not recover the tightest known lower bounds for passive estimation. We propose a unified view of these distinct methodologies through a new lower bound approach called \emph{interactive Fano method}. As an application, we introduce a novel complexity measure, the \emph{Fractional Covering Number}, which facilitates the new lower bounds for interactive decision making that extend the DEC methodology by incorporating the complexity of estimation. Using the fractional covering number, we (i) provide a unified characterization of learnability for \emph{any} stochastic bandit problem, (ii) close the remaining gap between the upper and lower bounds in Foster et al. (2021, 2023) (up to polynomial factors) for any interactive decision making problem in which the underlying model class is convex.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Sample Complexity of Multiclass and Sparse Contextual Bandits

    cs.LG 2026-05 unverdicted novelty 8.0

    Algorithms and matching lower bounds for s-sparse contextual bandits yield Õ((s/ε² + |A|/ε) log |Π|/δ) samples to output an ε-optimal policy.

  2. Bellman-sufficient Information Complexity

    cs.LG 2026-06 unverdicted novelty 7.0

    Introduces Bellman-sufficient information complexity as a representation-level framework for sequential decision making that unifies upper and lower bounds through a conditional Bellman information-risk sandwich.

  3. Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation

    math.PR 2026-06 unverdicted novelty 7.0

    Proves pointwise majorizing-measure theorem for Gaussian processes, records Bayesian algorithmic lower bounds, and constructs a separation example among different complexity measures.

  4. Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation

    math.PR 2026-06 unverdicted novelty 7.0

    Establishes a variance-aware pointwise majorizing-measure theorem for Gaussian fields, records Bayesian algorithmic lower bounds, and constructs a separation example among classical, algorithmic, and pointwise quantities.

  5. Pointwise Generalization in Deep Neural Networks

    cs.LG 2026-05 unverdicted novelty 7.0

    Proposes pointwise Riemannian Dimension from feature eigenvalues to derive tighter, representation-aware generalization bounds for deep networks in the nonlinear regime.

  6. Bellman-sufficient Information Complexity

    cs.LG 2026-06 unverdicted novelty 6.0

    Proposes Bellman-sufficient state representations and information indices Y=χ(Ω) to organize sequential decision making with a conditional Bellman information-risk sandwich providing matching upper and lower complexit...