pith. sign in

arxiv: 2605.25592 · v1 · pith:EQUE2LBEnew · submitted 2026-05-25 · 📊 stat.ML · cs.LG

Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification

classification 📊 stat.ML cs.LG
keywords designbanditslinearoptimalapproachesassortmentbestcomputationally
0
0 comments X
read the original abstract

We study optimal experimental design for multinomial logit (MNL) bandits, where an agent repeatedly selects a subset of $K$ items from a ground set of size $N$ and observes single-choice feedback. Unlike linear or generalized linear bandits, MNL bandits have a combinatorial action space, which makes classical optimal design approaches and naive optimization over all subsets computationally intractable. We propose a computationally efficient optimal design framework for MNL models that achieves both statistical efficiency and scalability through two complementary approaches: (i) an exact or certified-approximate reformulation of the design oracle as a $0$-$1$ mixed-integer linear program (MILP) with solver-certified early stopping, and (ii) a fully polynomial-time lifted design that replaces the nonlinear objective with a tractable surrogate. Using the Kiefer-Wolfowitz equivalence theorem, we establish near G-optimality guarantees and characterize the induced statistical-computational trade-offs. As an application, we develop a best assortment identification algorithm for MNL bandits with linear utilities and non-uniform revenues, and prove an instance-dependent sample complexity of $\tilde{O}\big(\frac{d \log N}{\Delta^2}\big)$, where $d$ is the feature dimension, $N$ is the number of arms, and $\Delta$ is the minimum revenue gap.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.