Schedules and Prioritization: A Behavioral Foundation for Multi-Armed Bandits and Stopping Problems
Pith reviewed 2026-06-27 07:36 UTC · model grok-4.3
The pith
Preferences over local schedules and a compensation axiom derive index optimality for bandits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Bandit models can be derived from preferences over schedules by first axiomatizing a generalized stopping representation with current utility, local discounting, and continuation aggregator, then using a common-tail compensation axiom to price calendar time, and finally imposing a tight elapsed-calendar constraint to obtain a rested generalized bandit where the index equals the shadow price of advancing the local clock.
What carries the argument
The common-tail compensation axiom that prices calendar time across schedules, which enables the reduction to index optimality in the rested generalized bandit.
If this is right
- Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models arise as special cases of the generalized framework.
- The optimal policy in the bandit is an index policy where the index is the shadow price of local clock advancement.
- Stopping problems are solved by comparing indices derived from the schedule preferences.
- Calendar time is consistently valued across different local schedules under the axiom.
Where Pith is reading between the lines
- Human experiments could test whether people price calendar time consistently across projects as the axiom requires.
- The framework might apply to organizational prioritization where different tasks have their own local times.
- Violations of index optimality in practice could be traced to failures of the common-tail compensation in preferences.
Load-bearing premise
The common-tail compensation axiom holds so that calendar time can be priced across different schedules.
What would settle it
Preferences over schedules that violate the common-tail compensation axiom, for example by assigning different values to the same common tail depending on the preceding schedule, would prevent the reduction to the rested bandit and the shadow-price interpretation of the index.
read the original abstract
Bandit models typically begin with arms, states, rewards, and transition rules. This paper instead begins with preferences over stopped local contingent schedules: possible unfoldings of a responsibility, project, experiment, or opportunity in its own local time. Behavioral axioms on single schedules characterize a generalized stopping representation with current utility, local discounting, and a broad continuation aggregator. A common-tail compensation axiom then allows calendar time to be priced across schedules. Imposing a tight elapsed-calendar constraint generates a rested generalized bandit and yields index optimality: the index is the shadow price of advancing a local clock. Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models arise as special cases.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a behavioral foundation for multi-armed bandits and stopping problems by starting from preferences over stopped local contingent schedules rather than from arms, states, or rewards. Behavioral axioms on single schedules are shown to characterize a generalized stopping representation involving current utility, local discounting, and a continuation aggregator. A common-tail compensation axiom is then used to price calendar time across schedules. Imposing a tight elapsed-calendar constraint produces a rested generalized bandit whose optimal policy is an index policy, with the index interpreted as the shadow price of advancing a local clock. Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models are recovered as special cases.
Significance. If the derivations are rigorous, the paper supplies a unified axiomatic microfoundation that shifts the primitive from structural elements to preferences over schedules and yields an economically interpretable index. The unification of several distinct models as special cases and the shadow-price characterization of the index are potentially valuable contributions to decision theory and bandit analysis.
major comments (2)
- [Abstract / common-tail compensation axiom section] Abstract and the section introducing the common-tail compensation axiom: this axiom is the load-bearing step that permits pricing calendar time across distinct schedules and thereby enables the reduction to a rested generalized bandit with index optimality. The manuscript must supply an explicit theorem (with proof) showing necessity of the axiom for the shadow-price result and must verify that the axiom holds in each of the listed special cases (expected utility, rank-dependent, Choquet, etc.).
- [Generalized stopping representation derivation] Section deriving the generalized stopping representation: the abstract states that behavioral axioms on single schedules characterize the representation with current utility, local discounting, and continuation aggregator, yet no functional forms, uniqueness argument, or verification that the representation is unique up to the stated parameters appear in the provided summary. An explicit uniqueness result tied to the axioms is required to support downstream claims.
minor comments (2)
- [Abstract] The abstract paragraph beginning 'A common-tail compensation axiom then allows...' could be expanded with a one-sentence pointer to the relevant theorem number for readers.
- [Notation] Notation for the continuation aggregator and local clock should be introduced with a short table or glossary if the same symbols are reused across sections.
Simulated Author's Rebuttal
We thank the referee for the constructive report and for identifying points where the axiomatic development can be made more explicit. We address each major comment below and will incorporate the requested additions in a revised manuscript.
read point-by-point responses
-
Referee: [Abstract / common-tail compensation axiom section] Abstract and the section introducing the common-tail compensation axiom: this axiom is the load-bearing step that permits pricing calendar time across distinct schedules and thereby enables the reduction to a rested generalized bandit with index optimality. The manuscript must supply an explicit theorem (with proof) showing necessity of the axiom for the shadow-price result and must verify that the axiom holds in each of the listed special cases (expected utility, rank-dependent, Choquet, etc.).
Authors: We agree that an explicit necessity theorem linking the common-tail compensation axiom to the shadow-price index result would strengthen the paper. In the revision we will add a dedicated theorem (with proof) establishing necessity under the maintained assumptions, together with a verification subsection confirming that the axiom is satisfied in each recovered special case (expected utility, rank-dependent utility, Choquet expected utility, robust, learning, and Pandora models). revision: yes
-
Referee: [Generalized stopping representation derivation] Section deriving the generalized stopping representation: the abstract states that behavioral axioms on single schedules characterize the representation with current utility, local discounting, and continuation aggregator, yet no functional forms, uniqueness argument, or verification that the representation is unique up to the stated parameters appear in the provided summary. An explicit uniqueness result tied to the axioms is required to support downstream claims.
Authors: The current draft presents the representation theorem but does not isolate a separate uniqueness proposition. We will add an explicit uniqueness result (showing uniqueness up to the stated parameters of current utility, local discounting, and continuation aggregator) directly tied to the behavioral axioms on single schedules. This will be placed immediately after the representation theorem to support all subsequent claims. revision: yes
Circularity Check
Derivation from stated behavioral axioms with no self-referential or fitted reductions
full rationale
The paper derives the generalized stopping representation and index optimality directly from behavioral axioms on single schedules plus the common-tail compensation axiom as primitives. No equations or steps in the provided abstract reduce a prediction or index to a fitted parameter by construction, invoke self-citations for uniqueness, or rename known results. The central claim remains independent of its inputs and is not forced by definition or prior self-work.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Behavioral axioms on single schedules that characterize the generalized stopping representation
- domain assumption Common-tail compensation axiom that prices calendar time across schedules
Reference graph
Works this paper leans on
-
[1]
Bulletin of the American Mathematical Society , year =
Robbins, Herbert , title =. Bulletin of the American Mathematical Society , year =
-
[2]
and Jones, David M
Gittins, John C. and Jones, David M. , title =. Progress in Statistics , editor =. 1974 , pages =
1974
-
[3]
, title =
Gittins, John C. , title =. Journal of the Royal Statistical Society. Series B (Methodological) , year =
-
[4]
Journal of the Royal Statistical Society
Whittle, Peter , title =. Journal of the Royal Statistical Society. Series B (Methodological) , year =
-
[5]
and Fristedt, Bert , title =
Berry, Donald A. and Fristedt, Bert , title =
-
[6]
and Veinott, Arthur F
Katehakis, Michael N. and Veinott, Arthur F. Jr. , title =. Mathematics of Operations Research , year =
-
[7]
Journal of Applied Probability , year =
Whittle, Peter , title =. Journal of Applied Probability , year =
-
[8]
and Weiss, Gideon , title =
Weber, Richard R. and Weiss, Gideon , title =. Journal of Applied Probability , year =
-
[9]
, title =
Weber, Richard R. , title =. The Annals of Applied Probability , year =
-
[10]
and Glazebrook, Kevin D
Gittins, John C. and Glazebrook, Kevin D. and Weber, Richard R. , title =. 2011 , isbn =
2011
-
[11]
Journal of Economic Theory , year =
Rothschild, Michael , title =. Journal of Economic Theory , year =
-
[12]
Econometrica , year =
Bolton, Patrick and Harris, Christopher , title =. Econometrica , year =
-
[13]
Econometrica , year =
Keller, Godfrey and Rady, Sven and Cripps, Martin , title =. Econometrica , year =
-
[14]
, title =
Anderson, Christopher M. , title =. Theory and Decision , year =
-
[15]
Kim, Michael Jong and Lim, Andrew E. B. , title =. Management Science , year =
-
[16]
Journal of Mathematical Economics , year =
Li, Jian , title =. Journal of Mathematical Economics , year =
-
[17]
Annals of Operations Research , year =
Caro, Felipe and Das Gupta, Aparupa , title =. Annals of Operations Research , year =
-
[18]
and Park, Haechurl and Rothblum, Uriel G
Denardo, Eric V. and Park, Haechurl and Rothblum, Uriel G. , title =. Mathematics of Operations Research , year =
-
[19]
European Journal of Operational Research , year =
Malekipirbazari, Milad and Cavus, Ozlem , title =. European Journal of Operational Research , year =
-
[20]
Chow, Yuan Shih and Robbins, Herbert and Siegmund, David , title =
-
[21]
, title =
Shiryaev, Albert N. , title =
-
[22]
, title =
Weitzman, Martin L. , title =. Econometrica , year =
-
[23]
, title =
Peskir, Goran and Shiryaev, Albert N. , title =. 2006 , doi =
2006
-
[24]
, title =
Koopmans, Tjalling C. , title =. Econometrica , year =
-
[25]
and Diamond, Peter A
Koopmans, Tjalling C. and Diamond, Peter A. and Williamson, Richard E. , title =. Econometrica , year =
-
[26]
and Porteus, Evan L
Kreps, David M. and Porteus, Evan L. , title =. Econometrica , year =
-
[27]
, title =
Epstein, Larry G. , title =. Journal of Economic Theory , year =
-
[28]
and Zin, Stanley E
Epstein, Larry G. and Zin, Stanley E. , title =. Econometrica , year =
-
[29]
, title =
Chew, Soo Hong and Epstein, Larry G. , title =. Equilibrium Theory in Infinite Dimensional Spaces , editor =. 1991 , pages =
1991
-
[30]
Economic Theory , year =
Klibanoff, Peter and Ozdenoren, Emre , title =. Economic Theory , year =
-
[31]
On Monotone Recursive Preferences , journal =
Bommier, Antoine and Kochov, Asen and Le Grand, Fran. On Monotone Recursive Preferences , journal =. 2017 , volume =
2017
-
[32]
and Luce, R
Krantz, David H. and Luce, R. Duncan and Suppes, Patrick and Tversky, Amos , title =
-
[33]
and Aumann, Robert J
Anscombe, Francis J. and Aumann, Robert J. , title =. The Annals of Mathematical Statistics , year =
-
[34]
Journal of Mathematical Economics , year =
Gilboa, Itzhak and Schmeidler, David , title =. Journal of Mathematical Economics , year =
-
[35]
Econometrica , year =
Schmeidler, David , title =. Econometrica , year =
-
[36]
Econometrica , year =
Maccheroni, Fabio and Marinacci, Massimo and Rustichini, Aldo , title =. Econometrica , year =
-
[37]
Journal of Economic Theory , year =
Maccheroni, Fabio and Marinacci, Massimo and Rustichini, Aldo , title =. Journal of Economic Theory , year =
-
[38]
and Schneider, Martin , title =
Epstein, Larry G. and Schneider, Martin , title =. Journal of Economic Theory , year =
-
[39]
2025 , note =
Auster, Sarah and Che, Yeon-Koo , title =. 2025 , note =
2025
-
[40]
American Economic Review , year =
Auster, Sarah and Che, Yeon-Koo and Mierendorff, Konrad , title =. American Economic Review , year =
-
[41]
American Economic Review , year =
Che, Yeon-Koo and Mierendorff, Konrad , title =. American Economic Review , year =
-
[42]
Bandit Problems , booktitle =
Bergemann, Dirk and V. Bandit Problems , booktitle =. 2008 , doi =
2008
-
[43]
and Treetanthiploet, Tanut , title =
Cohen, Samuel N. and Treetanthiploet, Tanut , title =. Electronic Journal of Probability , year =
-
[44]
Journal of Economic Theory , year =
Doval, Laura , title =. Journal of Economic Theory , year =
-
[45]
Journal of Economic Theory , year =
Olszewski, Wojciech and Weber, Richard , title =. Journal of Economic Theory , year =
-
[46]
Econometrica , year =
Klibanoff, Peter and Marinacci, Massimo and Mukerji, Sujoy , title =. Econometrica , year =
-
[47]
Journal of Economic Theory , year =
Klibanoff, Peter and Marinacci, Massimo and Mukerji, Sujoy , title =. Journal of Economic Theory , year =
-
[48]
, title =
Hansen, Lars Peter and Sargent, Thomas J. , title =. American Economic Review , year =
-
[49]
, title =
Hansen, Lars Peter and Sargent, Thomas J. , title =
-
[50]
and Pindyck, Robert S
Dixit, Avinash K. and Pindyck, Robert S. , title =
-
[51]
, title =
Varian, Hal R. , title =. Econometrica , year =
-
[52]
, title =
Samuelson, Paul A. , title =. Economica , year =
-
[53]
, title =
Houthakker, Hendrik S. , title =. Economica , year =
-
[54]
, title =
Afriat, Sydney N. , title =. International Economic Review , year =
-
[55]
A Theory of Subjective Learning , journal =
Dillenberger, David and Lleras, Juan Sebasti. A Theory of Subjective Learning , journal =. 2014 , volume =
2014
-
[56]
Vijay and Sadowski, Philipp , title =
Dillenberger, David and Krishna, R. Vijay and Sadowski, Philipp , title =. Theoretical Economics , year =
-
[57]
American Economic Review , year =
Denti, Tommaso , title =. American Economic Review , year =
-
[58]
Econometrica , year =
Denti, Tommaso and Pomatto, Luciano , title =. Econometrica , year =
-
[59]
Journal of Mathematical Economics , year =
Cheng, Xiaoyu , title =. Journal of Mathematical Economics , year =
-
[60]
Econometrica , year =
Sarver, Todd , title =. Econometrica , year =
-
[61]
Journal of Economic Theory , year =
Sadowski, Philipp and Sarver, Todd , title =. Journal of Economic Theory , year =
-
[62]
Objective Rationality Foundations for (Dynamic)
Frick, Mira and Iijima, Ryota and. Objective Rationality Foundations for (Dynamic). Journal of Economic Theory , year =
-
[63]
Dual-Self Representations of Ambiguity Preferences , journal =
Chandrasekher, Madhav and Frick, Mira and Iijima, Ryota and. Dual-Self Representations of Ambiguity Preferences , journal =. 2022 , volume =
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.