pith. sign in

arxiv: 2606.12044 · v1 · pith:37Y5EVKPnew · submitted 2026-06-10 · 💰 econ.TH

Schedules and Prioritization: A Behavioral Foundation for Multi-Armed Bandits and Stopping Problems

Pith reviewed 2026-06-27 07:36 UTC · model grok-4.3

classification 💰 econ.TH
keywords multi-armed banditsstopping problemsbehavioral axiomsindex policiesschedule preferencesgeneralized banditsshadow price of time
0
0 comments X

The pith

Preferences over local schedules and a compensation axiom derive index optimality for bandits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper starts from behavioral axioms on preferences over stopped local contingent schedules to characterize a generalized stopping representation. A common-tail compensation axiom prices calendar time across schedules. Imposing a tight elapsed-calendar constraint then generates a rested generalized bandit in which the index is the shadow price of advancing a local clock. This foundation unifies several models including expected-utility, learning, robust, rank-dependent, Choquet, and Pandora as special cases. A sympathetic reader would care because it grounds the use of index policies in prioritization and stopping problems in observable preferences rather than assumed reward structures.

Core claim

Bandit models can be derived from preferences over schedules by first axiomatizing a generalized stopping representation with current utility, local discounting, and continuation aggregator, then using a common-tail compensation axiom to price calendar time, and finally imposing a tight elapsed-calendar constraint to obtain a rested generalized bandit where the index equals the shadow price of advancing the local clock.

What carries the argument

The common-tail compensation axiom that prices calendar time across schedules, which enables the reduction to index optimality in the rested generalized bandit.

If this is right

  • Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models arise as special cases of the generalized framework.
  • The optimal policy in the bandit is an index policy where the index is the shadow price of local clock advancement.
  • Stopping problems are solved by comparing indices derived from the schedule preferences.
  • Calendar time is consistently valued across different local schedules under the axiom.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Human experiments could test whether people price calendar time consistently across projects as the axiom requires.
  • The framework might apply to organizational prioritization where different tasks have their own local times.
  • Violations of index optimality in practice could be traced to failures of the common-tail compensation in preferences.

Load-bearing premise

The common-tail compensation axiom holds so that calendar time can be priced across different schedules.

What would settle it

Preferences over schedules that violate the common-tail compensation axiom, for example by assigning different values to the same common tail depending on the preceding schedule, would prevent the reduction to the rested bandit and the shadow-price interpretation of the index.

read the original abstract

Bandit models typically begin with arms, states, rewards, and transition rules. This paper instead begins with preferences over stopped local contingent schedules: possible unfoldings of a responsibility, project, experiment, or opportunity in its own local time. Behavioral axioms on single schedules characterize a generalized stopping representation with current utility, local discounting, and a broad continuation aggregator. A common-tail compensation axiom then allows calendar time to be priced across schedules. Imposing a tight elapsed-calendar constraint generates a rested generalized bandit and yields index optimality: the index is the shadow price of advancing a local clock. Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models arise as special cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a behavioral foundation for multi-armed bandits and stopping problems by starting from preferences over stopped local contingent schedules rather than from arms, states, or rewards. Behavioral axioms on single schedules are shown to characterize a generalized stopping representation involving current utility, local discounting, and a continuation aggregator. A common-tail compensation axiom is then used to price calendar time across schedules. Imposing a tight elapsed-calendar constraint produces a rested generalized bandit whose optimal policy is an index policy, with the index interpreted as the shadow price of advancing a local clock. Expected-utility, learning, robust, rank-dependent, Choquet, and Pandora models are recovered as special cases.

Significance. If the derivations are rigorous, the paper supplies a unified axiomatic microfoundation that shifts the primitive from structural elements to preferences over schedules and yields an economically interpretable index. The unification of several distinct models as special cases and the shadow-price characterization of the index are potentially valuable contributions to decision theory and bandit analysis.

major comments (2)
  1. [Abstract / common-tail compensation axiom section] Abstract and the section introducing the common-tail compensation axiom: this axiom is the load-bearing step that permits pricing calendar time across distinct schedules and thereby enables the reduction to a rested generalized bandit with index optimality. The manuscript must supply an explicit theorem (with proof) showing necessity of the axiom for the shadow-price result and must verify that the axiom holds in each of the listed special cases (expected utility, rank-dependent, Choquet, etc.).
  2. [Generalized stopping representation derivation] Section deriving the generalized stopping representation: the abstract states that behavioral axioms on single schedules characterize the representation with current utility, local discounting, and continuation aggregator, yet no functional forms, uniqueness argument, or verification that the representation is unique up to the stated parameters appear in the provided summary. An explicit uniqueness result tied to the axioms is required to support downstream claims.
minor comments (2)
  1. [Abstract] The abstract paragraph beginning 'A common-tail compensation axiom then allows...' could be expanded with a one-sentence pointer to the relevant theorem number for readers.
  2. [Notation] Notation for the continuation aggregator and local clock should be introduced with a short table or glossary if the same symbols are reused across sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive report and for identifying points where the axiomatic development can be made more explicit. We address each major comment below and will incorporate the requested additions in a revised manuscript.

read point-by-point responses
  1. Referee: [Abstract / common-tail compensation axiom section] Abstract and the section introducing the common-tail compensation axiom: this axiom is the load-bearing step that permits pricing calendar time across distinct schedules and thereby enables the reduction to a rested generalized bandit with index optimality. The manuscript must supply an explicit theorem (with proof) showing necessity of the axiom for the shadow-price result and must verify that the axiom holds in each of the listed special cases (expected utility, rank-dependent, Choquet, etc.).

    Authors: We agree that an explicit necessity theorem linking the common-tail compensation axiom to the shadow-price index result would strengthen the paper. In the revision we will add a dedicated theorem (with proof) establishing necessity under the maintained assumptions, together with a verification subsection confirming that the axiom is satisfied in each recovered special case (expected utility, rank-dependent utility, Choquet expected utility, robust, learning, and Pandora models). revision: yes

  2. Referee: [Generalized stopping representation derivation] Section deriving the generalized stopping representation: the abstract states that behavioral axioms on single schedules characterize the representation with current utility, local discounting, and continuation aggregator, yet no functional forms, uniqueness argument, or verification that the representation is unique up to the stated parameters appear in the provided summary. An explicit uniqueness result tied to the axioms is required to support downstream claims.

    Authors: The current draft presents the representation theorem but does not isolate a separate uniqueness proposition. We will add an explicit uniqueness result (showing uniqueness up to the stated parameters of current utility, local discounting, and continuation aggregator) directly tied to the behavioral axioms on single schedules. This will be placed immediately after the representation theorem to support all subsequent claims. revision: yes

Circularity Check

0 steps flagged

Derivation from stated behavioral axioms with no self-referential or fitted reductions

full rationale

The paper derives the generalized stopping representation and index optimality directly from behavioral axioms on single schedules plus the common-tail compensation axiom as primitives. No equations or steps in the provided abstract reduce a prediction or index to a fitted parameter by construction, invoke self-citations for uniqueness, or rename known results. The central claim remains independent of its inputs and is not forced by definition or prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two layers of behavioral axioms whose precise statements are not supplied in the abstract; these function as domain assumptions in decision theory. No free parameters or invented entities are mentioned.

axioms (2)
  • domain assumption Behavioral axioms on single schedules that characterize the generalized stopping representation
    Invoked to obtain current utility, local discounting, and continuation aggregator (abstract).
  • domain assumption Common-tail compensation axiom that prices calendar time across schedules
    Required to move from single-schedule representation to multi-schedule bandit problem (abstract).

pith-pipeline@v0.9.1-grok · 5641 in / 1520 out tokens · 22787 ms · 2026-06-27T07:36:37.094529+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references

  1. [1]

    Bulletin of the American Mathematical Society , year =

    Robbins, Herbert , title =. Bulletin of the American Mathematical Society , year =

  2. [2]

    and Jones, David M

    Gittins, John C. and Jones, David M. , title =. Progress in Statistics , editor =. 1974 , pages =

  3. [3]

    , title =

    Gittins, John C. , title =. Journal of the Royal Statistical Society. Series B (Methodological) , year =

  4. [4]

    Journal of the Royal Statistical Society

    Whittle, Peter , title =. Journal of the Royal Statistical Society. Series B (Methodological) , year =

  5. [5]

    and Fristedt, Bert , title =

    Berry, Donald A. and Fristedt, Bert , title =

  6. [6]

    and Veinott, Arthur F

    Katehakis, Michael N. and Veinott, Arthur F. Jr. , title =. Mathematics of Operations Research , year =

  7. [7]

    Journal of Applied Probability , year =

    Whittle, Peter , title =. Journal of Applied Probability , year =

  8. [8]

    and Weiss, Gideon , title =

    Weber, Richard R. and Weiss, Gideon , title =. Journal of Applied Probability , year =

  9. [9]

    , title =

    Weber, Richard R. , title =. The Annals of Applied Probability , year =

  10. [10]

    and Glazebrook, Kevin D

    Gittins, John C. and Glazebrook, Kevin D. and Weber, Richard R. , title =. 2011 , isbn =

  11. [11]

    Journal of Economic Theory , year =

    Rothschild, Michael , title =. Journal of Economic Theory , year =

  12. [12]

    Econometrica , year =

    Bolton, Patrick and Harris, Christopher , title =. Econometrica , year =

  13. [13]

    Econometrica , year =

    Keller, Godfrey and Rady, Sven and Cripps, Martin , title =. Econometrica , year =

  14. [14]

    , title =

    Anderson, Christopher M. , title =. Theory and Decision , year =

  15. [15]

    Kim, Michael Jong and Lim, Andrew E. B. , title =. Management Science , year =

  16. [16]

    Journal of Mathematical Economics , year =

    Li, Jian , title =. Journal of Mathematical Economics , year =

  17. [17]

    Annals of Operations Research , year =

    Caro, Felipe and Das Gupta, Aparupa , title =. Annals of Operations Research , year =

  18. [18]

    and Park, Haechurl and Rothblum, Uriel G

    Denardo, Eric V. and Park, Haechurl and Rothblum, Uriel G. , title =. Mathematics of Operations Research , year =

  19. [19]

    European Journal of Operational Research , year =

    Malekipirbazari, Milad and Cavus, Ozlem , title =. European Journal of Operational Research , year =

  20. [20]

    Chow, Yuan Shih and Robbins, Herbert and Siegmund, David , title =

  21. [21]

    , title =

    Shiryaev, Albert N. , title =

  22. [22]

    , title =

    Weitzman, Martin L. , title =. Econometrica , year =

  23. [23]

    , title =

    Peskir, Goran and Shiryaev, Albert N. , title =. 2006 , doi =

  24. [24]

    , title =

    Koopmans, Tjalling C. , title =. Econometrica , year =

  25. [25]

    and Diamond, Peter A

    Koopmans, Tjalling C. and Diamond, Peter A. and Williamson, Richard E. , title =. Econometrica , year =

  26. [26]

    and Porteus, Evan L

    Kreps, David M. and Porteus, Evan L. , title =. Econometrica , year =

  27. [27]

    , title =

    Epstein, Larry G. , title =. Journal of Economic Theory , year =

  28. [28]

    and Zin, Stanley E

    Epstein, Larry G. and Zin, Stanley E. , title =. Econometrica , year =

  29. [29]

    , title =

    Chew, Soo Hong and Epstein, Larry G. , title =. Equilibrium Theory in Infinite Dimensional Spaces , editor =. 1991 , pages =

  30. [30]

    Economic Theory , year =

    Klibanoff, Peter and Ozdenoren, Emre , title =. Economic Theory , year =

  31. [31]

    On Monotone Recursive Preferences , journal =

    Bommier, Antoine and Kochov, Asen and Le Grand, Fran. On Monotone Recursive Preferences , journal =. 2017 , volume =

  32. [32]

    and Luce, R

    Krantz, David H. and Luce, R. Duncan and Suppes, Patrick and Tversky, Amos , title =

  33. [33]

    and Aumann, Robert J

    Anscombe, Francis J. and Aumann, Robert J. , title =. The Annals of Mathematical Statistics , year =

  34. [34]

    Journal of Mathematical Economics , year =

    Gilboa, Itzhak and Schmeidler, David , title =. Journal of Mathematical Economics , year =

  35. [35]

    Econometrica , year =

    Schmeidler, David , title =. Econometrica , year =

  36. [36]

    Econometrica , year =

    Maccheroni, Fabio and Marinacci, Massimo and Rustichini, Aldo , title =. Econometrica , year =

  37. [37]

    Journal of Economic Theory , year =

    Maccheroni, Fabio and Marinacci, Massimo and Rustichini, Aldo , title =. Journal of Economic Theory , year =

  38. [38]

    and Schneider, Martin , title =

    Epstein, Larry G. and Schneider, Martin , title =. Journal of Economic Theory , year =

  39. [39]

    2025 , note =

    Auster, Sarah and Che, Yeon-Koo , title =. 2025 , note =

  40. [40]

    American Economic Review , year =

    Auster, Sarah and Che, Yeon-Koo and Mierendorff, Konrad , title =. American Economic Review , year =

  41. [41]

    American Economic Review , year =

    Che, Yeon-Koo and Mierendorff, Konrad , title =. American Economic Review , year =

  42. [42]

    Bandit Problems , booktitle =

    Bergemann, Dirk and V. Bandit Problems , booktitle =. 2008 , doi =

  43. [43]

    and Treetanthiploet, Tanut , title =

    Cohen, Samuel N. and Treetanthiploet, Tanut , title =. Electronic Journal of Probability , year =

  44. [44]

    Journal of Economic Theory , year =

    Doval, Laura , title =. Journal of Economic Theory , year =

  45. [45]

    Journal of Economic Theory , year =

    Olszewski, Wojciech and Weber, Richard , title =. Journal of Economic Theory , year =

  46. [46]

    Econometrica , year =

    Klibanoff, Peter and Marinacci, Massimo and Mukerji, Sujoy , title =. Econometrica , year =

  47. [47]

    Journal of Economic Theory , year =

    Klibanoff, Peter and Marinacci, Massimo and Mukerji, Sujoy , title =. Journal of Economic Theory , year =

  48. [48]

    , title =

    Hansen, Lars Peter and Sargent, Thomas J. , title =. American Economic Review , year =

  49. [49]

    , title =

    Hansen, Lars Peter and Sargent, Thomas J. , title =

  50. [50]

    and Pindyck, Robert S

    Dixit, Avinash K. and Pindyck, Robert S. , title =

  51. [51]

    , title =

    Varian, Hal R. , title =. Econometrica , year =

  52. [52]

    , title =

    Samuelson, Paul A. , title =. Economica , year =

  53. [53]

    , title =

    Houthakker, Hendrik S. , title =. Economica , year =

  54. [54]

    , title =

    Afriat, Sydney N. , title =. International Economic Review , year =

  55. [55]

    A Theory of Subjective Learning , journal =

    Dillenberger, David and Lleras, Juan Sebasti. A Theory of Subjective Learning , journal =. 2014 , volume =

  56. [56]

    Vijay and Sadowski, Philipp , title =

    Dillenberger, David and Krishna, R. Vijay and Sadowski, Philipp , title =. Theoretical Economics , year =

  57. [57]

    American Economic Review , year =

    Denti, Tommaso , title =. American Economic Review , year =

  58. [58]

    Econometrica , year =

    Denti, Tommaso and Pomatto, Luciano , title =. Econometrica , year =

  59. [59]

    Journal of Mathematical Economics , year =

    Cheng, Xiaoyu , title =. Journal of Mathematical Economics , year =

  60. [60]

    Econometrica , year =

    Sarver, Todd , title =. Econometrica , year =

  61. [61]

    Journal of Economic Theory , year =

    Sadowski, Philipp and Sarver, Todd , title =. Journal of Economic Theory , year =

  62. [62]

    Objective Rationality Foundations for (Dynamic)

    Frick, Mira and Iijima, Ryota and. Objective Rationality Foundations for (Dynamic). Journal of Economic Theory , year =

  63. [63]

    Dual-Self Representations of Ambiguity Preferences , journal =

    Chandrasekher, Madhav and Frick, Mira and Iijima, Ryota and. Dual-Self Representations of Ambiguity Preferences , journal =. 2022 , volume =