pith. machine review for the scientific record. sign in

arxiv: 2604.27186 · v1 · submitted 2026-04-29 · 📡 eess.SY · cs.AI· cs.LG· cs.SY· q-fin.PM

Recognition: unknown

Learning to Spend: Model Predictive Control for Budgeting under Non-Stationary Returns

Authors on Pith no claims yet

Pith reviewed 2026-05-07 10:27 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.LGcs.SYq-fin.PM
keywords model predictive controlbudget allocationnon-stationary returnsreactive budgetingdigital marketingintertemporal trade-offsclosed-loop control
0
0 comments X

The pith

Model predictive control improves budget allocation over reactive methods only when return efficiencies follow a predictable structure captured by a model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper treats periodic budget allocation with evolving returns as a closed-loop control problem and tests receding-horizon MPC against simple reactive policies that adjust spending based on current observations alone. In a simulation framework drawn from digital marketing, the comparison spans environments with different degrees of non-stationarity. Results indicate that unpredictable drift or stationary returns give MPC no systematic edge. Only when return efficiency exhibits a known structure over the horizon that the controller can model does MPC improve total returns by deliberately shifting allocations across periods to exploit intertemporal trade-offs. This finding clarifies for practitioners when the extra modeling effort of predictive control is justified in resource allocation tasks.

Core claim

We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over 1.

What carries the argument

Receding-horizon Model Predictive Control that plans budget allocations by optimizing over a future horizon using an explicit model of return-efficiency dynamics, contrasted with reactive policies that allocate without foresight.

If this is right

  • When return efficiencies contain known structure, MPC improves cumulative returns by shifting spend across periods to exploit the modeled trade-offs.
  • When return changes are stationary or purely stochastic, reactive policies achieve comparable performance without requiring a predictive model.
  • The value of MPC depends directly on how well the underlying return model matches the actual predictable component over the planning horizon.
  • The controlled simulation framework enables repeatable tests of budgeting performance as the degree of predictability in returns is varied.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Businesses using digital ad budgets could first test whether historical return data supports a usable predictive model before switching from reactive pacing to MPC.
  • The same distinction between predictable and unpredictable non-stationarity could guide control choices in other domains such as production scheduling or inventory replenishment.
  • If return models can be learned and updated from streaming data, the boundary between cases where MPC helps and where it does not may shift.

Load-bearing premise

An accurate model of the predictable structure in return efficiencies is available to the MPC controller.

What would settle it

Running the same simulations after deliberately supplying the MPC with an inaccurate or absent model of return dynamics and checking whether its performance advantage over reactive policies disappears.

read the original abstract

We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over reactive baselines. By contrast, when return efficiency exhibits predictable structure over the planning horizon, that is captured through an underlying model, MPC consistently outperforms reactive budgeting by exploiting intertemporal trade-offs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript frames finite-horizon budget allocation as a closed-loop control problem and compares receding-horizon MPC against reactive pacing policies under execution noise and operational constraints. Using controlled simulations motivated by digital marketing, it reports that non-stationarity alone does not favor MPC; systematic outperformance occurs only when return efficiency exhibits predictable structure over the planning horizon that is captured by an underlying model, allowing exploitation of intertemporal trade-offs.

Significance. If the central claim is supported by properly documented experiments, the work would usefully delineate when predictive control adds value in economic resource allocation, distinguishing structured predictability from mere non-stationarity. The controlled simulation approach is a positive feature for isolating regime-specific effects.

major comments (2)
  1. [Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.
  2. [MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.
minor comments (1)
  1. Notation for return efficiency and planning horizon could be introduced more explicitly at first use to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive evaluation of the controlled simulation framework. We agree that the experimental documentation requires strengthening to better support the central claims, and we will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.

    Authors: We agree that these details were insufficiently documented. In the revised manuscript we will add a dedicated 'Experimental Setup' subsection that specifies: the controlled generation of non-stationarity regimes, the performance metrics (cumulative return efficiency and regret to oracle), the number of Monte Carlo runs (50 independent trials per regime), statistical testing (paired Wilcoxon signed-rank tests with reported p-values), and robustness checks across noise levels and horizons. These additions will make the performance differentials fully reproducible and statistically grounded. revision: yes

  2. Referee: [MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.

    Authors: We acknowledge the ambiguity. The simulations employ an oracle model that perfectly encodes the known predictable structure in order to isolate the value of receding-horizon optimization; this was not stated explicitly. In revision we will add a clarifying paragraph in the MPC formulation section stating that the model is supplied as ground truth for these controlled experiments, and we will include a short discussion (with one additional sensitivity plot) on how performance degrades under model mismatch or online estimation. This will allow readers to attribute gains specifically to predictive control under ideal model conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; MPC advantage tied to explicit model availability in controlled simulations

full rationale

The paper's central result—that MPC outperforms reactive policies only when return efficiency has predictable structure captured by an underlying model—is demonstrated through controlled simulations with varying non-stationarity. This comparison is self-contained: the model is an input to the MPC formulation (as is standard for model-based control), while reactive baselines lack it by design. No derivation step reduces a claimed prediction to a fitted parameter or self-citation by construction, and the abstract explicitly conditions the advantage on model presence rather than asserting unconditional superiority. The evaluation framework is externally falsifiable via the simulation environments described.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the simulation framework and 'underlying model' are referenced but not formalized.

pith-pipeline@v0.9.0 · 5447 in / 974 out tokens · 40171 ms · 2026-05-07T10:27:33.332379+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 3 canonical work pages

  1. [1]

    Expedia group, inc

    Expedia Group, Inc. Expedia group, inc. form 10-k annual report. https://www.sec.gov/ixviewer/documents/20240208/ 0001637459-24-000014.xhtml, 2024. Accessed via U.S. SEC EDGAR

  2. [2]

    Expedia group, inc

    Expedia Group, Inc. Expedia group, inc. form 10-q quarterly re- ports. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany& CIK=0001637459&type=10-Q, 2024. Accessed via U.S. SEC EDGAR

  3. [3]

    Hanssens, Leonard J

    Dominique M. Hanssens, Leonard J. Parsons, and Randall L. Schultz. Market Response Models: Econometric and Time Series Analysis. Springer, 2003

  4. [4]

    Dekimpe and Dominique M

    Marnik G. Dekimpe and Dominique M. Hanssens. The persistence of marketing effects on sales.Marketing Science, 14(1):1–21, 1995

  5. [5]

    Lagrangian decomposition algorithm for allocating marketing channels

    Daisuke Hatano, Takuro Fukunaga, Takanori Maehara, and Ken-ichi Kawarabayashi. Lagrangian decomposition algorithm for allocating marketing channels. InProceedings of the AAAI Conference on Artificial Intelligence, 2015

  6. [6]

    Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022

    Virginia Luzon. Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022

  7. [7]

    A unified framework for advertising bidding and budget allocation

    Qingpeng Zhao, Yu Liu, Wenyu Wei, Wei Chen, and Jun Wang. A unified framework for advertising bidding and budget allocation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2185–2194, 2019

  8. [8]

    Budget optimization for online campaigns with positive car- ryover effects

    Nikolay Archak, Vahab Mirrokni, and Shanmugavelayutham Muthukr- ishnan. Budget optimization for online campaigns with positive car- ryover effects. InInternational Workshop on Internet and Network Economics, pages 86–99. Springer, 2012

  9. [9]

    Budget-constrained marketing optimization via automated bidding

    Hongyi Cai, Wei Chen, and Jun Wang. Budget-constrained marketing optimization via automated bidding. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

  10. [10]

    An end-to-end framework for marketing effectiveness optimization under budget constraint.arXiv preprint arXiv:2302.04477, 2023

    Ziang Yan, Shusen Wang, Guorui Zhou, Jingjian Lin, and Peng Jiang. An end-to-end framework for marketing effectiveness optimization under budget constraint.arXiv preprint arXiv:2302.04477, 2023

  11. [11]

    Marc Nerlove and Kenneth J. Arrow. Optimal advertising policy under dynamic conditions.Economica, 29(114):129–142, 1962

  12. [12]

    Suresh P. Sethi. Optimal control of the vidale–wolfe advertising model. Operations Research, 21(4):998–1013, 1973

  13. [13]

    Lee.Model predictive control: past, present and future

    Manfred Morari and Jay H. Lee.Model predictive control: past, present and future. Elsevier, 1999

  14. [14]

    Rawlings, David Q

    James B. Rawlings, David Q. Mayne, and Moritz Diehl.Model Predic- tive Control: Theory, Computation, and Design. Nob Hill Publishing, 2nd edition, 2017

  15. [15]

    Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019

    Yingying Li, Xin Chen, and Na Li. Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019

  16. [16]

    The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020

    Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, and Adam Wierman. The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020

  17. [17]

    Real- time bidding algorithms for performance-based display ad allocation

    Ye Chen, Pavel Berkhin, Bo Anderson, and Nikhil R Devanur. Real- time bidding algorithms for performance-based display ad allocation. InProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011

  18. [18]

    Real-time bidding by reinforcement learning in display advertising

    Hongyi Cai, Kan Ren, Weinan Zhang, Kleanthis Malialis, Jun Wang, Yong Yu, and Dawei Guo. Real-time bidding by reinforcement learning in display advertising. InProceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 661–670, 2017

  19. [19]

    Budget pacing for targeted online advertisements at linkedin

    Deepak Agarwal et al. Budget pacing for targeted online advertisements at linkedin. InProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1613–1619, 2014

  20. [20]

    A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023

    Santiago R Balseiro, Kshipra Bhawalkar, Zhe Feng, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan, and Di Wang. A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023

  21. [21]

    A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959

    Francis J Richards. A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959

  22. [22]

    Data-driven budget allocation optimiza- tion for digital marketing

    Jing Wang, Chris Swartz, Kai Huang, Smriti Shyamal, Roopesh Ranjan, Dan Friedman, and Joel Brooks. Data-driven budget allocation optimiza- tion for digital marketing. InProceedings of the 65th Annual Canadian Operational Research Society Conference, London, Ontario, 2024. APPENDIXA IMPLEMENTATIONDETAILS The optimization problem (12)–(15) is a small nonlin...