Recognition: unknown
Learning to Spend: Model Predictive Control for Budgeting under Non-Stationary Returns
Pith reviewed 2026-05-07 10:27 UTC · model grok-4.3
The pith
Model predictive control improves budget allocation over reactive methods only when return efficiencies follow a predictable structure captured by a model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over 1.
What carries the argument
Receding-horizon Model Predictive Control that plans budget allocations by optimizing over a future horizon using an explicit model of return-efficiency dynamics, contrasted with reactive policies that allocate without foresight.
If this is right
- When return efficiencies contain known structure, MPC improves cumulative returns by shifting spend across periods to exploit the modeled trade-offs.
- When return changes are stationary or purely stochastic, reactive policies achieve comparable performance without requiring a predictive model.
- The value of MPC depends directly on how well the underlying return model matches the actual predictable component over the planning horizon.
- The controlled simulation framework enables repeatable tests of budgeting performance as the degree of predictability in returns is varied.
Where Pith is reading between the lines
- Businesses using digital ad budgets could first test whether historical return data supports a usable predictive model before switching from reactive pacing to MPC.
- The same distinction between predictable and unpredictable non-stationarity could guide control choices in other domains such as production scheduling or inventory replenishment.
- If return models can be learned and updated from streaming data, the boundary between cases where MPC helps and where it does not may shift.
Load-bearing premise
An accurate model of the predictable structure in return efficiencies is available to the MPC controller.
What would settle it
Running the same simulations after deliberately supplying the MPC with an inaccurate or absent model of return dynamics and checking whether its performance advantage over reactive policies disappears.
read the original abstract
We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over reactive baselines. By contrast, when return efficiency exhibits predictable structure over the planning horizon, that is captured through an underlying model, MPC consistently outperforms reactive budgeting by exploiting intertemporal trade-offs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript frames finite-horizon budget allocation as a closed-loop control problem and compares receding-horizon MPC against reactive pacing policies under execution noise and operational constraints. Using controlled simulations motivated by digital marketing, it reports that non-stationarity alone does not favor MPC; systematic outperformance occurs only when return efficiency exhibits predictable structure over the planning horizon that is captured by an underlying model, allowing exploitation of intertemporal trade-offs.
Significance. If the central claim is supported by properly documented experiments, the work would usefully delineate when predictive control adds value in economic resource allocation, distinguishing structured predictability from mere non-stationarity. The controlled simulation approach is a positive feature for isolating regime-specific effects.
major comments (2)
- [Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.
- [MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.
minor comments (1)
- Notation for return efficiency and planning horizon could be introduced more explicitly at first use to aid readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive evaluation of the controlled simulation framework. We agree that the experimental documentation requires strengthening to better support the central claims, and we will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.
Authors: We agree that these details were insufficiently documented. In the revised manuscript we will add a dedicated 'Experimental Setup' subsection that specifies: the controlled generation of non-stationarity regimes, the performance metrics (cumulative return efficiency and regret to oracle), the number of Monte Carlo runs (50 independent trials per regime), statistical testing (paired Wilcoxon signed-rank tests with reported p-values), and robustness checks across noise levels and horizons. These additions will make the performance differentials fully reproducible and statistically grounded. revision: yes
-
Referee: [MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.
Authors: We acknowledge the ambiguity. The simulations employ an oracle model that perfectly encodes the known predictable structure in order to isolate the value of receding-horizon optimization; this was not stated explicitly. In revision we will add a clarifying paragraph in the MPC formulation section stating that the model is supplied as ground truth for these controlled experiments, and we will include a short discussion (with one additional sensitivity plot) on how performance degrades under model mismatch or online estimation. This will allow readers to attribute gains specifically to predictive control under ideal model conditions. revision: yes
Circularity Check
No significant circularity; MPC advantage tied to explicit model availability in controlled simulations
full rationale
The paper's central result—that MPC outperforms reactive policies only when return efficiency has predictable structure captured by an underlying model—is demonstrated through controlled simulations with varying non-stationarity. This comparison is self-contained: the model is an input to the MPC formulation (as is standard for model-based control), while reactive baselines lack it by design. No derivation step reduces a claimed prediction to a fitted parameter or self-citation by construction, and the abstract explicitly conditions the advantage on model presence rather than asserting unconditional superiority. The evaluation framework is externally falsifiable via the simulation environments described.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Expedia Group, Inc. Expedia group, inc. form 10-k annual report. https://www.sec.gov/ixviewer/documents/20240208/ 0001637459-24-000014.xhtml, 2024. Accessed via U.S. SEC EDGAR
-
[2]
Expedia group, inc
Expedia Group, Inc. Expedia group, inc. form 10-q quarterly re- ports. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany& CIK=0001637459&type=10-Q, 2024. Accessed via U.S. SEC EDGAR
2024
-
[3]
Hanssens, Leonard J
Dominique M. Hanssens, Leonard J. Parsons, and Randall L. Schultz. Market Response Models: Econometric and Time Series Analysis. Springer, 2003
2003
-
[4]
Dekimpe and Dominique M
Marnik G. Dekimpe and Dominique M. Hanssens. The persistence of marketing effects on sales.Marketing Science, 14(1):1–21, 1995
1995
-
[5]
Lagrangian decomposition algorithm for allocating marketing channels
Daisuke Hatano, Takuro Fukunaga, Takanori Maehara, and Ken-ichi Kawarabayashi. Lagrangian decomposition algorithm for allocating marketing channels. InProceedings of the AAAI Conference on Artificial Intelligence, 2015
2015
-
[6]
Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022
Virginia Luzon. Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022
2022
-
[7]
A unified framework for advertising bidding and budget allocation
Qingpeng Zhao, Yu Liu, Wenyu Wei, Wei Chen, and Jun Wang. A unified framework for advertising bidding and budget allocation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2185–2194, 2019
2019
-
[8]
Budget optimization for online campaigns with positive car- ryover effects
Nikolay Archak, Vahab Mirrokni, and Shanmugavelayutham Muthukr- ishnan. Budget optimization for online campaigns with positive car- ryover effects. InInternational Workshop on Internet and Network Economics, pages 86–99. Springer, 2012
2012
-
[9]
Budget-constrained marketing optimization via automated bidding
Hongyi Cai, Wei Chen, and Jun Wang. Budget-constrained marketing optimization via automated bidding. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
2023
-
[10]
Ziang Yan, Shusen Wang, Guorui Zhou, Jingjian Lin, and Peng Jiang. An end-to-end framework for marketing effectiveness optimization under budget constraint.arXiv preprint arXiv:2302.04477, 2023
-
[11]
Marc Nerlove and Kenneth J. Arrow. Optimal advertising policy under dynamic conditions.Economica, 29(114):129–142, 1962
1962
-
[12]
Suresh P. Sethi. Optimal control of the vidale–wolfe advertising model. Operations Research, 21(4):998–1013, 1973
1973
-
[13]
Lee.Model predictive control: past, present and future
Manfred Morari and Jay H. Lee.Model predictive control: past, present and future. Elsevier, 1999
1999
-
[14]
Rawlings, David Q
James B. Rawlings, David Q. Mayne, and Moritz Diehl.Model Predic- tive Control: Theory, Computation, and Design. Nob Hill Publishing, 2nd edition, 2017
2017
-
[15]
Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019
Yingying Li, Xin Chen, and Na Li. Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019
2019
-
[16]
The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020
Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, and Adam Wierman. The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020
1994
-
[17]
Real- time bidding algorithms for performance-based display ad allocation
Ye Chen, Pavel Berkhin, Bo Anderson, and Nikhil R Devanur. Real- time bidding algorithms for performance-based display ad allocation. InProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011
2011
-
[18]
Real-time bidding by reinforcement learning in display advertising
Hongyi Cai, Kan Ren, Weinan Zhang, Kleanthis Malialis, Jun Wang, Yong Yu, and Dawei Guo. Real-time bidding by reinforcement learning in display advertising. InProceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 661–670, 2017
2017
-
[19]
Budget pacing for targeted online advertisements at linkedin
Deepak Agarwal et al. Budget pacing for targeted online advertisements at linkedin. InProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1613–1619, 2014
2014
-
[20]
A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023
Santiago R Balseiro, Kshipra Bhawalkar, Zhe Feng, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan, and Di Wang. A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023
-
[21]
A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959
Francis J Richards. A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959
1959
-
[22]
Data-driven budget allocation optimiza- tion for digital marketing
Jing Wang, Chris Swartz, Kai Huang, Smriti Shyamal, Roopesh Ranjan, Dan Friedman, and Joel Brooks. Data-driven budget allocation optimiza- tion for digital marketing. InProceedings of the 65th Annual Canadian Operational Research Society Conference, London, Ontario, 2024. APPENDIXA IMPLEMENTATIONDETAILS The optimization problem (12)–(15) is a small nonlin...
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.