arxiv: 2604.27186 · v1 · submitted 2026-04-29 · 📡 eess.SY · cs.AI· cs.LG· cs.SY· q-fin.PM

Recognition: unknown

Learning to Spend: Model Predictive Control for Budgeting under Non-Stationary Returns

Nilavra Pathak , Smriti Shyamal , Prasant Mhasker , Christopher Swartz

Authors on Pith no claims yet

Pith reviewed 2026-05-07 10:27 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.LGcs.SYq-fin.PM

keywords model predictive controlbudget allocationnon-stationary returnsreactive budgetingdigital marketingintertemporal trade-offsclosed-loop control

0 comments

The pith

Model predictive control improves budget allocation over reactive methods only when return efficiencies follow a predictable structure captured by a model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper treats periodic budget allocation with evolving returns as a closed-loop control problem and tests receding-horizon MPC against simple reactive policies that adjust spending based on current observations alone. In a simulation framework drawn from digital marketing, the comparison spans environments with different degrees of non-stationarity. Results indicate that unpredictable drift or stationary returns give MPC no systematic edge. Only when return efficiency exhibits a known structure over the horizon that the controller can model does MPC improve total returns by deliberately shifting allocations across periods to exploit intertemporal trade-offs. This finding clarifies for practitioners when the extra modeling effort of predictive control is justified in resource allocation tasks.

Core claim

We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over 1.

What carries the argument

Receding-horizon Model Predictive Control that plans budget allocations by optimizing over a future horizon using an explicit model of return-efficiency dynamics, contrasted with reactive policies that allocate without foresight.

If this is right

When return efficiencies contain known structure, MPC improves cumulative returns by shifting spend across periods to exploit the modeled trade-offs.
When return changes are stationary or purely stochastic, reactive policies achieve comparable performance without requiring a predictive model.
The value of MPC depends directly on how well the underlying return model matches the actual predictable component over the planning horizon.
The controlled simulation framework enables repeatable tests of budgeting performance as the degree of predictability in returns is varied.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Businesses using digital ad budgets could first test whether historical return data supports a usable predictive model before switching from reactive pacing to MPC.
The same distinction between predictable and unpredictable non-stationarity could guide control choices in other domains such as production scheduling or inventory replenishment.
If return models can be learned and updated from streaming data, the boundary between cases where MPC helps and where it does not may shift.

Load-bearing premise

An accurate model of the predictable structure in return efficiencies is available to the MPC controller.

What would settle it

Running the same simulations after deliberately supplying the MPC with an inaccurate or absent model of return dynamics and checking whether its performance advantage over reactive policies disappears.

read the original abstract

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MPC for budget allocation only beats reactive policies when an accurate model captures predictable return structure, but the simulations may lean on oracle-level model knowledge.

read the letter

The main takeaway is that model predictive control for finite-horizon budget allocation only outperforms reactive policies when return efficiency has predictable structure over the planning horizon that an underlying model can capture. In stationary or unpredictable-drift regimes, MPC shows no systematic advantage. That conditional result is the paper's clearest contribution. It applies receding-horizon MPC to periodic budget allocation with execution noise and constraints, using simulations motivated by digital marketing. The comparison across different non-stationarity levels is a straightforward extension of existing MPC techniques, and the finding that non-stationarity by itself does not justify predictive control is a practical and honest point. It avoids overclaiming by showing where the method adds value through intertemporal trade-offs and where it does not. The work is grounded enough to be useful for readers thinking about resource allocation under changing returns. The soft spot is the role of the model. The claimed advantage depends on the model accurately representing the predictable structure, yet the abstract gives no information on whether that model is provided perfectly to the MPC, estimated online from noisy data, or allowed to have mismatch. If the experiments supply the controller with oracle knowledge while the reactive baseline operates without it, the outperformance is unsurprising and reduces to information quality rather than a property of predictive control. Details on experimental design, metrics, number of runs, or statistical checks are also missing, which leaves the simulation results hard to evaluate for robustness. This is the sort of paper that could interest people working on applied control in marketing or similar budgeting domains. A reader looking for guidance on when to move beyond reactive rules would get value from the regime distinctions. It deserves peer review because the core conditional claim is worth checking with fuller methods and more realistic model estimation, even if the current evidence needs strengthening on those points.

Referee Report

2 major / 1 minor

Summary. The manuscript frames finite-horizon budget allocation as a closed-loop control problem and compares receding-horizon MPC against reactive pacing policies under execution noise and operational constraints. Using controlled simulations motivated by digital marketing, it reports that non-stationarity alone does not favor MPC; systematic outperformance occurs only when return efficiency exhibits predictable structure over the planning horizon that is captured by an underlying model, allowing exploitation of intertemporal trade-offs.

Significance. If the central claim is supported by properly documented experiments, the work would usefully delineate when predictive control adds value in economic resource allocation, distinguishing structured predictability from mere non-stationarity. The controlled simulation approach is a positive feature for isolating regime-specific effects.

major comments (2)

[Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.
[MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.

minor comments (1)

Notation for return efficiency and planning horizon could be introduced more explicitly at first use to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive evaluation of the controlled simulation framework. We agree that the experimental documentation requires strengthening to better support the central claims, and we will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract / Simulation Results] Abstract and simulation section: the reported performance differential across non-stationarity regimes is presented without any description of experimental design, performance metrics, number of Monte Carlo runs, statistical tests, or robustness checks. This leaves the central claim that MPC 'consistently outperforms' under predictable structure weakly supported.

Authors: We agree that these details were insufficiently documented. In the revised manuscript we will add a dedicated 'Experimental Setup' subsection that specifies: the controlled generation of non-stationarity regimes, the performance metrics (cumulative return efficiency and regret to oracle), the number of Monte Carlo runs (50 independent trials per regime), statistical testing (paired Wilcoxon signed-rank tests with reported p-values), and robustness checks across noise levels and horizons. These additions will make the performance differentials fully reproducible and statistically grounded. revision: yes
Referee: [MPC formulation] MPC formulation and model section: the claim that MPC advantage requires an 'underlying model' that captures predictable return structure is load-bearing, yet the manuscript does not specify whether this model is supplied as an oracle, learned online from noisy observations, or subject to mismatch. Without this information the outperformance cannot be attributed to predictive control per se rather than model quality.

Authors: We acknowledge the ambiguity. The simulations employ an oracle model that perfectly encodes the known predictable structure in order to isolate the value of receding-horizon optimization; this was not stated explicitly. In revision we will add a clarifying paragraph in the MPC formulation section stating that the model is supplied as ground truth for these controlled experiments, and we will include a short discussion (with one additional sensitivity plot) on how performance degrades under model mismatch or online estimation. This will allow readers to attribute gains specifically to predictive control under ideal model conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; MPC advantage tied to explicit model availability in controlled simulations

full rationale

The paper's central result—that MPC outperforms reactive policies only when return efficiency has predictable structure captured by an underlying model—is demonstrated through controlled simulations with varying non-stationarity. This comparison is self-contained: the model is an input to the MPC formulation (as is standard for model-based control), while reactive baselines lack it by design. No derivation step reduces a claimed prediction to a fitted parameter or self-citation by construction, and the abstract explicitly conditions the advantage on model presence rather than asserting unconditional superiority. The evaluation framework is externally falsifiable via the simulation environments described.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the simulation framework and 'underlying model' are referenced but not formalized.

pith-pipeline@v0.9.0 · 5447 in / 974 out tokens · 40171 ms · 2026-05-07T10:27:33.332379+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 3 canonical work pages

[1]

Expedia group, inc

Expedia Group, Inc. Expedia group, inc. form 10-k annual report. https://www.sec.gov/ixviewer/documents/20240208/ 0001637459-24-000014.xhtml, 2024. Accessed via U.S. SEC EDGAR

work page arXiv 2024
[2]

Expedia group, inc

Expedia Group, Inc. Expedia group, inc. form 10-q quarterly re- ports. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany& CIK=0001637459&type=10-Q, 2024. Accessed via U.S. SEC EDGAR

2024
[3]

Hanssens, Leonard J

Dominique M. Hanssens, Leonard J. Parsons, and Randall L. Schultz. Market Response Models: Econometric and Time Series Analysis. Springer, 2003

2003
[4]

Dekimpe and Dominique M

Marnik G. Dekimpe and Dominique M. Hanssens. The persistence of marketing effects on sales.Marketing Science, 14(1):1–21, 1995

1995
[5]

Lagrangian decomposition algorithm for allocating marketing channels

Daisuke Hatano, Takuro Fukunaga, Takanori Maehara, and Ken-ichi Kawarabayashi. Lagrangian decomposition algorithm for allocating marketing channels. InProceedings of the AAAI Conference on Artificial Intelligence, 2015

2015
[6]

Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022

Virginia Luzon. Dynamic budget allocation in social media advertising campaigns.European Journal of Operational Research, 298(1):327– 341, 2022

2022
[7]

A unified framework for advertising bidding and budget allocation

Qingpeng Zhao, Yu Liu, Wenyu Wei, Wei Chen, and Jun Wang. A unified framework for advertising bidding and budget allocation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2185–2194, 2019

2019
[8]

Budget optimization for online campaigns with positive car- ryover effects

Nikolay Archak, Vahab Mirrokni, and Shanmugavelayutham Muthukr- ishnan. Budget optimization for online campaigns with positive car- ryover effects. InInternational Workshop on Internet and Network Economics, pages 86–99. Springer, 2012

2012
[9]

Budget-constrained marketing optimization via automated bidding

Hongyi Cai, Wei Chen, and Jun Wang. Budget-constrained marketing optimization via automated bidding. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

2023
[10]

An end-to-end framework for marketing effectiveness optimization under budget constraint.arXiv preprint arXiv:2302.04477, 2023

Ziang Yan, Shusen Wang, Guorui Zhou, Jingjian Lin, and Peng Jiang. An end-to-end framework for marketing effectiveness optimization under budget constraint.arXiv preprint arXiv:2302.04477, 2023

work page arXiv 2023
[11]

Marc Nerlove and Kenneth J. Arrow. Optimal advertising policy under dynamic conditions.Economica, 29(114):129–142, 1962

1962
[12]

Suresh P. Sethi. Optimal control of the vidale–wolfe advertising model. Operations Research, 21(4):998–1013, 1973

1973
[13]

Lee.Model predictive control: past, present and future

Manfred Morari and Jay H. Lee.Model predictive control: past, present and future. Elsevier, 1999

1999
[14]

Rawlings, David Q

James B. Rawlings, David Q. Mayne, and Moritz Diehl.Model Predic- tive Control: Theory, Computation, and Design. Nob Hill Publishing, 2nd edition, 2017

2017
[15]

Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019

Yingying Li, Xin Chen, and Na Li. Online optimal control with linear dynamics and predictions: Algorithms and regret analysis.Advances in Neural Information Processing Systems, 32, 2019

2019
[16]

The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020

Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, and Adam Wierman. The power of predictions in online control.Advances in Neural Information Processing Systems, 33:1994–2004, 2020

1994
[17]

Real- time bidding algorithms for performance-based display ad allocation

Ye Chen, Pavel Berkhin, Bo Anderson, and Nikhil R Devanur. Real- time bidding algorithms for performance-based display ad allocation. InProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011

2011
[18]

Real-time bidding by reinforcement learning in display advertising

Hongyi Cai, Kan Ren, Weinan Zhang, Kleanthis Malialis, Jun Wang, Yong Yu, and Dawei Guo. Real-time bidding by reinforcement learning in display advertising. InProceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 661–670, 2017

2017
[19]

Budget pacing for targeted online advertisements at linkedin

Deepak Agarwal et al. Budget pacing for targeted online advertisements at linkedin. InProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1613–1619, 2014

2014
[20]

A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023

Santiago R Balseiro, Kshipra Bhawalkar, Zhe Feng, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan, and Di Wang. A field guide for pacing budget and ros constraints.arXiv preprint arXiv:2302.08530, 2023

work page arXiv 2023
[21]

A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959

Francis J Richards. A flexible growth function for empirical use.Journal of experimental Botany, 10(2):290–301, 1959

1959
[22]

Data-driven budget allocation optimiza- tion for digital marketing

Jing Wang, Chris Swartz, Kai Huang, Smriti Shyamal, Roopesh Ranjan, Dan Friedman, and Joel Brooks. Data-driven budget allocation optimiza- tion for digital marketing. InProceedings of the 65th Annual Canadian Operational Research Society Conference, London, Ontario, 2024. APPENDIXA IMPLEMENTATIONDETAILS The optimization problem (12)–(15) is a small nonlin...

2024