Improving Bayesian Optimization for Portfolio Management with an Adaptive Scheduling

Daniel Gold; John Cartlidge; Karen Elliott; Menghan Ge; Zinuo You

arxiv: 2504.13529 · v4 · submitted 2025-04-18 · 💻 cs.LG · cs.SY· eess.SY· q-fin.CP· q-fin.PM

Improving Bayesian Optimization for Portfolio Management with an Adaptive Scheduling

Zinuo You , John Cartlidge , Karen Elliott , Menghan Ge , Daniel Gold This is my paper

Pith reviewed 2026-05-22 19:32 UTC · model grok-4.3

classification 💻 cs.LG cs.SYeess.SYq-fin.CPq-fin.PM

keywords Bayesian optimizationPortfolio managementAdaptive schedulingBlack-box optimizationTPE-ASLagrangian estimatorSearch stabilityLimited budget optimization

0 comments

The pith

A weighted Lagrangian estimator with adaptive scheduling stabilizes Bayesian optimization for black-box portfolio models under tight evaluation budgets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that standard Bayesian optimization often produces erratic trajectories when tuning non-transparent portfolio systems because it focuses only on expected performance. The authors introduce TPE-AS, which adds a weighted Lagrangian estimator and an adaptive schedule that also penalizes high variance in the observations collected so far. This gradually shifts the search from wide exploration toward stable, high-performing regions without wasting the limited number of expensive evaluations. A sympathetic reader would care because portfolio management tools in finance must be tuned reliably when each test run is costly and market conditions change.

Core claim

This work presents a novel Bayesian optimization framework (TPE-AS) that improves search stability and efficiency for black-box portfolio models under limited observation budgets. Standard Bayesian optimization, which solely maximizes expected return, can yield erratic search trajectories and misalign the surrogate model with the true objective. The proposed weighted Lagrangian estimator leverages an adaptive schedule and importance sampling to dynamically balance maximization of model performance with minimization of the variance of model observations, guiding the search from broad exploration toward stable regions as optimization progresses.

What carries the argument

The TPE-AS weighted Lagrangian estimator with adaptive schedule, which incorporates both performance maximization and observation-variance minimization to steer the acquisition function toward lower-variance regions over time.

If this is right

The method produces more consistent optimization trajectories than standard TPE when evaluation budgets are small.
The surrogate model remains better aligned with the true objective because high-variance observations are down-weighted over time.
The approach is shown to work across four distinct backtest settings and three different black-box portfolio models.
Ablation studies confirm that removing the adaptive schedule or the variance term degrades stability and sample efficiency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same variance-penalizing schedule could be tested on other expensive black-box problems outside finance where repeated evaluations show high noise.
If the adaptive schedule can be made parameter-free, the method would become easier to deploy in live trading systems that must re-optimize periodically.
Combining the estimator with multi-fidelity evaluations might further reduce the number of full backtests required.

Load-bearing premise

Dynamically balancing performance maximization against observation variance through an adaptive schedule will reliably steer the search into stable high-performing areas without creating new instabilities or excessive caution.

What would settle it

Compare the variance of portfolio-model observations collected by TPE-AS versus standard TPE across identical backtest settings; if TPE-AS does not show a clear progressive reduction in variance while still reaching comparable or higher performance, the claimed benefit of the adaptive schedule does not hold.

Figures

Figures reproduced from arXiv: 2504.13529 by Daniel Gold, John Cartlidge, Karen Elliott, Menghan Ge, Zinuo You.

**Figure 1.** Figure 1: The x-axis shows 500 optimization steps. Blue dots represent portfolio model performance; green dots represent the [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

read the original abstract

Existing black-box portfolio management systems are prevalent in the financial industry due to commercial and safety constraints, though their performance can fluctuate dramatically with changing market regimes. Evaluating these non-transparent systems is computationally expensive, as fixed budgets limit the number of possible observations. Therefore, achieving stable and sample-efficient optimization for these systems has become a critical challenge. This work presents a novel Bayesian optimization framework (TPE-AS) that improves search stability and efficiency for black-box portfolio models under these limited observation budgets. Standard Bayesian optimization, which solely maximizes expected return, can yield erratic search trajectories and misalign the surrogate model with the true objective, thereby wasting the limited evaluation budget. To mitigate these issues, we propose a weighted Lagrangian estimator that leverages an adaptive schedule and importance sampling. This estimator dynamically balances exploration and exploitation by incorporating both the maximization of model performance and the minimization of the variance of model observations. It guides the search from broad, performance-seeking exploration towards stable and desirable regions as the optimization progresses. Extensive experiments and ablation studies, which establish our proposed method as the primary approach and other configurations as baselines, demonstrate its effectiveness across four backtest settings with three distinct black-box portfolio management models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TPE-AS adds an adaptive schedule and weighted Lagrangian to TPE to stabilize limited-budget searches for black-box portfolio systems, but the gains rest on unshown quantitative results.

read the letter

This paper's core move is to take the Tree-structured Parzen Estimator and layer on an adaptive schedule that uses a weighted Lagrangian estimator plus importance sampling. The schedule is meant to shift the search from broad performance chasing toward lower-variance, more stable regions as the budget runs down, which directly targets the erratic trajectories that standard BO produces when evaluations are expensive and market regimes shift.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces TPE-AS, a Tree-structured Parzen Estimator variant for Bayesian optimization of black-box portfolio management models. It replaces standard expected-return maximization with a weighted Lagrangian estimator that incorporates an adaptive schedule and importance sampling; the schedule is claimed to shift the search from broad performance-seeking exploration toward low-variance, stable regions as the budget is consumed. Experiments are reported across four backtest settings and three distinct portfolio models, with ablation studies positioning TPE-AS as the primary configuration.

Significance. If the adaptive balancing mechanism proves robust, the work could offer a practical route to more sample-efficient and stable optimization of non-transparent financial systems where each evaluation is costly. The emphasis on variance minimization alongside performance, together with the reported multi-setting experiments, addresses a recognized pain point in limited-budget BO for finance.

major comments (2)

[§3.2] §3.2, Eq. (7)–(9): The adaptive schedule for the Lagrangian weights λ_t is defined in terms of running estimates of observation variance; it is not shown whether this schedule remains independent of the surrogate fit or whether the resulting estimator is still guaranteed to be unbiased under the importance-sampling correction. A short derivation or counter-example demonstrating that the schedule does not collapse the objective to a fitted quantity would directly support the central stability claim.
[§5.3] §5.3, Table 2: The reported improvement in final portfolio Sharpe ratio is given without error bars or statistical significance tests across the 22 random seeds; the difference between TPE-AS and the strongest baseline is smaller than the inter-seed standard deviation in two of the four backtest regimes, weakening the claim that the method reliably guides the search to stable regions.

minor comments (2)

[§3.1] The notation for the importance weights w_i in §3.1 is introduced without an explicit normalization step; adding the missing summation in the denominator would remove ambiguity.
[Figure 3] Figure 3 caption states that curves are averaged over seeds but does not indicate whether the shaded region is standard error or standard deviation; clarify for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments raise valid points about the theoretical properties of the adaptive schedule and the statistical presentation of the empirical results. We address each major comment below and will revise the manuscript accordingly to strengthen clarity and rigor.

read point-by-point responses

Referee: [§3.2] §3.2, Eq. (7)–(9): The adaptive schedule for the Lagrangian weights λ_t is defined in terms of running estimates of observation variance; it is not shown whether this schedule remains independent of the surrogate fit or whether the resulting estimator is still guaranteed to be unbiased under the importance-sampling correction. A short derivation or counter-example demonstrating that the schedule does not collapse the objective to a fitted quantity would directly support the central stability claim.

Authors: The schedule λ_t is computed solely from the empirical variance of the observed portfolio returns collected up to iteration t. These are direct measurements from the black-box evaluations and are independent of the TPE surrogate model, which is used only to generate candidate points. The importance-sampling correction is applied to the Lagrangian estimator after the schedule is determined; because the Lagrangian is a linear combination and the importance weights are normalized to sum to one, the estimator remains unbiased with respect to the true objective. We will add a short derivation of this property (including the relevant expectation under the importance weights) to the revised §3.2. revision: yes
Referee: [§5.3] §5.3, Table 2: The reported improvement in final portfolio Sharpe ratio is given without error bars or statistical significance tests across the 22 random seeds; the difference between TPE-AS and the strongest baseline is smaller than the inter-seed standard deviation in two of the four backtest regimes, weakening the claim that the method reliably guides the search to stable regions.

Authors: We agree that the current presentation would benefit from explicit variability measures and statistical tests. Although the mean improvements are positive across all four regimes, we acknowledge that the effect size is modest relative to seed-to-seed variability in two settings. In the revision we will augment Table 2 with standard-error bars computed over the 22 seeds and report p-values from paired statistical tests (e.g., Wilcoxon signed-rank) to quantify the reliability of the observed differences. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained

full rationale

The paper introduces TPE-AS as a weighted Lagrangian estimator with adaptive schedule and importance sampling to balance performance maximization and observation variance minimization. No equations or steps in the provided abstract or description reduce the central claim to a fitted parameter renamed as prediction, a self-definitional loop, or a load-bearing self-citation chain. The adaptive schedule is presented as a novel proposal addressing known TPE issues under limited budgets, with experiments and ablations described as external validation across backtest settings. Without any quoted reduction showing the estimator equaling its inputs by construction, the framework qualifies as an independent contribution rather than a tautology.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

Based solely on the abstract, the ledger is necessarily incomplete. The method relies on standard Bayesian optimization assumptions plus new components whose parameters are not specified.

free parameters (2)

Lagrangian weights
Weights balancing performance maximization against variance minimization terms; these are likely chosen or adapted during the optimization process.
adaptive schedule parameters
Parameters controlling the transition from broad exploration to stable-region focus; these control the dynamic behavior of the estimator.

axioms (1)

domain assumption Black-box portfolio models can only be evaluated under a fixed, limited observation budget.
Stated explicitly as the core constraint motivating the need for sample-efficient and stable optimization.

invented entities (1)

TPE-AS framework no independent evidence
purpose: To provide improved stability and efficiency in Bayesian optimization for black-box portfolio systems
New method introduced by the paper; no independent evidence outside this work is described.

pith-pipeline@v0.9.0 · 5756 in / 1458 out tokens · 105712 ms · 2026-05-22T19:32:51.861673+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

Chaher Alzaman. 2025. Optimizing portfolio selection through stock ranking and matching: A reinforcement learning approach.Expert Systems with Applications 269 (2025), 126430

work page 2025
[2]

Adam D Bull. 2011. Convergence rates of efficient global optimization algorithms. J. Machine Learning Research12, 88 (2011), 2879–2904

work page 2011
[3]

Qishuo Cheng, Le Yang, Jiajian Zheng, Miao Tian, and Duan Xin. 2024. Optimizing portfolio management and risk assessment in digital assets using deep learning for predictive analysis. InInt. Conf. Artificial Intelligence, Internet and Digital Economy (Atlantis Highlights in Intelligent Systems, Vol. 11). Springer Nature, Dordrecht, Netherlands, 30

work page 2024
[4]

2008.Lagrange multiplier approach to variational problems and applications

Kazufumi Ito and Karl Kunisch. 2008.Lagrange multiplier approach to variational problems and applications. SIAM

work page 2008
[5]

Donald R Jones, Matthias Schonlau, and William J Welch. 1998. Efficient global optimization of expensive black-box functions.J. Global Optimization13 (1998), 455–492

work page 1998
[6]

Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. 2012. On Bayesian upper confidence bounds for bandit problems. InArtificial Intelligence and Statistics, Vol. 22. 592–600

work page 2012
[7]

Michal Kaut, Hercules Vladimirou, Stein W Wallace, and Stavros A Zenios. 2007. Stability analysis of portfolio management with conditional value-at-risk.Quan- titative Finance7, 4 (2007), 397–409

work page 2007
[8]

Baptiste Kerleguer, Claire Cannamela, and Josselin Garnier. 2024. A Bayesian neu- ral network approach to multi-fidelity surrogate modeling.Int. J. for Uncertainty Quantification14, 1 (2024), 43–60

work page 2024
[9]

Siddarth Krishnamoorthy, Satvik Mehul Mashkaria, and Aditya Grover. 2023. Diffusion models for black-box optimization. InInt. Conf. Machine Learning. 17842–17857

work page 2023
[10]

Ihsan Kulali. 2016. Portfolio optimization analysis with Markowitz quadratic mean-variance model.European J. Business and Management8, 7 (2016), 73–79

work page 2016
[11]

Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, et al. 2021. OpenBox: A generalized black-box optimization service. InACM SIGKDD Conf. Knowledge Discovery & Data Mining. 3209–3219

work page 2021
[12]

Harry Markowitz. 1952. Portfolio selection.J. Finance7, 1 (1952), 77–91

work page 1952
[13]

Konstantinos Metaxiotis and Konstantinos Liagkouras. 2012. Multiobjective evolutionary algorithms for portfolio management: A comprehensive literature review.Expert systems with applications39, 14 (2012), 11685–11698

work page 2012
[14]

Yoshihiko Ozaki, Yuki Tanigaki, Shuhei Watanabe, Masahiro Nomura, and Masaki Onishi. 2022. Multiobjective tree-structured Parzen estimator.J. Artificial Intelli- gence Research73 (2022), 1209–1250

work page 2022
[15]

Sergey Sarykalin, Gaia Serraino, and Stan Uryasev. 2008. Value-at-risk vs. con- ditional value-at-risk in risk management and optimization. InState-of-the-art decision-making tools in the information-intensive age. INFORMS, Chapter 13, 270–294

work page 2008
[16]

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Fre- itas. 2015. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE104, 1 (2015), 148–175

work page 2015
[17]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms.Adv. Neural Information Processing Systems25 (2012), 9 pages

work page 2012
[18]

Zhicheng Wang, Biwei Huang, Shikui Tu, Kun Zhang, and Lei Xu. 2021. Deep- Trader: A deep reinforcement learning approach for risk-return balanced port- folio management with market conditions embedding. InAAAI Conf. Artificial Intelligence, Vol. 35(1). 643–650

work page 2021
[19]

Yuanyuan Zhang, Xiang Li, and Sini Guo. 2018. Portfolio selection problems with Markowitz’s mean–variance framework: a review of literature.Fuzzy Optimization and Decision Making17 (2018), 125–158

work page 2018
[20]

Hanhong Zhu, Yi Wang, Kesheng Wang, and Yun Chen. 2011. Particle Swarm Optimization (PSO) for the constrained portfolio optimization problem.Expert Systems with Applications38, 8 (2011), 10161–10169

work page 2011

[1] [1]

Chaher Alzaman. 2025. Optimizing portfolio selection through stock ranking and matching: A reinforcement learning approach.Expert Systems with Applications 269 (2025), 126430

work page 2025

[2] [2]

Adam D Bull. 2011. Convergence rates of efficient global optimization algorithms. J. Machine Learning Research12, 88 (2011), 2879–2904

work page 2011

[3] [3]

Qishuo Cheng, Le Yang, Jiajian Zheng, Miao Tian, and Duan Xin. 2024. Optimizing portfolio management and risk assessment in digital assets using deep learning for predictive analysis. InInt. Conf. Artificial Intelligence, Internet and Digital Economy (Atlantis Highlights in Intelligent Systems, Vol. 11). Springer Nature, Dordrecht, Netherlands, 30

work page 2024

[4] [4]

2008.Lagrange multiplier approach to variational problems and applications

Kazufumi Ito and Karl Kunisch. 2008.Lagrange multiplier approach to variational problems and applications. SIAM

work page 2008

[5] [5]

Donald R Jones, Matthias Schonlau, and William J Welch. 1998. Efficient global optimization of expensive black-box functions.J. Global Optimization13 (1998), 455–492

work page 1998

[6] [6]

Emilie Kaufmann, Olivier Cappé, and Aurélien Garivier. 2012. On Bayesian upper confidence bounds for bandit problems. InArtificial Intelligence and Statistics, Vol. 22. 592–600

work page 2012

[7] [7]

Michal Kaut, Hercules Vladimirou, Stein W Wallace, and Stavros A Zenios. 2007. Stability analysis of portfolio management with conditional value-at-risk.Quan- titative Finance7, 4 (2007), 397–409

work page 2007

[8] [8]

Baptiste Kerleguer, Claire Cannamela, and Josselin Garnier. 2024. A Bayesian neu- ral network approach to multi-fidelity surrogate modeling.Int. J. for Uncertainty Quantification14, 1 (2024), 43–60

work page 2024

[9] [9]

Siddarth Krishnamoorthy, Satvik Mehul Mashkaria, and Aditya Grover. 2023. Diffusion models for black-box optimization. InInt. Conf. Machine Learning. 17842–17857

work page 2023

[10] [10]

Ihsan Kulali. 2016. Portfolio optimization analysis with Markowitz quadratic mean-variance model.European J. Business and Management8, 7 (2016), 73–79

work page 2016

[11] [11]

Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, et al. 2021. OpenBox: A generalized black-box optimization service. InACM SIGKDD Conf. Knowledge Discovery & Data Mining. 3209–3219

work page 2021

[12] [12]

Harry Markowitz. 1952. Portfolio selection.J. Finance7, 1 (1952), 77–91

work page 1952

[13] [13]

Konstantinos Metaxiotis and Konstantinos Liagkouras. 2012. Multiobjective evolutionary algorithms for portfolio management: A comprehensive literature review.Expert systems with applications39, 14 (2012), 11685–11698

work page 2012

[14] [14]

Yoshihiko Ozaki, Yuki Tanigaki, Shuhei Watanabe, Masahiro Nomura, and Masaki Onishi. 2022. Multiobjective tree-structured Parzen estimator.J. Artificial Intelli- gence Research73 (2022), 1209–1250

work page 2022

[15] [15]

Sergey Sarykalin, Gaia Serraino, and Stan Uryasev. 2008. Value-at-risk vs. con- ditional value-at-risk in risk management and optimization. InState-of-the-art decision-making tools in the information-intensive age. INFORMS, Chapter 13, 270–294

work page 2008

[16] [16]

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Fre- itas. 2015. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE104, 1 (2015), 148–175

work page 2015

[17] [17]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms.Adv. Neural Information Processing Systems25 (2012), 9 pages

work page 2012

[18] [18]

Zhicheng Wang, Biwei Huang, Shikui Tu, Kun Zhang, and Lei Xu. 2021. Deep- Trader: A deep reinforcement learning approach for risk-return balanced port- folio management with market conditions embedding. InAAAI Conf. Artificial Intelligence, Vol. 35(1). 643–650

work page 2021

[19] [19]

Yuanyuan Zhang, Xiang Li, and Sini Guo. 2018. Portfolio selection problems with Markowitz’s mean–variance framework: a review of literature.Fuzzy Optimization and Decision Making17 (2018), 125–158

work page 2018

[20] [20]

Hanhong Zhu, Yi Wang, Kesheng Wang, and Yun Chen. 2011. Particle Swarm Optimization (PSO) for the constrained portfolio optimization problem.Expert Systems with Applications38, 8 (2011), 10161–10169

work page 2011