Learning Threshold-Type Investment Strategies with Stochastic Gradient Method

Mikl\'os R\'asonyi; Zsolt Nika

arxiv: 1907.02457 · v1 · pith:JVXIP6GOnew · submitted 2019-07-04 · 💱 q-fin.PM · q-fin.CP

Learning Threshold-Type Investment Strategies with Stochastic Gradient Method

Zsolt Nika , Mikl\'os R\'asonyi This is my paper

Pith reviewed 2026-05-25 02:18 UTC · model grok-4.3

classification 💱 q-fin.PM q-fin.CP

keywords portfolio optimizationstochastic gradientthreshold strategieslog-optimal growthKiefer-Wolfowitz algorithmonline learningbuy-and-sell rules

0 comments

The pith

The Kiefer-Wolfowitz stochastic gradient method converges to the log-optimal solution in the class of threshold-type buy-and-sell strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that the Kiefer-Wolfowitz stochastic gradient method can be applied to parametrized threshold-type buy-and-sell strategies in online portfolio optimization. It establishes convergence to the log-optimal strategy within this restricted class and verifies the result through numerical tests on price models that include stochastic volatility and long memory. The work also examines how to select the algorithm's own hyperparameters from limited price samples alone. A reader would care because the method offers a concrete way to improve trading rules adaptively as new price data arrives.

Core claim

The central claim is that the Kiefer--Wolfowitz version of the Stochastic Gradient method converges to the log-optimal solution in the threshold-type, buy-and-sell strategy class. The authors demonstrate both theoretically and numerically that an optimal threshold strategy exists for the considered price processes and that the algorithm reaches it.

What carries the argument

Kiefer-Wolfowitz stochastic gradient updates applied to the parameters of threshold-type buy-and-sell strategies.

If this is right

There exists an optimal threshold-type strategy that the method can learn for the tested price dynamics.
Numerical experiments confirm convergence of the algorithm to this optimum.
Hyperparameters of the method can be chosen from a small sample of observed prices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same gradient approach could be tested on other restricted families of trading rules beyond simple thresholds.
The technique supplies a practical route to adapt trading thresholds in real time as market data streams in.
It links stochastic approximation methods from statistics to the problem of online wealth maximization.

Load-bearing premise

An optimal threshold-type strategy exists for the price processes considered and can be reached by the stochastic gradient updates.

What would settle it

A price process with stochastic volatility on which the algorithm either diverges or converges to a strategy whose long-run growth rate is strictly below the best threshold strategy.

Figures

Figures reproduced from arXiv: 1907.02457 by Mikl\'os R\'asonyi, Zsolt Nika.

**Figure 1.** Figure 1: Function θ → g(θ). Both plot shows θ values on the 0.01 and 0.99 percentile range of Ht. 4. P∞ t=1 a 2 t c −2 t < ∞. A usual first guess choice is at = t −1 and ct = t −1/3 . Analyzing the growth function g(θ) in the univariate case help us to construct the step-sizes in a suitable way [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗

**Figure 2.** Figure 2: Convergence of the Kiefer–Wolfowitz algorithm in [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Investigating the effect of different scaling for two different parametrization of AR(1). For the [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Investigating the effect of different scaling for two different parametrization of DGSV. For the [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

In online portfolio optimization the investor makes decisions based on new, continuously incoming information on financial assets (typically their prices). In our study we consider a learning algorithm, namely the Kiefer--Wolfowitz version of the Stochastic Gradient method, that converges to the log-optimal solution in the threshold-type, buy-and-sell strategy class. The systematic study of this method is novel in the field of portfolio optimization; we aim to establish the theory and practice of Stochastic Gradient algorithm used on parametrized trading strategies. We demonstrate on a wide variety of stock price dynamics (e.g. with stochastic volatility and long-memory) that there is an optimal threshold type strategy which can be learned. Subsequently, we numerically show the convergence of the algorithm. Furthermore, we deal with the typically problematic question of how to choose the hyperparameters (the parameters of the algorithm and not the dynamics of the prices) without knowing anything about the price other than a small sample.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper numerically applies Kiefer-Wolfowitz to learn thresholds in a simple buy-sell class and shows it tracks the log-optimal on several simulated dynamics, but supplies no proof that the objective is unimodal.

read the letter

The core contribution is a concrete implementation of the Kiefer-Wolfowitz stochastic approximation on low-dimensional threshold parameters for online portfolio selection. They run the algorithm on price paths that include stochastic volatility and long-memory processes, report that it reaches a good threshold pair, and give a practical rule for choosing step sizes and perturbation sizes from a short initial sample of prices only. That last piece is useful because most stochastic approximation papers leave hyperparameter tuning to the user. The numerics appear to work on the examples they chose, and the target is the external log-optimal growth rate rather than an internally fitted quantity, so there is no obvious circularity. The systematic treatment of this particular method inside the threshold strategy class is new enough for the subfield. The main limitation is the missing analytic step. Kiefer-Wolfowitz convergence requires that the mean field has a unique stable point and that the finite-difference estimator behaves well; the paper does not verify that the expected log-wealth surface is unimodal in the two thresholds for the processes they consider. If volatility clustering or fractional noise creates secondary local maxima, the iterates can settle at a suboptimal pair. The numerical evidence does not rule this out, and no error bounds or almost-sure convergence statements are given. The work is therefore best read as an empirical demonstration rather than a completed theory. Readers already working on parametrized trading rules or stochastic approximation in finance will find the hyperparameter section and the range of test dynamics worth a look. It is solid enough to go to referees; the application is narrow but the practical details are honest and the experiments are reproducible from the description.

Referee Report

3 major / 2 minor

Summary. The paper claims that the Kiefer-Wolfowitz stochastic gradient method converges to the log-optimal solution within the class of threshold-type buy-and-sell strategies for online portfolio optimization. It aims to establish both the theory and practice of this approach, demonstrates numerically that optimal threshold strategies exist and can be learned for a range of price dynamics (including stochastic volatility and long-memory processes), shows convergence of the algorithm on selected paths, and addresses hyperparameter selection from small samples without detailed knowledge of the underlying price process.

Significance. If the convergence holds, the work would provide a practical, low-dimensional parametrization for learning trading strategies via stochastic approximation, extending these methods to finance with numerical evidence across diverse dynamics. The systematic numerical study on multiple processes is a clear strength and could be useful for practitioners, though the lack of theoretical verification or error bounds limits broader impact.

major comments (3)

[Abstract and theoretical sections] Abstract and theoretical development: the central claim that the KW method converges to the global log-optimum requires that the mean-field objective E[log-wealth] has a unique stable equilibrium under the considered dynamics, but no analytic verification, proof sketch, or check for unimodality/saddle points is supplied for fractional Brownian or Heston-type processes, contrary to the conditions of standard KW theorems (Kushner-Clark, Borkar). This is load-bearing for the asserted convergence.
[Numerical experiments] Numerical experiments: convergence is shown on selected paths, but no error bounds, bias analysis of the finite-difference gradient estimator, step-size conditions, or data-exclusion details are provided, leaving the reliability of the reported results for long-memory and SV dynamics unverified.
[Hyperparameter selection discussion] Hyperparameter choice: the procedure for selecting algorithm parameters from a small sample without knowledge of the price dynamics is presented as a solution to a typically problematic issue, but lacks robustness checks or explicit criteria that would allow reproduction or assessment of sensitivity.

minor comments (2)

Notation for the threshold parameters and the wealth process could be clarified with explicit definitions early in the manuscript to improve readability.
Figure legends and axis labels in the numerical results would benefit from additional detail on the specific dynamics and sample sizes used.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments. We address each major point below, with honest assessment of what can be revised and what remains a limitation of the current work.

read point-by-point responses

Referee: [Abstract and theoretical sections] Abstract and theoretical development: the central claim that the KW method converges to the global log-optimum requires that the mean-field objective E[log-wealth] has a unique stable equilibrium under the considered dynamics, but no analytic verification, proof sketch, or check for unimodality/saddle points is supplied for fractional Brownian or Heston-type processes, contrary to the conditions of standard KW theorems (Kushner-Clark, Borkar). This is load-bearing for the asserted convergence.

Authors: We agree that the manuscript does not supply analytic verification or a proof sketch that the mean-field objective possesses a unique stable equilibrium for fractional Brownian motion or Heston dynamics. The work applies the standard Kiefer-Wolfowitz framework and demonstrates convergence numerically across these processes, but does not establish the required conditions analytically. We will revise the abstract and theoretical sections to state explicitly that convergence is shown numerically under the maintained assumption that the standard KW conditions hold, without claiming a full theoretical guarantee for the non-Markovian and stochastic-volatility cases. revision: yes
Referee: [Numerical experiments] Numerical experiments: convergence is shown on selected paths, but no error bounds, bias analysis of the finite-difference gradient estimator, step-size conditions, or data-exclusion details are provided, leaving the reliability of the reported results for long-memory and SV dynamics unverified.

Authors: We acknowledge that the numerical section lacks formal error bounds, bias analysis of the finite-difference estimator, and explicit step-size or data-exclusion protocols. The experiments are designed to illustrate practical behavior on representative paths rather than to deliver rigorous statistical guarantees. We will add a subsection discussing step-size selection heuristics, the form of the finite-difference perturbation, and the path-sampling procedure used, while noting that full error bounds for these dynamics lie beyond the paper's scope. revision: partial
Referee: [Hyperparameter selection discussion] Hyperparameter choice: the procedure for selecting algorithm parameters from a small sample without knowledge of the price dynamics is presented as a solution to a typically problematic issue, but lacks robustness checks or explicit criteria that would allow reproduction or assessment of sensitivity.

Authors: The hyperparameter procedure is presented as a pragmatic, data-driven heuristic. To improve reproducibility we will expand the relevant section with explicit selection criteria, a sensitivity table showing performance variation under small perturbations of the chosen values, and additional numerical checks on a second independent sample. revision: yes

standing simulated objections not resolved

Analytic verification (or even a proof sketch) that the mean-field objective E[log-wealth] possesses a unique stable equilibrium for fractional Brownian motion and Heston-type price dynamics under the Kiefer-Wolfowitz conditions

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained numerical convergence analysis

full rationale

The paper applies the Kiefer-Wolfowitz stochastic gradient algorithm to a low-dimensional parametrization of threshold-type strategies and numerically demonstrates convergence on simulated paths for various price processes. The target (log-optimal strategy) is defined externally via expected log-wealth maximization, not constructed from the algorithm's outputs or self-citations. No equations reduce the claimed optimum to a fitted quantity inside the paper, no uniqueness theorem is imported from the authors' prior work, and no ansatz or renaming is used to smuggle in the result. The central claim rests on standard stochastic approximation theory plus empirical verification rather than self-referential definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; ledger entries are inferred from stated assumptions rather than explicit derivations.

axioms (1)

domain assumption Existence of a log-optimal threshold-type strategy for the considered price dynamics
Required for the convergence claim to be meaningful.

pith-pipeline@v0.9.0 · 5692 in / 1022 out tokens · 15515 ms · 2026-05-25T02:18:02.536765+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references

[1]

Academic Press, 2010

Kendall Kim.Electronic and algorithmic trading technology: the complete guide. Academic Press, 2010

2010
[2]

High-frequency trading: a practical guide to algorithmic strategies and trading systems, volume 604

Irene Aldridge. High-frequency trading: a practical guide to algorithmic strategies and trading systems, volume 604. John Wiley & Sons, 2013

2013
[3]

Online portfolio selection: A survey.ACM Computing Surveys (CSUR), 46(3):35, 2014

Bin Li and Steven CH Hoi. Online portfolio selection: A survey.ACM Computing Surveys (CSUR), 46(3):35, 2014

2014
[4]

John Wiley & Sons, 2018

Marcos Lopez De Prado.Advances in ﬁnancial machine learning. John Wiley & Sons, 2018

2018
[5]

A stochastic approximation method.The annals of mathematical statistics, pages 400–407, 1951

Herbert Robbins and Sutton Monro. A stochastic approximation method.The annals of mathematical statistics, pages 400–407, 1951

1951
[6]

Stochastic estimation of the maximum of a regression function

Jack Kiefer, Jacob Wolfowitz, et al. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3):462–466, 1952

1952
[7]

A stochastic approximation algorithm for american lookback put options

Zhenhua Zhang, G Yin, and Zhian Liang. A stochastic approximation algorithm for american lookback put options. Stochastic Analysis and Applications, 29(2):332–351, 2011

2011
[8]

Stock liquidation via stochastic approximation using nasdaq daily and intra-day data

G Yin, Qing Zhang, F Liu, RH Liu, and Y Cheng. Stock liquidation via stochastic approximation using nasdaq daily and intra-day data. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 16(1):217–236, 2006

2006
[9]

Algorithms for cvar optimization in mdps

Yinlam Chow and Mohammad Ghavamzadeh. Algorithms for cvar optimization in mdps. InAdvances in neural information processing systems, pages 3509–3517, 2014

2014
[10]

Stochasticapproximationwithaveraginginnovationappliedtoﬁnance

SophieLaruelleandGillesPagès. Stochasticapproximationwithaveraginginnovationappliedtoﬁnance. Monte Carlo Methods and Applications, 18(1):1–51, 2012

2012
[11]

Discrete approximation in quantile problem of portfolio selection

Andrey Kibzun and Riho Lepp. Discrete approximation in quantile problem of portfolio selection. In Stochastic Optimization: Algorithms and Applications, pages 121–135. Springer, 2001

2001
[12]

Optimal split of orders across liquidity pools: a stochastic algorithm approach.SIAM Journal on Financial Mathematics, 2(1):1042–1076, 2011

Sophie Laruelle, Charles-Albert Lehalle, and Gilles Pages. Optimal split of orders across liquidity pools: a stochastic algorithm approach.SIAM Journal on Financial Mathematics, 2(1):1042–1076, 2011

2011
[13]

Empirical properties of asset returns: stylized facts and statistical issues.Quantitative Finance, pages 223–236, 2001

Rama Cont. Empirical properties of asset returns: stylized facts and statistical issues.Quantitative Finance, pages 223–236, 2001

2001
[14]

Asymptotic optimality and asymptotic equipartition properties of log-optimum investment.The Annals of Probability, 16(2):876–898, 1988

Paul H Algoet, Thomas M Cover, et al. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment.The Annals of Probability, 16(2):876–898, 1988

1988
[15]

Log-optimal portfolios with memory eﬀect

Zsolt Nika and Miklos Rasonyi. Log-optimal portfolios with memory eﬀect. Applied Mathematical Finance, pages 1–29, 2018

2018
[16]

Cambridge university press, 2005

A Colin Cameron and Pravin K Trivedi.Microeconometrics: methods and applications. Cambridge university press, 2005. 11

2005

[1] [1]

Academic Press, 2010

Kendall Kim.Electronic and algorithmic trading technology: the complete guide. Academic Press, 2010

2010

[2] [2]

High-frequency trading: a practical guide to algorithmic strategies and trading systems, volume 604

Irene Aldridge. High-frequency trading: a practical guide to algorithmic strategies and trading systems, volume 604. John Wiley & Sons, 2013

2013

[3] [3]

Online portfolio selection: A survey.ACM Computing Surveys (CSUR), 46(3):35, 2014

Bin Li and Steven CH Hoi. Online portfolio selection: A survey.ACM Computing Surveys (CSUR), 46(3):35, 2014

2014

[4] [4]

John Wiley & Sons, 2018

Marcos Lopez De Prado.Advances in ﬁnancial machine learning. John Wiley & Sons, 2018

2018

[5] [5]

A stochastic approximation method.The annals of mathematical statistics, pages 400–407, 1951

Herbert Robbins and Sutton Monro. A stochastic approximation method.The annals of mathematical statistics, pages 400–407, 1951

1951

[6] [6]

Stochastic estimation of the maximum of a regression function

Jack Kiefer, Jacob Wolfowitz, et al. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3):462–466, 1952

1952

[7] [7]

A stochastic approximation algorithm for american lookback put options

Zhenhua Zhang, G Yin, and Zhian Liang. A stochastic approximation algorithm for american lookback put options. Stochastic Analysis and Applications, 29(2):332–351, 2011

2011

[8] [8]

Stock liquidation via stochastic approximation using nasdaq daily and intra-day data

G Yin, Qing Zhang, F Liu, RH Liu, and Y Cheng. Stock liquidation via stochastic approximation using nasdaq daily and intra-day data. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 16(1):217–236, 2006

2006

[9] [9]

Algorithms for cvar optimization in mdps

Yinlam Chow and Mohammad Ghavamzadeh. Algorithms for cvar optimization in mdps. InAdvances in neural information processing systems, pages 3509–3517, 2014

2014

[10] [10]

Stochasticapproximationwithaveraginginnovationappliedtoﬁnance

SophieLaruelleandGillesPagès. Stochasticapproximationwithaveraginginnovationappliedtoﬁnance. Monte Carlo Methods and Applications, 18(1):1–51, 2012

2012

[11] [11]

Discrete approximation in quantile problem of portfolio selection

Andrey Kibzun and Riho Lepp. Discrete approximation in quantile problem of portfolio selection. In Stochastic Optimization: Algorithms and Applications, pages 121–135. Springer, 2001

2001

[12] [12]

Optimal split of orders across liquidity pools: a stochastic algorithm approach.SIAM Journal on Financial Mathematics, 2(1):1042–1076, 2011

Sophie Laruelle, Charles-Albert Lehalle, and Gilles Pages. Optimal split of orders across liquidity pools: a stochastic algorithm approach.SIAM Journal on Financial Mathematics, 2(1):1042–1076, 2011

2011

[13] [13]

Empirical properties of asset returns: stylized facts and statistical issues.Quantitative Finance, pages 223–236, 2001

Rama Cont. Empirical properties of asset returns: stylized facts and statistical issues.Quantitative Finance, pages 223–236, 2001

2001

[14] [14]

Asymptotic optimality and asymptotic equipartition properties of log-optimum investment.The Annals of Probability, 16(2):876–898, 1988

Paul H Algoet, Thomas M Cover, et al. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment.The Annals of Probability, 16(2):876–898, 1988

1988

[15] [15]

Log-optimal portfolios with memory eﬀect

Zsolt Nika and Miklos Rasonyi. Log-optimal portfolios with memory eﬀect. Applied Mathematical Finance, pages 1–29, 2018

2018

[16] [16]

Cambridge university press, 2005

A Colin Cameron and Pravin K Trivedi.Microeconometrics: methods and applications. Cambridge university press, 2005. 11

2005