Equilibrium Strategies for the N-agent Mean-Variance Investment Problem over a Random Horizon

Jie Xiong; Xiaoqing Liang; Ying Yang

arxiv: 2507.04611 · v2 · submitted 2025-07-07 · 🧮 math.OC

Equilibrium Strategies for the N-agent Mean-Variance Investment Problem over a Random Horizon

Xiaoqing Liang , Jie Xiong , Ying Yang This is my paper

Pith reviewed 2026-05-19 06:50 UTC · model grok-4.3

classification 🧮 math.OC

keywords mean-variance optimizationequilibrium strategiesrandom horizonmean-field gamesn-agent gamesstochastic controlHJB equationsinvestment competition

0 comments

The pith

Explicit equilibrium strategies are derived for competitive mean-variance games over random horizons.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper derives explicit equilibrium strategies for a group of agents competing in mean-variance portfolio optimization where the investment horizon is random. Each agent's risk aversion changes with their wealth, and stocks are correlated through common noise. By solving an extended system of HJB equations under the assumption of an exponentially distributed horizon, closed-form expressions for the strategies and value functions are obtained. These strategies depend on both an agent's own wealth and the wealth levels of competitors. The results recover known equilibria from prior work in certain limiting cases, showing consistency with existing theory on mean-field games and exponential preferences.

Core claim

Under an exponentially distributed random horizon, the authors explicitly obtain the equilibrium feedback strategies and the value function for both the n-agent game and the corresponding mean-field game. The agent's equilibrium feedback strategy depends not only on his/her current wealth but also on the wealth of other competitors. When the risk aversion is state-independent and the risk-free interest rate is zero, the equilibrium strategies degenerate to constants identical to the unique equilibrium obtained in prior work with exponential risk preferences. When the competition parameter goes to zero and the risk aversion equals some specific value, the equilibrium strategies coincide with

What carries the argument

The extended Hamilton-Jacobi-Bellman (HJB) system of equations incorporating the random horizon and inter-agent competition, solved explicitly for exponential distributions to yield feedback strategies depending on own and others' wealth.

Load-bearing premise

The random time horizon follows an exponential distribution to permit closed-form solutions.

What would settle it

Substituting the derived strategies back into the extended HJB equations and verifying they satisfy the equilibrium conditions for an exponential horizon would confirm the result.

Figures

Figures reproduced from arXiv: 2507.04611 by Jie Xiong, Xiaoqing Liang, Ying Yang.

**Figure 4.1.** Figure 4.1: The feedback equilibrium feedback strategy of agent with respect to competitive parameter [PITH_FULL_IMAGE:figures/full_fig_p025_4_1.png] view at source ↗

**Figure 4.** Figure 4: , we observe that as [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

**Figure 4.2.** Figure 4.2: The feedback equilibrium feedback strategy of agent with respect to risk aversion parameter [PITH_FULL_IMAGE:figures/full_fig_p026_4_2.png] view at source ↗

**Figure 4.3.** Figure 4.3: The feedback equilibrium feedback strategy of agent with respect to parameter [PITH_FULL_IMAGE:figures/full_fig_p027_4_3.png] view at source ↗

**Figure 4.4.** Figure 4.4: The feedback equilibrium feedback strategy of agent with respect to parameter [PITH_FULL_IMAGE:figures/full_fig_p027_4_4.png] view at source ↗

read the original abstract

We study equilibrium feedback strategies for a family of dynamic mean-variance problems with competition among a large group of agents. We assume that the time horizon is random and each agent's risk aversion depends dynamically on the current wealth. We consider both the finite population game and the corresponding mean-field one. Each agent can invest in a risk-free asset and a specific individual stock, which is correlated with other stocks by a common noise. By applying stochastic control theory, we derive the extended Hamilton-Jacobi-Bellman (HJB) system of equations for both $n$-agent and mean-field games. Under an exponentially distributed random horizon, in each case, we explicitly obtain the equilibrium feedback strategies and the value function. Our results show that the agent's equilibrium feedback strategy depends not only on his/her current wealth but also on the wealth of other competitors. Moreover, when the risk aversion is state-independent and the risk-free interest rate is zero, the equilibrium strategies degenerate to constants, which is identical to the unique equilibrium obtained in \citet{lacker2019mean} with exponential risk preferences; when the competition parameter goes to zero and the risk aversion equals some specific value, the equilibrium strategies coincide with the ones derived in \citet{landriault2018equilibrium}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives explicit equilibrium strategies for n-agent and mean-field mean-variance games with exponential random horizon and wealth-dependent risk aversion, extending prior work but hinging on verification of the HJB solutions.

read the letter

This paper derives explicit equilibrium feedback strategies and value functions for both the finite n-agent game and its mean-field limit in a mean-variance investment setting. The horizon is random, risk aversion depends on current wealth, and stocks are correlated through common noise. Under an exponentially distributed horizon they obtain closed forms that reduce to the constant strategies in Lacker when risk aversion is state-independent and the risk-free rate is zero, and match Landriault in the zero-competition limit. That consistency is useful and shows the extension is coherent with existing results. The derivations follow standard stochastic control and extended HJB methods for time-inconsistent problems, which is the right approach here. The main soft spot is verification. The central claim rests on the candidate strategies actually solving the derived HJB system, including the extra terms from dynamic risk aversion, inter-agent wealth dependence, and common noise. If the paper includes a direct substitution check or verification lemma that confirms this identically, the results hold; without it, an algebraic slip in the ODE system could undermine the explicit expressions. The exponential horizon assumption is what enables the time-homogeneous ansatz and closed forms, so the work is narrower than a fully general random horizon but that is a reasonable trade-off for explicitness. This is for researchers working on mean-field games and time-inconsistent stochastic control in portfolio choice. A reader looking for usable formulas in competitive investment models with random horizons would find concrete expressions here. It deserves peer review because the explicit solutions are new in this combination of features and the setup is formally grounded, even if the verification needs careful referee attention.

Referee Report

1 major / 2 minor

Summary. The paper studies equilibrium feedback strategies for dynamic mean-variance investment problems among N agents (and the corresponding mean-field game) with a random time horizon and wealth-dependent risk aversion. Agents trade a risk-free asset and individual stocks correlated through common noise. Extended HJB systems are derived via stochastic control for both the finite-N and mean-field settings; under an exponentially distributed horizon, explicit equilibrium strategies and value functions are obtained. These strategies depend on own wealth and competitors' wealth, and recover known constant-strategy equilibria from Lacker (2019) and Landriault et al. (2018) in special cases.

Significance. If the candidate solutions are verified to satisfy the extended HJB systems, the explicit closed forms would constitute a concrete advance in time-inconsistent mean-field stochastic control for finance, furnishing tractable equilibria that incorporate dynamic risk aversion, common noise, and inter-agent wealth dependence. The recovery of prior results in limiting cases provides a useful consistency check and could facilitate further analysis of competition effects in portfolio choice.

major comments (1)

[§4] §4 (Explicit solutions under exponential horizon): the manuscript states that the candidate feedback strategies and value functions are obtained by solving the time-homogeneous ODE system that arises from the extended HJB, yet no verification lemma or direct substitution is supplied showing that these expressions satisfy the full extended HJB identically, including the cross-derivative terms induced by common noise, the dynamic risk-aversion factor, and the equilibrium consistency condition in the mean-field limit. Because the explicit expressions are the central claim, this verification step is load-bearing.

minor comments (2)

[§2] The notation for the common-noise correlation matrix and the precise form of the dynamic risk-aversion function could be introduced with an explicit equation reference in the model section to improve readability.
[Introduction] A brief remark on why the exponential horizon is chosen beyond tractability (e.g., memoryless property enabling time-homogeneous ansatz) would help readers assess the modeling assumption.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive summary and for identifying a key point regarding verification of the explicit solutions. We address the major comment below and will incorporate the necessary changes in the revised manuscript.

read point-by-point responses

Referee: [§4] §4 (Explicit solutions under exponential horizon): the manuscript states that the candidate feedback strategies and value functions are obtained by solving the time-homogeneous ODE system that arises from the extended HJB, yet no verification lemma or direct substitution is supplied showing that these expressions satisfy the full extended HJB identically, including the cross-derivative terms induced by common noise, the dynamic risk-aversion factor, and the equilibrium consistency condition in the mean-field limit. Because the explicit expressions are the central claim, this verification step is load-bearing.

Authors: We agree that an explicit verification is essential for the central claims. In the current draft the candidate solutions were obtained by substituting an ansatz into the extended HJB and reducing to an ODE system, but we did not perform the reverse substitution to confirm that the closed-form expressions satisfy the original system identically. In the revised version we will add a dedicated verification subsection (or appendix) that substitutes the explicit strategies and value functions back into the full extended HJB equations for both the finite-N and mean-field cases. This will explicitly check the cross-derivative terms arising from common noise, the wealth-dependent risk-aversion factor, and the equilibrium consistency condition, thereby confirming that the expressions solve the system. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained from stochastic control principles

full rationale

The paper applies standard stochastic control theory to derive the extended HJB system for the time-inconsistent mean-variance game with random horizon and state-dependent risk aversion, then solves the resulting system explicitly under the exponential horizon assumption via a time-homogeneous ansatz. This constitutes an independent first-principles derivation rather than any self-definition, fitted-input prediction, or load-bearing self-citation. Comparisons to prior works (Lacker 2019, Landriault 2018) are presented as special-case consistency checks after the main result is obtained, not as justifications for the ansatz or uniqueness. No step reduces the claimed equilibrium strategies or value functions to their inputs by construction.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions of stochastic differential games (existence of admissible controls, well-posedness of the wealth SDEs, and existence of a Nash equilibrium) plus the modeling choice that the horizon is exponentially distributed to close the HJB system. No new entities are postulated.

free parameters (2)

risk-aversion function
The paper allows risk aversion to depend dynamically on current wealth; the specific functional form is a modeling choice that enters the HJB system.
correlation structure via common noise
The individual stocks are driven by idiosyncratic noise plus a common factor; the intensity of the common noise is a free modeling parameter.

axioms (2)

domain assumption Wealth processes follow linear SDEs driven by Brownian motions with common noise
Standard in continuous-time portfolio theory; invoked to set up the controlled dynamics before applying stochastic control.
domain assumption Existence of equilibrium in the extended HJB system
The paper states that the extended HJB system is derived and solved; the existence step is presupposed for the explicit solution to be valid.

pith-pipeline@v0.9.0 · 5753 in / 1493 out tokens · 25870 ms · 2026-05-19T06:50:57.413861+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By applying stochastic control theory, we derive the extended Hamilton-Jacobi-Bellman (HJB) system of equations for both n-agent and mean-field games. Under an exponentially distributed random horizon, in each case, we explicitly obtain the equilibrium feedback strategies and the value function.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

V(x,y)=Ax²+Cy²+Dxy+Ex+Fy+I, G(x,y)=ax+cy+α, H(x,y)=ãx²+čy²+... (quadratic ansatz leading to cubic ODE for p)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

R., Jin, H., Pliska, S

Bielecki, T. R., Jin, H., Pliska, S. R., and Zhou, X. Y. (2005). Continuous-time mean-variance portfolio selection with bankruptcy prohibition. Mathematical Finance, 15(2):213–244

work page 2005
[2]

and Murgoci, A

Bj¨ ork, T. and Murgoci, A. (2010). A general theory of markovian time inconsistent stochastic control problems. Available at SSRN 1694759

work page 2010
[3]

Bj¨ ork, T., Murgoci, A., and Zhou, X. (2014). Mean-variance portfolio optimization with state- dependent risk aversion. Mathematical Finance, 24(1):1–24

work page 2014
[4]

Bo, L., Wang, S., and Zhou, C. (2024). A mean field game approach to optimal investment and risk control for competitive insurers. Insurance: Mathematics and Economics , 116:202–217

work page 2024
[5]

and Yang, P

Chen, Z. and Yang, P. (2020). Robust optimal reinsurance-investment strategy with price jumps and correlated claims. Insurance: Mathematics and Economics , 92:27–46

work page 2020
[6]

and Hu, X

Guan, G. and Hu, X. (2022). Time-consistent investment and reinsurance strategies for mean- variance insurers in n-agent and mean-field games.North American Actuarial Journal, 26(4):537– 569

work page 2022
[7]

Kryger, E. M. and Steffensen, M. (2010). Some solvable portfolio problems with quadratic and collective objectives. Available at SSRN 1577265

work page 2010
[8]

and Soret, A

Lacker, D. and Soret, A. (2020). Many-player games of optimal consumption and investment under relative performance criteria. Mathematics and Financial Economics , 14(2):263–281

work page 2020
[9]

and Zariphopoulou, T

Lacker, D. and Zariphopoulou, T. (2019). Mean field and n-agent games for optimal investment under relative performance criteria. Mathematical Finance, 29(4):1003–1038

work page 2019
[10]

Landriault, D., Li, B., Li, D., and Young, V. R. (2018). Equilibrium strategies for the mean- variance investment problem over a random horizon. SIAM Journal on Financial Mathematics , 9(3):1046–1073

work page 2018
[11]

and Ng, W.-L

Li, D. and Ng, W.-L. (2000). Optimal dynamic portfolio selection: Multiperiod mean-variance formulation. Mathematical Finance, 10(3):387–406. 28

work page 2000
[12]

Liang, X., Bai, L., and Guo, J. (2014). Optimal time-consistent portfolio and contribution se- lection for defined benefit pension schemes under mean-variance criterion.The ANZIAM Journal, 56(1):66–90

work page 2014
[13]

Pun, C. S. (2018). Time-consistent mean-variance portfolio selection with only risky assets. Economic Modelling, 75:281–292

work page 2018
[14]

Strotz, R. H. (1973). Myopia and Inconsistency in Dynamic Utility Maximization . Springer

work page 1973
[15]

and Zhou, X

Xiong, J. and Zhou, X. Y. (2007). Mean-variance portfolio selection under partial information. SIAM Journal on Control and Optimization , 46(1):156–175

work page 2007
[16]

S., and Wong, H

Yan, T., Han, B., Pun, C. S., and Wong, H. Y. (2020). Robust time-consistent mean-variance portfolio selection problem with multivariate stochastic volatility. Mathematics and Financial Economics, 14:699–724

work page 2020
[17]

and Zhou, X

Yong, J. and Zhou, X. (2012). Stochastic Controls: Hamiltonian Systems and HJB Equations . Springer Science & Business Media

work page 2012
[18]

Zeng, Y., Li, Z., and Lai, Y. (2013). Time-consistent investment and reinsurance strategies for mean-variance insurers with jumps. Insurance: Mathematics and Economics , 52(3):498–507

work page 2013
[19]

Zhang, L., Wang, P., and Shen, Y. (2025). Time-consistent investment strategy for a DC pension plan with the return of premiums clause and mispricing. Quantitative Finance, 25:117– 141

work page 2025
[20]

Zhou, X. Y. and Li, D. (2000). Continuous-time mean-variance portfolio selection: A stochastic LQ framework. Applied Mathematics and Optimization , 42:19–33. A Proof of Theorem 2.1 In this appendix, we will provide the proof of Theorem 2.1. Proof. Let bπi be the function that attains the supremum in (2.6), and bπi be the corresponding feedback strategy. S...

work page 2000

[1] [1]

R., Jin, H., Pliska, S

Bielecki, T. R., Jin, H., Pliska, S. R., and Zhou, X. Y. (2005). Continuous-time mean-variance portfolio selection with bankruptcy prohibition. Mathematical Finance, 15(2):213–244

work page 2005

[2] [2]

and Murgoci, A

Bj¨ ork, T. and Murgoci, A. (2010). A general theory of markovian time inconsistent stochastic control problems. Available at SSRN 1694759

work page 2010

[3] [3]

Bj¨ ork, T., Murgoci, A., and Zhou, X. (2014). Mean-variance portfolio optimization with state- dependent risk aversion. Mathematical Finance, 24(1):1–24

work page 2014

[4] [4]

Bo, L., Wang, S., and Zhou, C. (2024). A mean field game approach to optimal investment and risk control for competitive insurers. Insurance: Mathematics and Economics , 116:202–217

work page 2024

[5] [5]

and Yang, P

Chen, Z. and Yang, P. (2020). Robust optimal reinsurance-investment strategy with price jumps and correlated claims. Insurance: Mathematics and Economics , 92:27–46

work page 2020

[6] [6]

and Hu, X

Guan, G. and Hu, X. (2022). Time-consistent investment and reinsurance strategies for mean- variance insurers in n-agent and mean-field games.North American Actuarial Journal, 26(4):537– 569

work page 2022

[7] [7]

Kryger, E. M. and Steffensen, M. (2010). Some solvable portfolio problems with quadratic and collective objectives. Available at SSRN 1577265

work page 2010

[8] [8]

and Soret, A

Lacker, D. and Soret, A. (2020). Many-player games of optimal consumption and investment under relative performance criteria. Mathematics and Financial Economics , 14(2):263–281

work page 2020

[9] [9]

and Zariphopoulou, T

Lacker, D. and Zariphopoulou, T. (2019). Mean field and n-agent games for optimal investment under relative performance criteria. Mathematical Finance, 29(4):1003–1038

work page 2019

[10] [10]

Landriault, D., Li, B., Li, D., and Young, V. R. (2018). Equilibrium strategies for the mean- variance investment problem over a random horizon. SIAM Journal on Financial Mathematics , 9(3):1046–1073

work page 2018

[11] [11]

and Ng, W.-L

Li, D. and Ng, W.-L. (2000). Optimal dynamic portfolio selection: Multiperiod mean-variance formulation. Mathematical Finance, 10(3):387–406. 28

work page 2000

[12] [12]

Liang, X., Bai, L., and Guo, J. (2014). Optimal time-consistent portfolio and contribution se- lection for defined benefit pension schemes under mean-variance criterion.The ANZIAM Journal, 56(1):66–90

work page 2014

[13] [13]

Pun, C. S. (2018). Time-consistent mean-variance portfolio selection with only risky assets. Economic Modelling, 75:281–292

work page 2018

[14] [14]

Strotz, R. H. (1973). Myopia and Inconsistency in Dynamic Utility Maximization . Springer

work page 1973

[15] [15]

and Zhou, X

Xiong, J. and Zhou, X. Y. (2007). Mean-variance portfolio selection under partial information. SIAM Journal on Control and Optimization , 46(1):156–175

work page 2007

[16] [16]

S., and Wong, H

Yan, T., Han, B., Pun, C. S., and Wong, H. Y. (2020). Robust time-consistent mean-variance portfolio selection problem with multivariate stochastic volatility. Mathematics and Financial Economics, 14:699–724

work page 2020

[17] [17]

and Zhou, X

Yong, J. and Zhou, X. (2012). Stochastic Controls: Hamiltonian Systems and HJB Equations . Springer Science & Business Media

work page 2012

[18] [18]

Zeng, Y., Li, Z., and Lai, Y. (2013). Time-consistent investment and reinsurance strategies for mean-variance insurers with jumps. Insurance: Mathematics and Economics , 52(3):498–507

work page 2013

[19] [19]

Zhang, L., Wang, P., and Shen, Y. (2025). Time-consistent investment strategy for a DC pension plan with the return of premiums clause and mispricing. Quantitative Finance, 25:117– 141

work page 2025

[20] [20]

Zhou, X. Y. and Li, D. (2000). Continuous-time mean-variance portfolio selection: A stochastic LQ framework. Applied Mathematics and Optimization , 42:19–33. A Proof of Theorem 2.1 In this appendix, we will provide the proof of Theorem 2.1. Proof. Let bπi be the function that attains the supremum in (2.6), and bπi be the corresponding feedback strategy. S...

work page 2000