End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning

Christian Bongiorno; Efstratios Manolakis; Rosario Nunzio Mantegna

arxiv: 2507.01918 · v3 · submitted 2025-07-02 · 💱 q-fin.PM · cs.AI· math.OC· physics.data-an· stat.ML

End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning

Christian Bongiorno , Efstratios Manolakis , Rosario Nunzio Mantegna This is my paper

Pith reviewed 2026-05-19 06:13 UTC · model grok-4.3

classification 💱 q-fin.PM cs.AImath.OCphysics.data-anstat.ML

keywords portfolio optimizationcovariance estimationneural networksminimum varianceeigenvalue regularizationout-of-sample performancefinancial machine learninglarge-scale portfolios

0 comments

The pith

A rotation-invariant neural network learns lag transforms and eigenvalue regularization to produce minimum-variance portfolios that outperform shrinkage estimators out of sample.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural network that jointly learns to transform historical returns and regularize the eigenvalues of covariance matrices in order to construct global minimum-variance portfolios. The architecture is designed to be rotation-invariant and dimension-agnostic, so a single model trained on panels of a few hundred stocks can be applied directly to one thousand equities. A sympathetic reader would care because the loss is the future realized minimum variance, and the resulting portfolios show lower volatility, reduced drawdowns, and higher Sharpe ratios than leading competitors across long out-of-sample periods. The model remains interpretable because each module maps to an explicit step in the analytical minimum-variance solution, and the performance edge persists under long-only constraints and realistic trading frictions.

Core claim

The authors present a rotation-invariant neural network that provides the global minimum-variance portfolio by learning lag-transforms of historical returns and marginal volatilities together with regularization of the eigenvalues of large equity covariance matrices. This explicit mapping supplies interpretability while the architecture stays agnostic to dimension, allowing one model calibrated on a few hundred stocks to be used without retraining on one thousand US equities. The network is optimized end-to-end on the future short-term realized minimum variance using actual returns; in out-of-sample tests spanning January 2000 to December 2024 it delivers lower realized volatility, smaller最大

What carries the argument

A rotation-invariant neural network that mirrors the analytical form of the global minimum-variance solution while jointly learning lag-transforms and eigenvalue regularization.

If this is right

Lower realized volatility than state-of-the-art non-linear shrinkage in out-of-sample tests from 2000 to 2024
Smaller maximum drawdowns across both short and long evaluation horizons
Higher Sharpe ratios that persist when the learned covariance is inserted into long-only optimizers
Performance advantages remain under realistic execution that includes auction orders, slippage, fees, and leverage financing
Stability of the edge during episodes of acute market stress

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same architecture could be retrained on multi-objective losses that directly penalize turnover or tail risk.
Because the network is dimension-agnostic, it offers a route to cross-asset or international portfolios without redesigning the model.
The explicit modules allow post-hoc inspection of the learned regularization rules to derive new analytical cleaning formulas.
Online updating of the trained weights could adapt the estimator to slow changes in market microstructure.

Load-bearing premise

A single model trained on panels of a few hundred stocks can be applied without retraining to one thousand equities while preserving its performance advantage, relying on rotation invariance and dimension-agnostic architecture.

What would settle it

An out-of-sample test on a fresh panel of one thousand equities in which the model, applied without retraining, shows no reduction in realized volatility or improvement in Sharpe ratio relative to non-linear shrinkage.

Figures

Figures reproduced from arXiv: 2507.01918 by Christian Bongiorno, Efstratios Manolakis, Rosario Nunzio Mantegna.

**Figure 2.** Figure 2: Training loss on the left panel, validation loss on the right panel. Different lines refer to independent training [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: The upper plots show the calibrated weighting factors [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Eigenvalue sensitivity analysis. Left: median of the eigenvalues as a function of the rank, the colored bands [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: The figure shows how the MLP (Model 3) transforms the standard deviation of the lag-transformed returns. [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Long-only portfolio performances of the top 1,000 most capitalized stocks in the universe backtested with the [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

read the original abstract

We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and marginal volatilities and how to regularise the eigenvalues of large equity covariance matrices. This explicit mathematical mapping offers clear interpretability of each module's role, so the model cannot be regarded as a pure black box. The architecture mirrors the analytical form of the global minimum-variance solution yet remains agnostic to dimension, so a single model can be calibrated on panels of a few hundred stocks and applied, without retraining, to one thousand US equities, a cross-sectional jump that indicates robust generalization capability. The loss function is the future short-term realized minimum variance and is optimized end-to-end on real returns. In out-of-sample tests from January 2000 to December 2024, the estimator delivers systematically lower realized volatility, smaller maximum drawdowns, and higher Sharpe ratios than the best competitors, including state-of-the-art non-linear shrinkage, and these advantages persist across both short and long evaluation horizons despite the model's training focus is short-term. Furthermore, although the model is trained end-to-end to produce an unconstrained minimum-variance portfolio, we show that its learned covariance representation can be used in general optimizers under long-only constraints with virtually no loss in its performance advantage over competing estimators. These advantages persist when the strategy is executed under a highly realistic implementation framework that models market orders at the auctions, empirical slippage, exchange fees, and financing charges for leverage, and they remain stable during episodes of acute market stress.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A rotation-invariant NN that learns lag transforms and eigenvalue cleaning for GMV portfolios claims to scale from hundreds to a thousand stocks without retraining, with decent OOS results but thin validation on the scaling claim.

read the letter

This paper's main point is a neural network that outputs global minimum-variance portfolio weights by jointly learning lag transforms on returns and volatilities plus eigenvalue regularization, all in a rotation-invariant and dimension-agnostic setup. A single model trained on a few hundred stocks is then applied directly to a thousand equities, and the abstract reports lower realized volatility, smaller drawdowns, and higher Sharpe ratios than non-linear shrinkage over 2000-2024, even after realistic trading costs and under long-only constraints.

Referee Report

3 major / 2 minor

Summary. The paper introduces a rotation-invariant neural network for large-scale minimum-variance portfolio optimization. It jointly learns lag-transforms of historical returns and marginal volatilities together with eigenvalue regularization of the covariance matrix. The architecture is designed to be dimension-agnostic, allowing a single model trained on panels of a few hundred stocks to be applied without retraining to universes of one thousand equities. The loss is the future short-term realized minimum variance, optimized end-to-end. Out-of-sample results over January 2000–December 2024 are reported to show lower realized volatility, smaller maximum drawdowns, and higher Sharpe ratios than leading competitors including non-linear shrinkage estimators; advantages are claimed to persist under long-only constraints and realistic transaction-cost modeling.

Significance. If the empirical advantages and zero-shot generalization hold after addressing the points below, the work would offer a practically relevant advance in high-dimensional covariance estimation for portfolio construction. The explicit decomposition into interpretable modules (lag-transform, volatility scaling, eigenvalue cleaning) distinguishes it from black-box alternatives and could facilitate adoption in quantitative asset management. The end-to-end training on realized variance supplies a direct, falsifiable objective that aligns with the downstream task.

major comments (3)

[§4 (Out-of-sample evaluation) and architecture description] The central practical claim—that a model trained on panels of a few hundred stocks can be applied without retraining to one thousand equities while preserving its performance edge—rests on asserted rotation invariance and dimension-agnostic behavior. No ablation that isolates the effect of increasing cross-sectional dimension (holding architecture, training window, and hyperparameters fixed) or direct comparison against an identically architected model retrained on the larger panel is described. This omission is load-bearing for the generalization result highlighted in the abstract and §4.
[§4 and abstract] The abstract and results section report systematic outperformance in realized volatility, drawdowns, and Sharpe ratios relative to state-of-the-art non-linear shrinkage, yet no statistical significance tests (e.g., Diebold-Mariano, bootstrap confidence intervals on differences, or multiple-testing adjustments) are provided, nor are exact baseline implementations and hyperparameter choices fully detailed. Without these, it is difficult to judge whether the reported advantages are robust or sensitive to implementation specifics.
[§3 (Loss and training) and §4] The loss is defined on future short-term realized minimum variance, which supplies an external benchmark; however, the learned regularization and transform parameters are optimized end-to-end on the same historical panel used for evaluation. This creates a moderate risk that part of the reported advantage reflects in-sample fitting rather than genuine out-of-sample generalization, particularly given the long 2000–2024 window and absence of explicit look-ahead-bias safeguards.

minor comments (2)

[§3] Notation for the lag-transform and eigenvalue regularization modules could be clarified with explicit equations showing how each component maps to the analytical minimum-variance solution.
[Figures and tables in §4] Figure captions and table footnotes should explicitly state the exact number of assets in each training and test cross-section to make the dimension jump transparent.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. These have helped us identify opportunities to strengthen the empirical support and clarity of the manuscript. We address each major comment below and indicate the revisions we will make.

read point-by-point responses

Referee: [§4 (Out-of-sample evaluation) and architecture description] The central practical claim—that a model trained on panels of a few hundred stocks can be applied without retraining to one thousand equities while preserving its performance edge—rests on asserted rotation invariance and dimension-agnostic behavior. No ablation that isolates the effect of increasing cross-sectional dimension (holding architecture, training window, and hyperparameters fixed) or direct comparison against an identically architected model retrained on the larger panel is described. This omission is load-bearing for the generalization result highlighted in the abstract and §4.

Authors: We agree that an explicit ablation isolating the cross-sectional dimension effect would provide stronger support for the claimed zero-shot generalization. In the revised manuscript we will add such an analysis to §4: we will retrain the identical architecture on randomly sampled panels of 200 and 500 stocks drawn from the original training universe and evaluate zero-shot performance on the full 1,000-stock test universe. We will also report results for a model retrained directly on the larger panel (subject to computational feasibility) while holding all other hyperparameters fixed. These additions will quantify whether the performance advantage is preserved by the rotation-invariant design. revision: yes
Referee: [§4 and abstract] The abstract and results section report systematic outperformance in realized volatility, drawdowns, and Sharpe ratios relative to state-of-the-art non-linear shrinkage, yet no statistical significance tests (e.g., Diebold-Mariano, bootstrap confidence intervals on differences, or multiple-testing adjustments) are provided, nor are exact baseline implementations and hyperparameter choices fully detailed. Without these, it is difficult to judge whether the reported advantages are robust or sensitive to implementation specifics.

Authors: We concur that formal statistical tests and fuller implementation details are necessary for robust interpretation. In the revision we will add Diebold-Mariano tests comparing realized volatility and Sharpe-ratio series, together with bootstrap confidence intervals on the performance differentials. We will also expand §4 and the appendix to document the precise hyperparameter settings and implementation choices for all non-linear shrinkage baselines, ensuring full reproducibility. revision: yes
Referee: [§3 (Loss and training) and §4] The loss is defined on future short-term realized minimum variance, which supplies an external benchmark; however, the learned regularization and transform parameters are optimized end-to-end on the same historical panel used for evaluation. This creates a moderate risk that part of the reported advantage reflects in-sample fitting rather than genuine out-of-sample generalization, particularly given the long 2000–2024 window and absence of explicit look-ahead-bias safeguards.

Authors: We appreciate the concern about potential temporal leakage. The training procedure already employs a strictly causal rolling-window scheme in which parameters are estimated only on data available up to each rebalancing date and the loss is evaluated on subsequent realized variance; the 2000–2024 evaluation itself follows a walk-forward protocol. Nevertheless, to address the referee’s point directly we will add an explicit subsection in §3 describing these safeguards and will include supplementary results that use more conservative hold-out designs (e.g., training exclusively on pre-2010 data for post-2010 evaluation). revision: partial

Circularity Check

0 steps flagged

No circularity: architecture design and empirical OOS evaluation remain independent of claimed outputs

full rationale

The paper constructs a neural network whose modules explicitly mirror the known analytical GMV formula (inverse covariance weighting) while adding learned lag-transform and eigenvalue regularization; the loss is defined directly on future realized portfolio variance, an external benchmark independent of the fitted parameters. The dimension-agnostic and rotation-invariant properties are architectural choices that permit cross-sectional transfer by design, but the reported performance advantage is measured on a later time window (2000-2024) against external competitors and is not mathematically forced by the training objective or by any self-citation. No equation reduces the out-of-sample volatility or Sharpe improvement to a re-expression of the training inputs; the generalization claim is therefore an empirical assertion rather than a definitional tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that historical equity returns contain learnable structure that can be extracted via lag transforms and eigenvalue regularization to predict future realized variance; the model introduces many trainable parameters whose values are determined by optimization on the training window.

free parameters (1)

neural network parameters
Weights and biases of the rotation-invariant network are fitted end-to-end to minimize future realized minimum variance on historical returns.

axioms (1)

domain assumption The analytical form of the global minimum-variance portfolio can be mirrored by a neural network architecture that remains agnostic to input dimension.
The paper states that the architecture mirrors the analytical GMV solution while staying dimension-agnostic.

pith-pipeline@v0.9.0 · 5832 in / 1454 out tokens · 78684 ms · 2026-05-19T06:13:12.249768+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and marginal volatilities and how to regularise the eigenvalues of large equity covariance matrices.
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The architecture mirrors the analytical form of the global minimum-variance solution yet remains agnostic to dimension, so a single model can be calibrated on panels of a few hundred stocks and applied, without retraining, to one thousand US equities

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages

[1]

Portofolio selection

Harry Markowitz. Portofolio selection. Journal of Finance, 7:77–91, 1952

work page 1952
[2]

Efficient capital markets

Eugene F Fama. Efficient capital markets. Journal of Finance, 25(2):383–417, 1970

work page 1970
[3]

Estimation of a covariance matrix

Charles Stein. Estimation of a covariance matrix. In 39th Annual Meeting IMS, Atlanta, GA, 1975, 1975

work page 1975
[4]

An overview of machine learning for portfolio optimization

Yongjae Lee, Jang Ho Kim, Woo Chang Kim, and Frank J Fabozzi. An overview of machine learning for portfolio optimization. Journal of Portfolio Management, 51(2), 2024

work page 2024
[5]

Noise dressing of financial correlation matrices

Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters. Noise dressing of financial correlation matrices. Physical Review Letters, 83(7):1467, 1999

work page 1999
[6]

A well-conditioned estimator for large-dimensional covariance matrices.Journal of Multivariate Analysis, 88(2):365–411, 2004

Olivier Ledoit and Michael Wolf. A well-conditioned estimator for large-dimensional covariance matrices.Journal of Multivariate Analysis, 88(2):365–411, 2004

work page 2004
[7]

Cleaning large correlation matrices: tools from random matrix theory

Joël Bun, Jean-Philippe Bouchaud, and Marc Potters. Cleaning large correlation matrices: tools from random matrix theory. Physics Reports, 666:1–109, 2017

work page 2017
[8]

Optimal data splitting for holdout cross-validation in large covariance matrix estimation

Lamia Lamrani, Christian Bongiorno, and Marc Potters. Optimal data splitting for holdout cross-validation in large covariance matrix estimation. arXiv preprint arXiv:2503.15186, 2025

work page arXiv 2025
[9]

Nonlinear shrinkage estimation of large-dimensional covariance matrices

Olivier Ledoit and Michael Wolf. Nonlinear shrinkage estimation of large-dimensional covariance matrices. The Annals of Statistics, 2012

work page 2012
[10]

Eigenvectors of some large sample covariance matrix ensembles

Olivier Ledoit and Sandrine Péché. Eigenvectors of some large sample covariance matrix ensembles. Probability Theory and Related Fields, 151(1):233–264, 2011

work page 2011
[11]

Spectrum estimation: A unified framework for covariance matrix estimation and pca in large dimensions

Olivier Ledoit and Michael Wolf. Spectrum estimation: A unified framework for covariance matrix estimation and pca in large dimensions. Journal of Multivariate Analysis, 139:360–384, 2015

work page 2015
[12]

Direct nonlinear shrinkage estimation of large-dimensional covariance matrices

Olivier Ledoit and Michael Wolf. Direct nonlinear shrinkage estimation of large-dimensional covariance matrices. Technical report, Working Paper, 2017

work page 2017
[13]

Quadratic shrinkage for large covariance matrices

Olivier Ledoit and Michael Wolf. Quadratic shrinkage for large covariance matrices. Bernoulli, 28(3):1519–1547, 2022

work page 2022
[14]

Advances in high-dimensional covariance matrix estimation

Daniel Bartz. Advances in high-dimensional covariance matrix estimation . Technische Universitaet Berlin (Germany), 2016. 20 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 2016
[15]

Nonparametric eigenvalue-regularized precision or covariance matrix estimator.Annals of Statistics, 44(3):928–953, 2016

Clifford Lam. Nonparametric eigenvalue-regularized precision or covariance matrix estimator.Annals of Statistics, 44(3):928–953, 2016

work page 2016
[16]

A nonparametric eigenvalue-regularized integrated covariance matrix estimator for asset return data

Clifford Lam and Phoenix Feng. A nonparametric eigenvalue-regularized integrated covariance matrix estimator for asset return data. Journal of Econometrics, 206(1):226–257, 2018

work page 2018
[17]

Agnostic allocation portfolios: a sweet spot in the risk-based jungle? Journal of Portfolio Management, 46(4):22–38, 2020

Pierre-Alain Reigneron, Vincent Nguyen, Stefano Ciliberti, Philip Seager, and Jean-Philippe Bouchaud. Agnostic allocation portfolios: a sweet spot in the risk-based jungle? Journal of Portfolio Management, 46(4):22–38, 2020

work page 2020
[18]

Estimation of large financial covariances: A cross-validation approach

Vincent Tan and Stefan Zohren. Estimation of large financial covariances: A cross-validation approach. Journal of Portfolio Management, 51(4), 2025

work page 2025
[19]

Correlation, hierarchies, and networks in financial markets

Michele Tumminello, Fabrizio Lillo, and Rosario N Mantegna. Correlation, hierarchies, and networks in financial markets. Journal of Economic Behavior & Organization, 75(1):40–58, 2010

work page 2010
[20]

Covariance matrix filtering with bootstrapped hierarchies

Christian Bongiorno and Damien Challet. Covariance matrix filtering with bootstrapped hierarchies. PloS One, 16(1):e0245092, 2021

work page 2021
[21]

Reactive global minimum variance portfolios with k-bahc covariance cleaning

Christian Bongiorno and Damien Challet. Reactive global minimum variance portfolios with k-bahc covariance cleaning. The European Journal of Finance, 28(13-15):1344–1360, 2022

work page 2022
[22]

Mantegna

Rosario N. Mantegna. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11:193–197, 1999

work page 1999
[23]

Cluster analysis for portfolio optimiza- tion

Vincenzo Tola, Fabrizio Lillo, Mauro Gallegati, and Rosario N Mantegna. Cluster analysis for portfolio optimiza- tion. Journal of Economic Dynamics and Control, 32(1):235–258, 2008

work page 2008
[24]

When do improved covariance matrix estimators enhance portfolio optimization? an empirical comparative study of nine estimators

Ester Pantaleo, Michele Tumminello, Fabrizio Lillo, and Rosario N Mantegna. When do improved covariance matrix estimators enhance portfolio optimization? an empirical comparative study of nine estimators. Quantitative Finance, 11(7):1067–1080, 2011

work page 2011
[25]

Two-step estimators of high-dimensional correlation matrices

Andrés García-Medina, Salvatore Miccichè, and Rosario N Mantegna. Two-step estimators of high-dimensional correlation matrices. Physical Review E, 108(4):044137, 2023

work page 2023
[26]

Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models

Robert Engle. Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models. Journal of Business & Economic Statistics, 20(3):339–350, 2002

work page 2002
[27]

Covariance matrix filtering and portfolio optimisation: the average oracle vs non-linear shrinkage and all the variants of dcc-nls

Christian Bongiorno and Damien Challet. Covariance matrix filtering and portfolio optimisation: the average oracle vs non-linear shrinkage and all the variants of dcc-nls. Quantitative Finance, pages 1–8, 2024

work page 2024
[28]

Filtering time-dependent covariance matrices using time-independent eigenvalues

Christian Bongiorno, Damien Challet, and Grégoire Loeper. Filtering time-dependent covariance matrices using time-independent eigenvalues. Journal of Statistical Mechanics: Theory and Experiment, 2023(2):023402, 2023

work page 2023
[29]

Model-based vs

Jean-David Fermanian, Benjamin Poignard, and Panos Xidonas. Model-based vs. agnostic methods for the prediction of time-varying covariance matrices. Annals of Operations Research, pages 1–38, 2024

work page 2024
[30]

Quantifying the information lost in optimal covariance matrix cleaning

Christian Bongiorno and Lamia Lamrani. Quantifying the information lost in optimal covariance matrix cleaning. Physica A: Statistical Mechanics and its Applications, 657:130225, 2025

work page 2025
[31]

Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization

Christian Bongiorno and Damien Challet. Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization. Finance Research Letters, 52:103383, 2023

work page 2023
[32]

Log-gases and random matrices (LMS-34)

Peter J Forrester. Log-gases and random matrices (LMS-34). Princeton university press, 1st edition, 2010. pp. 111-115

work page 2010
[33]

Dynamic portfolio optimization using a hybrid mlp-har approach

Caio Mário Mesquita, Cristiano Arbex Valle, and Adriano CM Pereira. Dynamic portfolio optimization using a hybrid mlp-har approach. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI) , pages 1075–1082. IEEE, 2020

work page 2020
[34]

A deep learning framework for medium-term covariance forecasting in multi-asset portfolios

Pedro Reis, Ana Paula Serra, and João Gama. A deep learning framework for medium-term covariance forecasting in multi-asset portfolios. arXiv preprint arXiv:2503.01581, 2025

work page arXiv 2025
[35]

Enhancing portfolio optimization: A two-stage approach with deep learning and portfolio optimization

Shiguo Huang, Linyu Cao, Ruili Sun, Tiefeng Ma, and Shuangzhe Liu. Enhancing portfolio optimization: A two-stage approach with deep learning and portfolio optimization. Mathematics, 12(21):3376, 2024

work page 2024
[36]

Integrating prediction in mean-variance portfolio optimization

Andrew Butler and Roy H Kwon. Integrating prediction in mean-variance portfolio optimization. Quantitative Finance, 23(3):429–452, 2023

work page 2023
[37]

Deep learning for portfolio optimization

Zihao Zhang, Stefan Zohren, and Stephen Roberts. Deep learning for portfolio optimization. The Journal of Financial Data Science, 2(4):8–20, 2020

work page 2020
[38]

Distributionally robust end-to-end portfolio construction

Giorgio Costa and Garud N Iyengar. Distributionally robust end-to-end portfolio construction. Quantitative Finance, 23(10):1465–1482, 2023. 21 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 2023
[39]

End-to-end risk budgeting portfolio optimization with neural networks

A Sinem Uysal, Xiaoyue Li, and John M Mulvey. End-to-end risk budgeting portfolio optimization with neural networks. Annals of Operations Research, 339(1):397–426, 2024

work page 2024
[40]

Deep deterministic portfolio optimization

Ayman Chaouki, Stephen Hardiman, Christian Schmidt, Emmanuel Sérié, and Joachim De Lataillade. Deep deterministic portfolio optimization. The Journal of Finance and Data Science, 6:16–30, 2020

work page 2020
[41]

Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory

Junkyu Jang and NohYoon Seong. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications, 218:119556, 2023

work page 2023
[42]

Reinforcement learning for deep portfolio optimization

Ruyu Yan, Jiafei Jin, and Kun Han. Reinforcement learning for deep portfolio optimization. Electronic Research Archive, 32(9), 2024

work page 2024
[43]

Optimization-based spectral end-to-end deep reinforcement learning for equity portfolio management

Pengrui Yu, Siya Liu, Chengneng Jin, Runsheng Gu, and Xiaomin Gong. Optimization-based spectral end-to-end deep reinforcement learning for equity portfolio management. Pacific-Basin Finance Journal, 91:102746, 2025

work page 2025
[44]

Dominating estimators for the global minimum variance portfolio

Gabriel Frahm and Christoph Memmel. Dominating estimators for the global minimum variance portfolio. Technical Report 01/2009, Deutsche Bundesbank, January 2009

work page 2009
[45]

Muirhead

Robb J. Muirhead. Aspects of Multivariate Statistical Theory. John Wiley & Sons, 1st edition, 1982. pp. 390-405

work page 1982
[46]

Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies, 22(5):1915–1953, 2009

Victor DeMiguel, Lorenzo Garlappi, and Raman Uppal. Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies, 22(5):1915–1953, 2009

work page 1915
[47]

Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation

Robert F Engle. Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica: Journal of the Econometric Society, pages 987–1007, 1982

work page 1982
[48]

Gilles O. Zumbach. V olatility processes and volatility forecast with long memory.Quantitative Finance, 4(1):70, oct 2003

work page 2003
[49]

On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results

Michael J Best and Robert R Grauer. On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results. The Review of Financial Studies, 4(2):315–342, 1991

work page 1991
[50]

Empirical evidence on student-t log-returns of diversified world stock indices

Eckhard Platen and Renata Rendek. Empirical evidence on student-t log-returns of diversified world stock indices. Journal of Statistical Theory and Practice, 2(2):233–251, 2008

work page 2008
[51]

The likelihood of various stock market return distributions, part 2: Empirical results

Harry M Markowitz and Nilufer Usmen. The likelihood of various stock market return distributions, part 2: Empirical results. Journal of Risk and Uncertainty, 13:221–247, 1996

work page 1996
[52]

Optimal covariance cleaning for heavy-tailed distributions: Insights from information theory

Christian Bongiorno and Marco Berritta. Optimal covariance cleaning for heavy-tailed distributions: Insights from information theory. Physical Review E, 108(5):054133, 2023

work page 2023
[53]

Risk reduction in large portfolios: Why imposing the wrong constraints helps

Ravi Jagannathan and Tongshu Ma. Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance, 58(4):1651–1683, 2003

work page 2003
[54]

Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks

Olivier Ledoit and Michael Wolf. Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks. The Review of Financial Studies, 30(12):4349–4388, 06 2017

work page 2017
[55]

Gilles O. Zumbach. The riskmetrics 2006 methodology. Technical Report 185, RiskMetrics Group, Geneva, Switzerland, March 2007

work page 2006
[56]

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabás Poczos, Ruslan Salakhutdinov, and Alexander J. Smola. Deep sets. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017
[57]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017
[58]

Fuchs, Martin Engelcke, Michael A

Edward Wagstaff, Fabian B. Fuchs, Martin Engelcke, Michael A. Osborne, and Ingmar Posner. Universal approximation of functions on sets. Journal of Machine Learning Research, 23(21-0730), 2021

work page 2021
[59]

Understanding the difficulty of training transformers

Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Jiawei Han. Understanding the difficulty of training transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5747–5763, 2020

work page 2020
[60]

Attention is not all you need: Pure attention loses rank doubly exponentially with depth

Yihe Dong, Jean-Baptiste Cordonnier, and Andreas Loukas. Attention is not all you need: Pure attention loses rank doubly exponentially with depth. In International Conference on Machine Learning, pages 2793–2803. PMLR, 2021

work page 2021
[61]

Gers, Jürgen Schmidhuber, and Fred Cummins

Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Learning to forget: Continual prediction with lstm. Neural Computation, 12(10):2451–2471, 2000

work page 2000
[62]

Mike Schuster and Kuldip K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681, November 1997. 22 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 1997
[63]

Securities and Exchange Commission

U.S. Securities and Exchange Commission. 17 CFR §240.12d2-2 – Removal from Listing and Registration. https://www.law.cornell.edu/cfr/text/17/240.12d2-2, 2024

work page 2024
[64]

A Modern Introduction to Probability and Statistics: Understanding why and how

Frederik Michel Dekking. A Modern Introduction to Probability and Statistics: Understanding why and how . Springer Science & Business Media, 2005. pg. 231-243

work page 2005
[65]

Large dynamic covariance matrices

Robert F Engle, Olivier Ledoit, and Michael Wolf. Large dynamic covariance matrices. Journal of Business & Economic Statistics, 37(2):363–375, 2019

work page 2019
[66]

J. P. Morgan Guaranty Trust Company and Reuters Ltd. Riskmetrics™ —technical document. Technical report, J. P. Morgan Guaranty Trust Company and Reuters Ltd., New York, December 1996

work page 1996
[67]

Simple multivariate conditional covariance dynamics using hyperbolically weighted moving averages

Hiroyuki Kawakatsu. Simple multivariate conditional covariance dynamics using hyperbolically weighted moving averages. Journal of Econometric Methods, 10(1):33–52, 2021

work page 2021
[68]

Mesoscopic community structure of financial markets revealed by price and sign fluctuations

Assaf Almog, Ferry Besamusca, Mel MacMahon, and Diego Garlaschelli. Mesoscopic community structure of financial markets revealed by price and sign fluctuations. PloS one, 10(7):e0133679, 2015

work page 2015
[69]

On the methods of measuring association between two attributes

G Udny Yule. On the methods of measuring association between two attributes. Journal of the Royal Statistical Society, 75(6):579–652, 1912

work page 1912
[70]

Kendall correlation coefficients for portfolio optimization

Tomas Espana, Victor Le Coz, and Matteo Smerlak. Kendall correlation coefficients for portfolio optimization. arXiv preprint arXiv:2410.17366, 2024

work page arXiv 2024
[71]

Optnet: Differentiable optimization as a layer in neural networks

Brandon Amos and J Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pages 136–145. PMLR, 2017

work page 2017
[72]

Demystifying equity risk-based strategies: A simple alpha plus beta description

Raul Leote, Xiao Lu, and Pierre Moulin. Demystifying equity risk-based strategies: A simple alpha plus beta description. Journal of Portfolio Management, 38(3):56–70, 2012

work page 2012
[73]

Cap-weighted portfolios are sub-optimal portfolios

Jason C Hsu. Cap-weighted portfolios are sub-optimal portfolios. Journal of Investment Management, 4(3), 2004

work page 2004
[74]

A new method to estimate the noise in financial correlation matrices

Thomas Guhr and Bernd Kälber. A new method to estimate the noise in financial correlation matrices. Journal of Physics A: Mathematical and General, 36(12):3009, 2003

work page 2003
[75]

Scikit-learn

Oliver Kramer and Oliver Kramer. Scikit-learn. Machine Learning for Evolution Strategies, pages 45–53, 2016

work page 2016
[76]

covShrinkage: A package for shrinkage estimation of covariance matrices

Patrick Ledoit. covShrinkage: A package for shrinkage estimation of covariance matrices. https://github. com/pald22/covShrinkage, 2022. Accessed: 2025-06-20

work page 2022
[77]

Enhancing high-dimensional dynamic conditional angular correlation model based on garch family models: Comparative performance analysis for portfolio optimization

Zhangshuang Sun, Xuerui Gao, Kangyang Luo, Yanqin Bai, Jiyuan Tao, and Guoqiang Wang. Enhancing high-dimensional dynamic conditional angular correlation model based on garch family models: Comparative performance analysis for portfolio optimization. Finance Research Letters, 75:106808, 2025

work page 2025
[78]

An index of portfolio diversification

Walt Woerheide and Don Persson. An index of portfolio diversification. Financial Services Review, 2(2):73–85, 1992

work page 1992
[79]

Commissions & Fees

Interactive Brokers. Commissions & Fees . https://www.interactivebrokers.com/en/pricing/ commissions-home.php, 2025. Accessed: 2025-06-19

work page 2025
[80]

Benchmark interest calculation reference rate descriptions.https://www.ibkrguides

Interactive Brokers LLC. Benchmark interest calculation reference rate descriptions.https://www.ibkrguides. com/kb/en-us/benchmark-interest-calculation-reference-rate-descriptions.htm , 2025. Last updated July 8, 2025

work page 2025

Showing first 80 references.

[1] [1]

Portofolio selection

Harry Markowitz. Portofolio selection. Journal of Finance, 7:77–91, 1952

work page 1952

[2] [2]

Efficient capital markets

Eugene F Fama. Efficient capital markets. Journal of Finance, 25(2):383–417, 1970

work page 1970

[3] [3]

Estimation of a covariance matrix

Charles Stein. Estimation of a covariance matrix. In 39th Annual Meeting IMS, Atlanta, GA, 1975, 1975

work page 1975

[4] [4]

An overview of machine learning for portfolio optimization

Yongjae Lee, Jang Ho Kim, Woo Chang Kim, and Frank J Fabozzi. An overview of machine learning for portfolio optimization. Journal of Portfolio Management, 51(2), 2024

work page 2024

[5] [5]

Noise dressing of financial correlation matrices

Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters. Noise dressing of financial correlation matrices. Physical Review Letters, 83(7):1467, 1999

work page 1999

[6] [6]

A well-conditioned estimator for large-dimensional covariance matrices.Journal of Multivariate Analysis, 88(2):365–411, 2004

Olivier Ledoit and Michael Wolf. A well-conditioned estimator for large-dimensional covariance matrices.Journal of Multivariate Analysis, 88(2):365–411, 2004

work page 2004

[7] [7]

Cleaning large correlation matrices: tools from random matrix theory

Joël Bun, Jean-Philippe Bouchaud, and Marc Potters. Cleaning large correlation matrices: tools from random matrix theory. Physics Reports, 666:1–109, 2017

work page 2017

[8] [8]

Optimal data splitting for holdout cross-validation in large covariance matrix estimation

Lamia Lamrani, Christian Bongiorno, and Marc Potters. Optimal data splitting for holdout cross-validation in large covariance matrix estimation. arXiv preprint arXiv:2503.15186, 2025

work page arXiv 2025

[9] [9]

Nonlinear shrinkage estimation of large-dimensional covariance matrices

Olivier Ledoit and Michael Wolf. Nonlinear shrinkage estimation of large-dimensional covariance matrices. The Annals of Statistics, 2012

work page 2012

[10] [10]

Eigenvectors of some large sample covariance matrix ensembles

Olivier Ledoit and Sandrine Péché. Eigenvectors of some large sample covariance matrix ensembles. Probability Theory and Related Fields, 151(1):233–264, 2011

work page 2011

[11] [11]

Spectrum estimation: A unified framework for covariance matrix estimation and pca in large dimensions

Olivier Ledoit and Michael Wolf. Spectrum estimation: A unified framework for covariance matrix estimation and pca in large dimensions. Journal of Multivariate Analysis, 139:360–384, 2015

work page 2015

[12] [12]

Direct nonlinear shrinkage estimation of large-dimensional covariance matrices

Olivier Ledoit and Michael Wolf. Direct nonlinear shrinkage estimation of large-dimensional covariance matrices. Technical report, Working Paper, 2017

work page 2017

[13] [13]

Quadratic shrinkage for large covariance matrices

Olivier Ledoit and Michael Wolf. Quadratic shrinkage for large covariance matrices. Bernoulli, 28(3):1519–1547, 2022

work page 2022

[14] [14]

Advances in high-dimensional covariance matrix estimation

Daniel Bartz. Advances in high-dimensional covariance matrix estimation . Technische Universitaet Berlin (Germany), 2016. 20 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 2016

[15] [15]

Nonparametric eigenvalue-regularized precision or covariance matrix estimator.Annals of Statistics, 44(3):928–953, 2016

Clifford Lam. Nonparametric eigenvalue-regularized precision or covariance matrix estimator.Annals of Statistics, 44(3):928–953, 2016

work page 2016

[16] [16]

A nonparametric eigenvalue-regularized integrated covariance matrix estimator for asset return data

Clifford Lam and Phoenix Feng. A nonparametric eigenvalue-regularized integrated covariance matrix estimator for asset return data. Journal of Econometrics, 206(1):226–257, 2018

work page 2018

[17] [17]

Agnostic allocation portfolios: a sweet spot in the risk-based jungle? Journal of Portfolio Management, 46(4):22–38, 2020

Pierre-Alain Reigneron, Vincent Nguyen, Stefano Ciliberti, Philip Seager, and Jean-Philippe Bouchaud. Agnostic allocation portfolios: a sweet spot in the risk-based jungle? Journal of Portfolio Management, 46(4):22–38, 2020

work page 2020

[18] [18]

Estimation of large financial covariances: A cross-validation approach

Vincent Tan and Stefan Zohren. Estimation of large financial covariances: A cross-validation approach. Journal of Portfolio Management, 51(4), 2025

work page 2025

[19] [19]

Correlation, hierarchies, and networks in financial markets

Michele Tumminello, Fabrizio Lillo, and Rosario N Mantegna. Correlation, hierarchies, and networks in financial markets. Journal of Economic Behavior & Organization, 75(1):40–58, 2010

work page 2010

[20] [20]

Covariance matrix filtering with bootstrapped hierarchies

Christian Bongiorno and Damien Challet. Covariance matrix filtering with bootstrapped hierarchies. PloS One, 16(1):e0245092, 2021

work page 2021

[21] [21]

Reactive global minimum variance portfolios with k-bahc covariance cleaning

Christian Bongiorno and Damien Challet. Reactive global minimum variance portfolios with k-bahc covariance cleaning. The European Journal of Finance, 28(13-15):1344–1360, 2022

work page 2022

[22] [22]

Mantegna

Rosario N. Mantegna. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11:193–197, 1999

work page 1999

[23] [23]

Cluster analysis for portfolio optimiza- tion

Vincenzo Tola, Fabrizio Lillo, Mauro Gallegati, and Rosario N Mantegna. Cluster analysis for portfolio optimiza- tion. Journal of Economic Dynamics and Control, 32(1):235–258, 2008

work page 2008

[24] [24]

When do improved covariance matrix estimators enhance portfolio optimization? an empirical comparative study of nine estimators

Ester Pantaleo, Michele Tumminello, Fabrizio Lillo, and Rosario N Mantegna. When do improved covariance matrix estimators enhance portfolio optimization? an empirical comparative study of nine estimators. Quantitative Finance, 11(7):1067–1080, 2011

work page 2011

[25] [25]

Two-step estimators of high-dimensional correlation matrices

Andrés García-Medina, Salvatore Miccichè, and Rosario N Mantegna. Two-step estimators of high-dimensional correlation matrices. Physical Review E, 108(4):044137, 2023

work page 2023

[26] [26]

Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models

Robert Engle. Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models. Journal of Business & Economic Statistics, 20(3):339–350, 2002

work page 2002

[27] [27]

Covariance matrix filtering and portfolio optimisation: the average oracle vs non-linear shrinkage and all the variants of dcc-nls

Christian Bongiorno and Damien Challet. Covariance matrix filtering and portfolio optimisation: the average oracle vs non-linear shrinkage and all the variants of dcc-nls. Quantitative Finance, pages 1–8, 2024

work page 2024

[28] [28]

Filtering time-dependent covariance matrices using time-independent eigenvalues

Christian Bongiorno, Damien Challet, and Grégoire Loeper. Filtering time-dependent covariance matrices using time-independent eigenvalues. Journal of Statistical Mechanics: Theory and Experiment, 2023(2):023402, 2023

work page 2023

[29] [29]

Model-based vs

Jean-David Fermanian, Benjamin Poignard, and Panos Xidonas. Model-based vs. agnostic methods for the prediction of time-varying covariance matrices. Annals of Operations Research, pages 1–38, 2024

work page 2024

[30] [30]

Quantifying the information lost in optimal covariance matrix cleaning

Christian Bongiorno and Lamia Lamrani. Quantifying the information lost in optimal covariance matrix cleaning. Physica A: Statistical Mechanics and its Applications, 657:130225, 2025

work page 2025

[31] [31]

Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization

Christian Bongiorno and Damien Challet. Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization. Finance Research Letters, 52:103383, 2023

work page 2023

[32] [32]

Log-gases and random matrices (LMS-34)

Peter J Forrester. Log-gases and random matrices (LMS-34). Princeton university press, 1st edition, 2010. pp. 111-115

work page 2010

[33] [33]

Dynamic portfolio optimization using a hybrid mlp-har approach

Caio Mário Mesquita, Cristiano Arbex Valle, and Adriano CM Pereira. Dynamic portfolio optimization using a hybrid mlp-har approach. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI) , pages 1075–1082. IEEE, 2020

work page 2020

[34] [34]

A deep learning framework for medium-term covariance forecasting in multi-asset portfolios

Pedro Reis, Ana Paula Serra, and João Gama. A deep learning framework for medium-term covariance forecasting in multi-asset portfolios. arXiv preprint arXiv:2503.01581, 2025

work page arXiv 2025

[35] [35]

Enhancing portfolio optimization: A two-stage approach with deep learning and portfolio optimization

Shiguo Huang, Linyu Cao, Ruili Sun, Tiefeng Ma, and Shuangzhe Liu. Enhancing portfolio optimization: A two-stage approach with deep learning and portfolio optimization. Mathematics, 12(21):3376, 2024

work page 2024

[36] [36]

Integrating prediction in mean-variance portfolio optimization

Andrew Butler and Roy H Kwon. Integrating prediction in mean-variance portfolio optimization. Quantitative Finance, 23(3):429–452, 2023

work page 2023

[37] [37]

Deep learning for portfolio optimization

Zihao Zhang, Stefan Zohren, and Stephen Roberts. Deep learning for portfolio optimization. The Journal of Financial Data Science, 2(4):8–20, 2020

work page 2020

[38] [38]

Distributionally robust end-to-end portfolio construction

Giorgio Costa and Garud N Iyengar. Distributionally robust end-to-end portfolio construction. Quantitative Finance, 23(10):1465–1482, 2023. 21 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 2023

[39] [39]

End-to-end risk budgeting portfolio optimization with neural networks

A Sinem Uysal, Xiaoyue Li, and John M Mulvey. End-to-end risk budgeting portfolio optimization with neural networks. Annals of Operations Research, 339(1):397–426, 2024

work page 2024

[40] [40]

Deep deterministic portfolio optimization

Ayman Chaouki, Stephen Hardiman, Christian Schmidt, Emmanuel Sérié, and Joachim De Lataillade. Deep deterministic portfolio optimization. The Journal of Finance and Data Science, 6:16–30, 2020

work page 2020

[41] [41]

Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory

Junkyu Jang and NohYoon Seong. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications, 218:119556, 2023

work page 2023

[42] [42]

Reinforcement learning for deep portfolio optimization

Ruyu Yan, Jiafei Jin, and Kun Han. Reinforcement learning for deep portfolio optimization. Electronic Research Archive, 32(9), 2024

work page 2024

[43] [43]

Optimization-based spectral end-to-end deep reinforcement learning for equity portfolio management

Pengrui Yu, Siya Liu, Chengneng Jin, Runsheng Gu, and Xiaomin Gong. Optimization-based spectral end-to-end deep reinforcement learning for equity portfolio management. Pacific-Basin Finance Journal, 91:102746, 2025

work page 2025

[44] [44]

Dominating estimators for the global minimum variance portfolio

Gabriel Frahm and Christoph Memmel. Dominating estimators for the global minimum variance portfolio. Technical Report 01/2009, Deutsche Bundesbank, January 2009

work page 2009

[45] [45]

Muirhead

Robb J. Muirhead. Aspects of Multivariate Statistical Theory. John Wiley & Sons, 1st edition, 1982. pp. 390-405

work page 1982

[46] [46]

Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies, 22(5):1915–1953, 2009

Victor DeMiguel, Lorenzo Garlappi, and Raman Uppal. Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies, 22(5):1915–1953, 2009

work page 1915

[47] [47]

Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation

Robert F Engle. Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica: Journal of the Econometric Society, pages 987–1007, 1982

work page 1982

[48] [48]

Gilles O. Zumbach. V olatility processes and volatility forecast with long memory.Quantitative Finance, 4(1):70, oct 2003

work page 2003

[49] [49]

On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results

Michael J Best and Robert R Grauer. On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results. The Review of Financial Studies, 4(2):315–342, 1991

work page 1991

[50] [50]

Empirical evidence on student-t log-returns of diversified world stock indices

Eckhard Platen and Renata Rendek. Empirical evidence on student-t log-returns of diversified world stock indices. Journal of Statistical Theory and Practice, 2(2):233–251, 2008

work page 2008

[51] [51]

The likelihood of various stock market return distributions, part 2: Empirical results

Harry M Markowitz and Nilufer Usmen. The likelihood of various stock market return distributions, part 2: Empirical results. Journal of Risk and Uncertainty, 13:221–247, 1996

work page 1996

[52] [52]

Optimal covariance cleaning for heavy-tailed distributions: Insights from information theory

Christian Bongiorno and Marco Berritta. Optimal covariance cleaning for heavy-tailed distributions: Insights from information theory. Physical Review E, 108(5):054133, 2023

work page 2023

[53] [53]

Risk reduction in large portfolios: Why imposing the wrong constraints helps

Ravi Jagannathan and Tongshu Ma. Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance, 58(4):1651–1683, 2003

work page 2003

[54] [54]

Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks

Olivier Ledoit and Michael Wolf. Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks. The Review of Financial Studies, 30(12):4349–4388, 06 2017

work page 2017

[55] [55]

Gilles O. Zumbach. The riskmetrics 2006 methodology. Technical Report 185, RiskMetrics Group, Geneva, Switzerland, March 2007

work page 2006

[56] [56]

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabás Poczos, Ruslan Salakhutdinov, and Alexander J. Smola. Deep sets. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017

[57] [57]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017

work page 2017

[58] [58]

Fuchs, Martin Engelcke, Michael A

Edward Wagstaff, Fabian B. Fuchs, Martin Engelcke, Michael A. Osborne, and Ingmar Posner. Universal approximation of functions on sets. Journal of Machine Learning Research, 23(21-0730), 2021

work page 2021

[59] [59]

Understanding the difficulty of training transformers

Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Jiawei Han. Understanding the difficulty of training transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5747–5763, 2020

work page 2020

[60] [60]

Attention is not all you need: Pure attention loses rank doubly exponentially with depth

Yihe Dong, Jean-Baptiste Cordonnier, and Andreas Loukas. Attention is not all you need: Pure attention loses rank doubly exponentially with depth. In International Conference on Machine Learning, pages 2793–2803. PMLR, 2021

work page 2021

[61] [61]

Gers, Jürgen Schmidhuber, and Fred Cummins

Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Learning to forget: Continual prediction with lstm. Neural Computation, 12(10):2451–2471, 2000

work page 2000

[62] [62]

Mike Schuster and Kuldip K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681, November 1997. 22 Bongiorno et al., End-to-End GMV Porfolio with NNs

work page 1997

[63] [63]

Securities and Exchange Commission

U.S. Securities and Exchange Commission. 17 CFR §240.12d2-2 – Removal from Listing and Registration. https://www.law.cornell.edu/cfr/text/17/240.12d2-2, 2024

work page 2024

[64] [64]

A Modern Introduction to Probability and Statistics: Understanding why and how

Frederik Michel Dekking. A Modern Introduction to Probability and Statistics: Understanding why and how . Springer Science & Business Media, 2005. pg. 231-243

work page 2005

[65] [65]

Large dynamic covariance matrices

Robert F Engle, Olivier Ledoit, and Michael Wolf. Large dynamic covariance matrices. Journal of Business & Economic Statistics, 37(2):363–375, 2019

work page 2019

[66] [66]

J. P. Morgan Guaranty Trust Company and Reuters Ltd. Riskmetrics™ —technical document. Technical report, J. P. Morgan Guaranty Trust Company and Reuters Ltd., New York, December 1996

work page 1996

[67] [67]

Simple multivariate conditional covariance dynamics using hyperbolically weighted moving averages

Hiroyuki Kawakatsu. Simple multivariate conditional covariance dynamics using hyperbolically weighted moving averages. Journal of Econometric Methods, 10(1):33–52, 2021

work page 2021

[68] [68]

Mesoscopic community structure of financial markets revealed by price and sign fluctuations

Assaf Almog, Ferry Besamusca, Mel MacMahon, and Diego Garlaschelli. Mesoscopic community structure of financial markets revealed by price and sign fluctuations. PloS one, 10(7):e0133679, 2015

work page 2015

[69] [69]

On the methods of measuring association between two attributes

G Udny Yule. On the methods of measuring association between two attributes. Journal of the Royal Statistical Society, 75(6):579–652, 1912

work page 1912

[70] [70]

Kendall correlation coefficients for portfolio optimization

Tomas Espana, Victor Le Coz, and Matteo Smerlak. Kendall correlation coefficients for portfolio optimization. arXiv preprint arXiv:2410.17366, 2024

work page arXiv 2024

[71] [71]

Optnet: Differentiable optimization as a layer in neural networks

Brandon Amos and J Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pages 136–145. PMLR, 2017

work page 2017

[72] [72]

Demystifying equity risk-based strategies: A simple alpha plus beta description

Raul Leote, Xiao Lu, and Pierre Moulin. Demystifying equity risk-based strategies: A simple alpha plus beta description. Journal of Portfolio Management, 38(3):56–70, 2012

work page 2012

[73] [73]

Cap-weighted portfolios are sub-optimal portfolios

Jason C Hsu. Cap-weighted portfolios are sub-optimal portfolios. Journal of Investment Management, 4(3), 2004

work page 2004

[74] [74]

A new method to estimate the noise in financial correlation matrices

Thomas Guhr and Bernd Kälber. A new method to estimate the noise in financial correlation matrices. Journal of Physics A: Mathematical and General, 36(12):3009, 2003

work page 2003

[75] [75]

Scikit-learn

Oliver Kramer and Oliver Kramer. Scikit-learn. Machine Learning for Evolution Strategies, pages 45–53, 2016

work page 2016

[76] [76]

covShrinkage: A package for shrinkage estimation of covariance matrices

Patrick Ledoit. covShrinkage: A package for shrinkage estimation of covariance matrices. https://github. com/pald22/covShrinkage, 2022. Accessed: 2025-06-20

work page 2022

[77] [77]

Enhancing high-dimensional dynamic conditional angular correlation model based on garch family models: Comparative performance analysis for portfolio optimization

Zhangshuang Sun, Xuerui Gao, Kangyang Luo, Yanqin Bai, Jiyuan Tao, and Guoqiang Wang. Enhancing high-dimensional dynamic conditional angular correlation model based on garch family models: Comparative performance analysis for portfolio optimization. Finance Research Letters, 75:106808, 2025

work page 2025

[78] [78]

An index of portfolio diversification

Walt Woerheide and Don Persson. An index of portfolio diversification. Financial Services Review, 2(2):73–85, 1992

work page 1992

[79] [79]

Commissions & Fees

Interactive Brokers. Commissions & Fees . https://www.interactivebrokers.com/en/pricing/ commissions-home.php, 2025. Accessed: 2025-06-19

work page 2025

[80] [80]

Benchmark interest calculation reference rate descriptions.https://www.ibkrguides

Interactive Brokers LLC. Benchmark interest calculation reference rate descriptions.https://www.ibkrguides. com/kb/en-us/benchmark-interest-calculation-reference-rate-descriptions.htm , 2025. Last updated July 8, 2025

work page 2025