Adaptive Multi-task Learning for Multi-sector Portfolio Optimization

Qingliang Fan; Ruike Wu; Yanrong Yang

arxiv: 2507.16433 · v2 · submitted 2025-07-22 · 📊 stat.ME · cs.LG

Adaptive Multi-task Learning for Multi-sector Portfolio Optimization

Qingliang Fan , Ruike Wu , Yanrong Yang This is my paper

Pith reviewed 2026-05-19 03:33 UTC · model grok-4.3

classification 📊 stat.ME cs.LG

keywords multi-task learningfactor modelsportfolio optimizationprincipal component analysissubspace relatednessadaptive learningmulti-sector

0 comments

The pith

Data-adaptive multi-task learning quantifies subspace relatedness across sectors to improve factor model estimation and portfolio optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors develop a method to transfer information across sectors by learning how their factor-driven temporal patterns relate to each other. They show that this adaptive sharing of subspace information leads to more accurate recovery of factor models when estimated jointly rather than separately. The improvement carries over to portfolio optimization, which relies on those models for asset allocation in multi-sector settings with many assets. A projection-penalized version of principal component analysis is introduced as the practical algorithm to achieve this learning from data.

Core claim

We propose a novel data-adaptive multi-task learning methodology that quantifies and learns the relatedness among the principal temporal subspaces spanned by factors across multiple sectors under study. This approach improves the simultaneous estimation of multiple factor models and enhances multi-sector portfolio optimization, which heavily depends on the accurate recovery of these factor models. A novel and easy-to-implement algorithm, termed projection-penalized principal component analysis, is developed to accomplish the multi-task learning procedure.

What carries the argument

The projection-penalized principal component analysis algorithm, which enforces learned relatedness by penalizing differences in projections onto principal temporal subspaces across sectors.

If this is right

More accurate simultaneous estimation of factor models for multiple sectors.
Enhanced performance in multi-sector portfolio optimization tasks.
Effective handling of large asset universes across different classes through adaptive information transfer.
Demonstrated advantages in both simulations and real daily return data from Russell 3000 index.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to other multi-group statistical estimation problems where shared low-dimensional structures exist.
It suggests that explicit modeling of subspace overlap can be a useful alternative to standard regularization in high-dimensional finance applications.
Future work could test the method on intraday data or international markets to check robustness.

Load-bearing premise

The premise that relatedness among principal temporal subspaces across sectors exists and can be learned from data in a way that genuinely improves estimation without adding bias or causing overfitting.

What would settle it

Observing no improvement or even worse performance in out-of-sample portfolio returns when using the multi-task method compared to estimating each sector's factor model independently.

Figures

Figures reproduced from arXiv: 2507.16433 by Qingliang Fan, Ruike Wu, Yanrong Yang.

**Figure 2.** Figure 2: Time plots of cumulative excess portfolio return in Chemicals and Allied Products [PITH_FULL_IMAGE:figures/full_fig_p029_2.png] view at source ↗

read the original abstract

Accurate transfer of information across multiple sectors to enhance model estimation is both significant and challenging in multi-sector portfolio optimization involving a large number of assets in different classes. Within the framework of factor modeling, we propose a novel data-adaptive multi-task learning methodology that quantifies and learns the relatedness among the principal temporal subspaces (spanned by factors) across multiple sectors under study. This approach not only improves the simultaneous estimation of multiple factor models but also enhances multi-sector portfolio optimization, which heavily depends on the accurate recovery of these factor models. Additionally, a novel and easy-to-implement algorithm, termed projection-penalized principal component analysis, is developed to accomplish the multi-task learning procedure. Diverse simulation designs and practical application on daily return data from Russell 3000 index demonstrate the advantages of multi-task learning methodology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper puts forward a projection-penalized PCA for adaptive multi-task learning of cross-sector temporal subspaces in factor models, which could help with limited per-sector data in portfolio work, though the reported gains rest on thin validation details.

read the letter

The core contribution is a projection-penalized PCA algorithm inside a data-adaptive multi-task setup that tries to measure and exploit relatedness between the leading temporal subspaces of factor models across sectors. This is positioned as a way to improve joint estimation when individual sectors have sparse or noisy returns, which then feeds into better multi-sector portfolio optimization. The abstract and simulations plus the Russell 3000 application are meant to show practical gains from this borrowing of strength. That framing is straightforward and targets a real constraint in institutional settings where assets are grouped by industry or style with uneven sample sizes. The algorithm itself looks like a concrete technical step rather than a re-labeling of existing penalties. On the positive side, the emphasis on learning subspace overlap directly from the data avoids assuming a fixed hierarchy or complete pooling, which is a reasonable direction for factor work in finance. The stress-test point about possible bias when sectors differ in volatility or when true overlap is only moderate is worth watching, but the abstract does not supply enough on how the penalty term behaves under those conditions or whether the recovered factors stay stable. The reported advantages in simulations and real data are stated without error bars, explicit baseline tables, or checks for post-hoc tuning, so it is hard to judge how much the method actually moves the needle versus standard separate estimation or simpler shrinkage. This paper is mainly for researchers in financial econometrics who already work with multi-factor models and want tools for grouped assets. A practitioner building broad-index portfolios might skim it for the idea, but would need the full empirical section to decide on implementation. It is coherent enough on its own terms to go to a serious referee, with the main request being tighter evidence on robustness and comparisons. I would recommend sending it out for review rather than desk rejection.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a novel data-adaptive multi-task learning methodology for multi-sector portfolio optimization. Within a factor modeling framework, it introduces projection-penalized principal component analysis to quantify and learn relatedness among principal temporal subspaces spanned by factors across sectors. The approach is claimed to improve simultaneous estimation of multiple factor models and thereby enhance multi-sector portfolio optimization. The method is evaluated via diverse simulation designs and an application to daily returns from the Russell 3000 index.

Significance. If the central claims are substantiated, the work could contribute to financial statistics by offering an adaptive mechanism for information transfer across sectors in high-dimensional factor models, potentially yielding more accurate covariance estimates for portfolio construction. The development of a new, easy-to-implement algorithm for this multi-task setting is a practical element worth noting, though the absence of detailed validation metrics in the abstract limits immediate assessment of its impact.

major comments (2)

Abstract and empirical sections: the reported advantages in simulations and Russell 3000 data are presented without details on validation metrics, error bars, baseline comparisons, or safeguards against post-hoc choices, which prevents verification of the claimed improvements in factor recovery and portfolio performance.
Method section on projection-penalized PCA: the central premise that the data-adaptive penalty reliably quantifies subspace relatedness without introducing bias is load-bearing, yet the manuscript does not analyze or simulate cases where sectors differ in volatility or factor strength; this risks the penalty pulling estimates toward an artificial common subspace and distorting leading eigenvectors used in covariance construction.

minor comments (1)

Clarify notation for the penalty term and its tuning in the algorithm description to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below and outline the revisions we will make to strengthen the presentation and analysis.

read point-by-point responses

Referee: Abstract and empirical sections: the reported advantages in simulations and Russell 3000 data are presented without details on validation metrics, error bars, baseline comparisons, or safeguards against post-hoc choices, which prevents verification of the claimed improvements in factor recovery and portfolio performance.

Authors: We agree that more explicit reporting of metrics and safeguards would improve verifiability. In the revised manuscript we will expand the empirical sections to report specific validation metrics (e.g., mean squared error for factor loading recovery, out-of-sample Sharpe ratios and turnover for portfolios), include error bars or standard deviations across repeated simulation runs and data splits, list all baseline methods with their implementation details, and describe the cross-validation procedure used for hyperparameter selection. The abstract will be updated to reference these key quantitative improvements. revision: yes
Referee: Method section on projection-penalized PCA: the central premise that the data-adaptive penalty reliably quantifies subspace relatedness without introducing bias is load-bearing, yet the manuscript does not analyze or simulate cases where sectors differ in volatility or factor strength; this risks the penalty pulling estimates toward an artificial common subspace and distorting leading eigenvectors used in covariance construction.

Authors: The penalty is constructed to be data-adaptive, with its magnitude governed by the empirical alignment of the principal temporal subspaces; our current simulation designs already vary factor strengths across sectors. To directly examine the referee's concern regarding heterogeneous volatilities, we will add a new simulation scenario in the revision in which sector-specific volatilities and factor strengths are deliberately mismatched. We will report the resulting estimated penalty values, subspace overlap measures, and eigenvector recovery errors relative to ground truth, thereby demonstrating that the procedure does not force an artificial common subspace when the data indicate otherwise. revision: yes

Circularity Check

0 steps flagged

New projection-penalized PCA algorithm adds independent adaptive penalty for subspace relatedness

full rationale

The paper proposes a novel data-adaptive multi-task learning method via a new algorithm (projection-penalized PCA) to quantify relatedness among principal temporal subspaces across sectors. This is presented as an explicit algorithmic contribution rather than a reduction of the target factor recovery or portfolio optimization result to a fitted parameter defined by the same quantity. No self-definitional loops, fitted-input predictions, or load-bearing self-citations are identifiable from the provided description; the derivation chain relies on the new penalty term and is validated through simulations and Russell 3000 data, keeping circularity low and non-load-bearing.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard factor-model assumptions in finance and the premise that subspace relatedness is learnable from data; no new free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Factor models provide a useful representation of asset returns across sectors
The entire methodology is developed within the framework of factor modeling.

pith-pipeline@v0.9.0 · 5661 in / 1340 out tokens · 42572 ms · 2026-05-19T03:33:02.605097+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

min L = Σ_m (1/T) Σ_t ||r_t^(m) − B^(m)F_t^(m)||² + (λ/T) Σ_{m<m′} ||P^(m) − P^(m′)||_F² (projection-penalized factor model)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

relatedness φ_i = Σ_{j≠i} ||P^(i) − P^(j)|| (Definition 1)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

Li, and X

Ao, M., Y. Li, and X. Zheng (2019). Approaching mean-variance efficiency for large portfolios. The Review of Financial Studies\/ 32\/ (7), 2890--2919

work page 2019
[2]

Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica\/ 71\/ (1), 135--171

work page 2003
[3]

Bai, J. and S. Ng (2002). Determining the number of factors in approximate factor models. Econometrica\/ 70\/ (1), 191--221

work page 2002
[4]

Barroso, P. and K. Saxena (2022). Lest we forget: Learn from out-of-sample forecast errors when optimizing portfolios. The Review of Financial Studies\/ 35\/ (3), 1222--1278

work page 2022
[5]

Parolya, and W

Bodnar, T., N. Parolya, and W. Schmid (2018). Estimation of the global minimum variance portfolio in high dimensions. European Journal of Operational Research\/ 266\/ (1), 371--390

work page 2018
[6]

Marrone, L

Bollerslev, T., J. Marrone, L. Xu, and H. Zhou (2014). Stock return predictability and variance risk premia: Statistical inference and international evidence. Journal of Financial and Quantitative Analysis\/ 49\/ (3), 633--661

work page 2014
[7]

Liu, and Y

Cai, T., M. Liu, and Y. Xia (2022). Individual data protected integrative regression analysis of high-dimensional heterogeneous data. Journal of the American Statistical Association\/ 117\/ (540), 2105--2119

work page 2022
[8]

Cai, T. and W. Liu (2011). Adaptive thresholding for sparse covariance matrix estimation. Journal of the American Statistical Association\/ 106\/ (494), 672--684

work page 2011
[9]

Cai, T. T., D. Kim, and H. Pu (2024). Transfer learning for functional mean estimation: Phase transition and adaptive algorithms. The Annals of Statistics\/ 52\/ (2), 654--678

work page 2024
[10]

Cao, H., H. Gu, X. Guo, and M. Rosenbaum (2023). Transfer learning for portfolio optimization. arXiv preprint arXiv:2307.13546\/

work page arXiv 2023
[11]

Chamberlain, G. and M. Rothschild (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica\/ 51 , 1281 -- 1304

work page 1983
[12]

Kearns, and J

Crammer, K., M. Kearns, and J. Wortman (2008). Learning from multiple sources. Journal of Machine Learning Research\/ 9\/ (57), 1757--1774

work page 2008
[13]

Garlappi, F

DeMiguel, V., L. Garlappi, F. J. Nogales, and R. Uppal (2009). A generalized approach to portfolio optimization: Improving performance by constraining portfolio norms. Management Science\/ 55\/ (5), 798--812

work page 2009
[14]

Li, and X

Ding, Y., Y. Li, and X. Zheng (2021). High dimensional minimum variance portfolio estimation under statistical factor models. Journal of Econometrics\/ 222\/ (1B), 502--515

work page 2021
[15]

Duan, Y. and K. Wang (2023). Adaptive and robust multi-task learning. The Annals of Statistics\/ 51\/ (5), 2015--2039

work page 2023
[16]

Evgeniou, T., C. A. Micchelli, M. Pontil, and J. Shawe-Taylor (2005). Learning multiple tasks with kernel methods. Journal of machine learning research\/ 6\/ (4)

work page 2005
[17]

Evgeniou, T. and M. Pontil (2004). Regularized multi--task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining , pp.\ 109--117

work page 2004
[18]

Fama, E. F. and K. R. French (1992). The cross-section of expected stock returns. Journal of Finance\/ 47\/ (2), 427--465

work page 1992
[19]

Fama, E. F. and K. R. French (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics\/ 33\/ (1), 3--56

work page 1993
[20]

Fama, E. F. and K. R. French (2015). A five-factor asset pricing model. Journal of Financial Economics\/ 116\/ (1), 1--22

work page 2015
[21]

Li, C.-H

Fan, J., R. Li, C.-H. Zhang, and H. Zou (2020). Statistical Foundations of Data Science . Chapman and Hall/CRC

work page 2020
[22]

Liao, and M

Fan, J., Y. Liao, and M. Mincheva (2011). High dimensional covariance matrix estimation in approximate factor models. The Annals of Statistics\/ 39\/ (6), 3320

work page 2011
[23]

Liao, and M

Fan, J., Y. Liao, and M. Mincheva (2013). Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology)\/ 75\/ (4), 603--680

work page 2013
[24]

Liao, and X

Fan, J., Y. Liao, and X. Shi (2015). Risks of large portfolios. Journal of Econometrics\/ 186\/ (2), 367--387

work page 2015
[25]

Zhang, and K

Fan, J., J. Zhang, and K. Yu (2012). Vast portfolio selection with gross-exposure constraints. Journal of the American Statistical Association\/ 107\/ (498), 592--606

work page 2012
[26]

Fan, Q., R. Wu, Y. Yang, and W. Zhong (2024). Time-varying minimum variance portfolio. Journal of Econometrics\/ 239 , 105339

work page 2024
[27]

Fang, K., X. Fan, Q. Zhang, and S. Ma (2018). Integrative sparse principal component analysis. Journal of Multivariate Analysis\/ 166 , 1--16

work page 2018
[28]

Giglio, and D

Feng, G., S. Giglio, and D. Xiu (2020). Taming the factor zoo: A test of new factors. The Journal of Finance\/ 75\/ (3), 1327--1370

work page 2020
[29]

Griffin, J. M. (2002). Are the fama and french factors global or country specific? The Review of Financial Studies\/ 15\/ (3), 783--803

work page 2002
[30]

Han, and R

Gu, T., Y. Han, and R. Duan (2024). Robust angle-based transfer learning in high dimensions. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ , qkae111

work page 2024
[31]

Harvey, C. R. and Y. Liu (2021). Lucky factors. Journal of Financial Economics\/ 141\/ (2), 413--435

work page 2021
[32]

Subramanian, and P

Hawawini, G., V. Subramanian, and P. Verdin (2003). Is performance driven by industry-or firm-specific factors? a new look at the evidence. Strategic management journal\/ 24\/ (1), 1--16

work page 2003
[33]

Huang, X., K. Xu, D. Lee, H. Hassani, H. Bastani, and E. Dobriban (2025). Optimal multitask linear regression and contextual bandits under sparse heterogeneity. Journal of the American Statistical Association\/ , 1--14

work page 2025
[34]

Vert, and F

Jacob, L., J.-p. Vert, and F. Bach (2008). Clustered multi-task learning: A convex formulation. Advances in neural information processing systems\/ 21

work page 2008
[35]

Jagannathan, R. and T. Ma (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance\/ 58\/ (4), 1651--1683

work page 2003
[36]

Knight, P. and R. Duan (2023). Multi-task learning with summary statistics. Advances in neural information processing systems\/ 36 , 54020--54031

work page 2023
[37]

Kose, M. A., C. Otrok, and C. H. Whiteman (2003). International business cycles: World, region, and country-specific factors. American Economic Review\/ 93\/ (4), 1216--1239

work page 2003
[38]

Li, F. and H. Sang (2019). Spatial homogeneity pursuit of regression coefficients for large datasets. Journal of the American Statistical Association\/ 114\/ (527), 1050--1062

work page 2019
[39]

Li, S., T. T. Cai, and H. Li (2022). Transfer learning for high-dimensional linear regression: Prediction, estimation and minimax optimality. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 84\/ (1), 149--173

work page 2022
[40]

Li, S., T. T. Cai, and H. Li (2023). Transfer learning in large-scale gaussian graphical models with false discovery rate control. Journal of the American Statistical Association\/ 118\/ (543), 2171--2183

work page 2023
[41]

Markowitz, H. (1952). Portfolio selection. Journal of Finance\/ 7\/ (1), 77--91

work page 1952
[42]

McDonald, A. M., M. Pontil, and D. Stamos (2016). New perspectives on k-support and cluster norms. Journal of Machine Learning Research\/ 17\/ (155), 1--38

work page 2016
[43]

Lutz, and D

M \"o rstedt, T., B. Lutz, and D. Neumann (2024). Cross validation based transfer learning for cross-sectional non-linear shrinkage: A data-driven approach in portfolio optimization. European Journal of Operational Research\/ 318\/ (2), 670--685

work page 2024
[44]

Ross, S. A. (1976). The arbitrage theory of capital asset pricing. Journal of Economic Theory\/ 13\/ (3), 341--360

work page 1976
[45]

Tang, L. and P. X. Song (2016). Fused lasso approach in regression coefficients clustering--learning parameter heterogeneity in data integration. Journal of Machine Learning Research\/ 17\/ (113), 1--23

work page 2016
[46]

Tian, Y. and Y. Feng (2023). Transfer learning under high-dimensional generalized linear models. Journal of the American Statistical Association\/ 118\/ (544), 2684--2697

work page 2023
[47]

Wang, H., B. Peng, D. Li, and C. Leng (2021). Nonparametric estimation of large covariance matrices with conditional sparsity. Journal of Econometrics\/ 223\/ (1), 53--72

work page 2021
[48]

Wu, S., H. R. Zhang, and C. R \'e (2020). Understanding and improving information transfer in multi-task learning. arXiv preprint arXiv:2005.00944\/

work page arXiv 2020
[49]

Xu, K. and H. Bastani (2025). Multitask learning and bandits via robust statistics. Management Science\/

work page 2025
[50]

Liu, and Z

Zhang, X., J. Liu, and Z. Z. and (2024). Learning coefficient heterogeneity over networks: A distributed spanning-tree-based fused-lasso regression. Journal of the American Statistical Association\/ 119\/ (545), 485--497

work page 2024
[51]

Zhang, Y. and Z. Zhu (2022). Transfer learning for high-dimensional quantile regression via convolution smoothing. arXiv preprint arXiv:2212.00428\/

work page arXiv 2022

[1] [1]

Li, and X

Ao, M., Y. Li, and X. Zheng (2019). Approaching mean-variance efficiency for large portfolios. The Review of Financial Studies\/ 32\/ (7), 2890--2919

work page 2019

[2] [2]

Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica\/ 71\/ (1), 135--171

work page 2003

[3] [3]

Bai, J. and S. Ng (2002). Determining the number of factors in approximate factor models. Econometrica\/ 70\/ (1), 191--221

work page 2002

[4] [4]

Barroso, P. and K. Saxena (2022). Lest we forget: Learn from out-of-sample forecast errors when optimizing portfolios. The Review of Financial Studies\/ 35\/ (3), 1222--1278

work page 2022

[5] [5]

Parolya, and W

Bodnar, T., N. Parolya, and W. Schmid (2018). Estimation of the global minimum variance portfolio in high dimensions. European Journal of Operational Research\/ 266\/ (1), 371--390

work page 2018

[6] [6]

Marrone, L

Bollerslev, T., J. Marrone, L. Xu, and H. Zhou (2014). Stock return predictability and variance risk premia: Statistical inference and international evidence. Journal of Financial and Quantitative Analysis\/ 49\/ (3), 633--661

work page 2014

[7] [7]

Liu, and Y

Cai, T., M. Liu, and Y. Xia (2022). Individual data protected integrative regression analysis of high-dimensional heterogeneous data. Journal of the American Statistical Association\/ 117\/ (540), 2105--2119

work page 2022

[8] [8]

Cai, T. and W. Liu (2011). Adaptive thresholding for sparse covariance matrix estimation. Journal of the American Statistical Association\/ 106\/ (494), 672--684

work page 2011

[9] [9]

Cai, T. T., D. Kim, and H. Pu (2024). Transfer learning for functional mean estimation: Phase transition and adaptive algorithms. The Annals of Statistics\/ 52\/ (2), 654--678

work page 2024

[10] [10]

Cao, H., H. Gu, X. Guo, and M. Rosenbaum (2023). Transfer learning for portfolio optimization. arXiv preprint arXiv:2307.13546\/

work page arXiv 2023

[11] [11]

Chamberlain, G. and M. Rothschild (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica\/ 51 , 1281 -- 1304

work page 1983

[12] [12]

Kearns, and J

Crammer, K., M. Kearns, and J. Wortman (2008). Learning from multiple sources. Journal of Machine Learning Research\/ 9\/ (57), 1757--1774

work page 2008

[13] [13]

Garlappi, F

DeMiguel, V., L. Garlappi, F. J. Nogales, and R. Uppal (2009). A generalized approach to portfolio optimization: Improving performance by constraining portfolio norms. Management Science\/ 55\/ (5), 798--812

work page 2009

[14] [14]

Li, and X

Ding, Y., Y. Li, and X. Zheng (2021). High dimensional minimum variance portfolio estimation under statistical factor models. Journal of Econometrics\/ 222\/ (1B), 502--515

work page 2021

[15] [15]

Duan, Y. and K. Wang (2023). Adaptive and robust multi-task learning. The Annals of Statistics\/ 51\/ (5), 2015--2039

work page 2023

[16] [16]

Evgeniou, T., C. A. Micchelli, M. Pontil, and J. Shawe-Taylor (2005). Learning multiple tasks with kernel methods. Journal of machine learning research\/ 6\/ (4)

work page 2005

[17] [17]

Evgeniou, T. and M. Pontil (2004). Regularized multi--task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining , pp.\ 109--117

work page 2004

[18] [18]

Fama, E. F. and K. R. French (1992). The cross-section of expected stock returns. Journal of Finance\/ 47\/ (2), 427--465

work page 1992

[19] [19]

Fama, E. F. and K. R. French (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics\/ 33\/ (1), 3--56

work page 1993

[20] [20]

Fama, E. F. and K. R. French (2015). A five-factor asset pricing model. Journal of Financial Economics\/ 116\/ (1), 1--22

work page 2015

[21] [21]

Li, C.-H

Fan, J., R. Li, C.-H. Zhang, and H. Zou (2020). Statistical Foundations of Data Science . Chapman and Hall/CRC

work page 2020

[22] [22]

Liao, and M

Fan, J., Y. Liao, and M. Mincheva (2011). High dimensional covariance matrix estimation in approximate factor models. The Annals of Statistics\/ 39\/ (6), 3320

work page 2011

[23] [23]

Liao, and M

Fan, J., Y. Liao, and M. Mincheva (2013). Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology)\/ 75\/ (4), 603--680

work page 2013

[24] [24]

Liao, and X

Fan, J., Y. Liao, and X. Shi (2015). Risks of large portfolios. Journal of Econometrics\/ 186\/ (2), 367--387

work page 2015

[25] [25]

Zhang, and K

Fan, J., J. Zhang, and K. Yu (2012). Vast portfolio selection with gross-exposure constraints. Journal of the American Statistical Association\/ 107\/ (498), 592--606

work page 2012

[26] [26]

Fan, Q., R. Wu, Y. Yang, and W. Zhong (2024). Time-varying minimum variance portfolio. Journal of Econometrics\/ 239 , 105339

work page 2024

[27] [27]

Fang, K., X. Fan, Q. Zhang, and S. Ma (2018). Integrative sparse principal component analysis. Journal of Multivariate Analysis\/ 166 , 1--16

work page 2018

[28] [28]

Giglio, and D

Feng, G., S. Giglio, and D. Xiu (2020). Taming the factor zoo: A test of new factors. The Journal of Finance\/ 75\/ (3), 1327--1370

work page 2020

[29] [29]

Griffin, J. M. (2002). Are the fama and french factors global or country specific? The Review of Financial Studies\/ 15\/ (3), 783--803

work page 2002

[30] [30]

Han, and R

Gu, T., Y. Han, and R. Duan (2024). Robust angle-based transfer learning in high dimensions. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ , qkae111

work page 2024

[31] [31]

Harvey, C. R. and Y. Liu (2021). Lucky factors. Journal of Financial Economics\/ 141\/ (2), 413--435

work page 2021

[32] [32]

Subramanian, and P

Hawawini, G., V. Subramanian, and P. Verdin (2003). Is performance driven by industry-or firm-specific factors? a new look at the evidence. Strategic management journal\/ 24\/ (1), 1--16

work page 2003

[33] [33]

Huang, X., K. Xu, D. Lee, H. Hassani, H. Bastani, and E. Dobriban (2025). Optimal multitask linear regression and contextual bandits under sparse heterogeneity. Journal of the American Statistical Association\/ , 1--14

work page 2025

[34] [34]

Vert, and F

Jacob, L., J.-p. Vert, and F. Bach (2008). Clustered multi-task learning: A convex formulation. Advances in neural information processing systems\/ 21

work page 2008

[35] [35]

Jagannathan, R. and T. Ma (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance\/ 58\/ (4), 1651--1683

work page 2003

[36] [36]

Knight, P. and R. Duan (2023). Multi-task learning with summary statistics. Advances in neural information processing systems\/ 36 , 54020--54031

work page 2023

[37] [37]

Kose, M. A., C. Otrok, and C. H. Whiteman (2003). International business cycles: World, region, and country-specific factors. American Economic Review\/ 93\/ (4), 1216--1239

work page 2003

[38] [38]

Li, F. and H. Sang (2019). Spatial homogeneity pursuit of regression coefficients for large datasets. Journal of the American Statistical Association\/ 114\/ (527), 1050--1062

work page 2019

[39] [39]

Li, S., T. T. Cai, and H. Li (2022). Transfer learning for high-dimensional linear regression: Prediction, estimation and minimax optimality. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 84\/ (1), 149--173

work page 2022

[40] [40]

Li, S., T. T. Cai, and H. Li (2023). Transfer learning in large-scale gaussian graphical models with false discovery rate control. Journal of the American Statistical Association\/ 118\/ (543), 2171--2183

work page 2023

[41] [41]

Markowitz, H. (1952). Portfolio selection. Journal of Finance\/ 7\/ (1), 77--91

work page 1952

[42] [42]

McDonald, A. M., M. Pontil, and D. Stamos (2016). New perspectives on k-support and cluster norms. Journal of Machine Learning Research\/ 17\/ (155), 1--38

work page 2016

[43] [43]

Lutz, and D

M \"o rstedt, T., B. Lutz, and D. Neumann (2024). Cross validation based transfer learning for cross-sectional non-linear shrinkage: A data-driven approach in portfolio optimization. European Journal of Operational Research\/ 318\/ (2), 670--685

work page 2024

[44] [44]

Ross, S. A. (1976). The arbitrage theory of capital asset pricing. Journal of Economic Theory\/ 13\/ (3), 341--360

work page 1976

[45] [45]

Tang, L. and P. X. Song (2016). Fused lasso approach in regression coefficients clustering--learning parameter heterogeneity in data integration. Journal of Machine Learning Research\/ 17\/ (113), 1--23

work page 2016

[46] [46]

Tian, Y. and Y. Feng (2023). Transfer learning under high-dimensional generalized linear models. Journal of the American Statistical Association\/ 118\/ (544), 2684--2697

work page 2023

[47] [47]

Wang, H., B. Peng, D. Li, and C. Leng (2021). Nonparametric estimation of large covariance matrices with conditional sparsity. Journal of Econometrics\/ 223\/ (1), 53--72

work page 2021

[48] [48]

Wu, S., H. R. Zhang, and C. R \'e (2020). Understanding and improving information transfer in multi-task learning. arXiv preprint arXiv:2005.00944\/

work page arXiv 2020

[49] [49]

Xu, K. and H. Bastani (2025). Multitask learning and bandits via robust statistics. Management Science\/

work page 2025

[50] [50]

Liu, and Z

Zhang, X., J. Liu, and Z. Z. and (2024). Learning coefficient heterogeneity over networks: A distributed spanning-tree-based fused-lasso regression. Journal of the American Statistical Association\/ 119\/ (545), 485--497

work page 2024

[51] [51]

Zhang, Y. and Z. Zhu (2022). Transfer learning for high-dimensional quantile regression via convolution smoothing. arXiv preprint arXiv:2212.00428\/

work page arXiv 2022