pith. sign in

arxiv: 2606.01650 · v2 · pith:3OOH2IM2new · submitted 2026-06-01 · 💱 q-fin.PM · q-fin.TR· stat.AP· stat.ME

Post Selection Estimation of Sharpe Ratios

Pith reviewed 2026-06-28 11:51 UTC · model grok-4.3

classification 💱 q-fin.PM q-fin.TRstat.APstat.ME
keywords Sharpe ratiopost-selection estimationJames-Stein estimatorempirical Bayessimulation studybias correctionportfolio selectionmaximum Sharpe ratio
0
0 comments X

The pith

James-Stein shrinkage gives the lowest bias and root mean square error when estimating the true Sharpe ratio of the asset with the highest observed sample ratio.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Selecting an asset because it posted the highest Sharpe ratio in a finite sample inflates the reported value relative to its true population ratio. The paper examines several bias-correction methods, including polyhedral-lemma estimators, James-Stein shrinkage, debiasing the expected maximum, thresholding, and empirical Bayes procedures. Simulations that vary sample size, number of assets, spread of population ratios, and correlation structure show that James-Stein shrinkage produces the smallest bias and error across most realistic parameter combinations, with the GMLEB empirical Bayes estimator a close second. The same ordering holds when the estimators are used to rank the outputs of different selection processes.

Core claim

We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this s

What carries the argument

James-Stein shrinkage estimator applied after selection of the maximum observed Sharpe ratio

Load-bearing premise

The simulation designs, including the ranges of sample sizes, number of assets, and distributions of true Sharpe ratios, match the conditions under which these estimators would be used in practice.

What would settle it

A new simulation or real dataset in which the polyhedral-lemma estimator or a simple debiasing method records lower root mean square error than James-Stein across the same grid of sample sizes and asset counts would overturn the reported ranking.

Figures

Figures reproduced from arXiv: 2606.01650 by Steven E. Pav.

Figure 1
Figure 1. Figure 1: The empirical biases of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The empirical RMSE values of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The empirical biases of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The empirical RMSE values of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The empirical RMSE values of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Empirical CDFs of the regret of each method are plotted in these [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The empirical biases of the tested estimators are shown versus the [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The empirical RMSE values of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Rank correlation coefficients of the various tested estimators are plot [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The empirical biases of the tested estimators are shown versus [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The empirical RMSE values of the tested estimators are shown [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Rank correlation coefficients of the various tested estimators are [PITH_FULL_IMAGE:figures/full_fig_p028_17.png] view at source ↗
read the original abstract

We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this selection process. We find that the James Stein estimator provides the best performance across many different realistic values of the relevant parameters, followed by the GMLEB estimator of Jiang and Zhang. These results are fairly robust to correlation of asset returns, with some caveats.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript addresses post-selection estimation of the true Sharpe ratio for the asset with the highest observed in-sample Sharpe ratio. It considers estimators based on the polyhedral lemma, James-Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding, and empirical Bayes (including GMLEB). These are evaluated via Monte Carlo simulations for bias, RMSE, and rank correlation across variations in sample size n, number of assets p, spread and shape of population Sharpe ratios, and asset return correlations. The central finding is that the James-Stein estimator performs best across many parameter values, followed by the GMLEB estimator of Jiang and Zhang, with results described as fairly robust to correlations.

Significance. If the simulation-based ranking holds under realistic conditions, the work would supply practical tools for correcting selection-induced upward bias in reported Sharpe ratios, a frequent issue when ranking assets or teams in quantitative portfolio management. The inclusion of rank-correlation metrics for comparing selection processes across teams is a useful extension beyond point estimation.

major comments (2)
  1. [Simulation study] Simulation study (abstract and methods): The generative model is restricted to i.i.d. Gaussian returns or simple constant-correlation structures with stylized draws (uniform or normal) for population Sharpes. Real equity returns exhibit fat tails, volatility clustering, and factor-driven dependence; these features can shift the bias-variance trade-off of shrinkage estimators and potentially reverse the reported ranking of James-Stein versus GMLEB or polyhedral methods. No sensitivity checks under t-distributed or GARCH-type returns are described.
  2. [Simulation study] Simulation study (abstract): No details are supplied on the exact return-distribution assumptions, simulation protocol (e.g., number of Monte Carlo replications, exact correlation matrices tested), or verification that the implemented estimators match their theoretical derivations. This makes the performance claims (James-Stein best, GMLEB second) difficult to reproduce or assess for robustness.
minor comments (2)
  1. The abstract refers to the 'polyhedral lemma' without a specific citation or brief explanation of how it is applied to Sharpe-ratio post-selection.
  2. Notation for the population versus sample Sharpe ratios and for the selection indicator should be introduced explicitly before the simulation results are presented.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We address each major comment below and indicate the changes we will make to the manuscript.

read point-by-point responses
  1. Referee: [Simulation study] Simulation study (abstract and methods): The generative model is restricted to i.i.d. Gaussian returns or simple constant-correlation structures with stylized draws (uniform or normal) for population Sharpes. Real equity returns exhibit fat tails, volatility clustering, and factor-driven dependence; these features can shift the bias-variance trade-off of shrinkage estimators and potentially reverse the reported ranking of James-Stein versus GMLEB or polyhedral methods. No sensitivity checks under t-distributed or GARCH-type returns are described.

    Authors: We agree that the simulation design is limited to i.i.d. Gaussian returns (or constant-correlation Gaussian) and stylized population Sharpe draws. This is a standard modeling choice that permits direct comparison with the theoretical derivations of the estimators, but it does not capture fat tails, volatility clustering, or factor structure. Consequently, we cannot rule out that the relative performance of James-Stein versus GMLEB or the polyhedral estimator could change under more realistic return processes. In the revision we will add an explicit limitations paragraph in the discussion section that states this caveat and identifies non-Gaussian robustness checks as an important direction for future work. We will not, however, expand the Monte Carlo study to include t-distributed or GARCH returns in the present revision. revision: partial

  2. Referee: [Simulation study] Simulation study (abstract): No details are supplied on the exact return-distribution assumptions, simulation protocol (e.g., number of Monte Carlo replications, exact correlation matrices tested), or verification that the implemented estimators match their theoretical derivations. This makes the performance claims (James-Stein best, GMLEB second) difficult to reproduce or assess for robustness.

    Authors: We apologize for the insufficient documentation. The manuscript will be revised to include a new subsection in the methods that fully specifies: (i) the exact distributional assumptions (i.i.d. normal returns with mean zero and unit volatility, population Sharpes drawn from the stated uniform or normal distributions), (ii) the Monte Carlo protocol (number of replications, random seeds, and parameter grid), (iii) the precise correlation matrices employed (identity and constant-correlation cases), and (iv) the verification steps confirming that each estimator was coded according to the cited theoretical references. These additions will make the simulation results reproducible from the revised text alone. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on independent Monte Carlo evaluation

full rationale

The paper evaluates post-selection Sharpe ratio estimators (polyhedral, James-Stein, GMLEB, etc.) via Monte Carlo simulation, reporting bias, RMSE, and rank correlation against known population parameters. These simulations generate returns and population Sharpes independently of the estimators under test. No equation or procedure reduces a reported performance metric to a fitted input by construction, nor does any central claim rely on a self-citation chain, imported uniqueness theorem, or ansatz smuggled from prior work. The comparison to external benchmarks (Jiang & Zhang GMLEB, classic James-Stein) is independent of the present paper's fitted values.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities used in the derivations or simulations.

pith-pipeline@v0.9.1-grok · 5670 in / 1037 out tokens · 28757 ms · 2026-06-28T11:51:00.035430+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 8 canonical work pages · 1 internal anchor

  1. [1]

    Pseudomathematics and financial charlatanism: The ef- fects of backtest over fitting on out-of-sample performance.Notices of the AMS, 61(5):458–471, 2014

    David H Bailey, Jonathan M Borwein, Marcos L´ opez De Prado, and Qiji Jim Zhu. Pseudomathematics and financial charlatanism: The ef- fects of backtest over fitting on out-of-sample performance.Notices of the AMS, 61(5):458–471, 2014. URLhttps://papers.ssrn.com/sol3/ papers.cfm?abstract_id=2308659

  2. [2]

    Teoria statistica delle classi e calcolo delle probabilita

    Carlo Bonferroni. Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del R istituto superiore di scienze economiche e commericiali di firenze, 8:3–62, 1936

  3. [3]

    Cootner, editor.The Random Character of Stock Market Prices

    Paul H. Cootner, editor.The Random Character of Stock Market Prices. MIT Press, 1964. ISBN 9780262030090. URLhttps://books.google. com/books?id=gcuMygAACAAJ

  4. [4]

    Hsu, editors

    Xinping Cui, Thorsten Dickhaus, Ying Ding, and Jason C. Hsu, editors. Handbook of Multiple Comparisons. Chapman and Hall/CRC, 2022. ISBN 9781032111551

  5. [5]

    Dolan and Jorge J

    Elizabeth D. Dolan and Jorge J. Mor´ e. Benchmarking optimization soft- ware with performance profiles, 2002. URLhttps://arxiv.org/abs/cs/ 0102001

  6. [6]

    E., & Raftery, A

    David L. Donoho and Iain M. Johnstone. Adapting to unknown smooth- ness via wavelet shrinkage.Journal of the American Statistical Associa- tion, 90(432):1200–1224, 1995. doi: 10.1080/01621459.1995.10476626. URL https://imjohnstone.su.domains/WEBLIST/1995/ausws.pdf

  7. [7]

    URLhttps://doi.org/10.1198/ jasa.2011.tm11181

    Bradley Efron. Tweedie’s formula and selection bias.Journal of the Amer- ican Statistical Association, 106(496):1602–1614, 2011. ISSN 01621459. doi: 10.1198/jasa.2011.tm11181. URLhttps://pmc.ncbi.nlm.nih.gov/ articles/PMC3325056/

  8. [8]

    A test for superior predictive ability.Jour- nal of Business and Economic Statistics, 23(4), 2005

    Peter Reinhard Hansen. A test for superior predictive ability.Jour- nal of Business and Economic Statistics, 23(4), 2005. doi: 10.1198/ 073500105000000063. URLhttp://pubs.amstat.org/doi/abs/10.1198/ 073500105000000063

  9. [9]

    Estimation with quadratic loss

    William James and Charles Stein. Estimation with quadratic loss. In Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, pages 361–379. University of California Press, 1961

  10. [10]

    General maximum likelihood empirical Bayes estimation of normal means.The Annals of Statistics, 37(4):1647– 1684, 2009

    Wenhua Jiang and Cun-Hui Zhang. General maximum likelihood empirical Bayes estimation of normal means.The Annals of Statistics, 37(4):1647– 1684, 2009. doi: 10.1214/08-AOS638. URLhttps://doi.org/10.1214/ 08-AOS638

  11. [11]

    General maximum likelihood empirical bayes estimation of normal means, 2009

    Wenhua Jiang and Cun-Hui Zhang. General maximum likelihood empirical bayes estimation of normal means, 2009. URLhttps://arxiv.org/abs/ 0908.1709. 30

  12. [12]

    Silverman

    Iain Johnstone and Bernard W. Silverman. EbayesThresh: R programs for empirical Bayes thresholding.Journal of Statistical Software, 12(8): 1–38, 2005. doi: 10.18637/jss.v012.i08. URLhttps://www.jstatsoft. org/index.php/jss/article/view/v012i08

  13. [13]

    Johnstone and Bernard W

    Iain M. Johnstone and Bernard W. Silverman. Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse se- quences.The Annals of Statistics, 32(4):1594–1649, 2004. doi: 10.1214/009053604000000030. URLhttps://projecteuclid.org/ journals/annals-of-statistics/volume-32/issue-4/Needles-and- straw-in-haystacks--Empirical-Bayes-estimates-of/10...

  14. [14]

    Kaufman.Commodity Trading Systems and Methods

    Perry J. Kaufman.Commodity Trading Systems and Methods. Wiley, 1978. ISBN 9780471035695

  15. [15]

    Exact post-selection inference, with application to the lasso

    Jason D. Lee, Dennis L. Sun, Yuekai Sun, and Jonathan E. Taylor. Ex- act post-selection inference, with application to the lasso, 2013. URL http://arxiv.org/abs/1311.6238. cite arxiv:1311.6238 Comment: Pub- lished at http://dx.doi.org/10.1214/15-AOS1371 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (h...

  16. [16]

    Paleologo.The Elements of Quantitative Investing

    Giuseppe A. Paleologo.The Elements of Quantitative Investing. John Wiley and Sons, 2025. ISBN 9781394265459

  17. [17]

    Steven E. Pav. Conditional inference on the asset with maximum Sharpe ra- tio. Privately Published, 2019. URLhttp://arxiv.org/abs/1906.00573

  18. [18]

    Pav.The Sharpe Ratio: Statistics and Applications

    Steven E. Pav.The Sharpe Ratio: Statistics and Applications. CRC Press,

  19. [19]

    Post-selection point and interval estimation of signal sizes in Gaussian samples.Canadian Journal of Statistics, 45(2):128–148, 2017

    Stephen Reid, Jonathan Taylor, and Robert Tibshirani. Post-selection point and interval estimation of signal sizes in Gaussian samples.Canadian Journal of Statistics, 45(2):128–148, 2017. URLhttps://arxiv.org/abs/ 1405.3340

  20. [20]

    William F. Sharpe. Mutual fund performance.Journal of Business, 39:119,

  21. [21]

    URLhttp://dx.doi.org/10.1086/294846

    doi: 10.1086/294846. URLhttp://dx.doi.org/10.1086/294846

  22. [22]

    Silvapulle and Pranab Kumar Sen.Constrained statistical infer- ence : inequality, order, and shape restrictions

    Mervyn J. Silvapulle and Pranab Kumar Sen.Constrained statistical infer- ence : inequality, order, and shape restrictions. Wiley-Interscience, Hobo- ken, N.J., 2005. ISBN 0471208272. URLhttp://books.google.com/ books?isbn=0471208272

  23. [23]

    Silverman, Ludger Evers, Kan Xu, Peter Carbonetto, and Matthew Stephens.EbayesThresh: Empirical Bayes Thresholding and Related Methods, 2017

    Bernard W. Silverman, Ludger Evers, Kan Xu, Peter Carbonetto, and Matthew Stephens.EbayesThresh: Empirical Bayes Thresholding and Related Methods, 2017. URLhttps://CRAN.R-project.org/package= EbayesThresh. R package version 1.4-12

  24. [24]

    of” in the title, which we felt was better than the original, “on

    Halbert White. A reality check for data snooping.Econometrica, 68:1097– 1127, 2000. doi: 10.1111/1468-0262.00152. URLhttps://www.ssc.wisc. edu/~bhansen/718/White2000.pdf. 31 A AI Use Statement We attempted to use AI in the preparation of this manuscript: •We asked gemini 3.0 for advice naming this paper. This resulted in several awful suggestions, which w...