Which Portfolios? The Construction Dependence of Factor Model Performance

Useong Shin

arxiv: 2606.19550 · v1 · pith:HYNMFJMEnew · submitted 2026-06-17 · 💱 q-fin.GN · q-fin.PR

Which Portfolios? The Construction Dependence of Factor Model Performance

Useong Shin This is my paper

Pith reviewed 2026-06-26 18:10 UTC · model grok-4.3

classification 💱 q-fin.GN q-fin.PR

keywords factor modelstest assetsportfolio constructionasset pricingpricing errorsrandom portfoliosmodel evaluationCRSP

0 comments

The pith

Factor model performance depends on how test portfolios are constructed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that rankings among factor models such as FF3, FF5, FF6, and q5 change when the same models are tested on random portfolios built from the CRSP stock universe under different rules for stock selection, initial weighting, holding periods, and rebalancing. Buy-and-hold construction favors FF5 and FF6, while daily constant-weighting favors FF3 as the most stable. q5 achieves the highest maximum Sharpe ratio in spanning tests yet produces larger and more construction-sensitive pricing errors on these random portfolios. The shifts arise because each construction applies its own weighting to the vector of pricing errors produced by each model. A reader would care because asset pricing tests routinely rely on chosen test assets, so the apparent success of any model is partly a joint product of model and portfolio design.

Core claim

Factor-model performance depends not only on the model but also on how test assets are constructed. We form characteristic-unsorted random portfolios from a broad CRSP universe and vary stock selection, initial weighting, holding, and rebalancing. Rankings shift materially: buy-and-hold favors FF5 and FF6, whereas daily constant-weighting favors FF3, the most stable model across designs. Although q5 attains the highest maximum Sharpe ratio in factor-spanning tests, it leaves comparatively large and construction-sensitive pricing errors on random portfolios. These results reflect construction-specific weighting of each model's pricing-error vector. Test-asset construction, including dynamic w

What carries the argument

Construction-specific weighting of each model's pricing-error vector, which determines how a given portfolio design emphasizes or de-emphasizes different pricing mistakes across models.

If this is right

Buy-and-hold portfolios favor FF5 and FF6.
Daily constant-weighting portfolios favor FF3 as the most stable across designs.
q5 attains the highest maximum Sharpe ratio in factor-spanning tests yet produces comparatively large and construction-sensitive pricing errors.
Test-asset construction including dynamic weight management functions as a design choice that affects model evaluation conclusions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Standard characteristic-sorted test assets may embed their own construction biases that interact with the same pricing-error weighting mechanism.
Model comparisons would benefit from systematic robustness checks across multiple portfolio constructions rather than a single default design.
The interaction between dynamic rebalancing rules and factor exposures could be examined directly by holding construction fixed while varying only rebalancing frequency.

Load-bearing premise

That varying stock selection, initial weighting, holding, and rebalancing on random portfolios from the CRSP universe sufficiently isolates construction dependence without other unmodeled biases or data artifacts affecting the observed ranking shifts.

What would settle it

Finding that model performance rankings remain identical across all tested variations in stock selection, weighting, holding, and rebalancing rules would falsify the claim of construction dependence.

Figures

Figures reproduced from arXiv: 2606.19550 by Useong Shin.

**Figure 6.1.** Figure 6.1: Realized and Model-Implied Mean Excess Returns: Buy and Hold, UNIF, 5% [PITH_FULL_IMAGE:figures/full_fig_p019_6_1.png] view at source ↗

**Figure 6.2.** Figure 6.2: Realized and Model-Implied Mean Excess Returns: Constant Weight, UNIF, 5% [PITH_FULL_IMAGE:figures/full_fig_p022_6_2.png] view at source ↗

read the original abstract

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Factor model rankings shift with test asset construction rules, shown via random CRSP portfolios varying selection, weighting, and rebalancing.

read the letter

The paper's core finding is that which factor model appears strongest depends on how the test portfolios are put together. Using unsorted random portfolios from CRSP, the authors change stock selection, initial weights, holding periods, and rebalancing rules. Buy-and-hold versions make FF5 and FF6 look better, while daily constant-weighting favors FF3 as the most stable. q5 leads in factor-spanning Sharpe ratios but shows larger and more variable pricing errors across these setups. The results tie the differences to how each construction weights the models' pricing-error vectors.

This is a straightforward empirical extension of existing factor-testing work. It makes explicit that construction is not neutral and can alter conclusions about model performance. The design is direct and matches the claim without obvious internal contradictions.

The evidence is empirical rather than derived, so the strength rests on whether the reported shifts hold under the exact procedures used. From the description, the central pattern appears consistent. A minor soft spot is that the abstract leaves open how sensitive the shifts are to alternative error measures or universe restrictions, though the stress test finds no load-bearing flaw.

This work is mainly for researchers who run or evaluate factor model comparisons in asset pricing. Anyone who treats a single portfolio construction as the default will see the practical implication. It is worth sending to peer review so the methods and robustness checks can be examined in detail.

Referee Report

2 major / 2 minor

Summary. The paper claims that factor-model performance depends on test-asset construction in addition to the model itself. Using characteristic-unsorted random portfolios drawn from the CRSP universe, the authors vary stock selection, initial weighting, holding periods, and rebalancing rules. They report material shifts in model rankings (buy-and-hold favors FF5/FF6 while daily constant-weighting favors FF3) and note that q5 attains the highest maximum Sharpe ratio in factor-spanning tests yet leaves comparatively large and construction-sensitive pricing errors. The results are interpreted as evidence that different constructions apply different weights to each model's pricing-error vector, making construction a design choice in model evaluation.

Significance. If the empirical patterns are robust, the finding would underscore that test-asset construction choices materially affect inferences about relative model performance. This would add a practical caution to the asset-pricing literature on factor-model horse races and pricing-error diagnostics, emphasizing that rankings are not invariant to portfolio-formation mechanics.

major comments (2)

[Empirical design and results sections] The central empirical claim rests on the assertion that varying stock selection, weighting, holding, and rebalancing on random CRSP portfolios isolates construction dependence. The manuscript must supply explicit robustness checks (alternative exclusion rules, winsorization thresholds, and bootstrap or analytical standard errors on the reported ranking shifts) to rule out data-handling artifacts or unmodeled selection biases; without these, the observed shifts cannot be confidently attributed to construction alone.
[Factor-spanning and pricing-error results] The abstract states that q5 leaves 'comparatively large and construction-sensitive pricing errors' while attaining the highest maximum Sharpe ratio. The paper should report the precise pricing-error metric (e.g., mean absolute alpha, cross-sectional R², or GRS statistic) and the associated standard errors or p-values for each construction; otherwise the contrast with the Sharpe-ratio result remains unquantified.

minor comments (2)

[Data section] Clarify the exact CRSP sample period, number of random portfolios drawn, and rebalancing frequency definitions in the main text or a dedicated data appendix.
[Results] Add a table or figure that directly juxtaposes the model rankings (with confidence intervals) across the main construction variants to make the 'material shifts' visually and quantitatively transparent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important areas for strengthening the empirical evidence. We address each major comment below and outline the revisions we will make.

read point-by-point responses

Referee: [Empirical design and results sections] The central empirical claim rests on the assertion that varying stock selection, weighting, holding, and rebalancing on random CRSP portfolios isolates construction dependence. The manuscript must supply explicit robustness checks (alternative exclusion rules, winsorization thresholds, and bootstrap or analytical standard errors on the reported ranking shifts) to rule out data-handling artifacts or unmodeled selection biases; without these, the observed shifts cannot be confidently attributed to construction alone.

Authors: We agree that explicit robustness checks are needed to isolate construction effects from potential data artifacts. In the revised manuscript we will add results using alternative exclusion rules (e.g., different minimum-price and size thresholds), varied winsorization thresholds, and bootstrap standard errors on the reported differences in model rankings across constructions. These additions will directly address concerns about selection biases and data-handling artifacts. revision: yes
Referee: [Factor-spanning and pricing-error results] The abstract states that q5 leaves 'comparatively large and construction-sensitive pricing errors' while attaining the highest maximum Sharpe ratio. The paper should report the precise pricing-error metric (e.g., mean absolute alpha, cross-sectional R², or GRS statistic) and the associated standard errors or p-values for each construction; otherwise the contrast with the Sharpe-ratio result remains unquantified.

Authors: We accept that the abstract's reference to pricing errors requires greater precision. The revised manuscript will explicitly identify the pricing-error metric used, report the associated standard errors or p-values for each construction, and thereby quantify the contrast with the maximum Sharpe ratio results. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a purely empirical study that constructs random characteristic-unsorted portfolios from CRSP, varies stock selection/weighting/holding/rebalancing rules, and directly compares resulting performance rankings across factor models (FF3/FF5/FF6/q5). No equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided abstract or description. The central claim—that rankings are construction-dependent—is established by the observed shifts themselves rather than by any reduction to prior inputs or definitions. This matches the default expectation of a non-circular empirical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical study relying on standard factor models (FF3/5/6, q5) and CRSP data; no new free parameters, axioms, or invented entities are introduced or required for the central claim.

pith-pipeline@v0.9.1-grok · 5631 in / 1152 out tokens · 38715 ms · 2026-06-26T18:10:05.635561+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Anatomy of the Market: A Body-Tail Test of Factor Models
q-fin.GN 2026-06 unverdicted novelty 7.0

Body-tail decomposition of the market portfolio shows that q5 alone produces offsetting leg alphas and falls below its market baseline despite strong spanning performance.

Reference graph

Works this paper leans on

22 extracted references · 18 canonical work pages · cited by 1 Pith paper

[1]

R., Ross, S

Gibbons, M. R., Ross, S. A., & Shanken, J. (1989). A test of the efficiency of a given portfolio. Econometrica, 57(5), 1121--1152. https://www.jstor.org/stable/1913625

arXiv 1989
[2]

W., & MacKinlay, A

Lo, A. W., & MacKinlay, A. C. (1990). Data-snooping biases in tests of financial asset pricing models. The Review of Financial Studies, 3(3), 431--467. https://doi.org/10.1093/rfs/3.3.431

work page doi:10.1093/rfs/3.3.431 1990
[3]

The Journal of Finance52(1), 57–82 (1997) https://doi.org/10.1111/j.1540-6261.1997.tb03808.x

Hansen, L. P., & Jagannathan, R. (1997). Assessing specification errors in stochastic discount factor models. The Journal of Finance, 52(2), 557--590. https://doi.org/10.1111/j.1540-6261.1997.tb04813.x

work page doi:10.1111/j.1540-6261.1997.tb04813.x 1997
[4]

Cochrane, J. H. (2005). Asset Pricing (Revised ed.). Princeton University Press. https://www.johnhcochrane.com/asset-pricing

2005
[5]

Lewellen, J., Nagel, S., & Shanken, J. (2010). A skeptical appraisal of asset-pricing tests. Journal of Financial Economics, 96(2), 175--194. https://doi.org/10.1016/j.jfineco.2009.09.001

work page doi:10.1016/j.jfineco.2009.09.001 2010
[6]

Barillas, F., & Shanken, J. (2017). Which alpha? The Review of Financial Studies, 30(4), 1316--1338. https://doi.org/10.1093/rfs/hhw101

work page doi:10.1093/rfs/hhw101 2017
[7]

Barillas, F., & Shanken, J. (2018). Comparing asset pricing models. The Journal of Finance, 73(2), 715--754. https://doi.org/10.1111/jofi.12607

work page doi:10.1111/jofi.12607 2018
[8]

Kozak, S., Nagel, S., & Santosh, S. (2018). Interpreting factor models. The Journal of Finance, 73(3), 1183--1223. https://doi.org/10.1111/jofi.12612

work page doi:10.1111/jofi.12612 2018
[9]

Giglio, S., Xiu, D., & Zhang, D. (2025). Test assets and weak factors. The Journal of Finance, 80(1), 259--319. https://doi.org/10.1111/jofi.13415

work page doi:10.1111/jofi.13415 2025
[10]

Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. The Journal of Finance, 48(1), 65--91. https://doi.org/10.1111/j.1540-6261.1993.tb04702.x

work page doi:10.1111/j.1540-6261.1993.tb04702.x 1993
[11]

Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57--82. https://doi.org/10.1111/j.1540-6261.1997.tb03808.x

work page doi:10.1111/j.1540-6261.1997.tb03808.x 1997
[12]

The Jour- nal of Finance47(2), 427–465 (1992) https://doi.org/10.1111/j.1540-6261.1992

Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. The Journal of Finance, 47(2), 427--465. https://doi.org/10.1111/j.1540-6261.1992.tb04398.x

work page doi:10.1111/j.1540-6261.1992.tb04398.x 1992
[13]

Fama and Kenneth R

Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3--56. https://doi.org/10.1016/0304-405X(93)90023-5

work page doi:10.1016/0304-405x(93)90023-5 1993
[14]

Fama and Kenneth R

Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116(1), 1--22. https://doi.org/10.1016/j.jfineco.2014.10.010

work page doi:10.1016/j.jfineco.2014.10.010 2015
[15]

F., & French, K

Fama, E. F., & French, K. R. (2018). Choosing factors. Journal of Financial Economics, 128(2), 234--252. https://doi.org/10.1016/j.jfineco.2018.02.012

work page doi:10.1016/j.jfineco.2018.02.012 2018
[16]

Hou, K., Xue, C., & Zhang, L. (2015). Digesting anomalies: An investment approach. The Review of Financial Studies, 28(3), 650--705. https://doi.org/10.1093/rfs/hhu068

work page doi:10.1093/rfs/hhu068 2015
[17]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2019). Which factors? Review of Finance, 23(1), 1--35. https://doi.org/10.1093/rof/rfy032

work page doi:10.1093/rof/rfy032 2019
[18]

Hou, K., Xue, C., & Zhang, L. (2020). Replicating anomalies. The Review of Financial Studies, 33(5), 2019--2133. https://doi.org/10.1093/rfs/hhy131

work page doi:10.1093/rfs/hhy131 2020
[19]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2021). An augmented q-factor model with expected growth. Review of Finance, 25(1), 1--41. https://doi.org/10.1093/rof/rfaa004

work page doi:10.1093/rof/rfaa004 2021
[20]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2024). The economics of security analysis. Management Science, 70(1), 164--186. https://doi.org/10.1287/mnsc.2022.4640

work page doi:10.1287/mnsc.2022.4640 2024
[21]

French, K. R. (2026). Kenneth R. French Data Library [Data set]. Accessed May 5, 2026. https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

2026
[22]

Global-q.org. (2026). Factors and testing portfolios [Data set]. Accessed May 5, 2026. https://global-q.org/factors.html

2026

[1] [1]

R., Ross, S

Gibbons, M. R., Ross, S. A., & Shanken, J. (1989). A test of the efficiency of a given portfolio. Econometrica, 57(5), 1121--1152. https://www.jstor.org/stable/1913625

arXiv 1989

[2] [2]

W., & MacKinlay, A

Lo, A. W., & MacKinlay, A. C. (1990). Data-snooping biases in tests of financial asset pricing models. The Review of Financial Studies, 3(3), 431--467. https://doi.org/10.1093/rfs/3.3.431

work page doi:10.1093/rfs/3.3.431 1990

[3] [3]

The Journal of Finance52(1), 57–82 (1997) https://doi.org/10.1111/j.1540-6261.1997.tb03808.x

Hansen, L. P., & Jagannathan, R. (1997). Assessing specification errors in stochastic discount factor models. The Journal of Finance, 52(2), 557--590. https://doi.org/10.1111/j.1540-6261.1997.tb04813.x

work page doi:10.1111/j.1540-6261.1997.tb04813.x 1997

[4] [4]

Cochrane, J. H. (2005). Asset Pricing (Revised ed.). Princeton University Press. https://www.johnhcochrane.com/asset-pricing

2005

[5] [5]

Lewellen, J., Nagel, S., & Shanken, J. (2010). A skeptical appraisal of asset-pricing tests. Journal of Financial Economics, 96(2), 175--194. https://doi.org/10.1016/j.jfineco.2009.09.001

work page doi:10.1016/j.jfineco.2009.09.001 2010

[6] [6]

Barillas, F., & Shanken, J. (2017). Which alpha? The Review of Financial Studies, 30(4), 1316--1338. https://doi.org/10.1093/rfs/hhw101

work page doi:10.1093/rfs/hhw101 2017

[7] [7]

Barillas, F., & Shanken, J. (2018). Comparing asset pricing models. The Journal of Finance, 73(2), 715--754. https://doi.org/10.1111/jofi.12607

work page doi:10.1111/jofi.12607 2018

[8] [8]

Kozak, S., Nagel, S., & Santosh, S. (2018). Interpreting factor models. The Journal of Finance, 73(3), 1183--1223. https://doi.org/10.1111/jofi.12612

work page doi:10.1111/jofi.12612 2018

[9] [9]

Giglio, S., Xiu, D., & Zhang, D. (2025). Test assets and weak factors. The Journal of Finance, 80(1), 259--319. https://doi.org/10.1111/jofi.13415

work page doi:10.1111/jofi.13415 2025

[10] [10]

Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. The Journal of Finance, 48(1), 65--91. https://doi.org/10.1111/j.1540-6261.1993.tb04702.x

work page doi:10.1111/j.1540-6261.1993.tb04702.x 1993

[11] [11]

Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57--82. https://doi.org/10.1111/j.1540-6261.1997.tb03808.x

work page doi:10.1111/j.1540-6261.1997.tb03808.x 1997

[12] [12]

The Jour- nal of Finance47(2), 427–465 (1992) https://doi.org/10.1111/j.1540-6261.1992

Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. The Journal of Finance, 47(2), 427--465. https://doi.org/10.1111/j.1540-6261.1992.tb04398.x

work page doi:10.1111/j.1540-6261.1992.tb04398.x 1992

[13] [13]

Fama and Kenneth R

Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3--56. https://doi.org/10.1016/0304-405X(93)90023-5

work page doi:10.1016/0304-405x(93)90023-5 1993

[14] [14]

Fama and Kenneth R

Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116(1), 1--22. https://doi.org/10.1016/j.jfineco.2014.10.010

work page doi:10.1016/j.jfineco.2014.10.010 2015

[15] [15]

F., & French, K

Fama, E. F., & French, K. R. (2018). Choosing factors. Journal of Financial Economics, 128(2), 234--252. https://doi.org/10.1016/j.jfineco.2018.02.012

work page doi:10.1016/j.jfineco.2018.02.012 2018

[16] [16]

Hou, K., Xue, C., & Zhang, L. (2015). Digesting anomalies: An investment approach. The Review of Financial Studies, 28(3), 650--705. https://doi.org/10.1093/rfs/hhu068

work page doi:10.1093/rfs/hhu068 2015

[17] [17]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2019). Which factors? Review of Finance, 23(1), 1--35. https://doi.org/10.1093/rof/rfy032

work page doi:10.1093/rof/rfy032 2019

[18] [18]

Hou, K., Xue, C., & Zhang, L. (2020). Replicating anomalies. The Review of Financial Studies, 33(5), 2019--2133. https://doi.org/10.1093/rfs/hhy131

work page doi:10.1093/rfs/hhy131 2020

[19] [19]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2021). An augmented q-factor model with expected growth. Review of Finance, 25(1), 1--41. https://doi.org/10.1093/rof/rfaa004

work page doi:10.1093/rof/rfaa004 2021

[20] [20]

Hou, K., Mo, H., Xue, C., & Zhang, L. (2024). The economics of security analysis. Management Science, 70(1), 164--186. https://doi.org/10.1287/mnsc.2022.4640

work page doi:10.1287/mnsc.2022.4640 2024

[21] [21]

French, K. R. (2026). Kenneth R. French Data Library [Data set]. Accessed May 5, 2026. https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

2026

[22] [22]

Global-q.org. (2026). Factors and testing portfolios [Data set]. Accessed May 5, 2026. https://global-q.org/factors.html

2026