Continuous Timing Signals for Growth-Defensive Style Allocation: Factor Attribution, Risk Matching, and Out-of-Sample Evidence

Zheli Xiong

REVIEW 3 major objections 2 minor 1 cited by

A continuous smooth score from macro signals times allocation between growth and defensive style portfolios to improve Sharpe ratio and drawdown control over static benchmarks.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-06-30 17:43 UTC pith:MKRDPW7H

load-bearing objection The paper gives a workable continuous timing rule for growth-defensive style allocation that beats some benchmarks on risk-adjusted metrics in 2017-2026 backtests, but the 'selected' policy and missing implementation details leave the results open to selection bias. the 3 major comments →

arxiv 2605.20636 v2 pith:MKRDPW7H submitted 2026-05-20 q-fin.PM

Continuous Timing Signals for Growth-Defensive Style Allocation: Factor Attribution, Risk Matching, and Out-of-Sample Evidence

Zheli Xiong This is my paper

classification q-fin.PM

keywords growth defensive allocationstyle timingcontinuous signalsmacro timingfactor attributionportfolio allocationsmooth scorerisk matching

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether known growth and defensive style exposures can be dynamically allocated using continuous macro-market timing signals instead of discrete regime rules. Factor attribution confirms the relative growth-defensive portfolio carries standard style betas with insignificant alpha, framing the task as style timing. The framework builds a smooth score from rate relief, SPY drawdown depth, high-VIX relief, and growth-crowding penalty; softplus smooths interactions, tanh maps the score to weights, and EWMA smooths realized positions. In the June 2017 to May 2026 window with 10 bp costs and 50 percent maximum tilt, the policy delivers 19.24 percent CAGR, Sharpe ratio 1.01, and maximum drawdown of -31.63 percent, beating 50/50 mixes, single-signal versions, SPY, and volatility-matched growth while falling short of pure growth in raw return. Walk-forward checks and post-2022 data add support for risk-adjusted value from continuous timing.

Core claim

The G-D relative portfolio is a recognizable style portfolio whose market beta is 0.273, HML beta is -0.552, momentum beta is 0.117, and annualized alpha is 1.95 percent with Newey-West t-statistic 0.81. Replacing discrete rules with a continuous score that combines rate relief, SPY drawdown depth, high-VIX stress relief, and growth-crowding penalty, smoothed via softplus and tanh and EWMA, produces a 19.24 percent CAGR, 1.01 Sharpe ratio, and -31.63 percent maximum drawdown in the main 2017-2026 window under 10 bp costs and 50 percent active tilt. This beats 50/50 G/D, TNX-only, core-only, SPY, and volatility-matched 100 percent G benchmarks yet does not exceed 100 percent G or the best hig

What carries the argument

The continuous smooth score that combines rate relief, SPY drawdown depth, high-VIX stress relief, and growth-crowding penalty, with interactions smoothed by softplus functions, total score mapped to G/D weights by hyperbolic tangent, and realized weights smoothed by EWMA.

Load-bearing premise

The specific combination of rate relief, SPY drawdown depth, high-VIX stress relief, and growth-crowding penalty, when smoothed with softplus and tanh functions and selected for the reported performance, will continue to deliver risk-adjusted value out of sample without being driven by in-sample optimization over the 2017-2026 window.

What would settle it

Performance of the smooth-score policy in a new period after May 2026 that fails to beat the same set of benchmarks on Sharpe ratio or drawdown reduction while preserving the factor loadings would falsify the central claim.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

The relative G-D portfolio behaves as a style portfolio with market beta 0.273, HML beta -0.552, momentum beta 0.117, and insignificant alpha rather than a new anomaly.
Continuous timing with 50 percent maximum tilt improves Sharpe ratio to 1.01 and limits maximum drawdown to -31.63 percent relative to 50/50 and single-signal mixes.
The approach maintains interpretability while allowing dynamic allocation that reduces drawdowns compared with volatility-matched pure growth.
Walk-forward and post-2022 validations confirm drawdown reduction and risk-adjusted value beyond the main sample.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same continuous-score construction could be applied to other style pairs such as value versus momentum to test whether macro timing generalizes across factor rotations.
If the crowding penalty proves central, the framework implies that behavioral crowding measures can be integrated into style timing without losing interpretability.
Extending the score to include additional macro variables or regime-dependent weights offers a direct way to test whether further signals improve out-of-sample stability.
Comparing the continuous method against discrete if-then rules on identical data would isolate the contribution of smoothness itself.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

The paper gives a workable continuous timing rule for growth-defensive style allocation that beats some benchmarks on risk-adjusted metrics in 2017-2026 backtests, but the 'selected' policy and missing implementation details leave the results open to selection bias.

read the letter

The core result is a smooth score built from rate relief, SPY drawdown, VIX relief, and growth-crowding signals, passed through softplus interactions, tanh mapping, and EWMA smoothing to produce G/D weights with a 50% tilt cap. In the main window it delivers 19.24% CAGR and 1.01 Sharpe after 10 bp costs, beating the 50/50 mix, TNX-matched, core-matched, SPY, and volatility-matched 100% G benchmarks on risk-adjusted terms while cutting drawdowns. The Fama-French attribution is useful: it shows the G-D spread is mostly style exposure (market beta 0.273, HML -0.552) with insignificant alpha, so the exercise is framed correctly as timing known factors rather than hunting new alpha.

The continuous formulation is a clear step past binary regime switches, and the walk-forward plus post-2022 checks add some out-of-sample flavor. That part is done reasonably.

The weak point is the selection process. The paper calls the rule the 'selected' policy and reports its performance on the same 2017-2026 interval used to evaluate it. Without an explicit statement that the four signals, the softplus/tanh forms, the EWMA decay, and the tilt limit were locked before looking at full-period returns, the edge could be an artifact of trying combinations until the numbers looked good. The abstract gives no exact weights, no full data sources, and no statistical tests on the outperformance, which makes the central claim harder to evaluate.

This is aimed at practitioners who run tactical equity style overlays and want a transparent, continuous signal. It is not aimed at people working on asset pricing theory. The work shows clear thinking on the attribution side and honest benchmark comparisons, so it is coherent on its own terms. A serious editor should send it to referees, but only after the authors supply the exact selection protocol, full formulas, and code or data that lets others replicate the policy choice.

Referee Report

3 major / 2 minor

Summary. The paper claims that a continuous smooth-score policy for growth (G) versus defensive (D) ETF allocation, built from four signals (rate relief, SPY drawdown depth, high-VIX stress relief, growth-crowding penalty) smoothed via softplus interactions, tanh mapping, and EWMA, delivers 19.24% CAGR and 1.01 Sharpe (50% max tilt, 10 bp costs) in the June 2017–May 2026 window. This outperforms 50/50 G/D, matched TNX-only, matched core-only, SPY, and volatility-matched 100% G benchmarks. Fama-French five-factor plus momentum attribution shows the G-D spread has market beta 0.273, HML beta -0.552, momentum beta 0.117, and alpha 1.95% (t=0.81). Walk-forward and post-2022 checks are presented as supporting evidence for risk-adjusted value of continuous style timing.

Significance. If the performance edge survives explicit pre-specification of signal choice and functional forms, the work supplies concrete evidence that continuous, interpretable macro signals can improve risk-adjusted style allocation relative to static or discrete-regime alternatives. The Fama-French decomposition and multiple benchmark comparisons are strengths that anchor the result in existing style-factor literature rather than claiming a new anomaly.

major comments (3)

[Abstract] Abstract and main-results description: the reported 19.24% CAGR / 1.01 Sharpe is for the 'selected' smooth-score policy. No protocol is given showing that the four-signal combination, softplus/tanh functional forms, EWMA decay, and 50% tilt cap were fixed before inspecting returns on the full 2017-2026 window. This selection step is load-bearing for the central claim of outperformance and walk-forward validation.
[Abstract] Abstract: performance differentials versus the matched TNX-only, core-only, and volatility-matched 100% G benchmarks are stated without Newey-West t-statistics or bootstrap p-values, unlike the Fama-French alpha (t=0.81). This omission weakens the quantitative support for the risk-adjusted improvement claim.
[Methodology] Methodology description: free parameters (maximum active tilt, score signal weights, EWMA decay) are listed but no table or appendix shows the exact numerical values used for the reported policy, nor the precise data sources for the TNX and core series. Reproducibility of the 19.24% CAGR figure is therefore limited.

minor comments (2)

[Abstract] Abstract: the maximum drawdown of -31.63% is given without the corresponding drawdown figures for the benchmark portfolios in the same sentence, making direct risk comparison harder.
[Abstract] The paper notes that the policy does not exceed 100% G in raw CAGR; this qualification is useful but could be stated more prominently when summarizing the main-window results.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive report and recommendation for major revision. We address each major comment below with specific responses. We agree that reproducibility details and additional statistical tests should be added, and will revise the manuscript accordingly. On pre-specification, we provide economic rationale while acknowledging the iterative nature of signal selection.

read point-by-point responses

Referee: [Abstract] Abstract and main-results description: the reported 19.24% CAGR / 1.01 Sharpe is for the 'selected' smooth-score policy. No protocol is given showing that the four-signal combination, softplus/tanh functional forms, EWMA decay, and 50% tilt cap were fixed before inspecting returns on the full 2017-2026 window. This selection step is load-bearing for the central claim of outperformance and walk-forward validation.

Authors: We acknowledge that the manuscript does not include an explicit pre-specification protocol or hold-out validation for the exact four-signal combination and functional forms. The signals were chosen based on a priori economic reasoning: rate relief (falling yields favor growth), SPY drawdown depth (defensive rotation), VIX relief (risk-on environment), and crowding penalty (avoiding overextended growth). Softplus interactions, tanh mapping, and EWMA were selected for producing continuous, differentiable weights rather than through return optimization. We will add a new subsection in the methodology detailing this economic motivation and literature alignment to mitigate data-snooping concerns, while noting that the 'selected' label already signals some post-hoc refinement. revision: partial
Referee: [Abstract] Abstract: performance differentials versus the matched TNX-only, core-only, and volatility-matched 100% G benchmarks are stated without Newey-West t-statistics or bootstrap p-values, unlike the Fama-French alpha (t=0.81). This omission weakens the quantitative support for the risk-adjusted improvement claim.

Authors: We agree this is an omission that weakens the presentation. In the revised manuscript we will add Newey-West t-statistics (and, where appropriate, bootstrap p-values) for the reported performance differentials versus the matched TNX-only, core-only, and volatility-matched 100% G benchmarks, bringing them in line with the Fama-French attribution already provided. revision: yes
Referee: [Methodology] Methodology description: free parameters (maximum active tilt, score signal weights, EWMA decay) are listed but no table or appendix shows the exact numerical values used for the reported policy, nor the precise data sources for the TNX and core series. Reproducibility of the 19.24% CAGR figure is therefore limited.

Authors: This comment is correct; the current draft lacks the numerical parameter table and exact series definitions needed for full reproducibility. We will add an appendix table that reports every free parameter value used in the main 19.24% CAGR specification (maximum tilt = 50%, all signal weights, softplus/tanh scaling constants, EWMA decay factor) together with the precise Bloomberg or FRED tickers/series identifiers for TNX and the core defensive basket. This will enable exact replication. revision: yes

Circularity Check

1 steps flagged

Selected policy performance on 2017-2026 window reduces to fitted inputs by construction

specific steps

fitted input called prediction [Abstract]
"In the main aligned comparison window from June 28, 2017 to May 15, 2026, with 10bp transaction costs, the selected smooth-score policy uses a 50% maximum active tilt and obtains a 19.24% CAGR, a Sharpe ratio of 1.01, and a maximum drawdown of -31.63%. It improves over 50/50 G/D, matched TNX-only, matched core-only, SPY, and volatility-matched 100% G benchmarks."

The policy is labeled 'selected' for this window after combining rate relief, SPY drawdown, high-VIX relief, and growth-crowding penalty with softplus interactions, tanh mapping, and EWMA smoothing. Reporting the optimized metrics on the identical interval makes the performance a direct consequence of the selection step rather than an independent test.

full rationale

The paper's central empirical claim is the outperformance of a 'selected smooth-score policy' on the exact 2017-2026 window used to evaluate it. The score components, softplus/tanh/EWMA forms, and 50% tilt are presented as chosen for that window, with the reported CAGR/Shapre/drawdown as the result. Walk-forward and post-2022 are cited but do not establish that signal selection and functional forms were fixed ex-ante independent of full-window inspection. This matches the fitted-input-called-prediction pattern; no other circularity patterns (self-citation, self-definition, etc.) are present in the provided text. The Fama-French attribution is independent and does not participate in the circularity.

Axiom & Free-Parameter Ledger

3 free parameters · 1 axioms · 0 invented entities

Abstract-only review; multiple tuning parameters in the smooth score (tilt limit, signal weights, smoothing constants) appear selected or fitted to historical performance without independent justification provided.

free parameters (3)

Maximum active tilt
Set at 50% for the reported policy; directly controls allocation aggressiveness and is chosen rather than derived.
Score signal weights
Coefficients combining rate relief, drawdown, VIX, and crowding terms are not stated and must be tuned to achieve the reported metrics.
EWMA decay parameter
Controls weight smoothing; value not given and affects realized turnover and performance.

axioms (1)

domain assumption Fama-French five-factor plus momentum model correctly attributes the G-D relative returns as style exposures
Used to interpret the portfolio as non-anomalous style timing rather than alpha.

pith-pipeline@v0.9.1-grok · 5916 in / 1680 out tokens · 59857 ms · 2026-06-30T17:43:38.406504+00:00 · methodology

0 comments

read the original abstract

This paper studies conditional allocation between a growth/technology ETF basket, denoted by $G$, and a defensive income/value-oriented ETF basket, denoted by $D$. The objective is not to discover a new standalone alpha factor, but to examine whether known style exposures can be dynamically allocated using macro-market timing signals. Fama-French five-factor plus momentum attribution shows that the relative portfolio $G-D$ is a recognizable style portfolio: its market beta is 0.273, its HML beta is -0.552, its momentum beta is 0.117, and its annualized alpha is 1.95\% with a Newey-West t-statistic of only 0.81. The empirical object is therefore interpreted as a growth-versus-defensive style allocation problem rather than a new return anomaly. The allocation framework replaces discrete regime labels and if-then trading rules with a continuous smooth score. The score combines rate relief, SPY drawdown depth, high-VIX stress relief, and a growth-crowding penalty. Interaction terms are smoothed with softplus functions, the total score is mapped to G/D weights through a hyperbolic tangent function, and realized weights are smoothed with EWMA. In the main aligned comparison window from June 28, 2017 to May 15, 2026, with 10bp transaction costs, the selected smooth-score policy uses a 50\% maximum active tilt and obtains a 19.24\% CAGR, a Sharpe ratio of 1.01, and a maximum drawdown of -31.63\%. It improves over 50/50 G/D, matched TNX-only, matched core-only, SPY, and volatility-matched 100\% G benchmarks. It does not, however, exceed 100\% G or the best high-G static portfolios in raw CAGR. Walk-forward and post-2022 validations provide additional evidence of drawdown reduction and risk-adjusted allocation value. Overall, the evidence supports continuous, interpretable style timing, while also showing that high static growth exposure remains a strong benchmark.

Figures

Figures reproduced from arXiv: 2605.20636 by Zheli Xiong.

**Figure 2.** Figure 2: Selected Smooth-Score Policy Equity Curves, 2017-06-28 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Aligned Main Equity Curves, 2017-06-28 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Buy-and-Hold G/D Baseline Equity Curves, 2017-06-28 to 2026-05-15 8 Volatility-Matched and Static G/D Comparisons The selected smooth policy has much lower average G exposure than 100% G. A direct comparison of CAGR therefore mixes signal quality with risk level. The vol-matched comparison scales 100% G to the smooth strategy’s annualized volatility. The required scaling weight is 81.95% [PITH_FULL_IMAGE:… view at source ↗

**Figure 5.** Figure 5: Vol-Matched and Static G/D Equity Curves, 2017-06-28 to 2026-05-15 9 Out-of-Sample Validation The validation section combines walk-forward expanding, walk-forward rolling, and fixedparameter validation into one aligned OOS window. The parameter pool is the candidate local grid described above. The initial training window is 252 trading days, and the test block length is 63 trading days. Walk-forward expan… view at source ↗

**Figure 6.** Figure 6: OOS Validation Equity Curves, 2018-06-28 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Post-2022 OOS Validation Equity Curves, 2022-01-03 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Bond/Credit Extension Main Equity Curves, 2017-06-28 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Bond/Credit Extension OOS Equity Curves, 2018-06-28 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: Bond/Credit Extension Post-2022 Equity Curves, 2022-01-03 to 2026-05-15 [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Relief-Gated Relative Rotation for QQQ-DIA Allocation: Globally Screened Relative States, Fixed Position Mapping, Incremental Interaction Admission, and Walk-Forward Validation
q-fin.PM 2026-07 conditional novelty 3.0

A screened signal-stack strategy rotating between QQQ and DIA improves risk-adjusted returns versus static benchmarks in walk-forward tests from 2018–2022, but does not consistently beat QQQ on raw return.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages · cited by 1 Pith paper

[1]

H., Borwein, J

Bailey, D. H., Borwein, J. M., L´ opez de Prado, M., and Zhu, Q. J. (2017). The probability of backtest overfitting.Journal of Computational Finance, 20(4):39–69

work page 2017
[2]

W., Santa-Clara, P., and Valkanov, R

Brandt, M. W., Santa-Clara, P., and Valkanov, R. (2009). Parametric portfolio policies: Exploiting characteristics in the cross-section of equity returns.Review of Financial Studies, 22(9):3411–3447

work page 2009
[3]

Campbell, J. Y. and Thompson, S. B. (2008). Predicting excess stock returns out of sample: Can anything beat the historical average?Review of Financial Studies, 21(4):1509–1531

work page 2008
[4]

Carhart, M. M. (1997). On persistence in mutual fund performance.Journal of Finance, 52(1):57–82

work page 1997
[5]

Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds.Journal of Financial Economics, 33(1):3–56. 18

work page 1993
[6]

Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model.Journal of Financial Economics, 116(1):1–22

work page 2015
[7]

and Welch, I

Goyal, A. and Welch, I. (2008). A comprehensive look at the empirical performance of equity premium prediction.Review of Financial Studies, 21(4):1455–1508

work page 2008
[8]

and Timmermann, A

Guidolin, M. and Timmermann, A. (2007). Asset allocation under multivariate regime switching.Journal of Economic Dynamics and Control, 31(11):3503–3544

work page 2007
[9]

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle.Econometrica, 57(2):357–384

work page 1989
[10]

Hansen, P. R. (2005). A test for superior predictive ability.Journal of Business & Economic Statistics, 23(4):365–380

work page 2005
[11]

White, H. (2000). A reality check for data snooping.Econometrica, 68(5):1097–1126. 19

work page 2000

[1] [1]

H., Borwein, J

Bailey, D. H., Borwein, J. M., L´ opez de Prado, M., and Zhu, Q. J. (2017). The probability of backtest overfitting.Journal of Computational Finance, 20(4):39–69

work page 2017

[2] [2]

W., Santa-Clara, P., and Valkanov, R

Brandt, M. W., Santa-Clara, P., and Valkanov, R. (2009). Parametric portfolio policies: Exploiting characteristics in the cross-section of equity returns.Review of Financial Studies, 22(9):3411–3447

work page 2009

[3] [3]

Campbell, J. Y. and Thompson, S. B. (2008). Predicting excess stock returns out of sample: Can anything beat the historical average?Review of Financial Studies, 21(4):1509–1531

work page 2008

[4] [4]

Carhart, M. M. (1997). On persistence in mutual fund performance.Journal of Finance, 52(1):57–82

work page 1997

[5] [5]

Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds.Journal of Financial Economics, 33(1):3–56. 18

work page 1993

[6] [6]

Fama, E. F. and French, K. R. (2015). A five-factor asset pricing model.Journal of Financial Economics, 116(1):1–22

work page 2015

[7] [7]

and Welch, I

Goyal, A. and Welch, I. (2008). A comprehensive look at the empirical performance of equity premium prediction.Review of Financial Studies, 21(4):1455–1508

work page 2008

[8] [8]

and Timmermann, A

Guidolin, M. and Timmermann, A. (2007). Asset allocation under multivariate regime switching.Journal of Economic Dynamics and Control, 31(11):3503–3544

work page 2007

[9] [9]

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle.Econometrica, 57(2):357–384

work page 1989

[10] [10]

Hansen, P. R. (2005). A test for superior predictive ability.Journal of Business & Economic Statistics, 23(4):365–380

work page 2005

[11] [11]

White, H. (2000). A reality check for data snooping.Econometrica, 68(5):1097–1126. 19

work page 2000