A Nontrivial Upper Bound on the Out-of-Sample R² in Return Forecasting
Pith reviewed 2026-05-16 06:40 UTC · model grok-4.3
The pith
A coin-flip oracle model establishes a quadratic upper bound on the out-of-sample R-squared for return forecasts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study establishes that the R²_OOS of the coin-flip oracle model, whose analytical expression is a quadratic function of directional accuracy, serves as a tractable upper bound on the actual R²_OOS of practical return forecasting models.
What carries the argument
The coin-flip oracle model that outperforms practical models in mean squared error for any given directional accuracy.
If this is right
- Practical models' out-of-sample R² cannot surpass the quadratic function evaluated at their directional accuracy.
- The upper bound is independent of the specific predictor variables used.
- This allows direct comparison of model performance against the theoretical maximum for their accuracy level.
- Common predictive models in finance are shown to respect this bound in multiple scenarios.
Where Pith is reading between the lines
- Researchers could use this bound to set realistic expectations for forecast improvements.
- The approach might extend to other time-series forecasting domains beyond returns.
- It highlights that directional accuracy alone does not determine R-squared; magnitude consistency also matters.
- Testing the bound in non-financial prediction tasks could reveal similar limits.
Load-bearing premise
The coin-flip oracle model theoretically achieves lower mean squared error than any practical model that has the same directional accuracy.
What would settle it
Observing a practical forecasting model with out-of-sample R² higher than the quadratic value computed from its directional accuracy would falsify the claimed upper bound.
read the original abstract
This study establishes a nontrivial upper bound on the out-of-sample $R^2$ ($R^2_{\text{OOS}}$) in return forecasting. In particular, we define a coin-flip oracle model that, under the same directional accuracy, theoretically outperforms practical models in terms of MSE. The $R^2_{\text{OOS}}$ of the oracle model, whose analytical expression is a quadratic function of directional accuracy, can therefore serve as a tractable upper bound on the actual $R^2_{\text{OOS}}$. Empirical analyses across multiple forecasting scenarios reveal that the $R^2_{\text{OOS}}$ values of common predictive models are fundamentally bounded by this quadratic function.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to establish a nontrivial upper bound on out-of-sample R² in return forecasting by defining a coin-flip oracle model that, for a given directional accuracy p, theoretically achieves lower MSE than practical models. The oracle's R²_OOS is derived as a quadratic function of p and is asserted to serve as a tractable upper bound, with empirical analyses across forecasting scenarios showing that common predictive models fall below this bound.
Significance. If the coin-flip oracle is indeed the MSE-minimizing predictor for fixed directional accuracy, the quadratic bound would provide a useful, data-independent benchmark for assessing the performance of return-forecasting models and quantifying fundamental limits to predictability. The analytical form strengthens the result by making the bound directly computable from observed directional accuracy alone.
major comments (2)
- [Oracle model definition and MSE derivation (abstract and theoretical section)] The central claim that the coin-flip oracle minimizes MSE among all predictors sharing the same directional accuracy p is not established. An alternative predictor that outputs E[y | sign correct] on correct-sign realizations and an appropriate conditional value on errors can achieve identical p while strictly lowering MSE by exploiting magnitude information conditional on the sign outcome; this would render the derived quadratic an invalid upper bound. The manuscript provides no explicit optimality proof or comparison against such alternatives.
- [Abstract and theoretical derivation] The abstract states that the oracle outperforms real models in MSE under the same directional accuracy, yet the precise prediction rule of the oracle (fixed-magnitude outputs with coin-flip errors) and the algebraic steps leading to the quadratic R²_OOS expression are not supplied, preventing verification that the bound follows directly from the construction.
minor comments (1)
- [Empirical analyses] Clarify in the empirical section how directional accuracy p is computed from the data (e.g., whether it uses the same sign convention and sample periods as the theoretical oracle) to ensure the reported R²_OOS values can be directly compared to the quadratic bound.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments highlight the need for greater clarity on the oracle construction and its optimality properties. We respond point by point below and will revise the manuscript to incorporate the requested details while preserving the central contribution.
read point-by-point responses
-
Referee: [Oracle model definition and MSE derivation (abstract and theoretical section)] The central claim that the coin-flip oracle minimizes MSE among all predictors sharing the same directional accuracy p is not established. An alternative predictor that outputs E[y | sign correct] on correct-sign realizations and an appropriate conditional value on errors can achieve identical p while strictly lowering MSE by exploiting magnitude information conditional on the sign outcome; this would render the derived quadratic an invalid upper bound. The manuscript provides no explicit optimality proof or comparison against such alternatives.
Authors: The coin-flip oracle is an ex-ante predictor that achieves directional accuracy p by emitting a fixed magnitude m with the correct sign chosen randomly with probability p. The referee's proposed alternative cannot be implemented as a feasible predictor because it requires conditioning the output on the ex-post realization of whether the sign is correct, which depends on the unobserved y and is unavailable at forecast time. Under the information constraint of achieving accuracy p using only a directional signal, any non-constant magnitude choice increases average squared error without raising p. We will add a formal argument establishing this optimality (under standard symmetry assumptions on returns) to the theoretical section. revision: partial
-
Referee: [Abstract and theoretical derivation] The abstract states that the oracle outperforms real models in MSE under the same directional accuracy, yet the precise prediction rule of the oracle (fixed-magnitude outputs with coin-flip errors) and the algebraic steps leading to the quadratic R²_OOS expression are not supplied, preventing verification that the bound follows directly from the construction.
Authors: We apologize for the omission. The oracle emits a fixed magnitude m (chosen to minimize MSE for given p) with correct sign probability p and incorrect sign probability 1-p. Its MSE equals E[y²] + m² - 2m(2p-1)E[|y|] under symmetry. Dividing by Var(y) and rearranging yields the quadratic R²_OOS = (2p-1)² (m² / E[y²]). The revised manuscript will state the prediction rule explicitly and display the full algebraic derivation immediately after the definition. revision: yes
Circularity Check
No circularity: oracle R² derived directly from model definition as function of directional accuracy
full rationale
The paper defines a specific coin-flip oracle predictor (correct sign with probability p, random sign otherwise, fixed magnitude) and computes its out-of-sample R² analytically as a quadratic function of p. This algebraic expression is then proposed as an upper bound on attainable R²_OOS for any model sharing the same directional accuracy. The derivation step itself is a straightforward calculation from the oracle's assumed error process and does not reduce to a fitted parameter, self-citation chain, or redefinition of inputs; the bound claim rests on the separate (and externally challengeable) assertion that this oracle minimizes MSE for given p. No load-bearing step collapses by construction to the paper's own inputs or prior self-citations. The result is therefore self-contained against external benchmarks for the algebraic part.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A coin-flip oracle with the same directional accuracy as a practical model necessarily has lower or equal MSE.
invented entities (1)
-
coin-flip oracle model
no independent evidence
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.