pith. sign in

arxiv: 2605.20137 · v3 · pith:H6KQZZFCnew · submitted 2026-05-19 · 💱 q-fin.GN

A Three-Variable Benchmark for Post-GFC Covered Interest Parity Deviations

Pith reviewed 2026-05-22 09:04 UTC · model grok-4.3

classification 💱 q-fin.GN
keywords covered interest parityCIP deviationsbenchmarkpost-GFCNFCIdollar indexyield curve slopeG10 currencies
0
0 comments X

The pith

Three lagged public variables form a daily benchmark for post-GFC covered interest parity deviations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to supply a public daily benchmark for government-bond covered interest parity deviations after the global financial crisis, filling a gap that has left researchers without a standard reference comparable to factor models in asset pricing. It shows that three readily available lagged series—the National Financial Conditions Index, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year yield slope—account for most of the observed deviations across G10 currencies plus the Korean won at various tenors. The same three variables retain explanatory power in leave-one-year-out tests. Cointegration checks, quarter-end filters, and aggregation comparisons indicate that the fit reflects a lasting background component rather than fleeting spikes or spurious level correlations. The resulting benchmark therefore supports consistent daily regressions without reliance on proprietary data.

Core claim

The paper establishes that a linear combination of three lagged public state variables—the National Financial Conditions Index, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year slope—delivers strong in-sample and leave-one-year-out explanatory power for post-GFC government-bond CIP deviations in G10 plus KRW currency-tenor panels, while cointegration, quarter-end, and aggregation-difference diagnostics confirm that the benchmark isolates a persistent background component rather than short-maturity spikes or spurious correlations.

What carries the argument

A three-variable linear benchmark that uses lagged values of NFCI, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year slope to predict CIP deviations at daily frequency.

If this is right

  • Enables daily-frequency regressions on CIP deviations that are comparable to standard factor models in asset pricing.
  • Distinguishes persistent background deviations from transient quarter-end effects.
  • Supports leave-one-year-out validation as a check against overfitting.
  • Applies uniformly across multiple tenors and G10 plus KRW currency pairs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Researchers could test whether additional candidate drivers of CIP deviations retain incremental power once this benchmark is included.
  • The same three variables might serve as controls when studying related phenomena such as cross-currency basis swaps or bank funding spreads.
  • Extensions to real-time releases of the input series could allow monitoring of CIP conditions during future stress episodes.

Load-bearing premise

The three variables capture a genuine persistent economic component rather than statistical artifacts, omitted short-term patterns, or data-specific features.

What would settle it

Substantial deterioration in out-of-sample explanatory power or outright failure of cointegration tests when the same three variables are applied to data after 2022 or to additional currency panels.

Figures

Figures reproduced from arXiv: 2605.20137 by Useong Shin.

Figure 4
Figure 4. Figure 4 [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 4.1
Figure 4.1. Figure 4.1: Actual CIP deviations and baseline fitted values. The out-of-sample fitted values [PITH_FULL_IMAGE:figures/full_fig_p011_4_1.png] view at source ↗
Figure 4.2
Figure 4.2. Figure 4.2: Leave-one-year-out out-of-sample performance [PITH_FULL_IMAGE:figures/full_fig_p012_4_2.png] view at source ↗
Figure 4
Figure 4. Figure 4 [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 4.3
Figure 4.3. Figure 4.3: Expanding-window out-of-sample performance [PITH_FULL_IMAGE:figures/full_fig_p014_4_3.png] view at source ↗
Figure 5
Figure 5. Figure 5 [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 5.1
Figure 5.1. Figure 5.1: Non-overlapping aggregation-difference performance [PITH_FULL_IMAGE:figures/full_fig_p020_5_1.png] view at source ↗
read the original abstract

This paper proposes a public daily-frequency benchmark for post-GFC government-bond CIP deviations. Although CIP deviations are observed daily, the literature lacks a canonical benchmark for daily regressions comparable to standard factor models in asset pricing. Using G10 plus KRW currency-tenor panels, I show that three lagged public state variables-NFCI, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year slope-deliver strong in-sample and leave-one-year-out performance. Cointegration, quarter-end, and aggregation-difference diagnostics suggest that the benchmark captures a persistent background component rather than short-maturity quarter-end spikes or spurious level correlation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. This paper proposes a public daily-frequency benchmark for post-GFC government-bond CIP deviations. Using G10 plus KRW currency-tenor panels, three lagged public state variables—NFCI, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year slope—deliver strong in-sample and leave-one-year-out performance. Cointegration, quarter-end, and aggregation-difference diagnostics suggest that the benchmark captures a persistent background component rather than short-maturity quarter-end spikes or spurious level correlation.

Significance. If the central claims hold, this would supply a simple, replicable public benchmark for daily CIP regressions analogous to standard factor models in asset pricing. The focus on lagged, publicly available variables and explicit out-of-sample plus diagnostic checks is a constructive contribution that could reduce data-mining concerns and support further work on post-GFC financial conditions.

major comments (2)
  1. [§4.2] §4.2 Cointegration Diagnostics: residual-based tests (Engle-Granger or Phillips-Ouliaris) applied to highly autocorrelated series such as NFCI and the broad USD index can exhibit low power and size distortions in moderate samples. The manuscript should report results under alternative lag selections, deterministic terms, and perhaps Johansen trace tests; if the no-cointegration null is not rejected for most currency-tenor pairs under these variations, the claim that the benchmark isolates a true persistent background factor rather than correlated I(1) processes is materially weakened.
  2. [Table 3] Table 3 (or equivalent regression-results table), leave-one-year-out panel: the reported R² and t-statistics for the three-variable specification must be shown alongside single-variable and random-walk benchmarks with the same lag structure; without these comparisons the incremental explanatory power of the three-variable benchmark cannot be assessed and the 'strong performance' claim remains unsubstantiated.
minor comments (2)
  1. [Abstract] Abstract: include one or two headline quantitative metrics (e.g., average in-sample R² or out-of-sample RMSE) so readers can gauge the magnitude of the claimed performance without immediately consulting the tables.
  2. [§2] §2 Data and variable construction: clarify the exact aggregation method used for the daily NFCI and slope series when aligning with currency-tenor CIP observations; any implicit smoothing or interpolation should be stated explicitly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the robustness of our proposed benchmark. We respond to each major comment below and indicate the revisions we will implement.

read point-by-point responses
  1. Referee: [§4.2] §4.2 Cointegration Diagnostics: residual-based tests (Engle-Granger or Phillips-Ouliaris) applied to highly autocorrelated series such as NFCI and the broad USD index can exhibit low power and size distortions in moderate samples. The manuscript should report results under alternative lag selections, deterministic terms, and perhaps Johansen trace tests; if the no-cointegration null is not rejected for most currency-tenor pairs under these variations, the claim that the benchmark isolates a true persistent background factor rather than correlated I(1) processes is materially weakened.

    Authors: We agree that residual-based tests can have limited power against alternatives involving highly persistent series. In the revision we will add Johansen trace-test results for the same panels, using alternative lag lengths selected by AIC/BIC and both constant-only and constant-plus-trend specifications. These supplementary tables will be placed alongside the existing Engle-Granger results so readers can judge whether the evidence for cointegration is robust across methods. revision: yes

  2. Referee: [Table 3] Table 3 (or equivalent regression-results table), leave-one-year-out panel: the reported R² and t-statistics for the three-variable specification must be shown alongside single-variable and random-walk benchmarks with the same lag structure; without these comparisons the incremental explanatory power of the three-variable benchmark cannot be assessed and the 'strong performance' claim remains unsubstantiated.

    Authors: We accept that incremental explanatory power is best demonstrated by direct comparison. The revised leave-one-year-out table will report R² and t-statistics for each of the three individual lagged regressors, for the three-variable specification, and for a simple random-walk benchmark, all estimated with identical lag structure and sample. This will make the contribution of the multivariate benchmark transparent. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark uses external lagged public variables with out-of-sample validation

full rationale

The paper proposes a benchmark for CIP deviations based on three explicitly public and lagged state variables (NFCI, broad USD index, Treasury slope) and evaluates their performance via in-sample regressions and leave-one-year-out cross-validation, along with separate cointegration, quarter-end, and aggregation diagnostics. These elements constitute standard econometric reporting on observed data rather than any self-definitional loop, fitted parameter renamed as prediction, or load-bearing self-citation. The variables are external to the target series, the validation methods are independent of the fitted coefficients, and no ansatz or uniqueness theorem is invoked. The chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms are mentioned in the abstract; the work rests on standard econometric assumptions for regression, cointegration, and out-of-sample testing.

axioms (1)
  • domain assumption The three public variables capture a persistent component of CIP deviations after appropriate diagnostics.
    Invoked via the cointegration and quarter-end diagnostics described in the abstract.

pith-pipeline@v0.9.0 · 5627 in / 1057 out tokens · 34684 ms · 2026-05-22T09:04:35.710703+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.