crossfit: A Graph-Based Cross-Fitting Engine in R

arxiv: 2605.15856 · v1 · pith:OE3BD6IUnew · submitted 2026-05-15 · 📊 stat.CO

crossfit: A Graph-Based Cross-Fitting Engine in R

Etienne Peyrot , Fran\c{c}ois Petit This is my paper

Pith reviewed 2026-05-19 18:03 UTC · model grok-4.3

classification 📊 stat.CO

keywords cross-fittingR packagesemiparametric estimationdouble machine learningDAGnuisance modelsfold allocation

0 comments p. Extension

pith:OE3BD6IU Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{OE3BD6IU}

Prints a linked pith:OE3BD6IU badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

The crossfit R package provides a general-purpose cross-fitting engine using user-specified DAGs of nuisance models with custom fold allocations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces crossfit, an R package that acts as a general-purpose, estimator-agnostic engine for cross-fitting. Users specify a target functional and a directed acyclic graph (DAG) of nuisance models, each with node-specific training fold widths and target-specific evaluation windows. The engine executes reproducible schedules over folds, panels, and repetitions to enforce out-of-sample nuisance predictions. It supports modes such as disjoint and independence-enforcing allocations to control data sharing and dependence between nuisance branches, plus caching and validation for simulation-heavy work.

Core claim

crossfit is a software tool that executes cross-fitting by traversing a user-defined DAG of nuisance models, applying node-specific fold widths for training and target-specific windows for evaluation, while enforcing out-of-sample use to support valid semiparametric inference such as in double/debiased machine learning. The package returns either a scalar estimate or a cross-fitted predictor function and includes explicit scheduling, reuse-aware caching, and failure isolation.

What carries the argument

Directed acyclic graph (DAG) of nuisance models with node-specific training fold widths and target-specific evaluation windows, executed by the engine's scheduler and caching logic to generate cross-fitted outputs without data leakage.

If this is right

Enables valid estimation of low-dimensional targets amid high-dimensional nuisance functions by enforcing out-of-sample predictions.
Allows precise control of dependence between nuisance components via disjoint and independence-enforcing allocation modes that duplicate reused nodes.
Delivers explicit and auditable fold schedules suitable for simulation-heavy benchmarking and method development.
Provides reuse-aware caching and failure isolation to support efficient execution over large experiment grids.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The DAG-based specification of fold geometry could be used to empirically measure how different dependence structures affect finite-sample variance of cross-fitted estimators.
This design may be adapted to other languages or ML libraries to create consistent cross-fitting backends for causal inference pipelines.
Explicit caching rules could reduce redundant computation when testing many variants of the same target functional.

Load-bearing premise

The package's internal scheduler and caching logic correctly execute the user-specified DAG and fold-allocation rules without introducing unintended data leakage or dependence between nuisance branches.

What would settle it

Run a simulation using the independence-enforcing mode on models known to share data and check whether nuisance predictions from different branches show zero correlation on held-out evaluation data.

Figures

Figures reproduced from arXiv: 2605.15856 by Etienne Peyrot, Fran\c{c}ois Petit.

**Figure 1.** Figure 1: Fold allocation modes in crossfit. The same method is executed under three fold allocation modes. target denotes the user-defined target functional (the final quantity computed), and nui 1/nui 2 denote two required nuisance models; the arrow nui 2 → nui 1 indicates a nuisanceof-nuisance dependency. Top: nuisance dependency structure and nodespecific training widths (train_fold = number of folds used to t… view at source ↗

read the original abstract

Cross-fitting is a key ingredient in many semiparametric estimation procedures, such as double/debiased machine learning (DML), enabling valid estimation of low-dimensional targets in the presence of high-dimensional nuisance functions by enforcing out-of-sample use of nuisance predictions. crossfit is an R package that provides a general-purpose, estimator-agnostic cross-fitting engine. Users specify (i) a target functional and (ii) a directed acyclic graph (DAG) of nuisance models, with node-specific training fold widths and target-specific evaluation windows. The engine executes a reproducible schedule over folds, panels, and repetitions, returning either a scalar estimate (mode="estimate") or a cross-fitted predictor function for application to new data (mode="predict"). Beyond standard cross-fitting, crossfit implements fold-allocation modes that control how training data are shared across nuisance components, including disjoint and independence-enforcing allocations that duplicate reused nodes to reduce dependence between nuisance branches. The implementation targets simulation-heavy benchmarking and method development, with explicit and auditable schedules, defensive validation of specifications and nuisance dependencies, reuse-aware caching to avoid redundant refits, and failure isolation policies for large experiment grids. The crossfit package is available on CRAN, openly developed on GitHub under GPL-3, and is intended as a lightweight, tested foundation to prototype and empirically evaluate cross-fitted estimators with explicit control over fold geometry, dependence, and computation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

crossfit packages a DAG-driven scheduler for cross-fitting in R with dependence controls via node duplication, but the text does not show tests confirming the independence modes block leakage.

read the letter

This package gives R users a structured way to define cross-fitting schedules through a directed acyclic graph of nuisance models, with per-node training widths and allocation modes that duplicate reused nodes to limit dependence between branches. The core idea of cross-fitting is already standard, but the explicit graph specification plus the disjoint and independence-enforcing modes is the concrete addition here. It targets people who run many simulation studies or prototype semiparametric estimators and want to avoid hand-coding fold logic each time. The implementation includes caching that respects reuse, reproducible schedules across repetitions, and options to return either a scalar estimate or a fitted predictor. Those features are practical for the intended use case of benchmarking and method development. The description also notes defensive validation and failure isolation for large grids, which sounds useful in practice. The soft spot is verification. The abstract states that the duplication and caching prevent unintended leakage, yet it supplies no pseudocode for the scheduler, no cache-key invariant, and no unit-test results that would confirm the isolation actually works under the claimed modes. If the full manuscript includes those checks or small reproducible examples, the gap is minor; if not, it leaves the central guarantee untested in the write-up. This is for R-focused researchers who need a reusable engine for complex nuisance graphs rather than for readers looking for new statistical theory. A reader building custom DML pipelines or running Monte Carlo studies will get direct value from the fold-control options. It deserves a serious referee because the engineering choices are described clearly enough to evaluate, even if the paper is mainly a software contribution. I would send it for peer review with the request that reviewers examine the code for the leakage-prevention logic.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces the crossfit R package as a general-purpose, estimator-agnostic cross-fitting engine for semiparametric procedures such as double/debiased machine learning. Users specify a target functional together with a directed acyclic graph (DAG) of nuisance models, including node-specific training fold widths and target-specific evaluation windows. The engine runs reproducible schedules over folds, panels, and repetitions and returns either a scalar estimate (mode='estimate') or a cross-fitted predictor (mode='predict'). It additionally implements fold-allocation modes, notably disjoint and independence-enforcing allocations that duplicate reused nodes to reduce dependence between nuisance branches, together with caching, defensive validation, and failure-isolation features aimed at simulation-heavy benchmarking.

Significance. If the scheduler and caching logic correctly realize the claimed independence-enforcing duplication without introducing data leakage, the package would supply a lightweight, auditable foundation for prototyping and empirically evaluating cross-fitted estimators in R. Explicit control over fold geometry, dependence structure, and computational reuse would be particularly valuable for simulation studies and method development; the CRAN availability and open GitHub development further support reproducibility.

major comments (1)

[Abstract] Abstract: the claim that independence-enforcing allocations 'duplicate reused nodes to reduce dependence between nuisance branches' is load-bearing for the out-of-sample guarantee that justifies the entire construction. The manuscript describes the intended behavior and defensive validation but supplies neither pseudocode for the scheduler, an invariant statement on cache keys under duplication, nor unit-test results confirming that shared cache entries or fold indices do not create unintended leakage between branches.

minor comments (1)

The abstract would benefit from a concise, self-contained usage snippet showing how a simple DAG and mode are specified.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments and for emphasizing the centrality of the independence-enforcing allocation mechanism. We respond to the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that independence-enforcing allocations 'duplicate reused nodes to reduce dependence between nuisance branches' is load-bearing for the out-of-sample guarantee that justifies the entire construction. The manuscript describes the intended behavior and defensive validation but supplies neither pseudocode for the scheduler, an invariant statement on cache keys under duplication, nor unit-test results confirming that shared cache entries or fold indices do not create unintended leakage between branches.

Authors: We agree that the independence-enforcing duplication is central to the validity claim and that the current manuscript would be strengthened by explicit technical documentation. In the revised version we will add (i) pseudocode for the scheduler’s duplication logic in a new appendix, (ii) a precise invariant stating that cache keys are formed from the tuple (node_id, fold_id, duplication_tag) so that duplicated nodes receive distinct keys, and (iii) a concise report of unit-test results that verify no shared cache hits or fold-index collisions occur between branches. These additions will be placed in supplementary material to preserve the main text’s brevity while directly addressing the concern. revision: yes

Circularity Check

0 steps flagged

No derivation chain or fitted predictions present; software implementation is self-contained

full rationale

The manuscript describes an R package implementing cross-fitting over user-specified DAGs of nuisance models, with modes for fold allocation and caching. No equations, first-principles derivations, predictions of new quantities, or parameter fits are claimed. The central contribution is the engine's execution of user-defined schedules and independence rules; its correctness is evaluated against external benchmarks (CRAN tests, GitHub code, simulation use cases) rather than reducing to self-referential definitions or self-citations. No load-bearing step matches any enumerated circularity pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical free parameters, axioms, or invented entities are introduced; the paper describes software features rather than a theoretical derivation.

pith-pipeline@v0.9.0 · 5783 in / 1054 out tokens · 51989 ms · 2026-05-19T18:03:16.118805+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 1 internal anchor

[1]

Benjamini and D

Bickel PJ. On Adaptive Estimation. The Annals of Statistics. 1982;10(3):647-71. Available from:https://doi.org/10.1214/aos/ 1176345863

work page doi:10.1214/aos/ 1982
[2]

The Annals of Statistics , volume =

Schick A. On Asymptotically Efficient Estimation in Semiparametric Models. The Annals of Statistics. 1986;14(3):1139-51. Available from: https://doi.org/10.1214/aos/1176350055

work page doi:10.1214/aos/1176350055 1986
[3]

Cross-Validated Targeted Minimum- Loss-Based Estimation

Zheng W, van der Laan MJ. Cross-Validated Targeted Minimum- Loss-Based Estimation. In: Targeted Learning: Causal Inference for Observational and Experimental Data. New York, NY: Springer New York; 2011. p. 459-74. Available from:https://doi.org/10.1007/ 978-1-4419-9782-1_27

work page 2011
[4]

Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation

Newey WK, Robins JR. Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation. arXiv preprint arXiv:180109138. 2018. Available from:https://arxiv.org/abs/1801.09138. 15

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

The Econometrics Journal , volume =

Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, et al. Double/debiased machine learning for treatment and struc- tural parameters. The Econometrics Journal. 2018;21(1):C1-C68. Avail- able from:https://doi.org/10.1111/ectj.12097

work page doi:10.1111/ectj.12097 2018
[6]

Orthogonal statistical learning

Foster DJ, Syrgkanis V. Orthogonal statistical learning. The Annals of Statistics. 2023;51(3):879-908. Available from:https://doi.org/10. 1214/23-AOS2258

work page 2023
[7]

Do machine learning methods lead to similar individualized treat- ment rules? A comparison study on real data

Bouvier F, Peyrot E, Balendran A, Ségalas C, Roberts I, Petit F, et al. Do machine learning methods lead to similar individualized treat- ment rules? A comparison study on real data. Statistics in Medicine. 2024;43(11):2043-61. Available from:https://doi.org/10.1002/sim. 10059

work page doi:10.1002/sim 2024
[8]

DoubleML: An Object-Oriented Implementation of Double Machine Learning in R

BachP,KurzMS,ChernozhukovV,SpindlerM,KlaassenS. DoubleML: An Object-Oriented Implementation of Double Machine Learning in R. Journal of Statistical Software. 2024;108(3):1-56. Available from: https://www.jstatsoft.org/article/view/v108i03

work page 2024
[9]

Meta-Learners for Estimation of Causal Effects: Finite Sam- ple Cross-Fit Performance

Okasa G. Meta-Learners for Estimation of Causal Effects: Finite Sam- ple Cross-Fit Performance. arXiv. 2022. ArXiv:2201.12692 [econ.EM]. Available from:https://arxiv.org/abs/2201.12692

work page arXiv 2022
[10]

Causal Ma- chine Learning Methods and Use of Cross-Fitting in Settings With High-Dimensional Confounding

Ellul S, Vansteelandt S, Carlin JB, Moreno-Betancur M. Causal Ma- chine Learning Methods and Use of Cross-Fitting in Settings With High-Dimensional Confounding. Statistics in Medicine. 2025;44(20– 22):e70272. Available from:https://doi.org/10.1002/sim.70272

work page doi:10.1002/sim.70272 2025
[11]

Double Cross-fit Doubly Robust Estimators: Beyond Series Regression

McClean A, Balakrishnan S, Kennedy EH, Wasserman L. Double Cross-fit Doubly Robust Estimators: Beyond Series Regression. arXiv preprint arXiv:240315175. 2024. Available from:https://arxiv.org/ abs/2403.15175

work page arXiv 2024
[12]

Three-way Cross-Fitting and Pseudo-Outcome Regression for Estimation of Conditional Effects and other Linear Functionals

Fisher A, Fisher V. Three-way Cross-Fitting and Pseudo-Outcome Regression for Estimation of Conditional Effects and other Linear Functionals. arXiv preprint arXiv:230607230. 2023. Available from: https://arxiv.org/abs/2306.07230

work page arXiv 2023
[13]

Doubly Robust Triple Cross-Fit Es- timation for Causal Inference with Imaging Data

Ke D, Zhou X, Yang Q, Song X. Doubly Robust Triple Cross-Fit Es- timation for Causal Inference with Imaging Data. Statistics in Bio- sciences. 2024. Online first. Available from:https://doi.org/10. 1007/s12561-024-09458-1. 16

work page 2024
[14]

Journal of Business & Economic Statistics , volume =

Chiang HD, Kato K, Ma Y, Sasaki Y. Multiway Cluster Robust Dou- ble/DebiasedMachineLearning. JournalofBusiness&EconomicStatis- tics. 2022;40(3):1046-56. Available from:https://doi.org/10.1080/ 07350015.2021.1895815

work page arXiv 2022
[15]

partialling-out

Zivich PN, Breskin A. Machine Learning for Causal Inference: On the Use of Cross-fit Estimators. Epidemiology. 2021;32(3):393-401. Avail- able from:https://doi.org/10.1097/EDE.0000000000001332. A Recipe: DML for the partially linear regression model Thisappendixgivesacomplete, self-containedexampleofaDouble/Debiased MachineLearning(DML)estimatorforthePart...

work page doi:10.1097/ede.0000000000001332 2021

[1] [1]

Benjamini and D

Bickel PJ. On Adaptive Estimation. The Annals of Statistics. 1982;10(3):647-71. Available from:https://doi.org/10.1214/aos/ 1176345863

work page doi:10.1214/aos/ 1982

[2] [2]

The Annals of Statistics , volume =

Schick A. On Asymptotically Efficient Estimation in Semiparametric Models. The Annals of Statistics. 1986;14(3):1139-51. Available from: https://doi.org/10.1214/aos/1176350055

work page doi:10.1214/aos/1176350055 1986

[3] [3]

Cross-Validated Targeted Minimum- Loss-Based Estimation

Zheng W, van der Laan MJ. Cross-Validated Targeted Minimum- Loss-Based Estimation. In: Targeted Learning: Causal Inference for Observational and Experimental Data. New York, NY: Springer New York; 2011. p. 459-74. Available from:https://doi.org/10.1007/ 978-1-4419-9782-1_27

work page 2011

[4] [4]

Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation

Newey WK, Robins JR. Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation. arXiv preprint arXiv:180109138. 2018. Available from:https://arxiv.org/abs/1801.09138. 15

work page internal anchor Pith review Pith/arXiv arXiv 2018

[5] [5]

The Econometrics Journal , volume =

Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, et al. Double/debiased machine learning for treatment and struc- tural parameters. The Econometrics Journal. 2018;21(1):C1-C68. Avail- able from:https://doi.org/10.1111/ectj.12097

work page doi:10.1111/ectj.12097 2018

[6] [6]

Orthogonal statistical learning

Foster DJ, Syrgkanis V. Orthogonal statistical learning. The Annals of Statistics. 2023;51(3):879-908. Available from:https://doi.org/10. 1214/23-AOS2258

work page 2023

[7] [7]

Do machine learning methods lead to similar individualized treat- ment rules? A comparison study on real data

Bouvier F, Peyrot E, Balendran A, Ségalas C, Roberts I, Petit F, et al. Do machine learning methods lead to similar individualized treat- ment rules? A comparison study on real data. Statistics in Medicine. 2024;43(11):2043-61. Available from:https://doi.org/10.1002/sim. 10059

work page doi:10.1002/sim 2024

[8] [8]

DoubleML: An Object-Oriented Implementation of Double Machine Learning in R

BachP,KurzMS,ChernozhukovV,SpindlerM,KlaassenS. DoubleML: An Object-Oriented Implementation of Double Machine Learning in R. Journal of Statistical Software. 2024;108(3):1-56. Available from: https://www.jstatsoft.org/article/view/v108i03

work page 2024

[9] [9]

Meta-Learners for Estimation of Causal Effects: Finite Sam- ple Cross-Fit Performance

Okasa G. Meta-Learners for Estimation of Causal Effects: Finite Sam- ple Cross-Fit Performance. arXiv. 2022. ArXiv:2201.12692 [econ.EM]. Available from:https://arxiv.org/abs/2201.12692

work page arXiv 2022

[10] [10]

Causal Ma- chine Learning Methods and Use of Cross-Fitting in Settings With High-Dimensional Confounding

Ellul S, Vansteelandt S, Carlin JB, Moreno-Betancur M. Causal Ma- chine Learning Methods and Use of Cross-Fitting in Settings With High-Dimensional Confounding. Statistics in Medicine. 2025;44(20– 22):e70272. Available from:https://doi.org/10.1002/sim.70272

work page doi:10.1002/sim.70272 2025

[11] [11]

Double Cross-fit Doubly Robust Estimators: Beyond Series Regression

McClean A, Balakrishnan S, Kennedy EH, Wasserman L. Double Cross-fit Doubly Robust Estimators: Beyond Series Regression. arXiv preprint arXiv:240315175. 2024. Available from:https://arxiv.org/ abs/2403.15175

work page arXiv 2024

[12] [12]

Three-way Cross-Fitting and Pseudo-Outcome Regression for Estimation of Conditional Effects and other Linear Functionals

Fisher A, Fisher V. Three-way Cross-Fitting and Pseudo-Outcome Regression for Estimation of Conditional Effects and other Linear Functionals. arXiv preprint arXiv:230607230. 2023. Available from: https://arxiv.org/abs/2306.07230

work page arXiv 2023

[13] [13]

Doubly Robust Triple Cross-Fit Es- timation for Causal Inference with Imaging Data

Ke D, Zhou X, Yang Q, Song X. Doubly Robust Triple Cross-Fit Es- timation for Causal Inference with Imaging Data. Statistics in Bio- sciences. 2024. Online first. Available from:https://doi.org/10. 1007/s12561-024-09458-1. 16

work page 2024

[14] [14]

Journal of Business & Economic Statistics , volume =

Chiang HD, Kato K, Ma Y, Sasaki Y. Multiway Cluster Robust Dou- ble/DebiasedMachineLearning. JournalofBusiness&EconomicStatis- tics. 2022;40(3):1046-56. Available from:https://doi.org/10.1080/ 07350015.2021.1895815

work page arXiv 2022

[15] [15]

partialling-out

Zivich PN, Breskin A. Machine Learning for Causal Inference: On the Use of Cross-fit Estimators. Epidemiology. 2021;32(3):393-401. Avail- able from:https://doi.org/10.1097/EDE.0000000000001332. A Recipe: DML for the partially linear regression model Thisappendixgivesacomplete, self-containedexampleofaDouble/Debiased MachineLearning(DML)estimatorforthePart...

work page doi:10.1097/ede.0000000000001332 2021