pith. sign in

arxiv: 2605.06385 · v1 · submitted 2026-05-07 · 💻 cs.LG

Data-Driven Covariate Selection for Nonparametric and Cycle-Agnostic Causal Effect Estimation

Pith reviewed 2026-05-08 12:45 UTC · model grok-4.3

classification 💻 cs.LG
keywords covariate selectioncausal effect estimationconditional independencecyclic causal modelsadjustment setsnonparametric estimationdata-driven methodssigma-acyclification
0
0 comments X

The pith

Conditional independence-based covariate selection for causal effects works in cyclic models without modification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a local data-driven method for picking adjustment sets from conditional independence tests, already known to be sound in acyclic graphs, remains valid when feedback loops create cycles. The key step is proving that these independence relations stay unchanged after sigma-acyclification converts the cyclic graph to an acyclic one for analysis. A reader would care because many real systems contain cycles yet current tools either ban them outright or demand full global structure learning first. The outcome is one procedure that works for both cyclic and acyclic cases with no extra steps required. Synthetic data experiments confirm the extension performs reliably.

Core claim

The central claim is that the soundness and completeness guarantees of the local covariate selection method based on conditional independence extend to cyclic causal models. This extension follows directly from the invariance of conditional independence assertions under sigma-acyclification. The result yields a unified cycle-agnostic framework for identifying valid adjustment sets and estimating causal effects that requires no changes between cyclic and acyclic settings.

What carries the argument

Invariance of conditional independence assertions under sigma-acyclification, which preserves the information needed to select valid adjustment sets even when cycles are present.

If this is right

  • The identical local procedure identifies valid adjustment sets in the presence of feedback loops.
  • No global causal graph recovery is needed for covariate selection regardless of cycles.
  • Nonparametric causal effect estimation applies directly to cyclic models using the same data-driven steps.
  • Empirical reliability holds on synthetic data generated from cyclic structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other properties established only for acyclic models may transfer to cyclic ones through the same invariance argument.
  • Software for causal inference can drop separate acyclicity checks and use one selection routine.
  • The approach suggests a route to handling mixed structures that contain both cycles and acyclic components.

Load-bearing premise

Conditional independence assertions remain unchanged when a cyclic causal graph is transformed by sigma-acyclification.

What would settle it

A cyclic causal graph together with data generated from it in which the conditional-independence procedure selects an adjustment set that fails to identify the true causal effect.

Figures

Figures reproduced from arXiv: 2605.06385 by Ana Leticia Garcez Vicente, Gijs van Seeventer, Saber Salehkaleybar.

Figure 1
Figure 1. Figure 1: Socioeconomic status can influence lifestyle. Lifestyle can view at source ↗
Figure 2
Figure 2. Figure 2: Relative error of linear models in acyclic (red square) and cyclic (blue circle) settings across view at source ↗
Figure 3
Figure 3. Figure 3: Edge fraction accuracy (first row) and precision (second row) of linear and non-linear view at source ↗
Figure 4
Figure 4. Figure 4: The y-axis shows the wall run time in seconds, while the x-axis is the number of nodes. view at source ↗
Figure 5
Figure 5. Figure 5: Performance across varying graph sizes of the different Markov blanket discovery algo view at source ↗
Figure 6
Figure 6. Figure 6: Number of independence tests across varying graph sizes of the different Markov blanket view at source ↗
Figure 7
Figure 7. Figure 7: Wall runtime in seconds (s) of the adjustment set discovery method under the pre-treatment view at source ↗
Figure 8
Figure 8. Figure 8: Wall runtime in seconds (s) of the adjustment set discovery method under the pre-treatment view at source ↗
Figure 9
Figure 9. Figure 9: Performance across varying graph sizes. The x-axis represents the number of nodes, while view at source ↗
Figure 10
Figure 10. Figure 10: Precision of linear models in acyclic (red square) and cyclic (blue circle) settings across view at source ↗
Figure 11
Figure 11. Figure 11: Empty fraction of linear and non-linear models in acyclic (red square) and cyclic (blue view at source ↗
read the original abstract

Estimating causal effects from observational data requires identifying valid adjustment sets. This task is especially challenging in realistic settings where latent confounding and feedback loops are present. Existing approaches typically assume acyclicity or rely on global causal structure learning, limiting applicability and computational efficiency. In this work, we study a local, data-driven method for covariate selection based on conditional independence information. While this method is known to be sound and complete in acyclic causal models, its validity in the presence of cycles has remained unclear. Our main contribution is to show that these guarantees extend to cyclic causal models. In particular, our result relies on the invariance of conditional independence assertions under $\sigma$-acyclification. These findings establish a unified, cycle-agnostic perspective on covariate selection and causal effect estimation, showing that the method applies across cyclic and acyclic settings without modification. Empirically, we validate this on extensive synthetic data, showing reliable performance in cyclic causal models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a local data-driven covariate selection procedure for nonparametric causal effect estimation that relies on conditional independence tests. It establishes that the procedure, previously shown to be sound and complete for acyclic causal models, extends without modification to cyclic models because conditional independence assertions are invariant under σ-acyclification. The result yields a cycle-agnostic method applicable even in the presence of latent confounding, with supporting empirical validation on synthetic data.

Significance. If the invariance result is rigorously established, the work unifies covariate selection across cyclic and acyclic graphs, eliminating the need for acyclicity assumptions or global structure learning. This is a meaningful advance for realistic settings with feedback loops, provided the preservation of adjustment-relevant conditional independences holds under latent confounding.

major comments (3)
  1. [Main theorem / invariance argument (following the abstract claim)] The central extension rests on invariance of conditional independence assertions under σ-acyclification. The manuscript must explicitly verify that this transformation preserves precisely the conditional independences used to identify valid adjustment sets (including those involving latent confounders), as failure here would undermine soundness or completeness even if the algorithm is unchanged.
  2. [Proof of extension to cyclic models] The soundness and completeness in acyclic models is taken as given; the cyclic extension therefore requires a self-contained argument or counter-example analysis showing that σ-acyclification does not alter the local covariate-selection decisions when cycles and latent variables coexist.
  3. [Experimental section] Empirical validation on synthetic data is reported, but the experiments must include explicit stress tests with cycles plus latent confounding to confirm that the invariance survives the combination of features that the skeptic identifies as potentially problematic.
minor comments (2)
  1. [Notation and preliminaries] Clarify the precise definition of σ-acyclification and its relation to the original graph when latent variables are present.
  2. [Related work] Ensure all references to prior acyclic results are cited with specific theorem numbers for traceability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of the cycle-agnostic covariate selection approach. We agree that the invariance of conditional independence under σ-acyclification is the linchpin of the extension and will strengthen both the formal argument and the empirical validation in the revision. Below we respond to each major comment.

read point-by-point responses
  1. Referee: The central extension rests on invariance of conditional independence assertions under σ-acyclification. The manuscript must explicitly verify that this transformation preserves precisely the conditional independences used to identify valid adjustment sets (including those involving latent confounders), as failure here would undermine soundness or completeness even if the algorithm is unchanged.

    Authors: We will add a dedicated subsection (and supporting lemmas in the appendix) that explicitly maps the conditional independences relevant to adjustment-set identification before and after σ-acyclification. The argument will show that any d-separation statement involving observed or latent variables that determines membership in a valid adjustment set is preserved, because σ-acyclification only removes directed cycles without altering the ancestral relationships or the separating sets used by the local selection procedure. This directly addresses soundness and completeness under latent confounding. revision: yes

  2. Referee: The soundness and completeness in acyclic models is taken as given; the cyclic extension therefore requires a self-contained argument or counter-example analysis showing that σ-acyclification does not alter the local covariate-selection decisions when cycles and latent variables coexist.

    Authors: We will supply a self-contained proof that the local decisions (i.e., which covariates are retained or discarded by the conditional-independence tests) remain identical after σ-acyclification. The proof proceeds by showing that the set of conditional independence queries issued by the algorithm is invariant, because any path that would be blocked or opened by a cycle is replaced by an equivalent acyclic path that yields the same independence relation. We will also include a brief counter-example analysis confirming that no spurious independences are introduced when latent confounders are present. revision: yes

  3. Referee: Empirical validation on synthetic data is reported, but the experiments must include explicit stress tests with cycles plus latent confounding to confirm that the invariance survives the combination of features that the skeptic identifies as potentially problematic.

    Authors: We will expand the experimental section with a new suite of simulations that jointly vary cycle density and the presence of latent confounders. These stress tests will report covariate-selection accuracy, false-positive rates for adjustment-set membership, and downstream causal-effect estimation error under the exact conditions highlighted by the referee. The additional results will be presented alongside the existing acyclic baselines for direct comparison. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central extension rests on a graph transformation and conditional-independence invariance proved in the paper

full rationale

The derivation chain begins from the known soundness/completeness of the local covariate-selection procedure in acyclic models (cited as established) and extends it to cyclic models by showing that conditional-independence assertions are invariant under σ-acyclification. This invariance is presented as the paper's main mathematical contribution rather than being presupposed by definition or by a self-citation chain. No fitted parameters are relabeled as predictions, no ansatz is smuggled via prior self-work, and the acyclic base case is treated as external input rather than derived from the cyclic result. The argument is therefore self-contained against external benchmarks and receives only a minor self-citation penalty.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard assumptions from causal graphical models (conditional independence semantics) and invokes the invariance property of σ-acyclification as the key step for the cyclic extension; no free parameters or new invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Conditional independence assertions are invariant under σ-acyclification
    This invariance is the central property used to extend the acyclic guarantees to cyclic models.
  • domain assumption The covariate selection method is sound and complete for acyclic causal models
    The paper treats this as established prior knowledge and builds the cyclic result on top of it.

pith-pipeline@v0.9.0 · 5468 in / 1361 out tokens · 59870 ms · 2026-05-08T12:45:32.150992+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 1 canonical work pages

  1. [1]

    Estimating causal effects from epidemiological data.Journal of Epidemiology & Community Health, 60(7):578–586, 2006

    Miguel A Hernán and James M Robins. Estimating causal effects from epidemiological data.Journal of Epidemiology & Community Health, 60(7):578–586, 2006

  2. [2]

    Princeton university press, 2009

    Joshua D Angrist and Jörn-Steffen Pischke.Mostly harmless econometrics: An empiricist’s companion. Princeton university press, 2009

  3. [3]

    Understanding randomised controlled trials.Archives of disease in childhood, 90(8):840– 844, 2005

    Al K Akobeng. Understanding randomised controlled trials.Archives of disease in childhood, 90(8):840– 844, 2005

  4. [4]

    Harvard University Press, 2017

    Paul Rosenbaum.Observation and experiment: An introduction to causal inference. Harvard University Press, 2017

  5. [5]

    Causal inference and the data-fusion problem.Proceedings of the National Academy of Sciences, 113(27):7345–7352, 2016

    Elias Bareinboim and Judea Pearl. Causal inference and the data-fusion problem.Proceedings of the National Academy of Sciences, 113(27):7345–7352, 2016

  6. [6]

    Spieker, Antoine Pariente, Pernelle Noize, Marc Simard, Miguel Angel Luque Fernandez, Michael Schomaker, Kenji Fujita, Danijela Gnjidic, and Mireille E

    Denis Talbot, Awa Diop, Miceline Mésidor, Yohann Chiu, Caroline Sirois, Andrew J. Spieker, Antoine Pariente, Pernelle Noize, Marc Simard, Miguel Angel Luque Fernandez, Michael Schomaker, Kenji Fujita, Danijela Gnjidic, and Mireille E. Schnitzer. Guidelines and best practices for the use of targeted maximum likelihood and machine learning when estimating c...

  7. [7]

    Local learning for covariate selection in nonparametric causal effect estimation with latent variables

    Zheng Li, Xichen Guo, Feng Xie, Yan Zeng, Hao Zhang, and Zhi Geng. Local learning for covariate selection in nonparametric causal effect estimation with latent variables. In D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Advances in Neural Information Processing Systems, volume 38, pages 124423–124455. Curran Asso...

  8. [8]

    Local search for efficient causal effect estimation.IEEE Transactions on Knowledge and Data Engineering, 35(9):8823–8837, 2022

    Debo Cheng, Jiuyong Li, Lin Liu, Jiji Zhang, Jixue Liu, and Thuc Duy Le. Local search for efficient causal effect estimation.IEEE Transactions on Knowledge and Data Engineering, 35(9):8823–8837, 2022

  9. [9]

    MIT press, 2017

    Jonas Peters, Dominik Janzing, and Bernhard Scholkopf.Elements of causal inference: foundations and learning algorithms. MIT press, 2017

  10. [10]

    World cancer report

    Christopher P Wild, Elisabete Weiderpass, and Bernard W Stewart. World cancer report. 2020

  11. [11]

    Foundations of structural causal models with cycles and latent variables.The Annals of Statistics, 49(5):2885–2915, 2021

    Stephan Bongers, Patrick Forré, Jonas Peters, and Joris M Mooij. Foundations of structural causal models with cycles and latent variables.The Annals of Statistics, 49(5):2885–2915, 2021

  12. [12]

    Markov Properties for Graphical Models with Cycles and Latent Variables

    Patrick Forré and Joris M Mooij. Markov properties for graphical models with cycles and latent variables. arXiv preprint arXiv:1710.08775, 2017

  13. [13]

    Cambridge university press, 2009

    Judea Pearl.Causality. Cambridge university press, 2009

  14. [14]

    Bayesian nonparametric modeling for causal inference.Journal of Computational and Graphical Statistics, 20(1):217–240, 2011

    Jennifer L Hill. Bayesian nonparametric modeling for causal inference.Journal of Computational and Graphical Statistics, 20(1):217–240, 2011

  15. [15]

    Cambridge university press, 2015

    Guido W Imbens and Donald B Rubin.Causal inference in statistics, social, and biomedical sciences. Cambridge university press, 2015

  16. [16]

    Estimation and inference of heterogeneous treatment effects using random forests.Journal of the American Statistical Association, 113(523):1228–1242, 2018

    Stefan Wager and Susan Athey. Estimation and inference of heterogeneous treatment effects using random forests.Journal of the American Statistical Association, 113(523):1228–1242, 2018. 10

  17. [17]

    [bayesian analysis in expert systems]: Comment: Graphical models, causality and intervention

    Judea Pearl. [bayesian analysis in expert systems]: Comment: Graphical models, causality and intervention. Statistical Science, 8(3):266–269, 1993

  18. [18]

    Complete graphical characterization and construction of adjustment sets in markov equivalence classes of ancestral graphs

    Emilija Perkovi ´c, Johannes Textor, Markus Kalisch, and Marloes H Maathuis. Complete graphical characterization and construction of adjustment sets in markov equivalence classes of ancestral graphs. Journal of Machine Learning Research, 18(220):1–62, 2018

  19. [19]

    An algorithm for fast recovery of sparse causal graphs.Social science computer review, 9(1):62–72, 1991

    Peter Spirtes and Clark Glymour. An algorithm for fast recovery of sparse causal graphs.Social science computer review, 9(1):62–72, 1991

  20. [20]

    Estimating high-dimensional intervention effects from observational data

    Marloes H Maathuis, Markus Kalisch, and Peter Bühlmann. Estimating high-dimensional intervention effects from observational data. 2009

  21. [21]

    Estimating bounds on causal effects in high-dimensional and possibly confounded systems.International Journal of Approximate Reasoning, 88:371–384, 2017

    Daniel Malinsky and Peter Spirtes. Estimating bounds on causal effects in high-dimensional and possibly confounded systems.International Journal of Approximate Reasoning, 88:371–384, 2017

  22. [22]

    Do-calculus when the true graph is unknown

    Antti Hyttinen, Frederick Eberhardt, and Matti Järvisalo. Do-calculus when the true graph is unknown. In UAI, volume 15, pages 395–404, 2015

  23. [23]

    Data-driven covariate selection for nonparametric estimation of causal effects

    Doris Entner, Patrik Hoyer, and Peter Spirtes. Data-driven covariate selection for nonparametric estimation of causal effects. InArtificial intelligence and statistics, pages 256–264. PMLR, 2013

  24. [24]

    Local causal discovery for structural evidence of direct discrimination

    Jacqueline Maasch, Kyra Gan, Violet Chen, Agni Orfanoudaki, Nil-Jana Akpinar, and Fei Wang. Local causal discovery for structural evidence of direct discrimination. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 19349–19357, 2025

  25. [25]

    Algorithms for large scale markov blanket discovery

    Ioannis Tsamardinos, Constantin F Aliferis, Alexander R Statnikov, and Er Statnikov. Algorithms for large scale markov blanket discovery. InFLAIRS, volume 2, pages 376–81, 2003

  26. [26]

    Speculative markov blanket discovery for optimal feature selection

    Sandeep Yaramakala and Dimitris Margaritis. Speculative markov blanket discovery for optimal feature selection. InFifth IEEE International Conference on Data Mining (ICDM’05), pages 4–pp. IEEE, 2005

  27. [27]

    Hiton: a novel markov blanket algorithm for optimal variable selection

    Constantin F Aliferis, Ioannis Tsamardinos, and Alexander Statnikov. Hiton: a novel markov blanket algorithm for optimal variable selection. InAMIA annual symposium proceedings, volume 2003, page 21, 2003

  28. [28]

    Using markov blankets for causal structure learning.Journal of Machine Learning Research, 9(7), 2008

    Jean-Philippe Pellet and André Elisseeff. Using markov blankets for causal structure learning.Journal of Machine Learning Research, 9(7), 2008

  29. [29]

    Constraint-based causal discovery using partial ancestral graphs in the presence of cycles

    Joris M Mooij and Tom Claassen. Constraint-based causal discovery using partial ancestral graphs in the presence of cycles. InConference on Uncertainty in Artificial Intelligence, pages 1159–1168. Pmlr, 2020

  30. [30]

    Causal calculus in the presence of cycles, latent confounders and selection bias

    Patrick Forré and Joris M Mooij. Causal calculus in the presence of cycles, latent confounders and selection bias. InUncertainty in Artificial Intelligence, pages 71–80. PMLR, 2020

  31. [31]

    Causal reasoning with ancestral graphs.Journal of Machine Learning Research, 9(7), 2008

    Jiji Zhang. Causal reasoning with ancestral graphs.Journal of Machine Learning Research, 9(7), 2008. A Markov blanket discovery algorithms We considered four standard Markov blankets discovery algorithms: IAMB [25], Fast-IAMB [26], HITON-MB

  32. [32]

    All methods aim to recover the Markov blanket MB(X) of a target variableXusing conditional independence (CI) tests

    and TC (Total Conditioning) [28]. All methods aim to recover the Markov blanket MB(X) of a target variableXusing conditional independence (CI) tests. The TC algorithm relies on an exhaustive characterization: a variable Y belongs to MB(X) iff X and Y are dependent conditional on all remaining variables. This characterization is sound and complete under fa...

  33. [33]

    Statement (1) is exactly the Markov blanket equivalence underσ-acyclification (Lemma 4.2)

    For everyZ⊆O\ {X, Y}, W̸ ⊥ ⊥σ Y|Z⇐ ⇒W̸ ⊥ ⊥ d Y|Z, W⊥ ⊥σ Y|Z∪ {X} ⇐ ⇒W⊥ ⊥ d Y|Z∪ {X}, where the right-hand side is evaluated inG acy. Statement (1) is exactly the Markov blanket equivalence underσ-acyclification (Lemma 4.2). For (2), let W∈O\ {X, Y} and Z⊆O\ {X, Y} . By the equivalence of σ-separation in G and d-separation inG acy, we have W⊥ ⊥σ Y|Z⇐ ⇒W⊥ ⊥...