pith. sign in

arxiv: 2509.12981 · v2 · submitted 2025-09-16 · 💻 cs.LG · stat.ML

Causal Discovery via Quantile Partial Effect

Pith reviewed 2026-05-18 15:48 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords causal discoveryquantile partial effectidentifiabilityobservational distributioncausal directionfisher informationbasis function test
0
0 comments X

The pith

Assuming the quantile partial effect of cause on effect lies in a finite linear span makes the causal direction identifiable from the observational distribution alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that if the quantile partial effect of one variable on another fits inside a finite linear span, the direction of causation can be recovered solely from the joint observational distribution of the two variables. This result generalizes earlier identifiability theorems that required additive noise, heteroscedastic noise, or other restrictions inside functional causal models. Because the quantile partial effect is computed directly from the data, the approach does not invoke hidden mechanisms, noise distributions, or the Markov condition. Instead it exploits visible asymmetries in the shape of the observed distribution. Readers should care because the method supplies concrete tests, via basis expansions for pairs and Fisher information for larger sets, that have been checked on many bivariate and multivariate datasets.

Core claim

When the QPE of cause on effect is assumed to lie in a finite linear span, cause and effect are identifiable from their observational distribution. This generalizes previous identifiability results based on Functional Causal Models with additive, heteroscedastic noise, etc. Since QPE resides entirely at the observational level, the parametric assumption does not require considering mechanisms, noise, or the Markov assumption, but rather directly utilizes the asymmetry of shape characteristics in the observational distribution. By performing basis function tests on the estimated QPE, causal directions can be distinguished. For multivariate causal discovery, Fisher Information is sufficient as

What carries the argument

Quantile Partial Effect (QPE) obtained from conditional quantile regression, which measures the shift in the outcome distribution at different probability levels and is assumed to lie in a finite linear span.

If this is right

  • Basis function tests on the estimated QPE distinguish causal directions in bivariate settings.
  • Fisher information recovers causal order in multivariate settings once the second moment of the QPE is bounded.
  • The method applies without specifying functional mechanisms or noise distributions.
  • The procedures succeed on a wide range of synthetic and real bivariate and multivariate causal discovery datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The finite-span assumption could be relaxed to other function classes such as smoothness or sparsity constraints to enlarge the set of identifiable models.
  • The link between QPE and score functions suggests possible integration with other information-based causal discovery techniques.
  • Similar quantile-effect tests might be developed for time-series or longitudinal data where the shape asymmetry evolves over time.

Load-bearing premise

The quantile partial effect of the cause on the effect must lie inside some finite-dimensional linear span.

What would settle it

A dataset with known true direction in which the QPE of cause on effect requires an infinite basis yet the finite-basis test recovers the correct direction, or a dataset in which the finite-span condition holds but the recovered direction is incorrect.

Figures

Figures reproduced from arXiv: 2509.12981 by Dehui Du, Xingzhe Sun, Yikang Chen.

Figure 1
Figure 1. Figure 1: Distributions and their QPEs for ANM Y = sin(X) + U and HNM Y = X3 + (1 + tanh((X − 1)2 ))U. (a) Joint distribution (heatmap) and samples (scatterplot); (b) Conditional density of Y |X (heatmap), conditional quantiles (white curves), and their gradients (white arrows); (c) QPE of Y |X (3D surface) and its intersection with the Y-Z plane (white curves); (d) Conditional density and conditional quantiles of X… view at source ↗
Figure 2
Figure 2. Figure 2: True and estimated QPEs of Y | X at samples from HNM Y = X3 + (1 + tanh((X − 1)2 ))U. From left to right: (a) True QPE; (b) QPE-k (Section 3.3); (c) Causal velocity model (Xi et al., 2025) (V-NN); (d) QPE-f (Section 3.4). The black lines represent the intersection of the QPE surface with the Y-Z plane. Only QPE-f’s trend along the Y-axis tend to match the true QPE. causal flow uθ, we can compute ∇xuθ and ∂… view at source ↗
Figure 3
Figure 3. Figure 3: ODRs of FICO and baselines on multivariate causal discovery datasets. Lower is better. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Convergence behavior of SKEW, SCORE, and FICO on HNM-GP datasets. [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: ODRs of FICO and baselines on 12 multivariate causal discovery datasets. Lower is better. [PITH_FULL_IMAGE:figures/full_fig_p027_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Relationship between FICO’s ODR and hyperparameters [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗
read the original abstract

Quantile Partial Effect (QPE) is a statistic associated with conditional quantile regression, measuring the effect of covariates at different levels. Our theory demonstrates that when the QPE of cause on effect is assumed to lie in a finite linear span, cause and effect are identifiable from their observational distribution. This generalizes previous identifiability results based on Functional Causal Models (FCMs) with additive, heteroscedastic noise, etc. Meanwhile, since QPE resides entirely at the observational level, this parametric assumption does not require considering mechanisms, noise, or even the Markov assumption, but rather directly utilizes the asymmetry of shape characteristics in the observational distribution. By performing basis function tests on the estimated QPE, causal directions can be distinguished, which is empirically shown to be effective in experiments on a large number of bivariate causal discovery datasets. For multivariate causal discovery, leveraging the close connection between QPE and score functions, we find that Fisher Information is sufficient as a statistical measure to determine causal order when assumptions are made about the second moment of QPE. We validate the feasibility of using Fisher Information to identify causal order on multiple synthetic and real-world multivariate causal discovery datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes using Quantile Partial Effect (QPE) for causal discovery. It demonstrates that assuming the QPE of the cause on the effect lies in a finite linear span allows identification of cause and effect from the observational distribution alone. This generalizes results from functional causal models with additive or heteroscedastic noise without invoking mechanisms, noise distributions, or the Markov assumption. For bivariate cases, basis function tests on estimated QPE distinguish directions, while for multivariate cases, Fisher information is used based on the connection between QPE and score functions under second-moment assumptions on QPE. The approach is empirically validated on multiple bivariate and multivariate synthetic and real-world datasets.

Significance. If the theoretical results hold, the paper contributes a new observational asymmetry for causal identifiability that operates directly at the level of quantile statistics, potentially extending causal discovery to settings where traditional assumptions do not apply. The empirical results on a large number of datasets provide evidence of practical utility. The connection to Fisher information offers a statistical measure for causal ordering in higher dimensions.

major comments (2)
  1. [§3, main identifiability theorem] The central identifiability claim requires that the finite linear span assumption on QPE(cause on effect) holds asymmetrically. The manuscript does not provide an explicit condition or proof ensuring that the reverse QPE(effect on cause) does not also lie in a finite linear span for the same joint distribution (e.g., when both are low-degree polynomials). Without this, the basis-function test cannot guarantee unique direction identification, which is load-bearing for the claim that directions are distinguishable from observational data alone.
  2. [§4.2, Fisher information derivation] The multivariate claim that Fisher Information suffices to determine causal order relies on assumptions about the second moment of QPE and its link to score functions. A detailed derivation is needed showing how the second-moment condition produces the ordering statistic, including regularity conditions and any error analysis.
minor comments (2)
  1. [Introduction] The definition and notation for QPE should be stated with an explicit equation in the introduction or preliminaries section for clarity.
  2. [Experiments] In the experiments section, specify the choice of basis functions, how the span dimension is selected, and any sensitivity analysis performed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and indicate the revisions we will make to improve the manuscript.

read point-by-point responses
  1. Referee: [§3, main identifiability theorem] The central identifiability claim requires that the finite linear span assumption on QPE(cause on effect) holds asymmetrically. The manuscript does not provide an explicit condition or proof ensuring that the reverse QPE(effect on cause) does not also lie in a finite linear span for the same joint distribution (e.g., when both are low-degree polynomials). Without this, the basis-function test cannot guarantee unique direction identification, which is load-bearing for the claim that directions are distinguishable from observational data alone.

    Authors: The identifiability result in Section 3 is conditional on the assumption that the QPE in the true causal direction lies in a finite linear span. The basis-function test then compares the fit of the estimated QPE to this span in each candidate direction and selects the direction with superior fit. We acknowledge that the manuscript does not explicitly characterize the set of distributions for which the assumption holds symmetrically. Such symmetric cases exist (e.g., linear or low-degree polynomial relationships), and in those cases the direction is not identifiable from the observational distribution alone, consistent with classical results. For generic nonlinear relationships the assumption is typically asymmetric. In the revision we will add a remark after Theorem 3 that (i) states the conditional nature of the result, (ii) gives a simple sufficient condition for asymmetry (nonlinearity of the conditional quantile function outside a finite-dimensional subspace), and (iii) notes that the test statistic remains well-defined and selects the better-fitting direction even when both directions satisfy the assumption to different degrees. revision: yes

  2. Referee: [§4.2, Fisher information derivation] The multivariate claim that Fisher Information suffices to determine causal order relies on assumptions about the second moment of QPE and its link to score functions. A detailed derivation is needed showing how the second-moment condition produces the ordering statistic, including regularity conditions and any error analysis.

    Authors: We agree that the current derivation in Section 4.2 is concise and would benefit from additional detail. In the revision we will expand the section to include: (1) a step-by-step derivation starting from the definition of the quantile partial effect, through its second-moment assumption, to the explicit expression for the Fisher information matrix as an ordering statistic; (2) the precise regularity conditions (twice continuous differentiability of the conditional density, finite second moments of the QPE, and integrability of the score); and (3) a short finite-sample error bound relating the estimation error of the QPE (via quantile regression) to the error in the estimated Fisher information. These additions will be placed immediately after the current statement of the multivariate procedure. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained without circular reductions

full rationale

The paper's central identifiability result rests on an explicit parametric assumption that the QPE of the true cause on the effect lies in a finite linear span, combined with basis-function tests applied to estimated QPE from the observational joint. This structure does not reduce any prediction or identifiability claim to a fitted quantity by construction, nor does it rely on self-citation load-bearing or imported uniqueness theorems. The multivariate extension invokes a connection to score functions and Fisher information under stated second-moment assumptions on QPE, but these remain observational statistics whose validity is tested empirically rather than forced by redefinition. No step equates the target causal order to an input fit or renames a known result; the derivation therefore retains independent content from the stated assumption and the proposed asymmetry tests.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The approach rests primarily on the domain assumption regarding the linear span of QPE and the second moment condition, with no new invented entities or free parameters explicitly fitted to the causal result itself.

free parameters (1)
  • linear span dimension for QPE
    The finite dimension of the linear span in which QPE is assumed to lie may require selection or fitting in application.
axioms (2)
  • domain assumption The quantile partial effect of the cause on the effect lies in a finite linear span
    This is the core parametric assumption enabling identifiability directly from the observational distribution.
  • domain assumption Assumptions about the second moment of QPE for using Fisher Information in multivariate settings
    Required for determining causal order in the multivariate case.

pith-pipeline@v0.9.0 · 5732 in / 1445 out tokens · 62549 ms · 2026-05-18T15:48:15.433289+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages

  1. [1]

    Correa, Duligur Ibeling, and Thomas Icard

    Elias Bareinboim, Juan D. Correa, Duligur Ibeling, and Thomas Icard. On Pearl’s Hierarchy and the Foundations of Causal Inference, pp.\ 507–556. Association for Computing Machinery, New York, NY, USA, 1 edition, 2022. ISBN 9781450395861

  2. [2]

    Dagma: Learning dags via m-matrices and a log-determinant acyclicity characterization

    Kevin Bello, Bryon Aragam, and Pradeep Ravikumar. Dagma: Learning dags via m-matrices and a log-determinant acyclicity characterization. Advances in Neural Information Processing Systems, 35: 0 8226--8239, 2022

  3. [3]

    Cause-effect inference by comparing regression errors

    Patrick Bloebaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, and Bernhard Schoelkopf. Cause-effect inference by comparing regression errors. In Amos Storkey and Fernando Perez-Cruz (eds.), Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pp.\ 90...

  4. [4]

    Certain cases in which the vanishing of the wronskian is a sufficient condition for linear dependence

    Maxime Bocher. Certain cases in which the vanishing of the wronskian is a sufficient condition for linear dependence. Trans. Am. Math. Soc., 2 0 (2): 0 139, April 1901

  5. [5]

    Cam: Causal additive models, high-dimensional order search and penalized regression

    Peter Bühlmann, Jonas Peters, and Jan Ernest. Cam: Causal additive models, high-dimensional order search and penalized regression. The Annals of Statistics, 42 0 (6): 0 2526--2556, 2014. ISSN 00905364

  6. [6]

    Exogenous isomorphism for counterfactual identifiability

    Yikang Chen and Dehui Du. Exogenous isomorphism for counterfactual identifiability. In Forty-second International Conference on Machine Learning, 2025

  7. [7]

    Optimal structure identification with greedy search

    David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3 0 (Nov): 0 507--554, 2002

  8. [8]

    A bayesian method for the induction of probabilistic networks from data

    Gregory F Cooper and Edward Herskovits. A bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9 0 (4): 0 309--347, October 1992

  9. [9]

    Inferring deterministic causal relations

    Povilas Daniu s is, Dominik Janzing, Joris Mooij, Jakob Zscheischler, Bastian Steudel, Kun Zhang, and Bernhard Sch\" o lkopf. Inferring deterministic causal relations. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI'10, pp.\ 143–150, Arlington, Virginia, USA, 2010. AUAI Press. ISBN 9780974903965

  10. [10]

    Density estimation using real NVP

    Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real NVP . In International Conference on Learning Representations, 2017

  11. [11]

    Bivariate causal discovery via conditional divergence

    Bao Duong and Thin Nguyen. Bivariate causal discovery via conditional divergence. In Bernhard Schölkopf, Caroline Uhler, and Kun Zhang (eds.), Proceedings of the First Conference on Causal Learning and Reasoning, volume 177 of Proceedings of Machine Learning Research, pp.\ 236--252. PMLR, 11--13 Apr 2022

  12. [12]

    Heteroscedastic causal structure learning

    Bao Duong and Thin Nguyen. Heteroscedastic causal structure learning. In Frontiers in Artificial Intelligence and Applications, Frontiers in artificial intelligence and applications. IOS Press, September 2023

  13. [13]

    Neural spline flows

    Conor Durkan, Artur Bekasov, Iain Murray, and George Papamakarios. Neural spline flows. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch \' e - Buc, Emily B. Fox, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-...

  14. [14]

    Jos \`e A. R. Fonollosa. Conditional Distribution Variability Measures for Causality Detection, pp.\ 339--347. Springer International Publishing, Cham, 2019. ISBN 978-3-030-21810-2. doi:10.1007/978-3-030-21810-2_12

  15. [15]

    Learning functional causal models with generative neural networks

    Olivier Goudet, Diviyan Kalainathan, Philippe Caillou, Isabelle Guyon, David Lopez-Paz, and Michele Sebag. Learning functional causal models with generative neural networks. In Explainable and interpretable models in computer vision and machine learning, pp.\ 39--80. Springer, 2018

  16. [16]

    Causal feature selection

    Isabelle Guyon, Constantin Aliferis, et al. Causal feature selection. In Computational methods of feature selection, pp.\ 79--102. Chapman and Hall/CRC, 2007

  17. [17]

    Cause effect pairs in machine learning

    Isabelle Guyon, Alexander Statnikov, and Berna Bakir Batu. Cause effect pairs in machine learning. Springer Nature, Cham, Switzerland, October 2019

  18. [18]

    Nonlinear causal discovery with additive noise models

    Patrik Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Sch\" o lkopf. Nonlinear causal discovery with additive noise models. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (eds.), Advances in Neural Information Processing Systems, volume 21. Curran Associates, Inc., 2008

  19. [19]

    Neural autoregressive flows

    Chin-Wei Huang, David Krueger, Alexandre Lacoste, and Aaron Courville. Neural autoregressive flows. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.\ 2078--2087. PMLR, 10--15 Jul 2018

  20. [20]

    Aapo Hyv \"a rinen, Kun Zhang, Shohei Shimizu, and Patrik O. Hoyer. Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research, 11 0 (56): 0 1709--1731, 2010

  21. [21]

    o lkopf, Peter B\

    Alexander Immer, Christoph Schultheiss, Julia E Vogt, Bernhard Sch\" o lkopf, Peter B\" u hlmann, and Alexander Marx. On the identifiability and estimation of causal location-scale noise models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (eds.), Proceedings of the 40th International Conference...

  22. [22]

    Causal normalizing flows: from theory to practice

    Adri \'a n Javaloy, Pablo S \'a nchez-Mart \' n, and Isabel Valera. Causal normalizing flows: from theory to practice. Advances in Neural Information Processing Systems, 36: 0 58833--58864, 2023

  23. [23]

    Causal discovery toolbox: Uncovering causal relationships in python

    Diviyan Kalainathan, Olivier Goudet, and Ritik Dutta. Causal discovery toolbox: Uncovering causal relationships in python. Journal of Machine Learning Research, 21 0 (37): 0 1--5, 2020

  24. [24]

    Causal autoregressive flows

    Ilyes Khemakhem, Ricardo Monti, Robert Leech, and Aapo Hyvarinen. Causal autoregressive flows. In Arindam Banerjee and Kenji Fukumizu (eds.), Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pp.\ 3520--3528. PMLR, 13--15 Apr 2021

  25. [25]

    Quantile Regression

    Roger Koenker. Quantile Regression. Econometric Society Monographs. Cambridge University Press, 2005

  26. [26]

    A skewness-based criterion for addressing heteroscedastic noise in causal discovery

    Yingyu Lin, Yuxing Huang, Wenqin Liu, Haoran Deng, Ignavier Ng, Kun Zhang, Mingming Gong, Yian Ma, and Biwei Huang. A skewness-based criterion for addressing heteroscedastic noise in causal discovery. In The Thirteenth International Conference on Learning Representations, 2025

  27. [27]

    Generating realistic in silico gene networks for performance assessment of reverse engineering methods

    Daniel Marbach, Thomas Schaffter, Claudio Mattiussi, and Dario Floreano. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Journal of Computational Biology, 16 0 (2): 0 229--239, 2009. doi:10.1089/cmb.2008.09TT. PMID: 19183003

  28. [28]

    Identifiability of cause and effect using regularized regression

    Alexander Marx and Jilles Vreeken. Identifiability of cause and effect using regularized regression. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '19, pp.\ 852–861, New York, NY, USA, 2019 a . Association for Computing Machinery. ISBN 9781450362016. doi:10.1145/3292500.3330854

  29. [29]

    Telling cause from effect by local and global regression

    Alexander Marx and Jilles Vreeken. Telling cause from effect by local and global regression. Knowl. Inf. Syst., 60 0 (3): 0 1277--1305, September 2019 b

  30. [30]

    Causal discovery with score matching on additive models with arbitrary noise

    Francesco Montagna, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, and Francesco Locatello. Causal discovery with score matching on additive models with arbitrary noise. In Mihaela van der Schaar, Cheng Zhang, and Dominik Janzing (eds.), Proceedings of the Second Conference on Causal Learning and Reasoning, volume 213 of Proceedings of Machine Learning Res...

  31. [31]

    Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Sch \"o lkopf

    Joris M. Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Sch \"o lkopf. Distinguishing cause from effect using observational data: Methods and benchmarks. Journal of Machine Learning Research, 17 0 (32): 0 1--102, 2016

  32. [32]

    Counterfactual identifiability of bijective causal models

    Arash Nasr-Esfahany, Mohammad Alizadeh, and Devavrat Shah. Counterfactual identifiability of bijective causal models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research,...

  33. [33]

    Standardizing structural causal models

    Weronika Ormaniec, Scott Sussex, Lars Lorch, Bernhard Sch \"o lkopf, and Andreas Krause. Standardizing structural causal models. In The Thirteenth International Conference on Learning Representations, 2025

  34. [34]

    Causality: Models, Reasoning and Inference

    Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, USA, 2nd edition, 2009. ISBN 052189560X

  35. [35]

    Mooij, Dominik Janzing, and Bernhard Sch \"o lkopf

    Jonas Peters, Joris M. Mooij, Dominik Janzing, and Bernhard Sch \"o lkopf. Causal discovery with continuous additive noise models. Journal of Machine Learning Research, 15 0 (58): 0 2009--2053, 2014

  36. [36]

    Beware of the simulated dag! causal discovery benchmarks may be easy to game

    Alexander Reisach, Christof Seiler, and Sebastian Weichwald. Beware of the simulated dag! causal discovery benchmarks may be easy to game. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, volume 34, pp.\ 27772--27784. Curran Associates, Inc., 2021

  37. [37]

    A scale-invariant sorting criterion to find a causal order in additive noise models

    Alexander Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald. A scale-invariant sorting criterion to find a causal order in additive noise models. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (eds.), Advances in Neural Information Processing Systems, volume 36, pp.\ 785--807. Curran Associates, Inc., 2023

  38. [38]

    a us Kleindessner, Chris Russell, Dominik Janzing, Bernhard Sch \

    Paul Rolland, Volkan Cevher, Matth \"a us Kleindessner, Chris Russell, Dominik Janzing, Bernhard Sch \"o lkopf, and Francesco Locatello. Score matching enables causal discovery of nonlinear additive noise models. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Con...

  39. [39]

    Lauffenburger, and Garry P

    Karen Sachs, Omar Perez, Dana Pe'er, Douglas A. Lauffenburger, and Garry P. Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308 0 (5721): 0 523--529, 2005. doi:10.1126/science.1105809

  40. [40]

    Tsaftaris

    Pedro Sanchez, Xiao Liu, Alison Q O'Neil, and Sotirios A. Tsaftaris. Diffusion models for causal discovery via topological ordering. In The Eleventh International Conference on Learning Representations, 2023

  41. [41]

    Hoyer, Aapo Hyv \"a rinen, and Antti Kerminen

    Shohei Shimizu, Patrik O. Hoyer, Aapo Hyv \"a rinen, and Antti Kerminen. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7 0 (72): 0 2003--2030, 2006

  42. [42]

    Hoyer, and Kenneth Bollen

    Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyv \"a rinen, Yoshinobu Kawahara, Takashi Washio, Patrik O. Hoyer, and Kenneth Bollen. Directlingam: A direct method for learning a linear non-gaussian structural equation model. Journal of Machine Learning Research, 12 0 (33): 0 1225--1248, 2011

  43. [43]

    Directed cyclic graphical representations of feedback models

    Peter Spirtes. Directed cyclic graphical representations of feedback models. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI'95, pp.\ 491–498, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc. ISBN 1558603859

  44. [44]

    An algorithm for fast recovery of sparse causal graphs

    Peter Spirtes and Clark Glymour. An algorithm for fast recovery of sparse causal graphs. Social science computer review, 9 0 (1): 0 62--72, 1991

  45. [45]

    A bound for the error in the normal approximation to the distribution of a sum of dependent random variables

    Charles Stein. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the sixth Berkeley symposium on mathematical statistics and probability, volume 2: Probability theory, volume 6, pp.\ 583--603. University of California Press, 1972

  46. [46]

    Strobl and Thomas A

    Eric V. Strobl and Thomas A. Lasko. Identifying patient-specific root causes with the heteroscedastic noise model. Journal of Computational Science, 72: 0 102099, 2023. ISSN 1877-7503. doi:https://doi.org/10.1016/j.jocs.2023.102099

  47. [47]

    Cause-effect inference in location-scale noise models: Maximum likelihood vs

    Xiangyu Sun and Oliver Schulte. Cause-effect inference in location-scale noise models: Maximum likelihood vs. independence testing. Advances in Neural Information Processing Systems, 36: 0 5447--5483, 2023

  48. [48]

    Distinguishing cause from effect using quantiles: Bivariate quantile causal discovery

    Natasa Tagasovska, Val \'e rie Chavez-Demoulin, and Thibault Vatter. Distinguishing cause from effect using quantiles: Bivariate quantile causal discovery. In Hal Daumé III and Aarti Singh (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp.\ 9311--9323. PMLR, 13--18 Jul 2020

  49. [49]

    Optimal transport for causal discovery

    Ruibo Tu, Kun Zhang, Hedvig Kjellstrom, and Cheng Zhang. Optimal transport for causal discovery. In International Conference on Learning Representations, 2022

  50. [50]

    Syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithms

    Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Verschoren, Bart De Moor, and Kathleen Marchal. Syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC bioinformatics, 7 0 (1): 0 43, 2006

  51. [51]

    Unconstrained monotonic neural networks

    Antoine Wehenkel and Gilles Louppe. Unconstrained monotonic neural networks. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch \' e - Buc, Emily B. Fox, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, V...

  52. [52]

    Distinguishing cause from effect with causal velocity models

    Johnny Xi, Hugh Dance, Peter Orbanz, and Benjamin Bloem-Reddy. Distinguishing cause from effect with causal velocity models. In Forty-second International Conference on Machine Learning, 2025

  53. [53]

    Inferring cause and effect in the presence of heteroscedastic noise

    Sascha Xu, Osman A Mian, Alexander Marx, and Jilles Vreeken. Inferring cause and effect in the presence of heteroscedastic noise. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research...

  54. [54]

    Information-theoretic causal discovery in topological order

    Sascha Xu, Sarah Mameche, and Jilles Vreeken. Information-theoretic causal discovery in topological order. In The 28th International Conference on Artificial Intelligence and Statistics, 2025

  55. [55]

    Ordering-based causal discovery for linear and nonlinear relations

    Zhuopeng Xu, Yujie Li, Cheng Liu, and Ning Gui. Ordering-based causal discovery for linear and nonlinear relations. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (eds.), Advances in Neural Information Processing Systems, volume 37, pp.\ 4315--4340. Curran Associates, Inc., 2024

  56. [56]

    Causalvae: Disentangled representation learning via neural structural causal models

    Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, and Jun Wang. Causalvae: Disentangled representation learning via neural structural causal models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9593--9602, June 2021

  57. [57]

    DAG - GNN : DAG structure learning with graph neural networks

    Yue Yu, Jie Chen, Tian Gao, and Mo Yu. DAG - GNN : DAG structure learning with graph neural networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp.\ 7154--7163. PMLR, 09--15 Jun 2019

  58. [58]

    gcastle: A python toolbox for causal discovery.arXiv preprint arXiv:2111.15155, 2021

    Keli Zhang, Shengyu Zhu, Marcus Kalander, Ignavier Ng, Junjian Ye, Zhitang Chen, and Lujia Pan. gcastle: A python toolbox for causal discovery. arXiv preprint arXiv:2111.15155, 2021

  59. [59]

    On the identifiability of the post-nonlinear causal model

    Kun Zhang and Aapo Hyv\" a rinen. On the identifiability of the post-nonlinear causal model. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI '09, pp.\ 647–655, Arlington, Virginia, USA, 2009. AUAI Press. ISBN 9780974903958

  60. [60]

    Dags with no tears: Continuous optimization for structure learning

    Xun Zheng, Bryon Aragam, Pradeep K Ravikumar, and Eric P Xing. Dags with no tears: Continuous optimization for structure learning. Advances in neural information processing systems, 31, 2018

  61. [61]

    Causal-learn: Causal discovery in python

    Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. Causal-learn: Causal discovery in python. Journal of Machine Learning Research, 25 0 (60): 0 1--8, 2024

  62. [62]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  63. [63]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  64. [64]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  65. [65]

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...