Causal Discovery in the Era of Agents
Pith reviewed 2026-06-26 08:19 UTC · model grok-4.3
The pith
Agents should handle data inspection and assumption explanation in causal discovery but must not generate edges, directions, or conclusions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Agents should inspect data, retrieve context, explain method assumptions and clarify graph outputs, but they should not supply edges, orientations, priors, constraints or causal conclusions. Causal claims remain grounded in data, explicit assumptions, formal algorithms, diagnostics and user or domain-expert decisions, as instantiated in the causal-learn+ platform that coordinates analysis around the causal-learn ecosystem without allowing language-model outputs to become causal evidence.
What carries the argument
The principle that agents assist the workflow while causal claims remain grounded exclusively in data, assumptions, algorithms and expert decisions, implemented as causal-learn+.
If this is right
- Causal discovery pipelines can incorporate language models for context retrieval and output clarification without the models determining structure.
- Expert knowledge enters only through explicit user or domain-expert input rather than model-proposed constraints.
- Method selection and diagnostics stay under algorithmic control even when agents suggest candidates.
- Interpretation of results remains traceable to data and formal assumptions rather than textual associations.
Where Pith is reading between the lines
- The same separation principle could apply to other scientific workflows where generative models risk substituting associations for measurements.
- Platforms built this way may make it easier to audit exactly which steps relied on data versus assistance.
- If the separation proves hard to enforce, hybrid human-AI review checkpoints would become necessary at every handoff.
Load-bearing premise
That agent roles can be kept strictly separate from causal inference steps so language-model outputs never leak into final edges, priors or conclusions.
What would settle it
An empirical case where an agent limited to inspection and explanation still produces a causal graph whose structure matches a known language-model hallucination rather than the data diagnostics.
read the original abstract
Recent attempts to combine large language models (LLMs) with causal discovery ask models to infer pairwise directions, propose graph structures, or inject language-model outputs as priors and constraints. These approaches promise faster analysis, but they also obscure whether a causal evidence is supported by data and assumptions or by textual associations, prompt artifacts and hallucinated mechanisms. We argue for a different role for agents in causal discovery. Agents should inspect data, retrieve context, explain method assumptions and clarify graph outputs, but they should not supply edges, orientations, priors, constraints or causal conclusions. We propose the principle that agents assist the workflow, while causal claims remain grounded in data, explicit assumptions, formal algorithms, diagnostics and user or domain-expert decisions. We instantiate this principle in causal-learn+, an online platform that coordinates data analysis, preprocessing, method recommendation, expert-knowledge incorporation, formal discovery and interpretation around the algorithmic ecosystem of causal-learn. A case study on Big Five personality data illustrates agent-assisted pipeline of causal discovery without turning language-model unreliability into causal evidence. The platform is available at causallearn.com.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that LLMs and agents should not directly infer causal edges, orientations, priors, constraints or conclusions in discovery tasks, as this risks conflating textual associations with data-supported evidence. Instead, agents should only assist by inspecting data, retrieving context, explaining method assumptions and clarifying outputs. The authors propose a principle that causal claims must remain grounded exclusively in data, explicit assumptions, formal algorithms, diagnostics and domain-expert decisions. They instantiate the principle in the causal-learn+ online platform, which coordinates preprocessing, method recommendation, expert-knowledge incorporation, formal discovery via the causal-learn library and interpretation. A descriptive case study on Big Five personality data is presented to illustrate an agent-assisted workflow that avoids turning LLM outputs into causal evidence.
Significance. If the proposed separation of roles can be reliably enforced, the work could help preserve the epistemic grounding of causal discovery methods by excluding unreliable LLM-generated content from the inference pipeline. The availability of the causal-learn+ platform and the explicit statement of the principle provide a concrete starting point for discussion in the causal discovery community.
major comments (2)
- [causal-learn+ description] § on causal-learn+ instantiation: the steps of 'method recommendation' and 'expert-knowledge incorporation' necessarily involve agent outputs that select algorithms or shape constraints; no mechanism is described that isolates these outputs from the formal discovery pipeline, so prompt artifacts could still determine which method runs or which domain constraints are applied, directly contradicting the central claim that agents supply neither priors nor constraints.
- [case study] Case study section: the illustration on Big Five personality data is purely descriptive and provides no quantitative comparison (e.g., edge recovery rates, false-positive rates, or stability metrics) against either direct LLM-based discovery or a non-agent baseline; without such controls the case study cannot demonstrate that the separation principle improves causal accuracy.
minor comments (2)
- [abstract/introduction] The abstract and introduction use 'agents' and 'LLMs' interchangeably in places; a brief clarification of the distinction would improve precision.
- [platform description] No pseudocode or explicit workflow diagram is provided for how the platform routes agent outputs versus formal algorithm outputs; adding one would clarify the separation.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the boundaries of our proposed principle. We respond to each major comment below.
read point-by-point responses
-
Referee: § on causal-learn+ instantiation: the steps of 'method recommendation' and 'expert-knowledge incorporation' necessarily involve agent outputs that select algorithms or shape constraints; no mechanism is described that isolates these outputs from the formal discovery pipeline, so prompt artifacts could still determine which method runs or which domain constraints are applied, directly contradicting the central claim that agents supply neither priors nor constraints.
Authors: We agree that the current description of causal-learn+ does not sufficiently detail how agent-generated suggestions are isolated from the formal pipeline. The manuscript states that agents assist the workflow while causal claims remain grounded exclusively in data, explicit assumptions, formal algorithms, diagnostics and domain-expert decisions; however, to make this separation explicit, we will revise the causal-learn+ section to describe a mandatory user-approval gate: agent recommendations for methods or constraints are presented as non-binding suggestions, logged separately, and only incorporated after explicit user or expert confirmation. This revision will also add that the formal discovery step (via causal-learn) operates solely on the approved inputs without further agent intervention. revision: yes
-
Referee: Case study section: the illustration on Big Five personality data is purely descriptive and provides no quantitative comparison (e.g., edge recovery rates, false-positive rates, or stability metrics) against either direct LLM-based discovery or a non-agent baseline; without such controls the case study cannot demonstrate that the separation principle improves causal accuracy.
Authors: The case study is presented strictly as an illustration of the agent-assisted workflow on real data, showing how the platform coordinates preprocessing, method selection, expert input, algorithmic discovery and interpretation without converting LLM outputs into causal evidence. The manuscript does not claim or attempt to demonstrate that the separation principle yields higher causal accuracy than direct LLM-based methods; such a claim would require a controlled benchmark study, which lies outside the scope of the current work focused on the principle and platform design. We therefore do not plan to add quantitative comparisons to the case study. revision: no
Circularity Check
No circularity: methodological stance without derivational reduction
full rationale
The paper advances a normative principle for agent roles in causal discovery and describes its instantiation in the causal-learn+ platform. No equations, fitted parameters, predictions, or formal derivations appear in the provided text. The central claim is an argument for workflow separation grounded in data and explicit algorithms rather than any self-referential reduction or self-citation chain. No load-bearing step reduces by construction to its own inputs, satisfying the criteria for a score of 0.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM agents can perform inspection, retrieval and explanation tasks without injecting textual associations or hallucinations into the causal evidence pipeline
Reference graph
Works this paper leans on
-
[1]
doi: 10.1038/s42256-026-01183-2
Multi-agent AI systems need transparency.Nature Machine Intelligence, 8:1, 2026. doi: 10.1038/s42256-026-01183-2
-
[2]
Fast scalable and accurate discovery of dags using the best order score search and grow-shrink trees
Bryan Andrews, Joseph Ramsey, Ruben Sanchez-Romero, Jazmin Camchong, and Erich Kummerfeld. Fast scalable and accurate discovery of dags using the best order score search and grow-shrink trees. InAdvances in Neural Information Processing Systems, 2023
2023
-
[3]
Theory refinement on bayesian networks
Wray Buntine. Theory refinement on bayesian networks. InUncertainty proceedings 1991, pages 52–60. Elsevier, 1991
1991
-
[4]
Optimal structure identification with greedy search
David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3(Nov):507–554, 2002
2002
-
[5]
Victor-Alexandru Darvariu, Stephen Hailes, and Mirco Musolesi. Large lan- guage models are effective priors for causal graph discovery.arXiv preprint arXiv:2405.13551, 2024
arXiv 2024
-
[6]
A versatile causal discovery framework to allow causally-related hidden variables
Xinshuai Dong, Biwei Huang, Ignavier Ng, Xiangchen Song, Yujia Zheng, Songyao Jin, Roberto Legaspi, Peter Spirtes, and Kun Zhang. A versatile causal discovery framework to allow causally-related hidden variables. InInternational Conference on Learning Representations, 2024. 9
2024
-
[7]
On the probable error of a coefficient of correlation deduced from a small sample.Metron, 1:3–32, 1921
Ronald Aylmer Fisher. On the probable error of a coefficient of correlation deduced from a small sample.Metron, 1:3–32, 1921
1921
-
[8]
Review of causal discovery methods based on graphical models.Frontiers in genetics, 10:524, 2019
Clark Glymour, Kun Zhang, and Peter Spirtes. Review of causal discovery methods based on graphical models.Frontiers in genetics, 10:524, 2019
2019
-
[9]
Investigating causal relations by econometric models and cross-spectral methods.Econometrica: journal of the Econometric Society, pages 424–438, 1969
Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods.Econometrica: journal of the Econometric Society, pages 424–438, 1969
1969
-
[10]
Testing for causality: A personal viewpoint.Journal of Economic Dynamics and control, 2:329–352, 1980
Clive WJ Granger. Testing for causality: A personal viewpoint.Journal of Economic Dynamics and control, 2:329–352, 1980
1980
-
[11]
Nonlinear causal discovery with additive noise models.Advances in neural information processing systems, 21, 2008
Patrik Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Sch¨ olkopf. Nonlinear causal discovery with additive noise models.Advances in neural information processing systems, 21, 2008
2008
-
[12]
Generalized score functions for causal discovery
Biwei Huang, Kun Zhang, Yizhu Lin, Bernhard Sch¨ olkopf, and Clark Glymour. Generalized score functions for causal discovery. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1551–1560, 2018
2018
-
[13]
Causal discovery from heterogeneous/nonstationary data.J
Biwei Huang, Kun Zhang, Jiji Zhang, Joseph D Ramsey, Ruben Sanchez- Romero, Clark Glymour, and Bernhard Sch¨ olkopf. Causal discovery from heterogeneous/nonstationary data.J. Mach. Learn. Res., 21(89):1–53, 2020
2020
-
[14]
Estimation of a structural vector autoregression model using non-gaussianity.Journal of Machine Learning Research, 11(5), 2010
Aapo Hyv¨ arinen, Kun Zhang, Shohei Shimizu, and Patrik O Hoyer. Estimation of a structural vector autoregression model using non-gaussianity.Journal of Machine Learning Research, 11(5), 2010
2010
-
[15]
Causal reasoning and large language models: Opening a new frontier for causality.Transactions on Machine Learning Research, 2023
Emre Kiciman, Robert Ness, Amit Sharma, and Chenhao Tan. Causal reasoning and large language models: Opening a new frontier for causality.Transactions on Machine Learning Research, 2023
2023
-
[16]
Greedy relaxations of the sparsest permutation algorithm
Wai-Yin Lam, Bryan Andrews, and Joseph Ramsey. Greedy relaxations of the sparsest permutation algorithm. InUncertainty in Artificial Intelligence, pages 1052–1062. PMLR, 2022
2022
-
[17]
On causal discovery in the presence of deterministic relations.Advances in Neural Information Processing Systems, 37:130920–130952, 2024
Loka Li, Haoyue Dai, Hanin Al Ghothani, Biwei Huang, Jiji Zhang, Shahar Harel, Isaac Bentwich, Guangyi Chen, and Kun Zhang. On causal discovery in the presence of deterministic relations.Advances in Neural Information Processing Systems, 37:130920–130952, 2024
2024
-
[18]
Causal discovery with language models as imperfect experts
Stephanie Long, Alexandre Pich´ e, Valentina Zantedeschi, Tibor Schuster, and Alexandre Drouin. Causal discovery with language models as imperfect experts. InICML 2023 Workshop on Structured Probabilistic Inference and Generative Modeling. 10
2023
-
[19]
Can large language models build causal graphs?arXiv preprint arXiv:2303.05279, 2023
Stephanie Long, Tibor Schuster, and Alexandre Pich´ e. Can large language models build causal graphs?arXiv preprint arXiv:2303.05279, 2023
arXiv 2023
-
[20]
Rcd: Repetitive causal discovery of linear non-gaussian acyclic models with latent confounders
Takashi Nicholas Maeda and Shohei Shimizu. Rcd: Repetitive causal discovery of linear non-gaussian acyclic models with latent confounders. InInternational Conference on Artificial Intelligence and Statistics, pages 735–745. PMLR, 2020
2020
-
[21]
Causal additive models with unobserved variables
Takashi Nicholas Maeda and Shohei Shimizu. Causal additive models with unobserved variables. InUncertainty in Artificial Intelligence, pages 97–106. PMLR, 2021
2021
-
[22]
Estimating the dimension of a model.The annals of statistics, pages 461–464, 1978
Gideon Schwarz. Estimating the dimension of a model.The annals of statistics, pages 461–464, 1978
1978
-
[23]
A linear non-Gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7:2003–2030, 2006
Shohei Shimizu, Patrik O Hoyer, Aapo Hyv¨ arinen, and Antti Kerminen. A linear non-Gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7:2003–2030, 2006
2003
-
[24]
Directlingam: A direct method for learning a linear non-gaussian structural equation model.Journal of Machine Learning Research-JMLR, 12(Apr):1225–1248, 2011
Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyvarinen, Yoshinobu Kawahara, Takashi Washio, Patrik O Hoyer, Kenneth Bollen, and Patrik Hoyer. Directlingam: A direct method for learning a linear non-gaussian structural equation model.Journal of Machine Learning Research-JMLR, 12(Apr):1225–1248, 2011
2011
-
[25]
Discovering graphical granger causality using the truncating lasso penalty.Bioinformatics, 26(18):i517–i523, 2010
Ali Shojaie and George Michailidis. Discovering graphical granger causality using the truncating lasso penalty.Bioinformatics, 26(18):i517–i523, 2010
2010
-
[26]
A simple approach for finding the globally optimal bayesian network structure
Tomi Silander and Petri Myllym¨ aki. A simple approach for finding the globally optimal bayesian network structure. InConference on Uncertainty in Artificial Intelligence, pages 445–452, 2006
2006
-
[27]
Causal inference in the presence of latent variables and selection bias
Peter Spirtes, Christopher Meek, and Thomas Richardson. Causal inference in the presence of latent variables and selection bias. InProceedings of the Eleventh conference on Uncertainty in artificial intelligence, pages 499–506, 1995
1995
-
[28]
MIT press, 2000
Peter Spirtes, Clark N Glymour, and Richard Scheines.Causation, prediction, and search. MIT press, 2000
2000
-
[29]
Integrating large language models in causal discovery: A statistical causal approach.Transactions on Machine Learning Research, 2024
Masayuki Takayama, Tadahisa Okuda, Thong Pham, Tatsuyoshi Ikenoue, Shingo Fukuma, Shohei Shimizu, and Akiyoshi Sannai. Integrating large language models in causal discovery: A statistical causal approach.Transactions on Machine Learning Research, 2024
2024
-
[30]
The max-min hill-climbing bayesian network structure learning algorithm.Machine learning, 65:31–78, 2006
Ioannis Tsamardinos, Laura E Brown, and Constantin F Aliferis. The max-min hill-climbing bayesian network structure learning algorithm.Machine learning, 65:31–78, 2006. 11
2006
-
[31]
Causal discovery in the presence of missing data
Ruibo Tu, Cheng Zhang, Paul Ackermann, Karthika Mohan, Hedvig Kjellstr¨ om, and Kun Zhang. Causal discovery in the presence of missing data. InThe 22nd International Conference on Artificial Intelligence and Statistics, pages 1762–1770. Pmlr, 2019
2019
-
[32]
Generalized independent noise condition for estimating latent variable causal graphs.Advances in neural information processing systems, 33:14891–14902, 2020
Feng Xie, Ruichu Cai, Biwei Huang, Clark Glymour, Zhifeng Hao, and Kun Zhang. Generalized independent noise condition for estimating latent variable causal graphs.Advances in neural information processing systems, 33:14891–14902, 2020
2020
-
[33]
Towards agentic science for advancing scientific discovery.Nature Machine Intelligence, 7(9):1373–1375, 2025
Hongliang Xin, John R Kitchin, and Heather J Kulik. Towards agentic science for advancing scientific discovery.Nature Machine Intelligence, 7(9):1373–1375, 2025
2025
-
[34]
Learning optimal bayesian networks: A shortest path perspective.Journal of Artificial Intelligence Research, 48:23–65, 2013
Changhe Yuan and Brandon Malone. Learning optimal bayesian networks: A shortest path perspective.Journal of Artificial Intelligence Research, 48:23–65, 2013
2013
-
[35]
On the identifiability of the post-nonlinear causal model
K Zhang and A Hyv¨ arinen. On the identifiability of the post-nonlinear causal model. In25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), pages 647–655. AUAI Press, 2009
2009
-
[36]
Kernel-based conditional independence test and application in causal discovery
Kun Zhang, Jonas Peters, Dominik Janzing, and Bernhard Sch¨ olkopf. Kernel-based conditional independence test and application in causal discovery. InProceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, pages 804–813, 2011
2011
-
[37]
Causal-learn: Causal discovery in python.Journal of Machine Learning Research, 25(60):1–8, 2024
Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. Causal-learn: Causal discovery in python.Journal of Machine Learning Research, 25(60):1–8, 2024. 12
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.