Consistency evaluation of benchmarks used for causal discovery

Chen Wang; Chihui Chen; Lina Yao; Yuzhe Zhang

arxiv: 2606.01789 · v1 · pith:UARD3O5Lnew · submitted 2026-06-01 · 💻 cs.AI

Consistency evaluation of benchmarks used for causal discovery

Yuzhe Zhang , Chihui Chen , Lina Yao , Chen Wang This is my paper

Pith reviewed 2026-06-28 14:23 UTC · model grok-4.3

classification 💻 cs.AI

keywords causal discoverybenchmark evaluationlarge language modelsconsistency checkinggraphical causal modelsdomain knowledge

0 comments

The pith

Eleven popular causal discovery benchmarks show large differences in consistency with domain literature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces an automated pipeline to assess how well benchmark causal graphs align with recent domain research. The pipeline retrieves thousands of papers from scientific databases and uses large language models to judge consistency with the graphs. It applies this to 11 real-world benchmarks, processing over 38,000 papers in total. The results reveal significant variation in how consistent these benchmarks are with current knowledge. This matters because misaligned benchmarks can mislead evaluations of causal discovery methods, especially those based on LLMs.

Core claim

The paper establishes that popular benchmarks for causal discovery vary significantly in their consistency with domain research papers, as evaluated by an LLM-based pipeline that checks 38,081 papers across 11 benchmarks.

What carries the argument

An automated pipeline that retrieves relevant research papers and prompts LLMs to evaluate consistency between benchmark causal graphs and the papers.

If this is right

Evaluations of causal discovery methods may be affected differently depending on the benchmark chosen.
LLM-based causal discovery methods are particularly sensitive to benchmark misalignment due to new discoveries.
Future benchmark creation should incorporate consistency checks with domain literature.
Researchers should select benchmarks with higher consistency for more reliable evaluations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Human verification of the LLM judgments could refine the consistency scores.
This approach could be extended to other AI benchmarks beyond causal discovery.
Periodic re-evaluation of benchmarks as new research emerges would help maintain their validity.

Load-bearing premise

The LLM prompts produce judgments of consistency that accurately reflect the true alignment between graphs and papers without systematic bias.

What would settle it

If independent human experts review a sample of the paper-graph pairs and find substantially different consistency rates than the LLM pipeline, the reported variation would not hold.

Figures

Figures reproduced from arXiv: 2606.01789 by Chen Wang, Chihui Chen, Lina Yao, Yuzhe Zhang.

**Figure 1.** Figure 1: All benchmarks by year period. before 2000 2000-2004 2005-2009 2010-2014 2015-2019 2020-2024 after 2024 Year Period 0 2 4 6 8 10 Number of Benchmarks 8 2 3 1 4 3 0 Benchmark Counts by Year Period (paper # > 2) [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Benchmarks used in at least 3 papers by year [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 4.** Figure 4: Number of papers/year by searching “causal [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Number of papers/year by searching “causal [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: The causal graph of the Asia benchmark dataset: a semi-synthetic benchmark dataset. Vi , Vj and Z ⊆ V \ {Vi , Vj}: if Z d-separates Vi and Vj , the pair of variables are mutually independent conditioned on Z, otherwise they are correlated. Note that Z can be empty. Example 3.1 [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: Benchmarks’ inconsistency rates. sachs fmri insurance ecoli cancer alzheimer diabetes alarm child asia Benchmark 0 10 20 30 40 50 60 Inconsistency Rate (%) 26.3% 28.6% 30.7% 32.8% 37.9% 39.2% 45.7% 47.2% 53.9% 57.4% Benchmark Inconsistency Rate Since Benchmark Cutoff Year [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: Benchmarks’ inconsistency rates since the [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: Benchmarks’ relevant paper number per year period. [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Benchmarks’ inconsistent paper number per year period. [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 11.** Figure 11: Benchmark numbers used in LLM-based papers. [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗

**Figure 12.** Figure 12: Top 20 benchmarks used in arXiv LLM-based causal discovery papers. [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗

**Figure 13.** Figure 13: Inconsistent paper number per year of Sachs. [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗

**Figure 14.** Figure 14: Inconsistent paper number per year of Insurance. [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗

**Figure 15.** Figure 15: Inconsistent paper number per year of Ecoli. [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗

**Figure 16.** Figure 16: Inconsistent paper number per year of Alzheimer. [PITH_FULL_IMAGE:figures/full_fig_p027_16.png] view at source ↗

**Figure 17.** Figure 17: Inconsistent paper number per year of Alarm. [PITH_FULL_IMAGE:figures/full_fig_p028_17.png] view at source ↗

**Figure 18.** Figure 18: Inconsistent paper number per year of Cancer. [PITH_FULL_IMAGE:figures/full_fig_p028_18.png] view at source ↗

**Figure 19.** Figure 19: Inconsistent paper number per year of Arctic Sea Ice. [PITH_FULL_IMAGE:figures/full_fig_p029_19.png] view at source ↗

**Figure 20.** Figure 20: Inconsistent paper number per year of Fmri. [PITH_FULL_IMAGE:figures/full_fig_p029_20.png] view at source ↗

**Figure 21.** Figure 21: Inconsistent paper number per year of Diabetes. [PITH_FULL_IMAGE:figures/full_fig_p030_21.png] view at source ↗

**Figure 22.** Figure 22: Inconsistent paper number per year of Child. [PITH_FULL_IMAGE:figures/full_fig_p030_22.png] view at source ↗

**Figure 23.** Figure 23: Inconsistent paper number per year of Asia. [PITH_FULL_IMAGE:figures/full_fig_p031_23.png] view at source ↗

**Figure 24.** Figure 24: Retrieved papers by year. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_24.png] view at source ↗

**Figure 25.** Figure 25: Retrieved papers that contain relevant information by year. [PITH_FULL_IMAGE:figures/full_fig_p032_25.png] view at source ↗

read the original abstract

In graphical causal model, causal discovery aims to construct a causal graph based on numerical data and domain knowledge in plain text. However, the evaluation of causal discovery methods remains a challenge in the area as the progress of domain researches often makes benchmark causal graphs contain mis-aligned knowledge. This problem especially affects the evaluation of large language model (LLM) based causal discovery methods as they are sensitive to the new discoveries in the literature. This work is the first to systematically study the quality of benchmark causal graphs. Specifically, we design a pipeline that automatically retrieves relevant research papers from scientific databases, and prompts LLMs to check the consistency between the benchmark causal graphs and domain research papers. We evaluate 11 popular real-world benchmarks, for which our pipeline in total proceeds 38,081 domain papers. Our results show that popular benchmarks vary significantly in their consistency with domain research, with clear implications for causal discovery research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper scales up an LLM pipeline to flag inconsistencies across 11 causal benchmarks against 38k papers, but the LLM judgments have no reported validation.

read the letter

The paper's main move is to automate a literature sweep that pulls 38,081 papers and asks LLMs to score how well each of 11 real-world causal discovery benchmarks matches the domain literature. They report clear differences in consistency across the benchmarks and argue this matters especially for LLM-based discovery methods.

The scale is the part that works. Manually checking thousands of papers against benchmark graphs is not feasible, so the automated retrieval-plus-LLM pipeline is a practical way to surface the problem that benchmarks can drift from current knowledge. That is a real issue in the subfield and the paper is the first to attempt it at this volume.

The soft spot is exactly where the stress-test note points: every quantitative claim rests on the LLM consistency judgments, yet the manuscript supplies no human-expert agreement numbers, no calibration set, and no checks on prompt sensitivity or inter-LLM stability. Without those, it is impossible to tell whether the reported variation reflects genuine misalignment or artifacts in how the model reads the papers. That assumption is load-bearing.

The work is aimed at people who select or maintain causal discovery benchmarks and at anyone running evaluations on the standard datasets. A reader who wants to know which benchmarks are most likely to be outdated will get something useful from the scale alone. It deserves peer review because the underlying problem is worth addressing and the method is reproducible in principle, but any referee will need to see validation experiments before the specific inconsistency numbers can be used with confidence.

Referee Report

2 major / 0 minor

Summary. The paper introduces an automated pipeline that retrieves relevant domain papers (38,081 in total) from scientific databases and prompts LLMs to judge consistency between the causal graphs in 11 popular real-world benchmarks and the retrieved literature. It reports that the benchmarks vary significantly in their consistency with domain research and draws implications for causal discovery evaluation, especially for LLM-based methods sensitive to new findings.

Significance. If the LLM consistency judgments prove reliable, the work identifies a previously unquantified source of misalignment in standard benchmarks and supplies a scalable method for ongoing evaluation. The scale of the literature search is a clear strength. The result would directly affect how benchmark graphs are trusted when assessing causal discovery algorithms.

major comments (2)

[Abstract (pipeline description)] The central quantitative results rest entirely on LLM judgments of consistency, yet the manuscript provides no validation of those judgments (human-expert agreement rates, calibration set, inter-LLM consistency, or handling of ambiguous papers). This assumption is load-bearing because every reported variation across the 11 benchmarks is derived from the LLM outputs.
[Abstract (pipeline description)] No information is given on prompt engineering, sensitivity analysis, or inter-rater metrics for the LLM consistency checks. Without these, it is impossible to determine whether the observed differences among benchmarks reflect genuine literature misalignment or artifacts of the judge.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for validation and methodological transparency in our LLM-based pipeline. We agree these are important and will incorporate the requested details and analyses in a major revision.

read point-by-point responses

Referee: The central quantitative results rest entirely on LLM judgments of consistency, yet the manuscript provides no validation of those judgments (human-expert agreement rates, calibration set, inter-LLM consistency, or handling of ambiguous papers). This assumption is load-bearing because every reported variation across the 11 benchmarks is derived from the LLM outputs.

Authors: We acknowledge the manuscript currently lacks explicit validation of the LLM consistency judgments. In the revision we will add a dedicated validation subsection that reports: (i) human-expert agreement rates on a stratified sample of 500 paper-graph pairs, (ii) a calibration set of 100 expert-annotated examples, (iii) inter-LLM consistency across GPT-4, Claude-3, and Llama-3, and (iv) explicit handling rules for ambiguous papers (e.g., “insufficient information” category with frequency statistics). These additions will directly quantify the reliability of the reported benchmark differences. revision: yes
Referee: No information is given on prompt engineering, sensitivity analysis, or inter-rater metrics for the LLM consistency checks. Without these, it is impossible to determine whether the observed differences among benchmarks reflect genuine literature misalignment or artifacts of the judge.

Authors: We agree that prompt details and robustness checks are missing. The revised manuscript will include: the complete system and user prompts in an appendix, a description of the iterative prompt-engineering process (including few-shot examples and chain-of-thought instructions), sensitivity results across three prompt variants and two temperature settings, and inter-rater metrics (both LLM-LLM and LLM-human) already referenced in the new validation section. These additions will allow readers to assess whether the observed benchmark variations are robust to judge artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical pipeline is self-contained

full rationale

The paper describes an empirical pipeline that retrieves domain papers from databases and applies LLM prompts to judge consistency with benchmark causal graphs. No equations, fitted parameters, predictions derived from fits, or self-referential definitions appear in the derivation. The central claim (variation in benchmark consistency) is produced by processing external literature rather than by construction from the paper's own inputs or prior self-citations. Self-citation load-bearing, ansatz smuggling, and renaming patterns are absent. The absence of validation metrics for LLM judgments is a potential reliability concern but does not create circularity under the defined criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on the unverified reliability of LLM consistency judgments and on the assumption that the retrieved papers form a representative sample of domain knowledge.

axioms (2)

domain assumption LLMs prompted with benchmark graphs and paper abstracts can produce accurate and unbiased consistency assessments.
The pipeline is built around this capability; no human validation step is mentioned in the abstract.
domain assumption The 38,081 retrieved papers constitute a sufficient and unbiased sample of the relevant domain literature for each benchmark.
The scale is reported but selection criteria and coverage are not detailed in the abstract.

pith-pipeline@v0.9.1-grok · 5681 in / 1265 out tokens · 19728 ms · 2026-06-28T14:23:34.617490+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

96 extracted references · 9 canonical work pages · 1 internal anchor

[1]

Journal of the Royal Statistical Society: Series B (Methodological), 50(2):157–194

Local computations with probabilities on graphi- cal structures and their application to expert systems. Journal of the Royal Statistical Society: Series B (Methodological), 50(2):157–194. Ahmed Abdulaal, Adamos Hadjivasiliou, Nina Montana-Brown, Tiantian He, Ayodeji Ijishakin, Ivana Drobnjak, Daniel Castro, and Daniel Alexander
[2]

InInternational Conference on Learning Representations, volume 2024, pages 57559–57610

Causal modelling agents: Causal graph dis- covery through synergising metadata-and data-driven reasoning. InInternational Conference on Learning Representations, volume 2024, pages 57559–57610. Bruce Abramson, John Brown, Ward Edwards, Allan Murphy, and Robert L Winkler. 1996. Hailfinder: A bayesian system for forecasting severe weather. International Jou...

2024
[3]

Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa

Collaborative causal discovery with atomic interventions.Advances in Neural Information Pro- cessing Systems, 34:12761–12773. Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa. 2023. Constrained causal bayesian opti- mization. InInternational Conference on Machine Learning, pages 304–321. PMLR. Sina Akbari, Ehsan Mokhtarian, AmirEmad Ghassami, ...

work page arXiv 2023
[4]

Springer

Proceedings, pages 247–256. Springer. Alexis Bellot, Junzhe Zhang, and Elias Bareinboim
[5]

InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 38, pages 11043–11051

Scores for learning discrete causal graphs with unobserved confounders. InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 38, pages 11043–11051. John Binder, Daphne Koller, Stuart Russell, and Keiji Kanazawa. 1997. Adaptive probabilistic networks with hidden variables.Machine Learning, 29(2):213– 244. Philippe Brouillard, Chandler ...

work page arXiv 1997
[6]

Haoyue Dai, Yiwen Qiu, Ignavier Ng, Xinshuai Dong, Peter Spirtes, and Kun Zhang

Bcd nets: Scalable variational approaches for bayesian causal discovery.Advances in Neural Information Processing Systems, 34:7095–7110. Haoyue Dai, Yiwen Qiu, Ignavier Ng, Xinshuai Dong, Peter Spirtes, and Kun Zhang. 2025. Latent variable causal discovery under selection bias.arXiv preprint arXiv:2512.11219. Haoyue Dai, Peter Spirtes, and Kun Zhang. 2022...

work page arXiv 2025
[7]

Anish Dhir, Ruby Sedgwick, Avinash Kori, Ben Glocker, and Mark Van Der Wilk

Bivariate causal discovery using bayesian model selection.arXiv preprint arXiv:2306.02931. Anish Dhir, Ruby Sedgwick, Avinash Kori, Ben Glocker, and Mark Van Der Wilk. 2024. Contin- uous bayesian model selection for multivariate causal discovery.arXiv preprint arXiv:2411.10154. Shuyu Dong, Michele Sebag, Kento Uemura, Akito Fujii, Shuang Chang, Yusuke Koy...

work page arXiv 2024
[8]

Hwang, Y

Multimodal pooled perturb-cite-seq screens in patient models define mechanisms of cancer immune evasion.Nature genetics, 53(3):332–341. Jensen FV . 1996.An introduction to Bayesian networks. London, UK: UCL Press. Amanda Gentzel, Dan Garant, and David Jensen. 2019. The case for evaluating causal models using inter- ventional measures and empirical data.Ad...

work page arXiv 1996
[9]

Efficient Causal Graph Discovery Using Large Language Models

An expert system for control of waste water treatment—a pilot project. Technical report, Techni- cal report, Judex Datasystemer A/S, Aalborg, 1989. In Danish. Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, and Yoshua Bengio. 2024. Efficient causal graph discovery using large language models. Preprint, arXiv:2402.01207. Diviyan Kalainathan and ...

work page internal anchor Pith review Pith/arXiv arXiv 1989
[10]

Peter JF Lucas, Linda C Van der Gaag, and Ameen Abu- Hanna

Can large language models build causal graphs? InNeurIPS 2022 Workshop on Causality for Real-world Impact. Peter JF Lucas, Linda C Van der Gaag, and Ameen Abu- Hanna. 2004. Bayesian networks in biomedicine and health-care. Alessandro Magrini, Stefano Di Blasi, Federico Mattia Stefanini, and 1 others. 2017. A conditional linear gaussian network to assess t...

2022
[11]

InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8975–8982

Discovering fully oriented causal networks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8975–8982. Ehsan Mokhtarian, Mohmmadsadegh Khorasani, Jalal Etesami, and Negar Kiyavash. 2023. Novel ordering- based approaches for causal structure learning in the presence of unobserved variables. InProceedings of the AAAI Confer...

work page doi:10.24432/c5fs30 2023
[12]

Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Ver- schoren, Bart De Moor, and Kathleen Marchal

International Conference on Learning Repre- sentations, ICLR. Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Ver- schoren, Bart De Moor, and Kathleen Marchal. 2006. Syntren: a generator of synthetic gene expression data for design and analysis of structure learning al- gorithms.BMC bioinformatics, 7(1):43. Anike...

2006
[13]

InInternational Conference on Ma- chine Learning, pages 50650–50668

Optimal kernel choice for score function-based causal discovery. InInternational Conference on Ma- chine Learning, pages 50650–50668. PMLR. 14 X Wang, Y Du, S Zhu, L Ke, Z Chen, J Hao, and J Wang
[14]

InProceedings of the Thirtieth International Joint Conference on Artificial Intelli- gence (IJCAI-21), pages 3566–3573

Ordering-based causal discovery with rein- forcement learning. InProceedings of the Thirtieth International Joint Conference on Artificial Intelli- gence (IJCAI-21), pages 3566–3573. IJCAI Interna- tional Joint Conferences on Artificial Intelligence Organization. Yunxia Wang, Fuyuan Cao, Kui Yu, and Jiye Liang
[15]

InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8584–8593

Efficient causal structure learning from mul- tiple interventional datasets with unknown targets. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8584–8593. Yunxia Wang, CAO Fuyuan, Kui Yu, and Jiye Liang. 2025a. Federated causal structure learning with non- identical variable sets. InForty-second International Conference...
[16]

InProceedings of the AAAI Con- ference on Artificial Intelligence, volume 40, pages 36757–36765

Robust causal discovery under imperfect struc- tural constraints. InProceedings of the AAAI Con- ference on Artificial Intelligence, volume 40, pages 36757–36765. Zidong Wang, Fei Liu, Qi Feng, Qingfu Zhang, and Xi- aoguang Gao. 2025b. Llm-enhanced score function evolution for causal structure learning. InProceed- ings of the Thirty-Fourth International J...

work page arXiv 1994
[17]

original paper

Causal-driven skill prerequisite structure dis- covery. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 20604– 20612. Yan Zeng, Shohei Shimizu, Ruichu Cai, Feng Xie, Mi- chio Yamamoto, and Zhifeng Hao. 2021. Causal discovery with multi-domain lingam for latent fac- tors. InCausal Analysis Workshop Series, pages 1–4. PMLR....

work page arXiv 2021
[18]

Sachs Original paper (Sachs et al., 2005) Papers using it: (Eulig et al., 2025; Shen et al., 2025; Roy et al., 2025; Shahverdikondori et al., 2024; Kang et al., 2026; Aglietti et al., 2023; Olko et al., 2023; Annadani et al., 2023; Perry et al., 2022; Dai et al., 2022; Addanki and Kasiviswanathan, 2021; Cundy et al., 2021; Wang et al., 2025a; Li et al., 2...

2005
[19]

Child Original paper (Spiegelhalter, 1992) Papers using it: (Shen et al., 2025; Peyrard and West, 2020; Olko et al., 2023; Wang et al., 2025a, 2024; Vashishtha et al., 2025; Duong et al., 2025; Ke et al., 2022; Lippe et al., 2021; Guo et al., 2024a; Zhang et al., 2023b, 2022; Wang et al., 2022; Ling et al., 2025b; Guo et al., 2024b; Cui et al., 2022)

1992
[20]

Alarm Original paper (Beinlich et al., 1989) Papers using it: (Akbari et al., 2021; Roy et al., 2025; Xie et al., 2024; Li et al., 2022; Olko et al., 2023; Wang et al., 2025a; Duong et al., 2025; Lippe et al., 2021; Guo et al., 2024a; Zhang et al., 2023b, 2022; Wang et al., 2022; Zhang et al., 2021; Ling et al., 2025b; Guo et al., 2024b; Cui et al., 2022)

1989
[21]

Asia Original paper (lau, 1988) Papers using it: (Shen et al., 2025; Roy et al., 2025; Shahverdikondori et al., 2024; Olko et al., 2023; Kocaoglu, 2023; Addanki and Kasiviswanathan, 2021; Vashishtha et al., 2025; Duong et al., 2025; Ke et al., 2022; Lippe et al., 2021; Bellot et al., 2024; Zhang et al., 2023b, 2022)

1988
[22]

Insurance Original paper (Binder et al., 1997) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Feng et al., 2025a,b; Peyrard and West, 2020; Wang et al., 2025a; Guo et al., 2024a; Zhang et al., 2022; Wang et al., 2022; Zhang et al., 2021; Guo et al., 2024b; Cui et al., 2022)

1997
[23]

Cancer Original paper (Korb and Nicholson, 2010) Papers using it: (Feng et al., 2025a,b; Shahverdikondori et al., 2024; Peyrard and West, 2020; Olko et al., 2023; Vashishtha et al., 2025; Duong et al., 2025; Lippe et al., 2021; Zhang et al., 2023b, 2022)

2010
[24]

Barley Original paper (Kristensen and Rasmussen, 2002) Papers using it: (Akbari et al., 2021; Wang et al., 2025a; Ling et al., 2025c; Zhang et al., 2022, 2021; Ling et al., 2025b)

2002
[25]

Hailfinder Original paper (Abramson et al., 1996) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Li et al., 2022; Zhang et al., 2021; Ling et al., 2025b; Cui et al., 2022)

1996
[26]

Fmri hippocampus Original paper (Poldrack et al., 2015) Papers using it: (Li et al., 2024b; Chen et al., 2024; Zeng et al., 2021)

2015
[27]

Alzheimer Original paper (Petersen et al., 2010; Shen et al., 2020) Papers using it: (Abdulaal et al., 2024; Feng et al., 2025a; Vashishtha et al., 2025)

2010
[28]

Arctic sea ice Original paper (Huang et al., 2021) Papers using it: (Abdulaal et al., 2024; Feng et al., 2025b; Kiciman et al., 2023) 17

2021
[29]

Diabetes Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025a,b; Lippe et al., 2021)

2022
[30]

Ecoli70(100) Original paper (Schäfer and Strimmer, 2005) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Kang et al., 2026; Chen et al., 2021)

2005
[31]

Gene expression Original paper (Sethuraman et al., 2023) Papers using it: (Guruswamy Sethuraman and Fekri, 2026; Xie et al., 2024; Li et al., 2025b; Guo et al., 2024a; Ling et al., 2025b)

2023
[32]

Hepar2 Original paper (Onisko, 2003) Papers using it: (Mokhtarian et al., 2023; Roy et al., 2025; Li et al., 2022)

2003
[33]

Pigs Original paper (FV, 1996) Papers using it: (Lippe et al., 2021; Guo et al., 2024a; Ling et al., 2025b)

1996
[34]

Reged Original paper (Statnikov et al., 2015) Papers using it: (Mian et al., 2021; Kaltenpoth and Vreeken, 2023; Guo et al., 2024b)

2015
[35]

Andes Original paper (Conati et al., 1997) Papers using it: (Xie et al., 2024; Li et al., 2022)

1997
[36]

Arth150 Original paper (Opgen-Rhein and Strimmer, 2007) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Kang et al., 2026)

2007
[37]

Auto mpg Original paper (Quinlan, 1993) Papers using it: (Eulig et al., 2025; Shen et al., 2025)

1993
[38]

Carpo Original paper Link Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021)

2023
[39]

HK stock Original paper (Huang et al., 2020) Papers using it: (Li et al., 2024b; Cai et al., 2023)

2020
[40]

Link Original paper (Jensen and Kong, 1999) Papers using it: (Wang et al., 2022; Ling et al., 2025b)

1999
[41]

Mildew Original paper (Jensen and Jensen, 1996) Papers using it: (Xie et al., 2024; Li et al., 2022)

1996
[42]

Neuropain Original paper (Tu et al., 2019) Papers using it: (Liu et al., 2024a; Feng et al., 2025a; Kiciman et al., 2023; Vashishtha et al., 2025) 18

2019
[43]

Obesity Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025a,b)

2022
[44]

Syntren Original paper (Van den Bulcke et al., 2006) Papers using it: (Dhir et al., 2024; Ling et al., 2025b)

2006
[45]

WIN95PTS Original paper (Heckerman et al., 1995) Papers using it: (Xie et al., 2024; Wang et al., 2022; Zhang et al., 2021)

1995
[46]

CE-Tueb Original paper (Mooij et al., 2016) Papers using it: (Dhir et al., 2023)

2016
[47]

CE-cha Original paper (Guyon et al., 2019) Papers using it: (Dhir et al., 2023)

2019
[48]

Cognition and Aging in the Chronic Fatigue Syndrome Original paper (Heins et al., 2013) Papers using it: (Qiao et al., 2024a)

2013
[49]

DWDClimate Original paper (Mooij et al., 2016) Papers using it: (Shen et al., 2025)

2016
[50]

MAGIC-IRRI Original paper (Scutari, 2016) Papers using it: (Kang et al., 2026)

2016
[51]

MAGIC-NIAB Original paper (Scutari et al., 2014) Papers using it: (Kang et al., 2026)

2014
[52]

New York Times Original paper Link Papers using it: (Liu et al., 2024a)
[53]

Pittsburgh Bridges Original paper (Reich and Fenves, 1989) Papers using it: (Ni, 2022)

1989
[54]

Categorical Cause-Effect Pairs Original paper (Ni, 2022) Papers using it: (Ni, 2022)

2022
[55]

Abalone Original paper (Warwick et al., 1994) Papers using it: (Ni, 2022)

1994
[56]

Alcohol Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025b) 19

2022
[57]

Algibra I Original paper Link Papers using it: (Yu et al., 2024)

2024
[58]

APM Original paper Link Papers using it: (Eulig et al., 2025)

2025
[59]

Apple gastronome Original paper (Liu et al., 2024a) Papers using it: (Feng et al., 2025a)
[60]

Big five Original paper Link Papers using it: (Dai et al., 2025; Dong et al., 2024)

2025
[61]

Brain tumor Original paper Link Papers using it: (Liu et al., 2024a)
[62]

Chemical Original paper (Ke et al., 2021) Papers using it: (Zhao et al., 2025)

2021
[63]

(no?)Chemistry image Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005
[64]

climatic analysis Original paper (Compo et al., 2011) Papers using it: (Liu et al., 2024a)

2011
[65]

COVID-19 Original paper Link Papers using it: (Wang et al., 2025b; Vashishtha et al., 2025)

2025
[66]

Credit Original paper (Quinlan, 1987) Papers using it: (Li et al., 2022)

1987
[67]

dream Original paper (Kalainathan and Goudet, 2019) Papers using it: (Roy et al., 2025)

2019
[68]

football Original paper Link Papers using it: (Qiao et al., 2024b)
[69]

G7 Original paper (Demirer et al., 2018) Papers using it: (Jalaldoust et al., 2022)

2018
[70]

General social survey Original paper Link Papers using it: (Li et al., 2025b) 20
[71]

IHDP Original paper (Hill, 2011) Papers using it: (Ashman et al., 2023)

2011
[72]

Lucas Original paper (Lucas et al., 2004) Papers using it: (Roy et al., 2025)

2004
[73]

Magnetic Original paper (Hwang et al., 2024) Papers using it: (Zhao et al., 2025)

2024
[74]

(tba)Micro24 Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005
[75]

(tba)Micro25 Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005
[76]

Munin Original paper (Andreassen et al., 1989) Papers using it: (Dong et al., 2025; Wang et al., 2022)

1989
[77]

Pacific walker circulation Original paper (Runge et al., 2019) Papers using it: (Liu and Kuang, 2023)

2019
[78]

Pathfinder Original paper (Heckerman et al., 1992) Papers using it: (Zhang et al., 2021)

1992
[79]

Pharmacokinetics Original paper (Grzegorzewski et al., 2021) Papers using it: (Li et al., 2024a)

2021
[80]

Physics Original paper (Lee et al., 2024) Papers using it: (Kang et al., 2026)

2024

Showing first 80 references.

[1] [1]

Journal of the Royal Statistical Society: Series B (Methodological), 50(2):157–194

Local computations with probabilities on graphi- cal structures and their application to expert systems. Journal of the Royal Statistical Society: Series B (Methodological), 50(2):157–194. Ahmed Abdulaal, Adamos Hadjivasiliou, Nina Montana-Brown, Tiantian He, Ayodeji Ijishakin, Ivana Drobnjak, Daniel Castro, and Daniel Alexander

[2] [2]

InInternational Conference on Learning Representations, volume 2024, pages 57559–57610

Causal modelling agents: Causal graph dis- covery through synergising metadata-and data-driven reasoning. InInternational Conference on Learning Representations, volume 2024, pages 57559–57610. Bruce Abramson, John Brown, Ward Edwards, Allan Murphy, and Robert L Winkler. 1996. Hailfinder: A bayesian system for forecasting severe weather. International Jou...

2024

[3] [3]

Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa

Collaborative causal discovery with atomic interventions.Advances in Neural Information Pro- cessing Systems, 34:12761–12773. Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa. 2023. Constrained causal bayesian opti- mization. InInternational Conference on Machine Learning, pages 304–321. PMLR. Sina Akbari, Ehsan Mokhtarian, AmirEmad Ghassami, ...

work page arXiv 2023

[4] [4]

Springer

Proceedings, pages 247–256. Springer. Alexis Bellot, Junzhe Zhang, and Elias Bareinboim

[5] [5]

InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 38, pages 11043–11051

Scores for learning discrete causal graphs with unobserved confounders. InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 38, pages 11043–11051. John Binder, Daphne Koller, Stuart Russell, and Keiji Kanazawa. 1997. Adaptive probabilistic networks with hidden variables.Machine Learning, 29(2):213– 244. Philippe Brouillard, Chandler ...

work page arXiv 1997

[6] [6]

Haoyue Dai, Yiwen Qiu, Ignavier Ng, Xinshuai Dong, Peter Spirtes, and Kun Zhang

Bcd nets: Scalable variational approaches for bayesian causal discovery.Advances in Neural Information Processing Systems, 34:7095–7110. Haoyue Dai, Yiwen Qiu, Ignavier Ng, Xinshuai Dong, Peter Spirtes, and Kun Zhang. 2025. Latent variable causal discovery under selection bias.arXiv preprint arXiv:2512.11219. Haoyue Dai, Peter Spirtes, and Kun Zhang. 2022...

work page arXiv 2025

[7] [7]

Anish Dhir, Ruby Sedgwick, Avinash Kori, Ben Glocker, and Mark Van Der Wilk

Bivariate causal discovery using bayesian model selection.arXiv preprint arXiv:2306.02931. Anish Dhir, Ruby Sedgwick, Avinash Kori, Ben Glocker, and Mark Van Der Wilk. 2024. Contin- uous bayesian model selection for multivariate causal discovery.arXiv preprint arXiv:2411.10154. Shuyu Dong, Michele Sebag, Kento Uemura, Akito Fujii, Shuang Chang, Yusuke Koy...

work page arXiv 2024

[8] [8]

Hwang, Y

Multimodal pooled perturb-cite-seq screens in patient models define mechanisms of cancer immune evasion.Nature genetics, 53(3):332–341. Jensen FV . 1996.An introduction to Bayesian networks. London, UK: UCL Press. Amanda Gentzel, Dan Garant, and David Jensen. 2019. The case for evaluating causal models using inter- ventional measures and empirical data.Ad...

work page arXiv 1996

[9] [9]

Efficient Causal Graph Discovery Using Large Language Models

An expert system for control of waste water treatment—a pilot project. Technical report, Techni- cal report, Judex Datasystemer A/S, Aalborg, 1989. In Danish. Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, and Yoshua Bengio. 2024. Efficient causal graph discovery using large language models. Preprint, arXiv:2402.01207. Diviyan Kalainathan and ...

work page internal anchor Pith review Pith/arXiv arXiv 1989

[10] [10]

Peter JF Lucas, Linda C Van der Gaag, and Ameen Abu- Hanna

Can large language models build causal graphs? InNeurIPS 2022 Workshop on Causality for Real-world Impact. Peter JF Lucas, Linda C Van der Gaag, and Ameen Abu- Hanna. 2004. Bayesian networks in biomedicine and health-care. Alessandro Magrini, Stefano Di Blasi, Federico Mattia Stefanini, and 1 others. 2017. A conditional linear gaussian network to assess t...

2022

[11] [11]

InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8975–8982

Discovering fully oriented causal networks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8975–8982. Ehsan Mokhtarian, Mohmmadsadegh Khorasani, Jalal Etesami, and Negar Kiyavash. 2023. Novel ordering- based approaches for causal structure learning in the presence of unobserved variables. InProceedings of the AAAI Confer...

work page doi:10.24432/c5fs30 2023

[12] [12]

Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Ver- schoren, Bart De Moor, and Kathleen Marchal

International Conference on Learning Repre- sentations, ICLR. Tim Van den Bulcke, Koenraad Van Leemput, Bart Naudts, Piet van Remortel, Hongwu Ma, Alain Ver- schoren, Bart De Moor, and Kathleen Marchal. 2006. Syntren: a generator of synthetic gene expression data for design and analysis of structure learning al- gorithms.BMC bioinformatics, 7(1):43. Anike...

2006

[13] [13]

InInternational Conference on Ma- chine Learning, pages 50650–50668

Optimal kernel choice for score function-based causal discovery. InInternational Conference on Ma- chine Learning, pages 50650–50668. PMLR. 14 X Wang, Y Du, S Zhu, L Ke, Z Chen, J Hao, and J Wang

[14] [14]

InProceedings of the Thirtieth International Joint Conference on Artificial Intelli- gence (IJCAI-21), pages 3566–3573

Ordering-based causal discovery with rein- forcement learning. InProceedings of the Thirtieth International Joint Conference on Artificial Intelli- gence (IJCAI-21), pages 3566–3573. IJCAI Interna- tional Joint Conferences on Artificial Intelligence Organization. Yunxia Wang, Fuyuan Cao, Kui Yu, and Jiye Liang

[15] [15]

InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8584–8593

Efficient causal structure learning from mul- tiple interventional datasets with unknown targets. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8584–8593. Yunxia Wang, CAO Fuyuan, Kui Yu, and Jiye Liang. 2025a. Federated causal structure learning with non- identical variable sets. InForty-second International Conference...

[16] [16]

InProceedings of the AAAI Con- ference on Artificial Intelligence, volume 40, pages 36757–36765

Robust causal discovery under imperfect struc- tural constraints. InProceedings of the AAAI Con- ference on Artificial Intelligence, volume 40, pages 36757–36765. Zidong Wang, Fei Liu, Qi Feng, Qingfu Zhang, and Xi- aoguang Gao. 2025b. Llm-enhanced score function evolution for causal structure learning. InProceed- ings of the Thirty-Fourth International J...

work page arXiv 1994

[17] [17]

original paper

Causal-driven skill prerequisite structure dis- covery. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 20604– 20612. Yan Zeng, Shohei Shimizu, Ruichu Cai, Feng Xie, Mi- chio Yamamoto, and Zhifeng Hao. 2021. Causal discovery with multi-domain lingam for latent fac- tors. InCausal Analysis Workshop Series, pages 1–4. PMLR....

work page arXiv 2021

[18] [18]

Sachs Original paper (Sachs et al., 2005) Papers using it: (Eulig et al., 2025; Shen et al., 2025; Roy et al., 2025; Shahverdikondori et al., 2024; Kang et al., 2026; Aglietti et al., 2023; Olko et al., 2023; Annadani et al., 2023; Perry et al., 2022; Dai et al., 2022; Addanki and Kasiviswanathan, 2021; Cundy et al., 2021; Wang et al., 2025a; Li et al., 2...

2005

[19] [19]

Child Original paper (Spiegelhalter, 1992) Papers using it: (Shen et al., 2025; Peyrard and West, 2020; Olko et al., 2023; Wang et al., 2025a, 2024; Vashishtha et al., 2025; Duong et al., 2025; Ke et al., 2022; Lippe et al., 2021; Guo et al., 2024a; Zhang et al., 2023b, 2022; Wang et al., 2022; Ling et al., 2025b; Guo et al., 2024b; Cui et al., 2022)

1992

[20] [20]

Alarm Original paper (Beinlich et al., 1989) Papers using it: (Akbari et al., 2021; Roy et al., 2025; Xie et al., 2024; Li et al., 2022; Olko et al., 2023; Wang et al., 2025a; Duong et al., 2025; Lippe et al., 2021; Guo et al., 2024a; Zhang et al., 2023b, 2022; Wang et al., 2022; Zhang et al., 2021; Ling et al., 2025b; Guo et al., 2024b; Cui et al., 2022)

1989

[21] [21]

Asia Original paper (lau, 1988) Papers using it: (Shen et al., 2025; Roy et al., 2025; Shahverdikondori et al., 2024; Olko et al., 2023; Kocaoglu, 2023; Addanki and Kasiviswanathan, 2021; Vashishtha et al., 2025; Duong et al., 2025; Ke et al., 2022; Lippe et al., 2021; Bellot et al., 2024; Zhang et al., 2023b, 2022)

1988

[22] [22]

Insurance Original paper (Binder et al., 1997) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Feng et al., 2025a,b; Peyrard and West, 2020; Wang et al., 2025a; Guo et al., 2024a; Zhang et al., 2022; Wang et al., 2022; Zhang et al., 2021; Guo et al., 2024b; Cui et al., 2022)

1997

[23] [23]

Cancer Original paper (Korb and Nicholson, 2010) Papers using it: (Feng et al., 2025a,b; Shahverdikondori et al., 2024; Peyrard and West, 2020; Olko et al., 2023; Vashishtha et al., 2025; Duong et al., 2025; Lippe et al., 2021; Zhang et al., 2023b, 2022)

2010

[24] [24]

Barley Original paper (Kristensen and Rasmussen, 2002) Papers using it: (Akbari et al., 2021; Wang et al., 2025a; Ling et al., 2025c; Zhang et al., 2022, 2021; Ling et al., 2025b)

2002

[25] [25]

Hailfinder Original paper (Abramson et al., 1996) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Li et al., 2022; Zhang et al., 2021; Ling et al., 2025b; Cui et al., 2022)

1996

[26] [26]

Fmri hippocampus Original paper (Poldrack et al., 2015) Papers using it: (Li et al., 2024b; Chen et al., 2024; Zeng et al., 2021)

2015

[27] [27]

Alzheimer Original paper (Petersen et al., 2010; Shen et al., 2020) Papers using it: (Abdulaal et al., 2024; Feng et al., 2025a; Vashishtha et al., 2025)

2010

[28] [28]

Arctic sea ice Original paper (Huang et al., 2021) Papers using it: (Abdulaal et al., 2024; Feng et al., 2025b; Kiciman et al., 2023) 17

2021

[29] [29]

Diabetes Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025a,b; Lippe et al., 2021)

2022

[30] [30]

Ecoli70(100) Original paper (Schäfer and Strimmer, 2005) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Kang et al., 2026; Chen et al., 2021)

2005

[31] [31]

Gene expression Original paper (Sethuraman et al., 2023) Papers using it: (Guruswamy Sethuraman and Fekri, 2026; Xie et al., 2024; Li et al., 2025b; Guo et al., 2024a; Ling et al., 2025b)

2023

[32] [32]

Hepar2 Original paper (Onisko, 2003) Papers using it: (Mokhtarian et al., 2023; Roy et al., 2025; Li et al., 2022)

2003

[33] [33]

Pigs Original paper (FV, 1996) Papers using it: (Lippe et al., 2021; Guo et al., 2024a; Ling et al., 2025b)

1996

[34] [34]

Reged Original paper (Statnikov et al., 2015) Papers using it: (Mian et al., 2021; Kaltenpoth and Vreeken, 2023; Guo et al., 2024b)

2015

[35] [35]

Andes Original paper (Conati et al., 1997) Papers using it: (Xie et al., 2024; Li et al., 2022)

1997

[36] [36]

Arth150 Original paper (Opgen-Rhein and Strimmer, 2007) Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021; Kang et al., 2026)

2007

[37] [37]

Auto mpg Original paper (Quinlan, 1993) Papers using it: (Eulig et al., 2025; Shen et al., 2025)

1993

[38] [38]

Carpo Original paper Link Papers using it: (Mokhtarian et al., 2023; Akbari et al., 2021)

2023

[39] [39]

HK stock Original paper (Huang et al., 2020) Papers using it: (Li et al., 2024b; Cai et al., 2023)

2020

[40] [40]

Link Original paper (Jensen and Kong, 1999) Papers using it: (Wang et al., 2022; Ling et al., 2025b)

1999

[41] [41]

Mildew Original paper (Jensen and Jensen, 1996) Papers using it: (Xie et al., 2024; Li et al., 2022)

1996

[42] [42]

Neuropain Original paper (Tu et al., 2019) Papers using it: (Liu et al., 2024a; Feng et al., 2025a; Kiciman et al., 2023; Vashishtha et al., 2025) 18

2019

[43] [43]

Obesity Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025a,b)

2022

[44] [44]

Syntren Original paper (Van den Bulcke et al., 2006) Papers using it: (Dhir et al., 2024; Ling et al., 2025b)

2006

[45] [45]

WIN95PTS Original paper (Heckerman et al., 1995) Papers using it: (Xie et al., 2024; Wang et al., 2022; Zhang et al., 2021)

1995

[46] [46]

CE-Tueb Original paper (Mooij et al., 2016) Papers using it: (Dhir et al., 2023)

2016

[47] [47]

CE-cha Original paper (Guyon et al., 2019) Papers using it: (Dhir et al., 2023)

2019

[48] [48]

Cognition and Aging in the Chronic Fatigue Syndrome Original paper (Heins et al., 2013) Papers using it: (Qiao et al., 2024a)

2013

[49] [49]

DWDClimate Original paper (Mooij et al., 2016) Papers using it: (Shen et al., 2025)

2016

[50] [50]

MAGIC-IRRI Original paper (Scutari, 2016) Papers using it: (Kang et al., 2026)

2016

[51] [51]

MAGIC-NIAB Original paper (Scutari et al., 2014) Papers using it: (Kang et al., 2026)

2014

[52] [52]

New York Times Original paper Link Papers using it: (Liu et al., 2024a)

[53] [53]

Pittsburgh Bridges Original paper (Reich and Fenves, 1989) Papers using it: (Ni, 2022)

1989

[54] [54]

Categorical Cause-Effect Pairs Original paper (Ni, 2022) Papers using it: (Ni, 2022)

2022

[55] [55]

Abalone Original paper (Warwick et al., 1994) Papers using it: (Ni, 2022)

1994

[56] [56]

Alcohol Original paper (Long et al., 2022) Papers using it: (Feng et al., 2025b) 19

2022

[57] [57]

Algibra I Original paper Link Papers using it: (Yu et al., 2024)

2024

[58] [58]

APM Original paper Link Papers using it: (Eulig et al., 2025)

2025

[59] [59]

Apple gastronome Original paper (Liu et al., 2024a) Papers using it: (Feng et al., 2025a)

[60] [60]

Big five Original paper Link Papers using it: (Dai et al., 2025; Dong et al., 2024)

2025

[61] [61]

Brain tumor Original paper Link Papers using it: (Liu et al., 2024a)

[62] [62]

Chemical Original paper (Ke et al., 2021) Papers using it: (Zhao et al., 2025)

2021

[63] [63]

(no?)Chemistry image Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005

[64] [64]

climatic analysis Original paper (Compo et al., 2011) Papers using it: (Liu et al., 2024a)

2011

[65] [65]

COVID-19 Original paper Link Papers using it: (Wang et al., 2025b; Vashishtha et al., 2025)

2025

[66] [66]

Credit Original paper (Quinlan, 1987) Papers using it: (Li et al., 2022)

1987

[67] [67]

dream Original paper (Kalainathan and Goudet, 2019) Papers using it: (Roy et al., 2025)

2019

[68] [68]

football Original paper Link Papers using it: (Qiao et al., 2024b)

[69] [69]

G7 Original paper (Demirer et al., 2018) Papers using it: (Jalaldoust et al., 2022)

2018

[70] [70]

General social survey Original paper Link Papers using it: (Li et al., 2025b) 20

[71] [71]

IHDP Original paper (Hill, 2011) Papers using it: (Ashman et al., 2023)

2011

[72] [72]

Lucas Original paper (Lucas et al., 2004) Papers using it: (Roy et al., 2025)

2004

[73] [73]

Magnetic Original paper (Hwang et al., 2024) Papers using it: (Zhao et al., 2025)

2024

[74] [74]

(tba)Micro24 Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005

[75] [75]

(tba)Micro25 Original paper (Schäfer and Strimmer, 2005) Papers using it: tba

2005

[76] [76]

Munin Original paper (Andreassen et al., 1989) Papers using it: (Dong et al., 2025; Wang et al., 2022)

1989

[77] [77]

Pacific walker circulation Original paper (Runge et al., 2019) Papers using it: (Liu and Kuang, 2023)

2019

[78] [78]

Pathfinder Original paper (Heckerman et al., 1992) Papers using it: (Zhang et al., 2021)

1992

[79] [79]

Pharmacokinetics Original paper (Grzegorzewski et al., 2021) Papers using it: (Li et al., 2024a)

2021

[80] [80]

Physics Original paper (Lee et al., 2024) Papers using it: (Kang et al., 2026)

2024