A Survey on Federated Causal Discovery and Inference
Pith reviewed 2026-06-26 10:21 UTC · model grok-4.3
The pith
Federated causal discovery and inference methods are classified by three design decisions on learning, partitioning, and knowledge sharing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that any federated causal discovery solution rests on three design decisions—how structures are learned, how data are partitioned, and what structural knowledge each party obtains—and that these decisions define the axes of methodological paradigm, federation topology, and structural scope. It further shows that federated causal inference methods can be grouped by target estimand and by estimation strategy, and that discovery supplies the structure needed for valid inference, forming stages of a single pipeline.
What carries the argument
The three core design decisions (structure learning method, data partitioning scheme, and per-party structural knowledge) that generate the taxonomies of methodological paradigm, federation topology, and structural scope.
If this is right
- The taxonomies let researchers locate gaps where no methods exist for particular combinations of learning approach, topology, and scope.
- Treating discovery and inference as sequential stages means that errors in structure recovery directly limit the validity of downstream effect estimates.
- Practical factors such as temporal dynamics, data heterogeneity, and non-identical variable sets must be handled inside the same three-axis framework.
- Privacy, communication cost, and theoretical guarantees are shared constraints that apply across both discovery and inference stages.
- Open problems remain in extending the framework to new application domains and in proving finite-sample guarantees under federation constraints.
Where Pith is reading between the lines
- The same three decisions could be used to classify methods in other privacy-sensitive graphical modeling tasks beyond causality.
- New methods might deliberately combine choices from different axes to produce hybrid algorithms that trade off privacy and accuracy in controlled ways.
- Empirical studies could test whether methods placed in the same taxonomy cell behave similarly on benchmark graphs with controlled federation constraints.
- The pipeline view suggests that joint optimization of discovery and inference stages might reduce error propagation compared with separate treatment.
Load-bearing premise
The three design decisions form a complete and non-overlapping classification that covers every federated causal discovery method.
What would settle it
Publication of an FCD method whose design choices cannot be placed on the three axes without forcing an arbitrary fourth category or creating unavoidable overlap between existing categories.
read the original abstract
Causal reasoning, which encompasses the discovery of causal structures and the inference of causal effects, is fundamental to data-driven decision making. In practice, data for reliable causal analysis are often distributed across institutions and cannot be centralized due to privacy regulations or communication constraints. Federated learning (FL) addresses this by enabling collaborative analysis without raw data sharing, giving rise to the rapidly growing field of federated causal discovery (FCD) and inference (FCI). However, the interdisciplinary nature of this field and the absence of a comprehensive survey present barriers to entry for researchers. This paper bridges that gap by providing a systematic review through multi-dimensional taxonomies. Grounded in the three core design decisions underlying any FCD solution, namely how structures are learned, how data are partitioned, and what structural knowledge each party obtains, we organize FCD along three axes: methodological paradigm, federation topology, and structural scope. We further examine key practical dimensions, including temporal dynamics, data heterogeneity, missing data, and non-identical variable sets. For FCI, we categorize methods by target estimand (average versus individualized/conditional treatment effects) and by estimation strategy, from classical weighting methods to modern deep generative architectures. Unlike prior works that treat FCD and FCI separately, we formalize their connection as complementary stages of a unified federated causal reasoning pipeline, where FCD supplies the structural knowledge required for valid effect estimation in FCI. Finally, we highlight their shared concerns regarding privacy, communication efficiency, theoretical guarantees, and application domains, and conclude by identifying open challenges for future research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a survey on federated causal discovery (FCD) and federated causal inference (FCI). It organizes FCD methods via a three-axis taxonomy (methodological paradigm, federation topology, structural scope) derived from three core design decisions (how structures are learned, how data are partitioned, and what structural knowledge each party obtains). It further reviews practical dimensions such as temporal dynamics and heterogeneity, formalizes FCD and FCI as complementary stages of a unified pipeline, categorizes FCI methods by estimand and strategy, and discusses shared concerns including privacy and open challenges.
Significance. If the proposed taxonomy proves comprehensive, the survey would provide a useful entry point and organizational framework for an emerging interdisciplinary area, with the explicit linkage of discovery to inference as a constructive contribution. The coverage of both methodological and practical aspects strengthens its potential utility for researchers.
major comments (1)
- [Abstract and taxonomy introduction section] Abstract and the section introducing the taxonomy: the assertion that the three core design decisions constitute a complete, non-overlapping basis for all FCD solutions is presented without an explicit exhaustive mapping of the cited literature onto the resulting 3D grid or discussion of potential overlaps/collisions with the supplementary dimensions (temporal dynamics, heterogeneity, etc.). This leaves the claimed generality of the taxonomy unverified and load-bearing for the central organizational contribution.
minor comments (2)
- [Taxonomy figures] The taxonomy diagrams would benefit from explicit cell-by-cell examples drawn from the surveyed papers to improve immediate readability.
- Ensure the reference list includes all works mentioned in the text and consider adding a table summarizing the mapping of key papers to the three axes.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on the taxonomy presentation. We address it point by point below.
read point-by-point responses
-
Referee: [Abstract and taxonomy introduction section] Abstract and the section introducing the taxonomy: the assertion that the three core design decisions constitute a complete, non-overlapping basis for all FCD solutions is presented without an explicit exhaustive mapping of the cited literature onto the resulting 3D grid or discussion of potential overlaps/collisions with the supplementary dimensions (temporal dynamics, heterogeneity, etc.). This leaves the claimed generality of the taxonomy unverified and load-bearing for the central organizational contribution.
Authors: The three core design decisions (how structures are learned, how data are partitioned, and what structural knowledge each party obtains) are presented as the logical foundation for any FCD solution because they directly encode the fundamental constraints of federated settings. The manuscript then organizes the reviewed literature along the resulting three axes in dedicated sections. We acknowledge that the abstract and taxonomy introduction do not contain an explicit exhaustive mapping table of all cited works onto the 3D grid, nor a dedicated discussion of interactions with the supplementary dimensions. In the revised version we will add both: (i) a summary table mapping representative papers to taxonomy cells to make completeness verifiable, and (ii) a short paragraph clarifying that supplementary dimensions such as temporal dynamics and heterogeneity are treated as orthogonal refinements that operate within the core axes rather than creating overlaps or collisions in the primary classification. This addresses the concern while preserving the taxonomy's grounding in the three design decisions. revision: yes
Circularity Check
No circularity: survey aggregates external results without self-referential derivation
full rationale
This is a literature survey paper whose central contribution is a multi-dimensional taxonomy for organizing existing FCD/FCI methods. The three core design decisions (structure learning, data partitioning, structural knowledge per party) are presented as an organizing lens drawn from the literature rather than derived from any fitted parameters, self-citations, or equations within the paper itself. No equations, predictions, or uniqueness theorems are claimed; the work explicitly states it reviews and categorizes published methods. The taxonomy is therefore not equivalent to its inputs by construction, and the paper remains self-contained against external benchmarks. No load-bearing steps reduce to self-definition or fitted inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Causation, Prediction, and Search
1 Spirtes P, Glymour C N, Scheines R. Causation, Prediction, and Search. MIT Press, 2000 2 Chickering D M. Optimal structure identification with greedy search. Journal of Machine Learning Research, 2002, 3: 507–554 3 Zheng X, Aragam B, Ravikumar P K, et al. DAGs with NO TEARS: Continuous optimization for structure learning. Advances in Neural Information ...
2000
-
[2]
Advances and open problems in federated learning
1273–1282 7 Kairouz P, McMahan H B. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 2021, 14: 1–210 8 Wright R, Yang Z. Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
2021
-
[3]
Learning Bayesian network structure from distributed homogeneous data
713–718 9 Gou K X, Jun G X, Zhao Z. Learning Bayesian network structure from distributed homogeneous data. In: Proceedings of the Acis International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007). IEEE,
2007
-
[4]
Federated causal discovery in medicine: Trends, opportunities, and challenges
250–254 10 Rocchi N, Scutari M, Zanga A, et al. Federated causal discovery in medicine: Trends, opportunities, and challenges. working paper or preprint, November 2025 11 Li Q, Wen Z, Wu Z, et al. A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 2021, 35:...
2025
-
[5]
Learning high-dimensional directed acyclic graphs with latent and selection variables
178–184 17 Colombo D, Maathuis M H, Kalisch M, et al. Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, 2012, 40: 294–321 18 Zhang J. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence, 2008, 172: 1873...
2012
-
[6]
Gradient-based neural DAG learning
3414–3425 23 Lachapelle S, Brouillard P, Deleu T, et al. Gradient-based neural DAG learning. In: Proceedings of the International Conference on Learning Representations, 2020 24 Pamfil R, Sriwattanaworachai N, Desai S, et al. DYNOTEARS: Structure learning from time-series data. In: Proceedings of the International Conference on Artificial Intelligence and...
2020
-
[7]
Neural Granger causality
1595–1605 25 Tank A, Covert I, Foti N, et al. Neural Granger causality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44: 4267–4279 26 Tsamardinos I, Brown L E, Aliferis C F. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 2006, 65: 31–78 27 Nandy P, Hauser A, Maathuis M H. High-dimensiona...
2021
-
[8]
A generalization of sampling without replacement from a finite universe
Statistical Science, 1990, pages 465–472 29 Horvitz D G, Thompson D J. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 1952, 47: 663–685 30 Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker...
1990
-
[9]
Tackling the objective inconsistency problem in heterogeneous federated optimization
5132–5143 37 Wang J, Liu Q, Liang H, et al. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in Neural Information Processing Systems, 2020, 33: 7611–7623 38 Yang Q, Liu Y, Chen T, et al. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10: 1–1...
2020
-
[10]
Communication-efficient federated learning via knowledge distillation
22802–22838 43 Wu C, Wu F, Lyu L, et al. Communication-efficient federated learning via knowledge distillation. Nature Communications, 2022, 13: 2032 44 Dwork C, Roth A. The algorithmic foundations of differential privacy. Foundations and Trends®in Theoretical Computer Science, 2014, 9: 211–487 45 Abadi M, Chu A, Goodfellow I, et al. Deep learning with di...
2022
-
[11]
Crypten: Secure multi-party computation meets machine learning
308–318 46 Knott B, Venkataraman S, Hannun A, et al. Crypten: Secure multi-party computation meets machine learning. Advances in Neural Sci China Inf Sci25 Information Processing Systems, 2021, 34: 4961–4973 47 Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the ACM SIGSAC Co...
2021
-
[12]
A fully homomorphic encryption scheme
1175–1191 48 Gentry C. A fully homomorphic encryption scheme. Stanford university, 2009 49 Acar A, Aksu H, Uluagac A S, et al. A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys, 2018, 51: 1–35 50 Huang J, Guo X, Yu K, et al. Towards privacy-aware causal structure learning in federated setting. IEEE Transactions o...
2009
-
[13]
Federated causal discovery from heterogeneous data
351–367 52 Li L, Ng I, Luo G, et al. Federated causal discovery from heterogeneous data. In: Proceedings of the International Conference on Learning Representations, 2024 53 Hahn M, Zajak A, Heider D, et al. Federated causal discovery across heterogeneous datasets under latent confounding. arXiv preprint arXiv:2603.05149, 2026 54 Guo X, Yu K, Liu L, et al...
-
[14]
Distributed Bayesian network structure learning
1998–2003 56 Na Y, Yang J. Distributed Bayesian network structure learning. In: Proceedings of the IEEE International Symposium on Industrial Electronics. IEEE,
1998
-
[15]
Federated learning of generalized linear causal networks
9–16 58 Ye Q, Amini A A, Zhou Q. Federated learning of generalized linear causal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46: 6623–6636 59 Chen R, Sivakumar K, Kargupta H. Learning Bayesian network structure from distributed data. In: Proceedings of the SIAM International Conference on Data Mining. SIAM,
2024
-
[16]
FedDAG: Federated DAG structure learning
61–69 62 Gao E, Chen J, Shen L, et al. FedDAG: Federated DAG structure learning. Transactions on Machine Learning Research, 2023 63 Mian O, Kaltenpoth D, Kamp M, et al. Nothing but regrets—privacy-preserving federated causal discovery. In: Proceedings of the International Conference on Artificial Intelligence and Statistics. PMLR,
2023
-
[17]
Interventional causal structure discovery over graphical models with convergence and optimality guarantees
100–109 68 Qiu C, Yang K. Interventional causal structure discovery over graphical models with convergence and optimality guarantees. IEEE Transactions on Network Science and Engineering, 2024, 12: 156–172 69 Van Daalen F, Ippel L, Dekker A, et al. VertiBayes: learning Bayesian network parameters from vertically partitioned data with missing values. Compl...
2024
-
[18]
Federated local causal structure learning
164–179 71 Yu K, Rong C, Wang H, et al. Federated local causal structure learning. Science China Information Sciences, 2025, 68: 132105 72 Rong C, Cao D, Yu K. Efficient local causal structure learning with privacy preservation. In: Proceedings of the International Joint Conference on Rough Sets. Springer,
2025
-
[19]
Federated causal structure learning with missing data
378–392 73 Shi J, Huang X, Guo X, et al. Federated causal structure learning with missing data. Knowledge-Based Systems, 2025, 330: 114601 74 Chen J, Ma Y, Yue X. Federated learning of dynamic Bayesian network via continuous optimization from time series data. IEEE Transactions on Artificial Intelligence, 2025 75 Mohanty A, Mohamed N, Ramanan P, et al. Fe...
2025
-
[20]
Federated multi-task Bayesian network learning in the presence of overlapping and distinct variables
62445–62466 78 Yang X, Niu B, Lan T, et al. Federated multi-task Bayesian network learning in the presence of overlapping and distinct variables. IISE Transactions, 2025, 57: 773–787 79 Zanga A, Bernasconi A, Lucas P J, et al. Federated causal discovery with missing data in a multicentric study on endometrial cancer. Journal of Biomedical Informatics, 202...
2025
-
[21]
Towards Uncertainty-Aware Federated Granger Causal Learning
20280–20288 83 Fehri G F E, Bellet A, Bastien P. Differentially private and federated structure learning in Bayesian networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, 2026 84 Torrijos P, G´ amez J A, Puerta J M. FedGES: A federated learning approach for Bayesian network structure learning. Machine Learnin...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[22]
An adaptive kernel approach to federated learning of heterogeneous causal effects
2024–2034 Sci China Inf Sci26 89 Vo T V, Wei P, Gkoulalas-Divanis A, et al. An adaptive kernel approach to federated learning of heterogeneous causal effects. In: Proceedings of the Advances in Neural Information Processing Systems,
2024
-
[23]
Federated causal inference in heterogeneous observational data
25540–25553 90 Xiong R, Koenecke A, Powell M, et al. Federated causal inference in heterogeneous observational data. Statistics in Medicine, 2023, 42: 4418–4439 91 Han L, Shen Z, Zubizarreta J. Multiply robust federated estimation of targeted average treatment effects. Advances in Neural Information Processing Systems, 2023, 36: 70453–70482 92 Almod´ ovar...
2023
-
[24]
16849–16868 95 Hu M, Shi X, Song P X K. Collaborative inference for treatment effect with distributed data-sharing management in multicenter studies. Statistics in Medicine, 2024, 43: 2263–2279 96 Kawamata Y, Motai R, Okada Y, et al. Collaborative causal inference on distributed data. Expert Systems with Applications, 2024, 244: 123024 97 Makhija D, Ghosh...
-
[25]
Heterogeneity-aware federated causal inference leveraging effect-measure transportability
472–489 114 Cao S, Yang S. Heterogeneity-aware federated causal inference leveraging effect-measure transportability. arXiv preprint arXiv:2510.16317, 2025 115 Almod´ ovar A, Parras J, Zazo S. Propensity weighted federated learning for treatment effect estimation in distributed imbalanced envi- ronments. Computers in Biology and Medicine, 2024, 178: 10877...
-
[26]
Toward causal representation learning
11–20 121 Sch¨ olkopf B, Locatello F, Bauer S, et al. Toward causal representation learning. Proceedings of the IEEE, 2021, 109: 612–634 122 Athey S, Imbens G W. The state of applied econometrics: Causality and policy evaluation. Journal of Economic Perspectives, 2017, 31: 3–32 123 Chernozhukov V, Chetverikov D, Demirer M, et al. Double/debiased machine l...
2021
-
[27]
Dense: Data-free one-shot federated learning
560–569 130 Zhang J, Chen C, Li B, et al. Dense: Data-free one-shot federated learning. Advances in Neural Information Processing Systems, 2022, 35: 21414–21428 131 Tsiatis A A. Semiparametric theory and missing data. Springer, 2006 132 Murphy S A. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodolog...
2022
-
[28]
Causal reasoning and large language models: Opening a new frontier for causality
11012–11022 Sci China Inf Sci27 135 Kiciman E, Ness R, Sharma A, et al. Causal reasoning and large language models: Opening a new frontier for causality. Transactions on Machine Learning Research, 2024 136 Ban T, Chen L, Wang X, et al. From query tools to causal architects: Harnessing large language models for advanced causal discovery from data. arXiv pr...
-
[29]
Causality-based feature selection: Methods and evaluations
673–678 141 Yu K, Guo X, Liu L, et al. Causality-based feature selection: Methods and evaluations. ACM Computing Surveys, 2020, 53: 1–36 142 Gupta V, Luqman A, Chattopadhyay N, et al. TravellingFL: Communication efficient peer-to-peer federated learning. IEEE Transactions on Vehicular Technology, 2023, 73: 5005–5019
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.