Computational Identifiability

Kyunghyun Cho; Lucius E.J. Bynum; Rajesh Ranganath

arxiv: 2606.19361 · v1 · pith:EQQABIY7new · submitted 2026-06-08 · 💻 cs.LG · cs.AI· cs.NA· math.NA· stat.CO· stat.ME· stat.ML

Computational Identifiability

Lucius E.J. Bynum , Rajesh Ranganath , Kyunghyun Cho This is my paper

Pith reviewed 2026-06-27 16:50 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NAmath.NAstat.COstat.MEstat.ML

keywords computational identifiabilitycausal identificationempirical estimatorfinite samplescausal graphscounterfactual queriesmixed data

0 comments

The pith

Causal identifiability holds when a finite search finds an estimator within error tolerance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper distinguishes theoretical identifiability, which relies on asymptotic or infinite-data conditions, from computational identifiability. The latter is defined by whether a specified finite search procedure can locate an empirical estimator for the target causal query that meets a chosen error tolerance. Success under the procedure's prior over parameters and other assumptions counts as identifiability. Readers would care because the method supports concrete checks in settings with small samples, unclear graphs, mixed data sources, or counterfactual targets that idealized theory leaves unresolved.

Core claim

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. Theoretical identifiability assumes asymptotic properties, infinite data, or other idealized conditions. Computational identifiability instead defines a finite computational search procedure for an empirical estimator; if this process finds an estimator empirically within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search and on the search procedure itself.

What carries the argument

A finite computational search procedure for an empirical estimator of the target query, conditioned on a prior distribution over parameters.

If this is right

It permits checking identifiability with small finite samples.
It resolves questions involving ambiguous graphical criteria.
It handles mixed observational and interventional data.
It applies to counterfactual data and estimands.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners could replace some graphical tests with simulation-based searches tuned to their data size.
Identifiability might vary with available compute budget rather than being a fixed property of the graph alone.
Different search procedures could be compared on the same query to test sensitivity of the conclusion.

Load-bearing premise

Success of the chosen search procedure with its prior distribution corresponds to genuine identifiability of the target query.

What would settle it

A concrete case where the search locates an estimator within tolerance for a query that standard identification theory proves is not identifiable, or where the search fails for a query known to be identifiable.

Figures

Figures reproduced from arXiv: 2606.19361 by Kyunghyun Cho, Lucius E.J. Bynum, Rajesh Ranganath.

**Figure 2.** Figure 2: (a)-(f) DAGs for each experimental setting we consider. (Right) Diagram showing computational identifiability curves, visualizing Definition 4 across a range of possible ϵ values. The empirical (or posterior) probability of identifiability can be read at a given desired error tolerance [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: (a) Meta-model performance in the transportability setting with varying mixtures of [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: (a) Meta-model performance in the counterfactual setting estimating CATE vs. ITE. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Computational identifiability can have a non-monotonic relationship with dataset size [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Identifiability curves from meta-models trained on each of the cases in Equation (5), as a [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Identifiability curves from per-dataset estimation methods instead of meta-trained models, [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. In causal identification, this information is often expressed in the form of a causal graph, and data are observed or collected for some subset of variables in the graph. Target queries may be for a single effect alone or for a class of effects in a given model. The derivation of an identification algorithm then defines mathematically the process by which the desired causal effect(s) can be uniquely determined, theoretically, in expectation. Identifiability in expectation, or 'theoretical identifiability,' generally assumes asymptotic properties, infinite data, or other mathematically idealized conditions. In this paper, we explore a fundamental distinction between this theoretical, idealized notion of identifiability and a proposed alternative that is computation-bound. The framework we propose - 'computational identifiability' - is to instead define a finite computational search procedure for an empirical estimator. If this process finds an estimator empirically, within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search (i.e., a prior distribution over the parameters) and conditional on the search procedure itself. Through several experiments, we demonstrate how this framework allows us to answer fine-grained, practical identification questions, such as identification with small finite samples, with ambiguous graphical criteria, with mixed observational-interventional data, and across counterfactual data and estimands. Code is available at https://github.com/lbynum/metadentify.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a search-based computational identifiability that handles finite samples and mixed data, but makes the property depend on the chosen procedure and prior.

read the letter

The core move is to replace the usual asymptotic or graph-based identifiability with a version that holds if a finite search procedure recovers an estimator inside a stated error tolerance. This lets the authors pose questions about small samples, ambiguous graphs, observational-plus-interventional data, and counterfactuals that standard identification theory does not directly address.

The experiments are presented as demonstrations of these practical cases, and the code is public, which is useful. The framing is explicit that success is conditional on the search method and the prior over parameters, so the circularity is acknowledged rather than hidden.

The main limitation is that the value of the definition still rests on whether the search procedure itself is reliable and not overly sensitive to its own hyperparameters. If the procedure fails to find an estimator, it is unclear whether the target is truly non-identifiable or simply hard for that particular search to locate. The paper would be stronger with more detail on how the search is constructed and with direct comparisons to existing identification algorithms on the same tasks.

This is mainly for applied causal researchers who already work with finite data and need a way to check identifiability under realistic constraints. It is not yet a replacement for theoretical results but could complement them. The work shows clear thinking about the gap between theory and practice, so it deserves a serious referee to evaluate the search implementation and the experimental evidence.

Referee Report

1 major / 1 minor

Summary. The paper proposes 'computational identifiability' as a computation-bound alternative to theoretical identifiability in causal inference. It defines this notion as the success of a finite search procedure in empirically locating an estimator within a desired error tolerance, conditional on a prior over parameters and on the procedure itself. Experiments are claimed to demonstrate utility for practical questions including finite-sample identification, ambiguous graphical criteria, mixed observational-interventional data, and counterfactual estimands.

Significance. If the central definition can be made non-circular and the search procedure shown to correspond to genuine identifiability rather than procedure-specific success, the framework could supply a practical, finite-data complement to classical identification theory in causal ML. The public code release is a clear strength for reproducibility.

major comments (1)

[Abstract] Abstract (definition of computational identifiability): the claim that identifiability holds precisely when the search succeeds, conditional on the search procedure and prior, makes the central notion dependent on the very computational object whose success it declares. This circularity is load-bearing for the distinction from theoretical identifiability and requires explicit resolution (e.g., via a non-tautological criterion that the procedure must satisfy independently of its own output).

minor comments (1)

The abstract states that experiments address fine-grained questions but supplies no information on baselines, error bars, sample sizes, or the concrete form of the search procedure; these details must appear in the main text and tables for the empirical claims to be evaluable.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive report and the opportunity to clarify our framework. We address the sole major comment below regarding the definition of computational identifiability. We believe the original definition is not circular, as the procedure is fixed independently, but we will revise the manuscript to make this explicit and add a non-tautological criterion as suggested.

read point-by-point responses

Referee: [Abstract] Abstract (definition of computational identifiability): the claim that identifiability holds precisely when the search succeeds, conditional on the search procedure and prior, makes the central notion dependent on the very computational object whose success it declares. This circularity is load-bearing for the distinction from theoretical identifiability and requires explicit resolution (e.g., via a non-tautological criterion that the procedure must satisfy independently of its own output).

Authors: We thank the referee for this observation. The definition is not circular because the search procedure is an a priori, independently specified computational object (a concrete algorithm with its own prior, search strategy, tolerance, and stopping rules) that does not reference the target query's identifiability status or output. Computational identifiability is then the empirical observation that this fixed procedure succeeds in locating an estimator within tolerance on simulated or held-out data. This is analogous to defining decidability relative to a specific Turing machine without circularity. The distinction from theoretical identifiability is preserved because the latter invokes asymptotic or infinite-data guarantees, while ours is strictly finite and procedure-relative. To resolve the concern explicitly, we will revise the abstract, Section 2, and related discussion to state a non-tautological criterion: the procedure must be a well-defined search (e.g., grid search, optimization, or sampling) that operates without presupposing the numerical value of the target parameter and whose success is validated by out-of-sample error metrics independent of the procedure's internal state. This clarification will be incorporated in the revision. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework is explicitly definitional

full rationale

The paper proposes computational identifiability as an alternative to theoretical identifiability, explicitly defining it as holding when a finite search procedure finds an estimator within error tolerance, conditional on the procedure and prior. This definition is stated outright in the abstract with no claim of independent derivation or reduction to hidden inputs. No equations, self-citations, or fitted predictions are invoked to establish the central notion; experiments apply the defined framework to practical cases. The conditionality on the search is acknowledged as part of the definition rather than an unstated assumption that undermines a separate result.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

The framework rests on a new definition that conditions identifiability on an unspecified search procedure and prior; these are introduced without independent justification beyond the proposal.

free parameters (2)

error tolerance
The tolerance threshold that determines search success is a free parameter of the definition.
prior distribution over parameters
The prior used inside the search is stated as an assumption but not derived from data or external principle.

axioms (1)

domain assumption Standard causal graph and identification-in-expectation assumptions from prior literature.
The new definition is presented as an alternative to, but still built upon, classical causal identification theory.

invented entities (1)

computational identifiability no independent evidence
purpose: To replace theoretical identifiability with a search-based criterion.
New concept introduced by the paper; no independent evidence outside the definition itself.

pith-pipeline@v0.9.1-grok · 5825 in / 1346 out tokens · 25824 ms · 2026-06-27T16:50:39.656299+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 11 canonical work pages · 3 internal anchors

[1]

Identification of causal effects using instrumental variables.Journal of the American statistical Association, 91(434):444–455, 1996

Joshua D Angrist, Guido W Imbens, and Donald B Rubin. Identification of causal effects using instrumental variables.Journal of the American statistical Association, 91(434):444–455, 1996

1996
[2]

Gridded transformer neural processes for spatio-temporal data

Matthew Ashman, Cristiana Diaconu, Eric Langezaal, Adrian Weller, and Richard E Turner. Gridded transformer neural processes for spatio-temporal data. In Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Tegan Maharaj, Kiri Wagstaff, and Jerry Zhu, editors,Proceedings of the 42nd International Conference on Machine Learning, v...

2025
[3]

Causalpfn: Amortized causal effect estimation via in-context learning

Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Junwei Ma, Bingru Li, Jesse C Cresswell, and Rahul Krishnan. Causalpfn: Amortized causal effect estimation via in-context learning. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems
[4]

IV-ICL: Bounding Causal Effects with Instrumental Variables via In-Context Learning

Vahid Balazadeh, Hamidreza Kamkari, Medha Barath, Ricardo Silva, and Rahul G Krishnan. Iv-icl: Bounding causal effects with instrumental variables via in-context learning.arXiv preprint arXiv:2605.12924, 2026. 11

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Counterfactual probabilities: Computational methods, bounds and applications

Alexander Balke and Judea Pearl. Counterfactual probabilities: Computational methods, bounds and applications. InUncertainty in artificial intelligence, pages 46–54. Elsevier, 1994

1994
[6]

Bounds on treatment effects from studies with imperfect compliance.Journal of the American statistical Association, 92(439):1171–1176, 1997

Alexander Balke and Judea Pearl. Bounds on treatment effects from studies with imperfect compliance.Journal of the American statistical Association, 92(439):1171–1176, 1997

1997
[7]

Causal inference by surrogate experiments: z-identifiability

Elias Bareinboim and Judea Pearl. Causal inference by surrogate experiments: z-identifiability. InProceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pages 113–120, 2012

2012
[8]

Black box causal inference: Effect estimation via meta prediction.arXiv preprint arXiv:2503.05985, 2025

Lucius EJ Bynum, Aahlad Manas Puli, Diego Herrero-Quevedo, Nhi Nguyen, Carlos Fernandez- Granda, Kyunghyun Cho, and Rajesh Ranganath. Black box causal inference: Effect estimation via meta prediction.arXiv preprint arXiv:2503.05985, 2025

work page arXiv 2025
[9]

Bounds on direct effects in the presence of confounded intermediate variables.Biometrics, 64(3):695–701, 2008

Zhihong Cai, Manabu Kuroki, Judea Pearl, and Jin Tian. Bounds on direct effects in the presence of confounded intermediate variables.Biometrics, 64(3):695–701, 2008

2008
[10]

Juan Correa and Elias Bareinboim. A calculus for stochastic interventions:causal effect identification and surrogate experiments.Proceedings of the AAAI Conference on Artifi- cial Intelligence, 34(06):10093–10100, Apr. 2020. doi: 10.1609/aaai.v34i06.6567. URL https://ojs.aaai.org/index.php/AAAI/article/view/6567

work page doi:10.1609/aaai.v34i06.6567 2020
[11]

General transportability of soft interventions: Completeness results.Advances in Neural Information Processing Systems, 33:10902–10912, 2020

Juan Correa and Elias Bareinboim. General transportability of soft interventions: Completeness results.Advances in Neural Information Processing Systems, 33:10902–10912, 2020

2020
[12]

Nested counterfactual identification from arbitrary surrogate experiments

Juan Correa, Sanghack Lee, and Elias Bareinboim. Nested counterfactual identification from arbitrary surrogate experiments. In M. Ranzato, A. Beygelzimer, Y . Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 6856–6867. Curran Associates, Inc., 2021. URLhttps://proceedings.neurips.cc/ ...

2021
[13]

Estimating interventional distributions with uncertain causal graphs through meta-learning

Anish Dhir, Cristiana Diaconu, Valentinian Mihai Lungu, James Requeima, Richard E Turner, and Mark van der Wilk. Estimating interventional distributions with uncertain causal graphs through meta-learning. InThe Thirty-ninth Annual Conference on Neural Information Process- ing Systems
[14]

Interventions and causal inference.Philosophy of Science, 74:981–995, 2007

Frederick Eberhardt and Richard Scheines. Interventions and causal inference.Philosophy of Science, 74:981–995, 2007

2007
[15]

Conditional neural processes

Marta Garnelo, Dan Rosenbaum, Christopher Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo Rezende, and SM Ali Eslami. Conditional neural processes. InInternational conference on machine learning, pages 1704–1713. PMLR, 2018

2018
[16]

Neural Processes

Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. Neural processes.arXiv preprint arXiv:1807.01622, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[17]

Partial identification of causal effects using proxy variables.arXiv preprint arXiv:2304.04374, 2023

AmirEmad Ghassami, Ilya Shpitser, and Eric Tchetgen Tchetgen. Partial identification of causal effects using proxy variables.arXiv preprint arXiv:2304.04374, 2023

work page arXiv 2023
[18]

Instrumental variables, selection models, and tight bounds on the average treatment effect

James J Heckman and Edward J Vytlacil. Instrumental variables, selection models, and tight bounds on the average treatment effect. Working Paper 259, National Bureau of Economic Research, August 2000. URLhttp://www.nber.org/papers/t0259

2000
[19]

Leonard Henckel, Emilija Perkovi´c, and Marloes H Maathuis. Graphical criteria for efficient total effect estimation via adjustment in causal linear models.Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):579–599, 2022

2022
[20]

Nonparametric analysis of randomized experiments with missing covariate and outcome data.Journal of the American statistical Association, 95 (449):77–84, 2000

Joel L Horowitz and Charles F Manski. Nonparametric analysis of randomized experiments with missing covariate and outcome data.Journal of the American statistical Association, 95 (449):77–84, 2000

2000
[21]

Pearl’s calculus of intervention is complete

Yimin Huang and Marco Valtorta. Pearl’s calculus of intervention is complete. InProceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, pages 217–224, 2006. 12

2006
[22]

Amortizing Causal Sensitivity Analysis via Prior Data-Fitted Networks

Emil Javurek, Dennis Frauen, Marie Brockschmidt, Jonas Schweisthal, and Stefan Feuer- riegel. Amortizing causal sensitivity analysis via prior data-fitted networks.arXiv preprint arXiv:2605.10590, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[23]

Recent developments in partial identification.Annual Review of Economics, 15:125–150, 2023

Brendan Kline and Elie Tamer. Recent developments in partial identification.Annual Review of Economics, 15:125–150, 2023

2023
[24]

Set transformer: A framework for attention-based permutation-invariant neural networks

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning, pages 3744–3753. PMLR, 2019

2019
[25]

Causal effect identifiability under partial-observability

Sanghack Lee and Elias Bareinboim. Causal effect identifiability under partial-observability. In International Conference on Machine Learning, pages 5692–5701. PMLR, 2020

2020
[26]

John Wiley & Sons, 2019

Roderick JA Little and Donald B Rubin.Statistical analysis with missing data. John Wiley & Sons, 2019

2019
[27]

Foundation models for causal inference via prior-data fitted networks.arXiv preprint arXiv:2506.10914, 2025

Yuchen Ma, Dennis Frauen, Emil Javurek, and Stefan Feuerriegel. Foundation models for causal inference via prior-data fitted networks.arXiv preprint arXiv:2506.10914, 2025

work page arXiv 2025
[28]

Counterfactual identification under monotonicity constraints

Aurghya Maiti, Drago Plecko, and Elias Bareinboim. Counterfactual identification under monotonicity constraints. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 26841–26850, 2025

2025
[29]

A potential outcomes calculus for identifying conditional path-specific effects

Daniel Malinsky, Ilya Shpitser, and Thomas Richardson. A potential outcomes calculus for identifying conditional path-specific effects. InThe 22nd International Conference on Artificial Intelligence and Statistics, pages 3080–3088. PMLR, 2019

2019
[30]

Identifying causal effects with proxy variables of an unmeasured confounder.Biometrika, 105(4):987–993, 2018

Wang Miao, Zhi Geng, and Eric J Tchetgen Tchetgen. Identifying causal effects with proxy variables of an unmeasured confounder.Biometrika, 105(4):987–993, 2018

2018
[31]

Deep survival analysis: Nonparametrics and missingness

Xenia Miscouridou, Adler Perotte, Noémie Elhadad, and Rajesh Ranganath. Deep survival analysis: Nonparametrics and missingness. InMachine Learning for Healthcare Conference, pages 244–256. PMLR, 2018

2018
[32]

Adaptive conditional quantile neural processes

Peiman Mohseni, Nick Duffield, Bani Mallick, and Arman Hasanzadeh. Adaptive conditional quantile neural processes. InUncertainty in Artificial Intelligence, pages 1445–1455. PMLR, 2023

2023
[33]

De- mystifying amortized causal discovery with transformers.arXiv preprint arXiv:2405.16924, 2024

Francesco Montagna, Max Cairney-Leeming, Dhanya Sridhar, and Francesco Locatello. De- mystifying amortized causal discovery with transformers.arXiv preprint arXiv:2405.16924, 2024

work page arXiv 2024
[34]

Causal diagrams for empirical research (with discussions)

Judea Pearl. Causal diagrams for empirical research (with discussions). InProbabilistic and causal inference: The works of Judea Pearl, pages 255–316. 2022

2022
[35]

Probabilistic evaluation of sequential plans from causal models with hidden variables

Judea Pearl and James M Robins. Probabilistic evaluation of sequential plans from causal models with hidden variables. InUAI, volume 95, pages 444–453, 1995

1995
[36]

General control functions for causal effect estimation from ivs.Advances in neural information processing systems, 33:8440–8451, 2020

Aahlad Puli and Rajesh Ranganath. General control functions for causal effect estimation from ivs.Advances in neural information processing systems, 33:8440–8451, 2020

2020
[37]

Causal estimation with functional con- founders.Advances in neural information processing systems, 33:5115–5125, 2020

Aahlad Puli, Adler Perotte, and Rajesh Ranganath. Causal estimation with functional con- founders.Advances in neural information processing systems, 33:5115–5125, 2020

2020
[38]

Tabiclv2: A better, faster, scalable, and open tabular foundation model, 2026

Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. Tabiclv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139, 2026

work page arXiv 2026
[39]

Causal identification from counterfactual data: Com- pleteness and bounding results.arXiv preprint arXiv:2602.23541, 2026

Arvind Raghavan and Elias Bareinboim. Causal identification from counterfactual data: Com- pleteness and bounding results.arXiv preprint arXiv:2602.23541, 2026

work page arXiv 2026
[40]

Deep survival analysis

Rajesh Ranganath, Adler Perotte, Noémie Elhadad, and David Blei. Deep survival analysis. In Machine Learning for Healthcare Conference, pages 101–114. PMLR, 2016. 13

2016
[41]

Do-pfn: In-context learning for causal effect estimation

Jake Robertson, Arik Reuter, Siyuan Guo, Noah Hollmann, Frank Hutter, and Bernhard Schölkopf. Do-pfn: In-context learning for causal effect estimation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems
[42]

Efficient adjustment sets for population average causal treatment effect estimation in graphical models.Journal of Machine Learning Research, 21 (188):1–86, 2020

Andrea Rotnitzky and Ezequiel Smucler. Efficient adjustment sets for population average causal treatment effect estimation in graphical models.Journal of Machine Learning Research, 21 (188):1–86, 2020. URLhttp://jmlr.org/papers/v21/19-1026.html

2020
[43]

A general method for deriving tight symbolic bounds on causal effects.Journal of Computational and Graphical Statistics, 32(2):567–576, 2023

Michael C Sachs, Gustav Jonzon, Arvid Sjölander, and Erin E Gabriel. A general method for deriving tight symbolic bounds on causal effects.Journal of Computational and Graphical Statistics, 32(2):567–576, 2023

2023
[44]

Estimating individual treatment effect: generalization bounds and algorithms

Uri Shalit, Fredrik D Johansson, and David Sontag. Estimating individual treatment effect: generalization bounds and algorithms. InInternational conference on machine learning, pages 3076–3085. PMLR, 2017

2017
[45]

A linear non-gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, Antti Kerminen, and Michael Jordan. A linear non-gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

2006
[46]

Identification of joint interventional distributions in recursive semi-markovian causal models

Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi-markovian causal models. InAAAI, pages 1219–1226, 2006

2006
[47]

Complete identification methods for the causal hierarchy.Journal of Machine Learning Research, 9:1941–1979, 2008

Ilya Shpitser and Judea Pearl. Complete identification methods for the causal hierarchy.Journal of Machine Learning Research, 9:1941–1979, 2008

1941
[48]

Testing for weak instruments in linear iv regression, 2002

James H Stock and Motohiro Yogo. Testing for weak instruments in linear iv regression, 2002

2002
[49]

Consistency of neural causal partial identifi- cation.Advances in Neural Information Processing Systems, 37:68956–68999, 2024

Jiyuan Tan, Jose Blanchet, and Vasilis Syrgkanis. Consistency of neural causal partial identifi- cation.Advances in Neural Information Processing Systems, 37:68956–68999, 2024

2024
[50]

An introduction to proximal causal learning.arXiv preprint arXiv:2009.10982, 2020

Eric J Tchetgen Tchetgen, Andrew Ying, Yifan Cui, Xu Shi, and Wang Miao. An introduction to proximal causal learning.arXiv preprint arXiv:2009.10982, 2020

work page arXiv 2009
[51]

The causal-neural connection: Expressiveness, learnability, and inference.Advances in Neural Information Processing Systems, 34:10823–10836, 2021

Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, and Elias Bareinboim. The causal-neural connection: Expressiveness, learnability, and inference.Advances in Neural Information Processing Systems, 34:10823–10836, 2021

2021
[52]

Neural causal models for counterfactual iden- tification and estimation

Kevin Xia, Yushu Pan, and Elias Bareinboim. Neural causal models for counterfactual iden- tification and estimation. InProceedings of the 11th Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=vouQcZS8KfW, 2022

2022
[53]

Set norm and equivariant skip connections: Putting the deep in deep sets

Lily Zhang, Veronica Tozzo, John Higgins, and Rajesh Ranganath. Set norm and equivariant skip connections: Putting the deep in deep sets. InInternational Conference on Machine Learning, pages 26559–26574. PMLR, 2022. A Further generalizing causal meta-prediction frameworks 0.00 0.25 0.50 0.75 1.00 Error tolerance ( ϵ) in σ2 Y units 0.00 0.25 0.50 0.75 1.0...

2022

[1] [1]

Identification of causal effects using instrumental variables.Journal of the American statistical Association, 91(434):444–455, 1996

Joshua D Angrist, Guido W Imbens, and Donald B Rubin. Identification of causal effects using instrumental variables.Journal of the American statistical Association, 91(434):444–455, 1996

1996

[2] [2]

Gridded transformer neural processes for spatio-temporal data

Matthew Ashman, Cristiana Diaconu, Eric Langezaal, Adrian Weller, and Richard E Turner. Gridded transformer neural processes for spatio-temporal data. In Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Tegan Maharaj, Kiri Wagstaff, and Jerry Zhu, editors,Proceedings of the 42nd International Conference on Machine Learning, v...

2025

[3] [3]

Causalpfn: Amortized causal effect estimation via in-context learning

Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas, Junwei Ma, Bingru Li, Jesse C Cresswell, and Rahul Krishnan. Causalpfn: Amortized causal effect estimation via in-context learning. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems

[4] [4]

IV-ICL: Bounding Causal Effects with Instrumental Variables via In-Context Learning

Vahid Balazadeh, Hamidreza Kamkari, Medha Barath, Ricardo Silva, and Rahul G Krishnan. Iv-icl: Bounding causal effects with instrumental variables via in-context learning.arXiv preprint arXiv:2605.12924, 2026. 11

work page internal anchor Pith review Pith/arXiv arXiv 2026

[5] [5]

Counterfactual probabilities: Computational methods, bounds and applications

Alexander Balke and Judea Pearl. Counterfactual probabilities: Computational methods, bounds and applications. InUncertainty in artificial intelligence, pages 46–54. Elsevier, 1994

1994

[6] [6]

Bounds on treatment effects from studies with imperfect compliance.Journal of the American statistical Association, 92(439):1171–1176, 1997

Alexander Balke and Judea Pearl. Bounds on treatment effects from studies with imperfect compliance.Journal of the American statistical Association, 92(439):1171–1176, 1997

1997

[7] [7]

Causal inference by surrogate experiments: z-identifiability

Elias Bareinboim and Judea Pearl. Causal inference by surrogate experiments: z-identifiability. InProceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pages 113–120, 2012

2012

[8] [8]

Black box causal inference: Effect estimation via meta prediction.arXiv preprint arXiv:2503.05985, 2025

Lucius EJ Bynum, Aahlad Manas Puli, Diego Herrero-Quevedo, Nhi Nguyen, Carlos Fernandez- Granda, Kyunghyun Cho, and Rajesh Ranganath. Black box causal inference: Effect estimation via meta prediction.arXiv preprint arXiv:2503.05985, 2025

work page arXiv 2025

[9] [9]

Bounds on direct effects in the presence of confounded intermediate variables.Biometrics, 64(3):695–701, 2008

Zhihong Cai, Manabu Kuroki, Judea Pearl, and Jin Tian. Bounds on direct effects in the presence of confounded intermediate variables.Biometrics, 64(3):695–701, 2008

2008

[10] [10]

Juan Correa and Elias Bareinboim. A calculus for stochastic interventions:causal effect identification and surrogate experiments.Proceedings of the AAAI Conference on Artifi- cial Intelligence, 34(06):10093–10100, Apr. 2020. doi: 10.1609/aaai.v34i06.6567. URL https://ojs.aaai.org/index.php/AAAI/article/view/6567

work page doi:10.1609/aaai.v34i06.6567 2020

[11] [11]

General transportability of soft interventions: Completeness results.Advances in Neural Information Processing Systems, 33:10902–10912, 2020

Juan Correa and Elias Bareinboim. General transportability of soft interventions: Completeness results.Advances in Neural Information Processing Systems, 33:10902–10912, 2020

2020

[12] [12]

Nested counterfactual identification from arbitrary surrogate experiments

Juan Correa, Sanghack Lee, and Elias Bareinboim. Nested counterfactual identification from arbitrary surrogate experiments. In M. Ranzato, A. Beygelzimer, Y . Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 6856–6867. Curran Associates, Inc., 2021. URLhttps://proceedings.neurips.cc/ ...

2021

[13] [13]

Estimating interventional distributions with uncertain causal graphs through meta-learning

Anish Dhir, Cristiana Diaconu, Valentinian Mihai Lungu, James Requeima, Richard E Turner, and Mark van der Wilk. Estimating interventional distributions with uncertain causal graphs through meta-learning. InThe Thirty-ninth Annual Conference on Neural Information Process- ing Systems

[14] [14]

Interventions and causal inference.Philosophy of Science, 74:981–995, 2007

Frederick Eberhardt and Richard Scheines. Interventions and causal inference.Philosophy of Science, 74:981–995, 2007

2007

[15] [15]

Conditional neural processes

Marta Garnelo, Dan Rosenbaum, Christopher Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo Rezende, and SM Ali Eslami. Conditional neural processes. InInternational conference on machine learning, pages 1704–1713. PMLR, 2018

2018

[16] [16]

Neural Processes

Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. Neural processes.arXiv preprint arXiv:1807.01622, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[17] [17]

Partial identification of causal effects using proxy variables.arXiv preprint arXiv:2304.04374, 2023

AmirEmad Ghassami, Ilya Shpitser, and Eric Tchetgen Tchetgen. Partial identification of causal effects using proxy variables.arXiv preprint arXiv:2304.04374, 2023

work page arXiv 2023

[18] [18]

Instrumental variables, selection models, and tight bounds on the average treatment effect

James J Heckman and Edward J Vytlacil. Instrumental variables, selection models, and tight bounds on the average treatment effect. Working Paper 259, National Bureau of Economic Research, August 2000. URLhttp://www.nber.org/papers/t0259

2000

[19] [19]

Leonard Henckel, Emilija Perkovi´c, and Marloes H Maathuis. Graphical criteria for efficient total effect estimation via adjustment in causal linear models.Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):579–599, 2022

2022

[20] [20]

Nonparametric analysis of randomized experiments with missing covariate and outcome data.Journal of the American statistical Association, 95 (449):77–84, 2000

Joel L Horowitz and Charles F Manski. Nonparametric analysis of randomized experiments with missing covariate and outcome data.Journal of the American statistical Association, 95 (449):77–84, 2000

2000

[21] [21]

Pearl’s calculus of intervention is complete

Yimin Huang and Marco Valtorta. Pearl’s calculus of intervention is complete. InProceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, pages 217–224, 2006. 12

2006

[22] [22]

Amortizing Causal Sensitivity Analysis via Prior Data-Fitted Networks

Emil Javurek, Dennis Frauen, Marie Brockschmidt, Jonas Schweisthal, and Stefan Feuer- riegel. Amortizing causal sensitivity analysis via prior data-fitted networks.arXiv preprint arXiv:2605.10590, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[23] [23]

Recent developments in partial identification.Annual Review of Economics, 15:125–150, 2023

Brendan Kline and Elie Tamer. Recent developments in partial identification.Annual Review of Economics, 15:125–150, 2023

2023

[24] [24]

Set transformer: A framework for attention-based permutation-invariant neural networks

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning, pages 3744–3753. PMLR, 2019

2019

[25] [25]

Causal effect identifiability under partial-observability

Sanghack Lee and Elias Bareinboim. Causal effect identifiability under partial-observability. In International Conference on Machine Learning, pages 5692–5701. PMLR, 2020

2020

[26] [26]

John Wiley & Sons, 2019

Roderick JA Little and Donald B Rubin.Statistical analysis with missing data. John Wiley & Sons, 2019

2019

[27] [27]

Foundation models for causal inference via prior-data fitted networks.arXiv preprint arXiv:2506.10914, 2025

Yuchen Ma, Dennis Frauen, Emil Javurek, and Stefan Feuerriegel. Foundation models for causal inference via prior-data fitted networks.arXiv preprint arXiv:2506.10914, 2025

work page arXiv 2025

[28] [28]

Counterfactual identification under monotonicity constraints

Aurghya Maiti, Drago Plecko, and Elias Bareinboim. Counterfactual identification under monotonicity constraints. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 26841–26850, 2025

2025

[29] [29]

A potential outcomes calculus for identifying conditional path-specific effects

Daniel Malinsky, Ilya Shpitser, and Thomas Richardson. A potential outcomes calculus for identifying conditional path-specific effects. InThe 22nd International Conference on Artificial Intelligence and Statistics, pages 3080–3088. PMLR, 2019

2019

[30] [30]

Identifying causal effects with proxy variables of an unmeasured confounder.Biometrika, 105(4):987–993, 2018

Wang Miao, Zhi Geng, and Eric J Tchetgen Tchetgen. Identifying causal effects with proxy variables of an unmeasured confounder.Biometrika, 105(4):987–993, 2018

2018

[31] [31]

Deep survival analysis: Nonparametrics and missingness

Xenia Miscouridou, Adler Perotte, Noémie Elhadad, and Rajesh Ranganath. Deep survival analysis: Nonparametrics and missingness. InMachine Learning for Healthcare Conference, pages 244–256. PMLR, 2018

2018

[32] [32]

Adaptive conditional quantile neural processes

Peiman Mohseni, Nick Duffield, Bani Mallick, and Arman Hasanzadeh. Adaptive conditional quantile neural processes. InUncertainty in Artificial Intelligence, pages 1445–1455. PMLR, 2023

2023

[33] [33]

De- mystifying amortized causal discovery with transformers.arXiv preprint arXiv:2405.16924, 2024

Francesco Montagna, Max Cairney-Leeming, Dhanya Sridhar, and Francesco Locatello. De- mystifying amortized causal discovery with transformers.arXiv preprint arXiv:2405.16924, 2024

work page arXiv 2024

[34] [34]

Causal diagrams for empirical research (with discussions)

Judea Pearl. Causal diagrams for empirical research (with discussions). InProbabilistic and causal inference: The works of Judea Pearl, pages 255–316. 2022

2022

[35] [35]

Probabilistic evaluation of sequential plans from causal models with hidden variables

Judea Pearl and James M Robins. Probabilistic evaluation of sequential plans from causal models with hidden variables. InUAI, volume 95, pages 444–453, 1995

1995

[36] [36]

General control functions for causal effect estimation from ivs.Advances in neural information processing systems, 33:8440–8451, 2020

Aahlad Puli and Rajesh Ranganath. General control functions for causal effect estimation from ivs.Advances in neural information processing systems, 33:8440–8451, 2020

2020

[37] [37]

Causal estimation with functional con- founders.Advances in neural information processing systems, 33:5115–5125, 2020

Aahlad Puli, Adler Perotte, and Rajesh Ranganath. Causal estimation with functional con- founders.Advances in neural information processing systems, 33:5115–5125, 2020

2020

[38] [38]

Tabiclv2: A better, faster, scalable, and open tabular foundation model, 2026

Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. Tabiclv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139, 2026

work page arXiv 2026

[39] [39]

Causal identification from counterfactual data: Com- pleteness and bounding results.arXiv preprint arXiv:2602.23541, 2026

Arvind Raghavan and Elias Bareinboim. Causal identification from counterfactual data: Com- pleteness and bounding results.arXiv preprint arXiv:2602.23541, 2026

work page arXiv 2026

[40] [40]

Deep survival analysis

Rajesh Ranganath, Adler Perotte, Noémie Elhadad, and David Blei. Deep survival analysis. In Machine Learning for Healthcare Conference, pages 101–114. PMLR, 2016. 13

2016

[41] [41]

Do-pfn: In-context learning for causal effect estimation

Jake Robertson, Arik Reuter, Siyuan Guo, Noah Hollmann, Frank Hutter, and Bernhard Schölkopf. Do-pfn: In-context learning for causal effect estimation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems

[42] [42]

Efficient adjustment sets for population average causal treatment effect estimation in graphical models.Journal of Machine Learning Research, 21 (188):1–86, 2020

Andrea Rotnitzky and Ezequiel Smucler. Efficient adjustment sets for population average causal treatment effect estimation in graphical models.Journal of Machine Learning Research, 21 (188):1–86, 2020. URLhttp://jmlr.org/papers/v21/19-1026.html

2020

[43] [43]

A general method for deriving tight symbolic bounds on causal effects.Journal of Computational and Graphical Statistics, 32(2):567–576, 2023

Michael C Sachs, Gustav Jonzon, Arvid Sjölander, and Erin E Gabriel. A general method for deriving tight symbolic bounds on causal effects.Journal of Computational and Graphical Statistics, 32(2):567–576, 2023

2023

[44] [44]

Estimating individual treatment effect: generalization bounds and algorithms

Uri Shalit, Fredrik D Johansson, and David Sontag. Estimating individual treatment effect: generalization bounds and algorithms. InInternational conference on machine learning, pages 3076–3085. PMLR, 2017

2017

[45] [45]

A linear non-gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, Antti Kerminen, and Michael Jordan. A linear non-gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

2006

[46] [46]

Identification of joint interventional distributions in recursive semi-markovian causal models

Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi-markovian causal models. InAAAI, pages 1219–1226, 2006

2006

[47] [47]

Complete identification methods for the causal hierarchy.Journal of Machine Learning Research, 9:1941–1979, 2008

Ilya Shpitser and Judea Pearl. Complete identification methods for the causal hierarchy.Journal of Machine Learning Research, 9:1941–1979, 2008

1941

[48] [48]

Testing for weak instruments in linear iv regression, 2002

James H Stock and Motohiro Yogo. Testing for weak instruments in linear iv regression, 2002

2002

[49] [49]

Consistency of neural causal partial identifi- cation.Advances in Neural Information Processing Systems, 37:68956–68999, 2024

Jiyuan Tan, Jose Blanchet, and Vasilis Syrgkanis. Consistency of neural causal partial identifi- cation.Advances in Neural Information Processing Systems, 37:68956–68999, 2024

2024

[50] [50]

An introduction to proximal causal learning.arXiv preprint arXiv:2009.10982, 2020

Eric J Tchetgen Tchetgen, Andrew Ying, Yifan Cui, Xu Shi, and Wang Miao. An introduction to proximal causal learning.arXiv preprint arXiv:2009.10982, 2020

work page arXiv 2009

[51] [51]

The causal-neural connection: Expressiveness, learnability, and inference.Advances in Neural Information Processing Systems, 34:10823–10836, 2021

Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, and Elias Bareinboim. The causal-neural connection: Expressiveness, learnability, and inference.Advances in Neural Information Processing Systems, 34:10823–10836, 2021

2021

[52] [52]

Neural causal models for counterfactual iden- tification and estimation

Kevin Xia, Yushu Pan, and Elias Bareinboim. Neural causal models for counterfactual iden- tification and estimation. InProceedings of the 11th Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=vouQcZS8KfW, 2022

2022

[53] [53]

Set norm and equivariant skip connections: Putting the deep in deep sets

Lily Zhang, Veronica Tozzo, John Higgins, and Rajesh Ranganath. Set norm and equivariant skip connections: Putting the deep in deep sets. InInternational Conference on Machine Learning, pages 26559–26574. PMLR, 2022. A Further generalizing causal meta-prediction frameworks 0.00 0.25 0.50 0.75 1.00 Error tolerance ( ϵ) in σ2 Y units 0.00 0.25 0.50 0.75 1.0...

2022