From Prediction to Intervention: The Evolution of AI in Biomedicine
Pith reviewed 2026-05-21 01:12 UTC · model grok-4.3
The pith
AI systems unable to model interventions will be structurally excluded from biomedical decision-making.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We argue that AI in biomedicine is undergoing a structural transition from observational systems that learn associations from historical data to interventional systems that model how biological processes respond to perturbations. Observational architectures are limited to previously observed states and cannot reliably generalize to novel therapies. Disease-level models explicitly represent state, dynamics, and intervention response to enable simulation of outcomes under intervention rather than extrapolation from the past. This follows from the structure of biomedical decision-making and means that systems unable to model intervention will be structurally excluded from it.
What carries the argument
The distinction between observational intelligence and interventional intelligence, with disease-level models serving as the mechanism to represent and simulate intervention responses in biological systems.
If this is right
- Value shifts from data processing and prediction to supporting intervention decisions.
- Predictive models cannot support generalization to unobserved interventions by construction.
- Biomedical AI development will focus on explicit modeling of dynamics and perturbation responses.
- Decision-making will prioritize simulation of 'what happens if' over 'what is likely based on past'.
Where Pith is reading between the lines
- This view implies a need for closer integration between AI and mechanistic modeling approaches in systems biology.
- It raises questions about how to validate interventional models when true intervention data is scarce or unethical to obtain directly.
- The framework could extend to other domains like policy-making where decisions involve novel actions without historical precedents.
Load-bearing premise
Predictive architectures that learn statistical associations from historical data are structurally incapable of representing how biological systems evolve under perturbation by novel interventions.
What would settle it
Training a predictive model on historical data and then testing whether it can accurately predict the effects of a completely new intervention or therapy that was not present in the training data would provide a direct test; failure to do so would support the claim while success would challenge it.
Figures
read the original abstract
Artificial intelligence has advanced rapidly in biomedicine through large-scale multimodal data integration, enabling increasingly accurate prediction of clinical outcomes and patient stratification. These systems, however, remain fundamentally observational: they learn statistical associations from historical data and operate within previously observed biological and clinical states, limiting their ability to generalize to novel therapies or unobserved interventions. We argue that AI in biomedicine is undergoing a structural transition. As biomedical decision-making increasingly depends on reasoning about intervention rather than extrapolation from past observations, predictive architectures become structurally insufficient. Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation, and therefore cannot reliably support decision-making in the presence of novel interventions. We introduce a conceptual framework distinguishing observational and interventional intelligence and define disease-level models as systems that explicitly represent the state, dynamics, and intervention response of biological processes. These models enable a shift from inference to simulation -- reasoning about what will happen under intervention rather than what is likely based on the past. This transition also implies a shift in where value is created: from data processing and prediction toward systems that support and define decision-making under intervention. It follows directly from the structure of biomedical decision-making and defines the next stage of AI in medicine. Systems that cannot model intervention will be structurally excluded from decision-making.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that AI systems in biomedicine are transitioning from observational predictive models, which learn statistical associations from historical data, to interventional models. It introduces 'disease-level models' that explicitly represent biological states, dynamics, and responses to interventions, enabling simulation-based reasoning. The core thesis is that predictive architectures are structurally insufficient for decision-making involving novel interventions and will therefore be excluded from such roles.
Significance. If the central distinction holds, the framework could usefully redirect research priorities toward causal and simulation-based AI in biomedicine, particularly for supporting decisions on unseen therapies. The conceptual separation of observational versus interventional intelligence provides a clear lens for evaluating current limitations, though its influence would be strengthened by empirical grounding or formal criteria.
major comments (2)
- Abstract, second paragraph: The claim that systems learning statistical associations 'cannot, by construction, represent how biological systems evolve under perturbation' is load-bearing for the structural exclusion argument. It equates observational learning with inability to support intervention decisions but does not address or refute methods such as counterfactual estimation or transportability assumptions that derive interventional conclusions from observational data. This requires explicit justification to avoid appearing definitional rather than demonstrated.
- Section introducing the conceptual framework (following the abstract): The definition of disease-level models as systems that 'explicitly represent the state, dynamics, and intervention response' is presented without a precise characterization or comparison to related existing approaches (e.g., structural causal models or mechanistic simulations). Without this, it is difficult to assess whether the proposed shift constitutes a substantive advance or a relabeling of ongoing work in causal AI.
minor comments (2)
- Abstract, final sentence: The phrase 'It follows directly from the structure of biomedical decision-making' is vague; specify the particular structural features of decision-making that lead to this conclusion.
- Throughout the manuscript: Include citations to key literature on causal inference in biomedicine and examples of predictive models used for treatment effect estimation to contextualize the argument.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. Their comments have helped us identify areas where the conceptual distinctions can be clarified. We respond to each major comment below, indicating the changes we plan to make in the revised version.
read point-by-point responses
-
Referee: Abstract, second paragraph: The claim that systems learning statistical associations 'cannot, by construction, represent how biological systems evolve under perturbation' is load-bearing for the structural exclusion argument. It equates observational learning with inability to support intervention decisions but does not address or refute methods such as counterfactual estimation or transportability assumptions that derive interventional conclusions from observational data. This requires explicit justification to avoid appearing definitional rather than demonstrated.
Authors: We appreciate the referee's observation that our claim requires further justification to distinguish it from existing methods. Our position is that while counterfactual estimation and transportability can provide interventional insights under specific assumptions derived from observational data, they do not enable the kind of open-ended simulation of novel perturbations that disease-level models aim to support. For truly novel interventions, these methods often require additional assumptions that may not be verifiable. We will revise the abstract to include a brief discussion of these methods and explain why they remain limited for the structural reasons outlined in the paper. revision: yes
-
Referee: Section introducing the conceptual framework (following the abstract): The definition of disease-level models as systems that 'explicitly represent the state, dynamics, and intervention response' is presented without a precise characterization or comparison to related existing approaches (e.g., structural causal models or mechanistic simulations). Without this, it is difficult to assess whether the proposed shift constitutes a substantive advance or a relabeling of ongoing work in causal AI.
Authors: We agree that the definition would benefit from greater precision and explicit comparison. In the revised version, we will expand this section to include a more formal characterization of disease-level models, specifying their requirements for representing dynamics and intervention responses. We will also add a discussion comparing them to structural causal models, which typically do not simulate full temporal dynamics, and to mechanistic simulations, which may not incorporate machine learning from large datasets. This will clarify the distinct contribution of our framework. revision: yes
Circularity Check
Core claim that predictive systems cannot support interventions is tautological from their definition as observational learners
specific steps
-
self definitional
[Abstract, second paragraph]
"Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation, and therefore cannot reliably support decision-making in the presence of novel interventions."
The paper first characterizes predictive architectures as learning statistical associations from historical data (observational only). It then claims that this definition entails they 'cannot, by construction' represent perturbations or support interventions. The structural exclusion is therefore equivalent to the premise by the paper's own wording, with no additional derivation or criterion provided.
full rationale
The paper's central derivation defines current biomedical AI as systems that 'learn statistical associations from historical data' and 'operate within previously observed biological and clinical states.' It then asserts that such systems 'cannot, by construction, represent how biological systems evolve under perturbation' and are therefore 'structurally insufficient' for interventional decision-making. This reduction makes the insufficiency follow directly from the definitional premise rather than from an independent argument, formal criterion, or external evidence showing why accurate outcome prediction would fail to support interventions. The transition to 'interventional intelligence' and 'disease-level models' is presented as following 'directly from the structure of biomedical decision-making,' but no separate justification is supplied beyond the initial framing. This matches a self-definitional pattern with moderate circularity; the rest of the framework (value shift, exclusion from decision-making) inherits the same linkage.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Predictive AI systems learn statistical associations from historical data and cannot represent evolution under perturbation.
invented entities (1)
-
disease-level models
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation... We introduce a conceptual framework distinguishing observational and interventional intelligence and define disease-level models as systems that explicitly represent the state, dynamics, and intervention response...
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The Disease Model Triad... State Representation, State Dynamics, Intervention Operator
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Jeffrey DeFauw, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, and Jeff Dean. A guide to deep learning in healthcare. Nature Medicine, 25: 0 24--29, 2019. doi:10.1038/s41591-018-0316-z
-
[2]
David Ha and J \"u rgen Schmidhuber. World models. In Advances in Neural Information Processing Systems (NeurIPS), 2018
work page 2018
-
[3]
A path towards autonomous machine intelligence
Yann LeCun. A path towards autonomous machine intelligence. Technical report, 2022. URL https://openreview.net/pdf?id=BZ5a1r-kVsf
work page 2022
-
[4]
Krber, Aurelio Lozano, Curtis P
Michael Moor, Oishi Banerjee, Zahra Shakeri Hossein Abad, Harlan M. Krber, Aurelio Lozano, Curtis P. Langlotz, et al. Foundation models for generalist medical artificial intelligence. Nature, 616: 0 259--265, 2023. doi:10.1038/s41586-023-05881-4
-
[5]
Sculley, Sebastian Nowozin, Joshua V
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model's uncertainty? E valuating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems (NeurIPS), 2019
work page 2019
-
[6]
Causality: Models, Reasoning, and Inference
Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009
work page 2009
-
[7]
The Book of Why: The New Science of Cause and Effect
Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018
work page 2018
-
[8]
Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do ImageNet classifiers generalize to ImageNet ? In International Conference on Machine Learning (ICML), 2019
work page 2019
-
[9]
From development to deployment: Dataset shift, causality, and shift-stable models in health AI
Adarsh Subbaswamy and Suchi Saria. From development to deployment: Dataset shift, causality, and shift-stable models in health AI . Biostatistics, 21 0 (2): 0 345--352, 2020. doi:10.1093/biostatistics/kxz041
-
[10]
Eric J. Topol. High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25: 0 44--56, 2019. doi:10.1038/s41591-018-0300-7
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.