From Prediction to Intervention: The Evolution of AI in Biomedicine

Aleksandr Sarachakov; Alexander Bagaev; Andrew Feinberg; Ferran Prat; Michael Feinberg; Viktor Svekolkin

arxiv: 2605.16293 · v1 · pith:IQWFPEGUnew · submitted 2026-04-14 · 💻 cs.CY · cs.AI

From Prediction to Intervention: The Evolution of AI in Biomedicine

Andrew Feinberg , Aleksandr Sarachakov , Viktor Svekolkin , Alexander Bagaev , Ferran Prat , Michael Feinberg This is my paper

Pith reviewed 2026-05-21 01:12 UTC · model grok-4.3

classification 💻 cs.CY cs.AI

keywords AI biomedicineinterventional intelligenceobservational modelsdisease-level modelsbiomedical decision makingintervention simulationnovel therapies

0 comments

The pith

AI systems unable to model interventions will be structurally excluded from biomedical decision-making.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that AI in biomedicine is moving beyond observational prediction, which relies on statistical associations from historical data, to interventional intelligence capable of simulating biological responses to new therapies. This transition is necessary because medical decisions now often involve novel interventions where past observations provide no guidance. The authors introduce disease-level models that represent the state, dynamics, and intervention effects of biological processes, shifting from inference to simulation. A reader would care because this redefines the role of AI from analyzing existing patterns to enabling proactive decision support in uncharted clinical scenarios. If the claim holds, systems limited to prediction will no longer suffice for high-stakes biomedical choices.

Core claim

We argue that AI in biomedicine is undergoing a structural transition from observational systems that learn associations from historical data to interventional systems that model how biological processes respond to perturbations. Observational architectures are limited to previously observed states and cannot reliably generalize to novel therapies. Disease-level models explicitly represent state, dynamics, and intervention response to enable simulation of outcomes under intervention rather than extrapolation from the past. This follows from the structure of biomedical decision-making and means that systems unable to model intervention will be structurally excluded from it.

What carries the argument

The distinction between observational intelligence and interventional intelligence, with disease-level models serving as the mechanism to represent and simulate intervention responses in biological systems.

If this is right

Value shifts from data processing and prediction to supporting intervention decisions.
Predictive models cannot support generalization to unobserved interventions by construction.
Biomedical AI development will focus on explicit modeling of dynamics and perturbation responses.
Decision-making will prioritize simulation of 'what happens if' over 'what is likely based on past'.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This view implies a need for closer integration between AI and mechanistic modeling approaches in systems biology.
It raises questions about how to validate interventional models when true intervention data is scarce or unethical to obtain directly.
The framework could extend to other domains like policy-making where decisions involve novel actions without historical precedents.

Load-bearing premise

Predictive architectures that learn statistical associations from historical data are structurally incapable of representing how biological systems evolve under perturbation by novel interventions.

What would settle it

Training a predictive model on historical data and then testing whether it can accurately predict the effects of a completely new intervention or therapy that was not present in the training data would provide a direct test; failure to do so would support the claim while success would challenge it.

Figures

Figures reproduced from arXiv: 2605.16293 by Aleksandr Sarachakov, Alexander Bagaev, Andrew Feinberg, Ferran Prat, Michael Feinberg, Viktor Svekolkin.

**Figure 1.** Figure 1: From observational to interventional intelligence in biomedicine. Current AI systems in biomedicine operate within an observational paradigm, learning statistical associations from historical multimodal data to generate predictions such as risk scores, classifications, and patient stratification. While highly effective within previously observed distributions, these systems are inherently constrained in t… view at source ↗

**Figure 2.** Figure 2: The Disease Model Triad. Disease-level models are defined by three interdependent components: state representation, which encodes the biological configuration of the system across tumor, immune, and microenvironmental dimensions; state dynamics, which describe how this state evolves over time under endogenous processes; and the intervention operator, which maps therapeutic actions to transitions in system… view at source ↗

**Figure 3.** Figure 3: From state representation to decision through simulation. Disease-level models operate by representing the biological state of a system, modeling its dynamics, and applying interventions as perturbations to that state. Simulation of these perturbations generates trajectories that reflect how disease evolves under different therapeutic actions. Decisions are then derived from these simulated trajectories … view at source ↗

read the original abstract

Artificial intelligence has advanced rapidly in biomedicine through large-scale multimodal data integration, enabling increasingly accurate prediction of clinical outcomes and patient stratification. These systems, however, remain fundamentally observational: they learn statistical associations from historical data and operate within previously observed biological and clinical states, limiting their ability to generalize to novel therapies or unobserved interventions. We argue that AI in biomedicine is undergoing a structural transition. As biomedical decision-making increasingly depends on reasoning about intervention rather than extrapolation from past observations, predictive architectures become structurally insufficient. Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation, and therefore cannot reliably support decision-making in the presence of novel interventions. We introduce a conceptual framework distinguishing observational and interventional intelligence and define disease-level models as systems that explicitly represent the state, dynamics, and intervention response of biological processes. These models enable a shift from inference to simulation -- reasoning about what will happen under intervention rather than what is likely based on the past. This transition also implies a shift in where value is created: from data processing and prediction toward systems that support and define decision-making under intervention. It follows directly from the structure of biomedical decision-making and defines the next stage of AI in medicine. Systems that cannot model intervention will be structurally excluded from decision-making.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that AI systems in biomedicine are transitioning from observational predictive models, which learn statistical associations from historical data, to interventional models. It introduces 'disease-level models' that explicitly represent biological states, dynamics, and responses to interventions, enabling simulation-based reasoning. The core thesis is that predictive architectures are structurally insufficient for decision-making involving novel interventions and will therefore be excluded from such roles.

Significance. If the central distinction holds, the framework could usefully redirect research priorities toward causal and simulation-based AI in biomedicine, particularly for supporting decisions on unseen therapies. The conceptual separation of observational versus interventional intelligence provides a clear lens for evaluating current limitations, though its influence would be strengthened by empirical grounding or formal criteria.

major comments (2)

Abstract, second paragraph: The claim that systems learning statistical associations 'cannot, by construction, represent how biological systems evolve under perturbation' is load-bearing for the structural exclusion argument. It equates observational learning with inability to support intervention decisions but does not address or refute methods such as counterfactual estimation or transportability assumptions that derive interventional conclusions from observational data. This requires explicit justification to avoid appearing definitional rather than demonstrated.
Section introducing the conceptual framework (following the abstract): The definition of disease-level models as systems that 'explicitly represent the state, dynamics, and intervention response' is presented without a precise characterization or comparison to related existing approaches (e.g., structural causal models or mechanistic simulations). Without this, it is difficult to assess whether the proposed shift constitutes a substantive advance or a relabeling of ongoing work in causal AI.

minor comments (2)

Abstract, final sentence: The phrase 'It follows directly from the structure of biomedical decision-making' is vague; specify the particular structural features of decision-making that lead to this conclusion.
Throughout the manuscript: Include citations to key literature on causal inference in biomedicine and examples of predictive models used for treatment effect estimation to contextualize the argument.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. Their comments have helped us identify areas where the conceptual distinctions can be clarified. We respond to each major comment below, indicating the changes we plan to make in the revised version.

read point-by-point responses

Referee: Abstract, second paragraph: The claim that systems learning statistical associations 'cannot, by construction, represent how biological systems evolve under perturbation' is load-bearing for the structural exclusion argument. It equates observational learning with inability to support intervention decisions but does not address or refute methods such as counterfactual estimation or transportability assumptions that derive interventional conclusions from observational data. This requires explicit justification to avoid appearing definitional rather than demonstrated.

Authors: We appreciate the referee's observation that our claim requires further justification to distinguish it from existing methods. Our position is that while counterfactual estimation and transportability can provide interventional insights under specific assumptions derived from observational data, they do not enable the kind of open-ended simulation of novel perturbations that disease-level models aim to support. For truly novel interventions, these methods often require additional assumptions that may not be verifiable. We will revise the abstract to include a brief discussion of these methods and explain why they remain limited for the structural reasons outlined in the paper. revision: yes
Referee: Section introducing the conceptual framework (following the abstract): The definition of disease-level models as systems that 'explicitly represent the state, dynamics, and intervention response' is presented without a precise characterization or comparison to related existing approaches (e.g., structural causal models or mechanistic simulations). Without this, it is difficult to assess whether the proposed shift constitutes a substantive advance or a relabeling of ongoing work in causal AI.

Authors: We agree that the definition would benefit from greater precision and explicit comparison. In the revised version, we will expand this section to include a more formal characterization of disease-level models, specifying their requirements for representing dynamics and intervention responses. We will also add a discussion comparing them to structural causal models, which typically do not simulate full temporal dynamics, and to mechanistic simulations, which may not incorporate machine learning from large datasets. This will clarify the distinct contribution of our framework. revision: yes

Circularity Check

1 steps flagged

Core claim that predictive systems cannot support interventions is tautological from their definition as observational learners

specific steps

self definitional [Abstract, second paragraph]
"Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation, and therefore cannot reliably support decision-making in the presence of novel interventions."

The paper first characterizes predictive architectures as learning statistical associations from historical data (observational only). It then claims that this definition entails they 'cannot, by construction' represent perturbations or support interventions. The structural exclusion is therefore equivalent to the premise by the paper's own wording, with no additional derivation or criterion provided.

full rationale

The paper's central derivation defines current biomedical AI as systems that 'learn statistical associations from historical data' and 'operate within previously observed biological and clinical states.' It then asserts that such systems 'cannot, by construction, represent how biological systems evolve under perturbation' and are therefore 'structurally insufficient' for interventional decision-making. This reduction makes the insufficiency follow directly from the definitional premise rather than from an independent argument, formal criterion, or external evidence showing why accurate outcome prediction would fail to support interventions. The transition to 'interventional intelligence' and 'disease-level models' is presented as following 'directly from the structure of biomedical decision-making,' but no separate justification is supplied beyond the initial framing. This matches a self-definitional pattern with moderate circularity; the rest of the framework (value shift, exclusion from decision-making) inherits the same linkage.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the premise that observational learning cannot support intervention reasoning, presented as a structural property of current architectures without independent derivation or external benchmarks.

axioms (1)

domain assumption Predictive AI systems learn statistical associations from historical data and cannot represent evolution under perturbation.
This premise is stated directly in the abstract as the basis for declaring predictive architectures structurally insufficient.

invented entities (1)

disease-level models no independent evidence
purpose: Systems that explicitly represent the state, dynamics, and intervention response of biological processes to enable simulation-based reasoning.
Introduced as a new conceptual category to bridge the gap between prediction and intervention; no independent evidence or falsifiable predictions are provided for their construction or performance.

pith-pipeline@v0.9.0 · 5777 in / 1293 out tokens · 43804 ms · 2026-05-21T01:12:59.512617+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Systems that learn from historical data cannot, by construction, represent how biological systems evolve under perturbation... We introduce a conceptual framework distinguishing observational and interventional intelligence and define disease-level models as systems that explicitly represent the state, dynamics, and intervention response...
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The Disease Model Triad... State Representation, State Dynamics, Intervention Operator

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

[1]

A guide to deep learning in healthcare

Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Jeffrey DeFauw, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, and Jeff Dean. A guide to deep learning in healthcare. Nature Medicine, 25: 0 24--29, 2019. doi:10.1038/s41591-018-0316-z

work page doi:10.1038/s41591-018-0316-z 2019
[2]

World models

David Ha and J \"u rgen Schmidhuber. World models. In Advances in Neural Information Processing Systems (NeurIPS), 2018

work page 2018
[3]

A path towards autonomous machine intelligence

Yann LeCun. A path towards autonomous machine intelligence. Technical report, 2022. URL https://openreview.net/pdf?id=BZ5a1r-kVsf

work page 2022
[4]

Krber, Aurelio Lozano, Curtis P

Michael Moor, Oishi Banerjee, Zahra Shakeri Hossein Abad, Harlan M. Krber, Aurelio Lozano, Curtis P. Langlotz, et al. Foundation models for generalist medical artificial intelligence. Nature, 616: 0 259--265, 2023. doi:10.1038/s41586-023-05881-4

work page doi:10.1038/s41586-023-05881-4 2023
[5]

Sculley, Sebastian Nowozin, Joshua V

Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model's uncertainty? E valuating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems (NeurIPS), 2019

work page 2019
[6]

Causality: Models, Reasoning, and Inference

Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009

work page 2009
[7]

The Book of Why: The New Science of Cause and Effect

Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018

work page 2018
[8]

Do ImageNet classifiers generalize to ImageNet ? In International Conference on Machine Learning (ICML), 2019

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do ImageNet classifiers generalize to ImageNet ? In International Conference on Machine Learning (ICML), 2019

work page 2019
[9]

From development to deployment: Dataset shift, causality, and shift-stable models in health AI

Adarsh Subbaswamy and Suchi Saria. From development to deployment: Dataset shift, causality, and shift-stable models in health AI . Biostatistics, 21 0 (2): 0 345--352, 2020. doi:10.1093/biostatistics/kxz041

work page doi:10.1093/biostatistics/kxz041 2020
[10]

Eric J. Topol. High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25: 0 44--56, 2019. doi:10.1038/s41591-018-0300-7

work page doi:10.1038/s41591-018-0300-7 2019

[1] [1]

A guide to deep learning in healthcare

Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Jeffrey DeFauw, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, and Jeff Dean. A guide to deep learning in healthcare. Nature Medicine, 25: 0 24--29, 2019. doi:10.1038/s41591-018-0316-z

work page doi:10.1038/s41591-018-0316-z 2019

[2] [2]

World models

David Ha and J \"u rgen Schmidhuber. World models. In Advances in Neural Information Processing Systems (NeurIPS), 2018

work page 2018

[3] [3]

A path towards autonomous machine intelligence

Yann LeCun. A path towards autonomous machine intelligence. Technical report, 2022. URL https://openreview.net/pdf?id=BZ5a1r-kVsf

work page 2022

[4] [4]

Krber, Aurelio Lozano, Curtis P

Michael Moor, Oishi Banerjee, Zahra Shakeri Hossein Abad, Harlan M. Krber, Aurelio Lozano, Curtis P. Langlotz, et al. Foundation models for generalist medical artificial intelligence. Nature, 616: 0 259--265, 2023. doi:10.1038/s41586-023-05881-4

work page doi:10.1038/s41586-023-05881-4 2023

[5] [5]

Sculley, Sebastian Nowozin, Joshua V

Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model's uncertainty? E valuating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems (NeurIPS), 2019

work page 2019

[6] [6]

Causality: Models, Reasoning, and Inference

Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009

work page 2009

[7] [7]

The Book of Why: The New Science of Cause and Effect

Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018

work page 2018

[8] [8]

Do ImageNet classifiers generalize to ImageNet ? In International Conference on Machine Learning (ICML), 2019

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do ImageNet classifiers generalize to ImageNet ? In International Conference on Machine Learning (ICML), 2019

work page 2019

[9] [9]

From development to deployment: Dataset shift, causality, and shift-stable models in health AI

Adarsh Subbaswamy and Suchi Saria. From development to deployment: Dataset shift, causality, and shift-stable models in health AI . Biostatistics, 21 0 (2): 0 345--352, 2020. doi:10.1093/biostatistics/kxz041

work page doi:10.1093/biostatistics/kxz041 2020

[10] [10]

Eric J. Topol. High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25: 0 44--56, 2019. doi:10.1038/s41591-018-0300-7

work page doi:10.1038/s41591-018-0300-7 2019