Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

Abdul Moeed; Daniel Dimitrov; Marc Jan Bonder; Martin Rohbeck; Oliver Stegle; Pavlo Lutsik; Stefan Schrod

arxiv: 2606.08493 · v2 · pith:ZZ3AM5SAnew · submitted 2026-06-07 · 🧬 q-bio.GN · cs.LG· stat.ML

Querying Counterfactuals on Tissue Graphs with Supervised Disentanglement

Abdul Moeed , Stefan Schrod , Martin Rohbeck , Marc Jan Bonder , Pavlo Lutsik , Oliver Stegle , Daniel Dimitrov This is my paper

Pith reviewed 2026-06-27 17:43 UTC · model grok-4.3

classification 🧬 q-bio.GN cs.LGstat.ML

keywords tissue graphscounterfactual predictionspatial transcriptomicsdisentangled representationsgraph perturbationscell neighborhoodscancer subdomains

0 comments

The pith

Supervised disentanglement separates a cell's intrinsic state from its spatial neighbors to answer counterfactual queries on tissue graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper first defines tissue graph counterfactuals as interventions that either rewire cell-to-cell connections or change the expression profiles of neighboring cells. It then presents a framework that applies supervised disentanglement to factor each cell's representation into an intrinsic component and a spatial-context component, with the latter used as the conditioning input for generating the counterfactual expression. Benchmarks on more than 2.5 million spatially resolved cells from colorectal cancer and mouse brain show gains over both spatial and non-spatial baselines in perturbation accuracy, disentanglement quality, and scalability. The same separation also surfaces biologically distinct cancer subdomains without labels and supports targeted neighbor-change simulations.

Core claim

Tissue graph counterfactuals are formalized as two families of spatial interventions: edge perturbations that rewire neighbor relations and node perturbations that alter neighbor expression. Supervised disentanglement decomposes each cell into an intrinsic state vector and a spatial context vector; the spatial vector is then treated as a controllable conditioning variable so that new neighbor configurations produce predicted expression changes for the target cell.

What carries the argument

Supervised disentanglement that isolates a cell's intrinsic state from its spatial context so the spatial component can serve as a conditioning input for counterfactual prediction.

If this is right

The method outperforms both spatially aware and non-spatial baselines on in-silico edge and node perturbations across millions of cells.
Unsupervised clustering on the learned intrinsic states recovers biologically distinct cancer subdomains.
The framework supports direct simulation of targeted changes to a cell's neighbor set.
Runtime and memory scale to tissue graphs containing millions of spatially resolved cells.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same intrinsic-versus-context split could be tested on time-series spatial data to track how neighbor changes drive state transitions during disease progression.
If the separation holds, the model could be paired with perturbation screens to prioritize which neighbor alterations are most likely to shift a target cell's phenotype.
Extending the conditioning variable to include additional modalities such as chromatin state or metabolite levels would test whether spatial context alone captures the dominant drivers of expression change.

Load-bearing premise

Supervised disentanglement cleanly isolates a cell's intrinsic state from its spatial context without leaking biologically relevant information or degrading prediction accuracy.

What would settle it

Collect a new spatial dataset containing cells whose actual post-perturbation expression is known; run the model on the pre-perturbation graphs and check whether its predicted counterfactual expressions match the observed post-perturbation values within measurement error.

Figures

Figures reproduced from arXiv: 2606.08493 by Abdul Moeed, Daniel Dimitrov, Marc Jan Bonder, Martin Rohbeck, Oliver Stegle, Pavlo Lutsik, Stefan Schrod.

**Figure 1.** Figure 1: Tissue graph counterfactuals and Cellina overview. (Left) Two interventions on a focal cell v with neighbors u: edge perturbation rewires v’s neighborhood N (v) to a counterfactual neighbor pool P, and node perturbation alters neighbor expression on a feature (gene) subset S. (Right) Cellina encodes intrinsic identity z ∼ q(z | xv) and spatial representation s from v’s local neighborhood, and decodes p(x |… view at source ↗

**Figure 2.** Figure 2: Counterfactual expression prediction with Cellina. Observed IGF2 expression in Fibroblasts with preannotated spatial domains in a tissue section. Rewiring FibroblastsREF neighborhoods to those of FibroblastsCRC yields predicted IGF2 expression recapitulating CRC Fibroblasts. True spatial perturbations remain experimentally difficult to obtain at scale, with emerging spatial perturbation screens curren… view at source ↗

**Figure 3.** Figure 3: Cellina identifies spatial subdomains and enables pathway-specific perturbations in CRC. (a) UMAP of Cellina-generated Fibroblast counts; counterfactual cells (CRC1 CF, CRC2 CF) integrate with their respective observed CRC subdomain populations. (b) Spearman ρ for Cellinapredicted counterfactuals across cell types, comparing global (pre-annotated) CRC neighborhoods versus Cellina-identified subdomains. (c… view at source ↗

read the original abstract

Tissue graph counterfactuals ask how a cell's expression would change under altered spatial neighbor contexts. Such queries are central to predicting cell behavior in tissues, but lack a unified definition, with existing methods targeting specific intervention types or treating cells as i.i.d. In this work, we first formalize tissue graph counterfactuals as a class of spatial interventions that either rewire connections between cells (edge perturbation) or modify the expression of their neighbors (node perturbation). We then introduce Cellina (https://cellina.readthedocs.io) - a framework that uses supervised disentanglement to decompose a cell's intrinsic state from its spatial context, using the latter as a conditioning input for counterfactual predictions. Across benchmarks spanning over 2.5 million spatially-resolved cells in colorectal cancer and mouse brain, Cellina outperforms spatially-informed and non-spatial competitors in in-silico graph perturbations, disentanglement, and scalability. Additionally, we show that Cellina reveals biologically distinct cancer subdomains in an unsupervised manner and enables targeted neighbor perturbation simulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper formalizes tissue-graph counterfactuals as edge or node perturbations and applies supervised disentanglement in Cellina to predict expression changes, but the abstract supplies no verifiable details on the method or benchmarks.

read the letter

The main point is that they give a clean definition of spatial counterfactuals on tissue graphs—either rewiring a cell's neighbors or altering neighbor expression—and then build Cellina to handle those queries by disentangling intrinsic cell state from spatial context.

The formalization itself is useful because prior work handled specific interventions without a shared language. Scaling the evaluation to 2.5 million cells across colorectal cancer and mouse brain datasets is also a concrete step, and the claim that it beats both spatial and non-spatial baselines on in-silico perturbations plus disentanglement metrics is worth checking. The unsupervised subdomain discovery in cancer tissue is an extra angle that could interest biologists.

The soft spot is that everything rests on the disentanglement step succeeding without leakage or loss of relevant signal, yet the abstract gives no equations, training details, or ablation results to judge whether that holds. No error bars, split information, or benchmark construction notes appear either, so the outperformance numbers cannot be assessed from what is here. The assumption that spatial context can be cleanly isolated for conditioning looks plausible on paper but needs the full methods to evaluate.

This is for people already working on spatial transcriptomics or graph models of tissues who want a unified way to run neighbor-perturbation simulations. A reader focused on that niche would get value from the framing even if the empirical claims require confirmation.

I would send it to peer review so the implementation and numbers can be examined properly.

Referee Report

2 major / 1 minor

Summary. The manuscript formalizes tissue graph counterfactuals as a class of spatial interventions consisting of edge perturbations (rewiring cell connections) or node perturbations (modifying neighbor expression). It introduces Cellina, a framework that applies supervised disentanglement to decompose each cell's intrinsic state from its spatial context and uses the spatial component as a conditioning input for counterfactual predictions. On benchmarks covering >2.5 million spatially resolved cells from colorectal cancer and mouse brain, Cellina is reported to outperform both spatially-informed and non-spatial baselines in in-silico perturbation accuracy, disentanglement quality, and scalability; additional results show unsupervised identification of biologically distinct cancer subdomains and targeted neighbor-perturbation simulations.

Significance. If the central claims hold, the work supplies a unified, scalable approach to counterfactual reasoning on tissue graphs, addressing a gap between existing methods that target narrow intervention types or ignore spatial structure. The scale of the evaluated datasets and the demonstration of unsupervised subdomain discovery constitute concrete strengths. The supervised disentanglement strategy, if shown to avoid leakage, could become a reusable technique for spatial transcriptomics.

major comments (2)

[Abstract] Abstract: the central claim that supervised disentanglement cleanly separates intrinsic state from spatial context (enabling leakage-free conditioning for counterfactuals) is load-bearing for all reported performance gains, yet the abstract supplies no quantitative validation, ablation, or leakage diagnostic; without these, the outperformance cannot be assessed.
[Abstract] Abstract: the statement that Cellina 'outperforms' competitors across in-silico graph perturbations, disentanglement, and scalability is presented without reference to error bars, data-split protocols, number of replicates, or statistical tests; this information is required to evaluate whether the reported superiority is robust.

minor comments (1)

The provision of a documentation link (cellina.readthedocs.io) is helpful for reproducibility and should be retained.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments regarding the abstract. The abstract is intended as a concise summary of the work, with full quantitative details, protocols, and diagnostics provided in the main text and supplementary materials. We address each point below and indicate where revisions can be made.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that supervised disentanglement cleanly separates intrinsic state from spatial context (enabling leakage-free conditioning for counterfactuals) is load-bearing for all reported performance gains, yet the abstract supplies no quantitative validation, ablation, or leakage diagnostic; without these, the outperformance cannot be assessed.

Authors: The abstract summarizes the core contribution without space for full metrics. The manuscript provides quantitative validation of the disentanglement (via performance on counterfactual tasks, ablation studies removing the spatial component, and leakage checks through cross-validation of intrinsic vs. context factors) in the Results and Methods sections, supported by experiments on >2.5M cells. We can revise the abstract to include a brief clause referencing these validations (e.g., 'validated via ablations and leakage diagnostics on large-scale benchmarks') if the editor permits an expanded abstract. revision: partial
Referee: [Abstract] Abstract: the statement that Cellina 'outperforms' competitors across in-silico graph perturbations, disentanglement, and scalability is presented without reference to error bars, data-split protocols, number of replicates, or statistical tests; this information is required to evaluate whether the reported superiority is robust.

Authors: The abstract reports the high-level outcome; the full experimental details—including multiple data splits, replicate counts, error bars (standard deviations across runs), and statistical tests (e.g., paired t-tests or Wilcoxon tests)—are documented in the Methods, Results, and Supplementary sections. We can add a short parenthetical in the abstract (e.g., 'with statistical significance across replicates') or a reference to the supplementary materials to address this. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and description formalize tissue graph counterfactuals as edge/node perturbations and introduce Cellina via supervised disentanglement for conditioning predictions. No equations, derivations, or self-citations are exhibited that reduce any claimed prediction or result to fitted inputs or prior author work by construction. The central claims rest on external benchmark evaluations across 2.5M cells rather than internal self-definition or renaming. The derivation chain is self-contained against the stated benchmarks with no load-bearing circular steps identifiable from the text.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no equations, parameters, or modeling details provided to identify free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5731 in / 1025 out tokens · 14726 ms · 2026-06-27T17:43:41.402044+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 3 canonical work pages

[1]

How attentive are graph attention networks?arXiv preprint arXiv:2105.14491,

Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks?arXiv preprint arXiv:2105.14491,

Pith/arXiv arXiv
[2]

doi: 10.1016/j.cell.2024.11.015

ISSN 0092-8674. doi: 10.1016/j.cell.2024.11.015. URLhttps://doi.org/10.1016/j.cell.2024.11.015. Helena L Crowell, Irene Ruano, Zedong Hu, Yourae Hong, Gin Caratù, Hubert Piessevaux, Ash- ley Heck, Rachel Liu, Max Walter, Megan Vandenberg, et al. Tracing colorectal malignancy transformation from cell to tissue scale.bioRxiv,

work page doi:10.1016/j.cell.2024.11.015 2024
[3]

Fast graph representation learning with pytorch geometric.arXiv preprint arXiv:1903.02428,

11 Matthias Fey and Jan Eric Lenssen. Fast graph representation learning with pytorch geometric.arXiv preprint arXiv:1903.02428,

Pith/arXiv arXiv 1903
[4]

Strand: Sequence-conditioned transport for single-cell perturbations

Boyang Fu, George Dasoulas, Sameer Gabbita, Xiang Lin, Shanghua Gao, Xiaorui Su, Soumya Ghosh, and Marinka Zitnik. Strand: Sequence-conditioned transport for single-cell perturbations. arXiv preprint arXiv:2602.10156,

arXiv
[5]

Demystifying inter-class disentanglement.arXiv preprint arXiv:1906.11796,

Aviv Gabbay and Yedid Hoshen. Demystifying inter-class disentanglement.arXiv preprint arXiv:1906.11796,

arXiv 1906
[6]

Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen

URL https://proceedings.neurips.cc/paper_files/paper/2022/ file/aa933b5abc1be30baece1d230ec575a7-Paper-Conference.pdf. Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen. Variational autoencoders and nonlinear ica: A unifying framework. InInternational conference on artificial intelligence and statistics, pages 2207–2217. PMLR,

2022
[8]

Diederik P Kingma and Max Welling

URLhttps://arxiv.org/abs/2004.11362. Diederik P Kingma and Max Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114,

arXiv 2004
[9]

Variational graph auto-encoders.arXiv preprint arXiv:1611.07308,

Thomas N Kipf and Max Welling. Variational graph auto-encoders.arXiv preprint arXiv:1611.07308,

Pith/arXiv arXiv
[10]

doi: 10.1038/s41592-021-01336-8

ISSN 1548-7105. doi: 10.1038/s41592-021-01336-8. URLhttps://doi.org/10.1038/s41592-021-01336-8. Jing Ma, Ruocheng Guo, Saumitra Mishra, Aidong Zhang, and Jundong Li. Clear: Generative counterfactual explanations on graphs.Advances in neural information processing systems, 35: 25895–25907,

work page doi:10.1038/s41592-021-01336-8
[11]

URL https://proceedings.neurips.cc/paper_files/paper/2024/file/ 24c4d51f3ef48dd2dbab78243ecb26a1-Paper-Datasets_and_Benchmarks_Track.pdf

doi: 10.52202/079017-0650. URL https://proceedings.neurips.cc/paper_files/paper/2024/file/ 24c4d51f3ef48dd2dbab78243ecb26a1-Paper-Datasets_and_Benchmarks_Track.pdf. Jovan Tanevski, Ricardo Omar Ramirez Flores, Attila Gabor, Denis Schapiro, and Julio Saez- Rodriguez. Explainable multiview framework for dissecting spatial relationships from highly multiplex...

work page doi:10.52202/079017-0650 2024
[12]

Perturbench: Benchmarking machine learning models for cellular perturbation analysis.arXiv preprint arXiv:2408.10609,

Yan Wu, Esther Wershof, Sebastian M Schmon, Marcel Nassar, Bła˙zej Osi´nski, Ridvan Eksi, Zichao Yan, Rory Stark, Kun Zhang, and Thore Graepel. Perturbench: Benchmarking machine learning models for cellular perturbation analysis.arXiv preprint arXiv:2408.10609,

arXiv
[13]

We convert both to normalized expression as pg,v = cg,vP g′ cg′,v ·ℓ 0 We consider perturbation settings where each gene is evaluated between a control and a perturbed condition

Let c(obs) g,v denote the observed raw count of gene g in cell v, and c(pred) g,v the corresponding model-predicted raw count. We convert both to normalized expression as pg,v = cg,vP g′ cg′,v ·ℓ 0 We consider perturbation settings where each gene is evaluated between a control and a perturbed condition. The log-fold change (logFC) for genegis defined as ...

arXiv 2024
[14]

Bio Conservation

were not used in our analyses for the following reasons: slide 110 contained sequencing artefacts in the form of major empty patches disrupting neighborhood computations, while slide 222 did not contain any REF cells. For evaluations, we merged fine-grained subtypes (e.g., epithelial subpopulations annotated as Epi1-Epi4) into broad cell type categories. ...

2023
[15]

with subsampling to 103, 104 and 105 cells (Figure A6). For a fair comparison, we set the same batch size and number of epochs for each model and only compare the train loop (without pre-processing workloads).Cellinacomes out as one of the fastest-to-train models in the benchmark suite, owing to the efficient pseudobulk-based φ(v) computed a priori, omitt...

2023

[1] [1]

How attentive are graph attention networks?arXiv preprint arXiv:2105.14491,

Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks?arXiv preprint arXiv:2105.14491,

Pith/arXiv arXiv

[2] [2]

doi: 10.1016/j.cell.2024.11.015

ISSN 0092-8674. doi: 10.1016/j.cell.2024.11.015. URLhttps://doi.org/10.1016/j.cell.2024.11.015. Helena L Crowell, Irene Ruano, Zedong Hu, Yourae Hong, Gin Caratù, Hubert Piessevaux, Ash- ley Heck, Rachel Liu, Max Walter, Megan Vandenberg, et al. Tracing colorectal malignancy transformation from cell to tissue scale.bioRxiv,

work page doi:10.1016/j.cell.2024.11.015 2024

[3] [3]

Fast graph representation learning with pytorch geometric.arXiv preprint arXiv:1903.02428,

11 Matthias Fey and Jan Eric Lenssen. Fast graph representation learning with pytorch geometric.arXiv preprint arXiv:1903.02428,

Pith/arXiv arXiv 1903

[4] [4]

Strand: Sequence-conditioned transport for single-cell perturbations

Boyang Fu, George Dasoulas, Sameer Gabbita, Xiang Lin, Shanghua Gao, Xiaorui Su, Soumya Ghosh, and Marinka Zitnik. Strand: Sequence-conditioned transport for single-cell perturbations. arXiv preprint arXiv:2602.10156,

arXiv

[5] [5]

Demystifying inter-class disentanglement.arXiv preprint arXiv:1906.11796,

Aviv Gabbay and Yedid Hoshen. Demystifying inter-class disentanglement.arXiv preprint arXiv:1906.11796,

arXiv 1906

[6] [6]

Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen

URL https://proceedings.neurips.cc/paper_files/paper/2022/ file/aa933b5abc1be30baece1d230ec575a7-Paper-Conference.pdf. Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen. Variational autoencoders and nonlinear ica: A unifying framework. InInternational conference on artificial intelligence and statistics, pages 2207–2217. PMLR,

2022

[7] [8]

Diederik P Kingma and Max Welling

URLhttps://arxiv.org/abs/2004.11362. Diederik P Kingma and Max Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114,

arXiv 2004

[8] [9]

Variational graph auto-encoders.arXiv preprint arXiv:1611.07308,

Thomas N Kipf and Max Welling. Variational graph auto-encoders.arXiv preprint arXiv:1611.07308,

Pith/arXiv arXiv

[9] [10]

doi: 10.1038/s41592-021-01336-8

ISSN 1548-7105. doi: 10.1038/s41592-021-01336-8. URLhttps://doi.org/10.1038/s41592-021-01336-8. Jing Ma, Ruocheng Guo, Saumitra Mishra, Aidong Zhang, and Jundong Li. Clear: Generative counterfactual explanations on graphs.Advances in neural information processing systems, 35: 25895–25907,

work page doi:10.1038/s41592-021-01336-8

[10] [11]

URL https://proceedings.neurips.cc/paper_files/paper/2024/file/ 24c4d51f3ef48dd2dbab78243ecb26a1-Paper-Datasets_and_Benchmarks_Track.pdf

doi: 10.52202/079017-0650. URL https://proceedings.neurips.cc/paper_files/paper/2024/file/ 24c4d51f3ef48dd2dbab78243ecb26a1-Paper-Datasets_and_Benchmarks_Track.pdf. Jovan Tanevski, Ricardo Omar Ramirez Flores, Attila Gabor, Denis Schapiro, and Julio Saez- Rodriguez. Explainable multiview framework for dissecting spatial relationships from highly multiplex...

work page doi:10.52202/079017-0650 2024

[11] [12]

Perturbench: Benchmarking machine learning models for cellular perturbation analysis.arXiv preprint arXiv:2408.10609,

Yan Wu, Esther Wershof, Sebastian M Schmon, Marcel Nassar, Bła˙zej Osi´nski, Ridvan Eksi, Zichao Yan, Rory Stark, Kun Zhang, and Thore Graepel. Perturbench: Benchmarking machine learning models for cellular perturbation analysis.arXiv preprint arXiv:2408.10609,

arXiv

[12] [13]

We convert both to normalized expression as pg,v = cg,vP g′ cg′,v ·ℓ 0 We consider perturbation settings where each gene is evaluated between a control and a perturbed condition

Let c(obs) g,v denote the observed raw count of gene g in cell v, and c(pred) g,v the corresponding model-predicted raw count. We convert both to normalized expression as pg,v = cg,vP g′ cg′,v ·ℓ 0 We consider perturbation settings where each gene is evaluated between a control and a perturbed condition. The log-fold change (logFC) for genegis defined as ...

arXiv 2024

[13] [14]

Bio Conservation

were not used in our analyses for the following reasons: slide 110 contained sequencing artefacts in the form of major empty patches disrupting neighborhood computations, while slide 222 did not contain any REF cells. For evaluations, we merged fine-grained subtypes (e.g., epithelial subpopulations annotated as Epi1-Epi4) into broad cell type categories. ...

2023

[14] [15]

with subsampling to 103, 104 and 105 cells (Figure A6). For a fair comparison, we set the same batch size and number of epochs for each model and only compare the train loop (without pre-processing workloads).Cellinacomes out as one of the fastest-to-train models in the benchmark suite, owing to the efficient pseudobulk-based φ(v) computed a priori, omitt...

2023