Modeling Cell-Cycle-Aware Single-Cell Drug Perturbation Responses

Dingping Zhao; Jie Lin

arxiv: 2606.30695 · v1 · pith:UKNTXEYXnew · submitted 2026-06-29 · 🧬 q-bio.QM · cs.AI

Modeling Cell-Cycle-Aware Single-Cell Drug Perturbation Responses

Dingping Zhao , Jie Lin This is my paper

Pith reviewed 2026-07-01 01:53 UTC · model grok-4.3

classification 🧬 q-bio.QM cs.AI

keywords single-cell perturbationcell cycledrug responsegene expression predictionphase predictioncircular supervisionout-of-distribution

0 comments

The pith

A closed-loop cell-cycle head that derives supervision from predicted treated expression improves both gene-expression and phase predictions in single-cell drug perturbation models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that cell-cycle phase changes should be treated as a primary prediction target rather than nuisance variation. It does so by deriving supervision signals from the model's own predictions of treated gene expression and feeding them into a learnable circular head that outputs G1/S/G2M phase probabilities. This approach yields better out-of-distribution transcriptional predictions on a large benchmark of drug-treated cells than models that simply condition on cell-cycle state as an input covariate. A sympathetic reader would care because many drugs alter cell proliferation, yet existing perturbation models rarely forecast these shifts explicitly.

Core claim

scCycleMol derives cell-cycle supervision from its own predicted treated expression and propagates it through a learnable full-expression cell-cycle head with circular G1/S/G2M phase targets, producing improved out-of-distribution expression prediction on a SciPlex3 benchmark with over 600k cells and 186 perturbation conditions compared with conditional perturbation baselines.

What carries the argument

The circular cell-cycle head that receives closed-loop supervision derived from the model's predicted treated expression and produces phase targets for the full expression profile.

If this is right

Out-of-distribution expression prediction improves over conditional perturbation baselines.
Cell-cycle phase accuracy increases while expression prediction remains nearly unchanged.
Pretraining on external datasets further enhances both expression and phase performance.
The framework applies across multiple cancer cell lines and thousands of genes under standardized dose and molecule metadata.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The closed-loop strategy could be applied to supervising other latent states that shift under perturbation without requiring separate labels.
Accurate phase forecasts may help identify which drugs selectively affect dividing versus quiescent cells in mixed populations.
The method opens a route to testing whether explicit phase modeling changes which gene programs are recovered as drivers of drug response.

Load-bearing premise

Supervision signals derived from the model's own predicted treated expression provide unbiased cell-cycle targets that do not reinforce the model's errors through the closed loop.

What would settle it

Replacing the self-derived cell-cycle targets with independent experimental phase measurements on treated cells and checking whether the reported gains in expression and phase prediction disappear.

Figures

Figures reproduced from arXiv: 2606.30695 by Dingping Zhao, Jie Lin.

**Figure 2.** Figure 2: Dose-resolved OOD r 2 distributions on SciPlex3. Each box summarizes cell-line–molecule combinations at one treatment dose for all modeled genes or DE genes. The zero-dose baseline predicts treated expression from matched DMSO controls [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Expression–cell-cycle trade-off across objectives and pretraining sources. Points farther right [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Cell-cycle supervision ablations on the OOD split. (a) Allowing the circular phase loss to update [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

read the original abstract

Single-cell drug perturbation models should predict not only transcriptional response magnitude, but also whether a treatment alters the proliferative state of a cell. This is challenging because cell-cycle variation is often treated as nuisance variation, and benchmark pipelines rarely treat drug-induced phase changes as a primary prediction target. We introduce scCycleMol, a cell-cycle-aware perturbation prediction framework built on a curated 24-hour SciPlex3 benchmark with standardized molecule identities, dose and cell-line metadata, and gene expression with cell-cycle supervision derived from treated states. Instead of using cell-cycle state as an input covariate, scCycleMol derives supervision from predicted treated expression and propagates it through a learnable full-expression cell-cycle head with circular G1/S/G2M phase targets. We evaluate marker-based supervision, molecular representations, and pretraining strategies to isolate sources of improvement. Across a SciPlex3 benchmark with over 600k cells, 186 perturbation conditions, multiple cancer cell lines, and thousands of genes, scCycleMol improves out-of-distribution expression prediction compared with conditional perturbation baselines. The best LINCS-pretrained circular model achieves 0.9093 expected all-gene r squared and 0.6843 expected differentially expressed gene r squared, compared with 0.6800 and 0.5400 for LINCS-pretrained ChemCPA. Closed-loop cell-cycle supervision improves phase accuracy by about 0.5 to 0.6 points while maintaining nearly unchanged expression prediction. A Tahoe-pretrained variant reaches 0.9609 phase accuracy, highlighting the benefit of explicit cell-cycle-aware supervision in perturbation modeling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

scCycleMol's closed-loop cell-cycle head delivers reported gains on SciPlex3 but the self-derived targets leave open the risk of reinforcing prediction errors.

read the letter

The key point is that this paper adds a learnable circular cell-cycle head whose targets are pulled from the model's own predicted treated expression, and it reports clear numerical lifts over ChemCPA on a large SciPlex3 benchmark.

What is new is the explicit closed-loop setup: cell-cycle supervision is not taken from observed labels but generated from the treated predictions and fed back through a full-expression phase head with G1/S/G2M targets. The work also evaluates marker-based supervision, different molecular representations, and pretraining on LINCS versus Tahoe. On over 600k cells the best LINCS-pretrained version reaches 0.9093 all-gene r-squared and 0.6843 on differentially expressed genes, versus 0.68 and 0.54 for the baseline, with an additional 0.5–0.6 point gain in phase accuracy.

The paper does a reasonable job of curating the benchmark with standardized metadata and running ablations to isolate the cell-cycle component. Treating phase change as a primary target rather than nuisance variation is a sensible shift for cancer-relevant perturbation modeling.

The soft spot is the circularity. Because the phase targets come directly from the model's treated predictions, any early bias in expression estimates can be fed back as supervision. The abstract supplies no external anchor such as orthogonal phase measurements or held-out ground-truth labels, so it is unclear whether the loop corrects errors or simply locks them in. No error bars, statistical tests, or split details are mentioned either, which makes the size of the gains harder to judge.

This is for groups already working on single-cell drug response models who want to incorporate proliferation state explicitly. A reader focused on architecture choices and pretraining effects will find the comparisons useful.

It deserves peer review because the benchmark scale and the concrete idea are worth checking in detail, even if the closed-loop assumption needs scrutiny.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces scCycleMol, a cell-cycle-aware framework for predicting single-cell drug perturbation responses on a curated 24-hour SciPlex3 benchmark (>600k cells, 186 conditions). Rather than treating cell-cycle state as an input covariate, it derives circular G1/S/G2M phase targets from the model's own predicted treated expression, propagates them through a learnable full-expression cell-cycle head, and reports that the best LINCS-pretrained circular variant achieves expected all-gene r² of 0.9093 and DE-gene r² of 0.6843 (vs. 0.6800/0.5400 for LINCS-pretrained ChemCPA), with an additional 0.5–0.6 point gain in phase accuracy from the closed-loop supervision.

Significance. If the closed-loop supervision can be shown to supply unbiased targets, the approach would usefully shift perturbation modeling from treating proliferative-state changes as nuisance variation to treating them as an explicit, jointly optimized prediction target, with potential downstream value for understanding drug effects on cell proliferation across cancer cell lines.

major comments (2)

[Abstract] Abstract: the headline gains are attributed to closed-loop cell-cycle supervision derived from the model's predicted treated expression, yet the abstract supplies no external anchor (orthogonal phase measurements, held-out ground-truth labels, or ablation of the feedback loop) to demonstrate that the targets remain unbiased and do not reinforce early prediction errors.
[Abstract] Abstract: no statistical testing, error bars, or cross-condition variance is reported for the r² values (0.9093/0.6843 vs. 0.6800/0.5400), making it impossible to assess whether the reported improvements are distinguishable from sampling variability or from differences in data splits and pretraining.

minor comments (2)

The precise definition of 'expected' r² (averaging across perturbations, genes, or both) and the exact train/validation/test partitioning of the 600k-cell SciPlex3 benchmark should be stated explicitly.
Implementation details for the learnable circular phase head (loss weighting, phase discretization, and how the head is trained jointly with the expression predictor) are needed to allow reproduction of the closed-loop procedure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address the two major comments point-by-point below, with proposed revisions to strengthen the abstract's presentation of the closed-loop supervision results and the reported metrics.

read point-by-point responses

Referee: [Abstract] Abstract: the headline gains are attributed to closed-loop cell-cycle supervision derived from the model's predicted treated expression, yet the abstract supplies no external anchor (orthogonal phase measurements, held-out ground-truth labels, or ablation of the feedback loop) to demonstrate that the targets remain unbiased and do not reinforce early prediction errors.

Authors: We acknowledge that the abstract does not explicitly reference external validation anchors for the closed-loop targets. The manuscript provides internal evidence via ablations (marker-based vs. closed-loop vs. no cell-cycle head) demonstrating that closed-loop supervision yields a 0.5–0.6 point gain in phase accuracy while leaving expression r² essentially unchanged, which is consistent with the targets not simply reinforcing early errors. However, the SciPlex3 benchmark lacks orthogonal phase labels, so we cannot supply such an external anchor. We will revise the abstract to briefly note the ablation results and the reliance on internal consistency checks. revision: yes
Referee: [Abstract] Abstract: no statistical testing, error bars, or cross-condition variance is reported for the r² values (0.9093/0.6843 vs. 0.6800/0.5400), making it impossible to assess whether the reported improvements are distinguishable from sampling variability or from differences in data splits and pretraining.

Authors: We agree that the abstract would benefit from reporting variability and statistical context for the r² metrics. The quoted values are expected averages over the 186 conditions and multiple cell lines, but standard deviations across runs or conditions and any formal tests were not included. We will revise the abstract (and main results section) to report error bars or cross-condition variance and to indicate whether the differences exceed sampling variability. revision: yes

Circularity Check

0 steps flagged

No significant circularity in claimed performance metrics

full rationale

The paper reports empirical r-squared values for gene expression (0.9093 all-gene, 0.6843 DE) and phase accuracy on a held-out SciPlex3 benchmark of >600k cells, compared against the independent baseline ChemCPA. The closed-loop supervision technique derives phase targets from predicted expression for training the phase head, but no equation, result, or metric is shown to equal its inputs by construction; the reported numbers are standard held-out statistics, not redefined quantities. The derivation chain consists of architectural choices and pretraining evaluated on external data, remaining self-contained against the benchmark without load-bearing self-citation or self-definitional reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.1-grok · 5816 in / 1183 out tokens · 34871 ms · 2026-07-01T01:53:07.708071+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 17 canonical work pages · 1 internal anchor

[1]

Deep-learning-based gene perturba- tion effect prediction does not yet outperform simple linear baselines.Nature Methods, 22(8):1657– 1661, 2025

Constantin Ahlmann-Eltze, Wolfgang Huber, and Simon Anders. Deep-learning-based gene perturba- tion effect prediction does not yet outperform simple linear baselines.Nature Methods, 22(8):1657– 1661, 2025. doi: 10.1038/s41592-025-02772-6

work page doi:10.1038/s41592-025-02772-6 2025
[2]

Stark, Gabriele Gut, Jacobo Sarabia del Castillo, Mitch Levesque, Kjong- Van Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R ¨atsch

Charlotte Bunne, Stefan G. Stark, Gabriele Gut, Jacobo Sarabia del Castillo, Mitch Levesque, Kjong- Van Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R ¨atsch. Learning single-cell pertur- bation responses using neural optimal transport.Nature Methods, 20(11):1759–1768, 2023. doi: 10.1038/s41592-023-01969-x. 12

work page doi:10.1038/s41592-023-01969-x 2023
[3]

arXiv preprint arXiv:2010.09885 , year=

Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. ChemBERTa: Large-scale self- supervised pretraining for molecular property prediction, 2020. URLhttps://arxiv.org/abs/ 2010.09885. arXiv preprint arXiv:2010.09885

work page arXiv 2020
[4]

Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav

Adri ´an E. Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav. The effects of proliferation status and cell cycle phase on the responses of single cells to chemotherapy.Molecular Biology of the Cell, 31(8):845–857, 2020. doi: 10.1091/ mbc.E19-09-0515

2020
[5]

Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J

Sean M. Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J. Zhan, Tiera A. Liby, Mark A. Dane, Aaron S. Meyer, and Laura M. Heiser. Analysis and modeling of cancer drug re- sponses using cell cycle phase-specific rate effects.Nature Communications, 14(1):3450, 2023. doi: 10.1038/s41467-023-39122-z

work page doi:10.1038/s41467-023-39122-z 2023
[6]

Leon Hetzel, Simon Boehm, Niki Kilbertus, Stephan G ¨unnemann, Mohammad Lotfollahi, and Fabian J. Theis. Predicting cellular responses to novel drug perturbations at a single-cell resolution. InAdvances in Neural Information Processing Systems, volume 35, pages 26711–26722. Curran As- sociates, Inc., 2022. URLhttps://proceedings.neurips.cc/paper_files/pap...

2022
[7]

scppdm: A diffusion model for single-cell drug-response prediction.arXiv preprint arXiv:2510.11726, 2025

Zhaokang Liang, Shuyang Zhuang, Xiaoran Jiao, Weian Mao, Hao Chen, and Chunhua Shen. scppdm: A diffusion model for single-cell drug-response prediction.arXiv preprint arXiv:2510.11726, 2025

work page arXiv 2025
[8]

Alexander Wolf, and Fabian J

Mohammad Lotfollahi, F. Alexander Wolf, and Fabian J. Theis. scGen predicts single-cell perturbation responses.Nature Methods, 16(8):715–721, 2019. doi: 10.1038/s41592-019-0494-8

work page doi:10.1038/s41592-019-0494-8 2019
[9]

Ibarra, Sanjay R

Mohammad Lotfollahi, Anna Klimovskaia Susmelj, Carlo De Donno, Leon Hetzel, Yuge Ji, Ignacio L. Ibarra, Sanjay R. Srivatsan, Mohsen Naghipourfar, Riza M. Daza, Beth Martin, Jay Shendure, Jose L. McFaline-Figueroa, Pierre Boyeau, F. Alexander Wolf, Nafissa Yakubova, Stephan G¨unnemann, Cole Trapnell, David Lopez-Paz, and Fabian J. Theis. Predicting cellula...

work page doi:10.15252/msb 2023
[10]

Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery.Nature Communications, 15(1):9256, 2024

Xiaoning Qi, Lianhe Zhao, Chenyu Tian, Yueyue Li, Zhen-Lin Chen, Peipei Huo, Runsheng Chen, Xiaodong Liu, Baoping Wan, Shengyong Yang, and Yi Zhao. Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery.Nature Communications, 15(1):9256, 2024. doi: 10.1038/s41467-024-53457-1

work page doi:10.1038/s41467-024-53457-1 2024
[11]

RDKit: Open-source cheminformatics, 2026

RDKit Developers. RDKit: Open-source cheminformatics, 2026. URLhttps://www.rdkit. org. Accessed: 2026-05-16

2026
[12]

Self-supervised graph transformer on large-scale molecular data

Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self-supervised graph transformer on large-scale molecular data. InAdvances in Neu- ral Information Processing Systems, volume 33, pages 12559–12571. Curran Associates, Inc.,
[13]

URLhttps://proceedings.neurips.cc/paper_files/paper/2020/file/ 94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf

2020
[14]

Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024

Yusuf Roohani, Kexin Huang, and Jure Leskovec. Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024. doi: 10.1038/ s41587-023-01905-6. 13

2024
[15]

Farrell, David Gennert, Alexander F

Rahul Satija, Jeffrey A. Farrell, David Gennert, Alexander F. Schier, and Aviv Regev. Spatial re- construction of single-cell gene expression data.Nature Biotechnology, 33(5):495–502, 2015. doi: 10.1038/nbt.3192

work page doi:10.1038/nbt.3192 2015
[16]

StateXDiff: Cell State-Contextualized Multimodal Diffusion for Single-Cell Perturbation Prediction

Peiting Shi, Ningfeng Que, Xianzhe Huang, Xiaofei Wang, and Jianzhong Jeff Xi. Statexdiff: Cell state-contextualized multimodal diffusion for single-cell perturbation prediction.arXiv preprint arXiv:2605.16104, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[17]

Srivatsan, Jos ´e L

Sanjay R. Srivatsan, Jos ´e L. McFaline-Figueroa, Vijay Ramani, Lauren Saunders, Junyue Cao, Jonathan Packer, Hannah A. Pliner, Dana L. Jackson, Riza M. Daza, Lena Christiansen, Fan Zhang, Frank Steemers, Jay Shendure, and Cole Trapnell. Massively multiplex chemical transcriptomics at single-cell resolution.Science, 367(6473):45–51, 2020. doi: 10.1126/sci...

work page doi:10.1126/science.aax6234 2020
[18]

Corsello, David D

Aravind Subramanian, Rajiv Narayan, Steven M. Corsello, David D. Peck, Ted E. Natoli, Xiaodong Lu, Joshua Gould, John F. Davis, Andrew A. Tubelli, Jacob K. Asiedu, David L. Lahr, Jodi E. Hirschman, Zihan Liu, Melanie Donahue, Bina Julian, Mariya Khan, David Wadden, Ian C. Smith, Daniel Lam, Arthur Liberzon, Courtney Toder, Mukta Bagul, Marek Orzechowski, ...

work page doi:10.1016/j.cell.2017.10.049 2017
[19]

Prakadan, Marc H

Itay Tirosh, Benjamin Izar, Sanjay M. Prakadan, Marc H. Wadsworth, Daniel Treacy, John J. Trom- betta, Asaf Rotem, Christopher Rodman, Christine Lian, George Murphy, Mohammad Fallahi-Sichani, Ken Dutton-Regester, Jia-Ren Lin, Ofir Cohen, Parin Shah, Diana Lu, Alex S. Genshaft, Travis K. Hughes, Carly G. K. Ziegler, Samuel W. Kazer, Aleth Gaillard, Kellie ...

work page doi:10.1126/science.aad0501 2016
[20]

Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–464, 2026

Zhiting Wei, Yiheng Wang, Yicheng Gao, Shuguang Wang, Ping Li, Duanmiao Si, Yuli Gao, Siqi Wu, Danlu Li, Kejing Dong, Xingbo Yang, Chen Tang, Shaliu Fu, Xiaohan Chen, Wannian Li, Yuzhou You, Chen Zhang, Aibin Liang, Guohui Chuai, and Qi Liu. Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–46...

2026
[21]

Alexander Wolf, Philipp Angerer, and Fabian J

F. Alexander Wolf, Philipp Angerer, and Fabian J. Theis. SCANPY: large-scale single-cell gene ex- pression data analysis.Genome Biology, 19(1):15, 2018. doi: 10.1186/s13059-017-1382-0

work page doi:10.1186/s13059-017-1382-0 2018
[22]

How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019. URLhttps: //openreview.net/forum?id=ryGs6iA5Km

2019
[23]

Hengshi Yu, Weizhou Qian, Yuxuan Song, and Joshua D. Welch. PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations.Molecular Systems Biology, 21(8):960–982,
[24]

doi: 10.1038/s44320-025-00131-3. 14

work page doi:10.1038/s44320-025-00131-3
[25]

Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G

Jesse Zhang, Airol A. Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G. Jones, John D. Thompson, Vuong Tran, Joseph Pangallo, Efthymia Papalexi, Ajay Sapre, Hoai Nguyen, Oliver Sanderson, Maria Nigos, Olivia Ka- plan, Sarah Schroeder, Bryan Hariadi, Simone Marrujo, Crina Curca Alec Salvi...

2025
[26]

URLhttps://www.biorxiv.org/content/10

doi: 10.1101/2025.02.20.639398. URLhttps://www.biorxiv.org/content/10. 1101/2025.02.20.639398. Preprint

work page doi:10.1101/2025.02.20.639398 2025
[27]

MvMRL: a multi-view molecular representation learning method for molecular property prediction.Briefings in Bioinformatics, 25(4):bbae298, 2024

Ru Zhang, Yanmei Lin, Yijia Wu, Lei Deng, Hao Zhang, Mingzhi Liao, and Yuzhong Peng. MvMRL: a multi-view molecular representation learning method for molecular property prediction.Briefings in Bioinformatics, 25(4):bbae298, 2024. doi: 10.1093/bib/bbae298. A Additional Experimental Details A.1 Preprocessing Overview We provide a reproducible Stage 1–2 prep...

work page doi:10.1093/bib/bbae298 2024

[1] [1]

Deep-learning-based gene perturba- tion effect prediction does not yet outperform simple linear baselines.Nature Methods, 22(8):1657– 1661, 2025

Constantin Ahlmann-Eltze, Wolfgang Huber, and Simon Anders. Deep-learning-based gene perturba- tion effect prediction does not yet outperform simple linear baselines.Nature Methods, 22(8):1657– 1661, 2025. doi: 10.1038/s41592-025-02772-6

work page doi:10.1038/s41592-025-02772-6 2025

[2] [2]

Stark, Gabriele Gut, Jacobo Sarabia del Castillo, Mitch Levesque, Kjong- Van Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R ¨atsch

Charlotte Bunne, Stefan G. Stark, Gabriele Gut, Jacobo Sarabia del Castillo, Mitch Levesque, Kjong- Van Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R ¨atsch. Learning single-cell pertur- bation responses using neural optimal transport.Nature Methods, 20(11):1759–1768, 2023. doi: 10.1038/s41592-023-01969-x. 12

work page doi:10.1038/s41592-023-01969-x 2023

[3] [3]

arXiv preprint arXiv:2010.09885 , year=

Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. ChemBERTa: Large-scale self- supervised pretraining for molecular property prediction, 2020. URLhttps://arxiv.org/abs/ 2010.09885. arXiv preprint arXiv:2010.09885

work page arXiv 2020

[4] [4]

Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav

Adri ´an E. Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav. The effects of proliferation status and cell cycle phase on the responses of single cells to chemotherapy.Molecular Biology of the Cell, 31(8):845–857, 2020. doi: 10.1091/ mbc.E19-09-0515

2020

[5] [5]

Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J

Sean M. Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J. Zhan, Tiera A. Liby, Mark A. Dane, Aaron S. Meyer, and Laura M. Heiser. Analysis and modeling of cancer drug re- sponses using cell cycle phase-specific rate effects.Nature Communications, 14(1):3450, 2023. doi: 10.1038/s41467-023-39122-z

work page doi:10.1038/s41467-023-39122-z 2023

[6] [6]

Leon Hetzel, Simon Boehm, Niki Kilbertus, Stephan G ¨unnemann, Mohammad Lotfollahi, and Fabian J. Theis. Predicting cellular responses to novel drug perturbations at a single-cell resolution. InAdvances in Neural Information Processing Systems, volume 35, pages 26711–26722. Curran As- sociates, Inc., 2022. URLhttps://proceedings.neurips.cc/paper_files/pap...

2022

[7] [7]

scppdm: A diffusion model for single-cell drug-response prediction.arXiv preprint arXiv:2510.11726, 2025

Zhaokang Liang, Shuyang Zhuang, Xiaoran Jiao, Weian Mao, Hao Chen, and Chunhua Shen. scppdm: A diffusion model for single-cell drug-response prediction.arXiv preprint arXiv:2510.11726, 2025

work page arXiv 2025

[8] [8]

Alexander Wolf, and Fabian J

Mohammad Lotfollahi, F. Alexander Wolf, and Fabian J. Theis. scGen predicts single-cell perturbation responses.Nature Methods, 16(8):715–721, 2019. doi: 10.1038/s41592-019-0494-8

work page doi:10.1038/s41592-019-0494-8 2019

[9] [9]

Ibarra, Sanjay R

Mohammad Lotfollahi, Anna Klimovskaia Susmelj, Carlo De Donno, Leon Hetzel, Yuge Ji, Ignacio L. Ibarra, Sanjay R. Srivatsan, Mohsen Naghipourfar, Riza M. Daza, Beth Martin, Jay Shendure, Jose L. McFaline-Figueroa, Pierre Boyeau, F. Alexander Wolf, Nafissa Yakubova, Stephan G¨unnemann, Cole Trapnell, David Lopez-Paz, and Fabian J. Theis. Predicting cellula...

work page doi:10.15252/msb 2023

[10] [10]

Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery.Nature Communications, 15(1):9256, 2024

Xiaoning Qi, Lianhe Zhao, Chenyu Tian, Yueyue Li, Zhen-Lin Chen, Peipei Huo, Runsheng Chen, Xiaodong Liu, Baoping Wan, Shengyong Yang, and Yi Zhao. Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery.Nature Communications, 15(1):9256, 2024. doi: 10.1038/s41467-024-53457-1

work page doi:10.1038/s41467-024-53457-1 2024

[11] [11]

RDKit: Open-source cheminformatics, 2026

RDKit Developers. RDKit: Open-source cheminformatics, 2026. URLhttps://www.rdkit. org. Accessed: 2026-05-16

2026

[12] [12]

Self-supervised graph transformer on large-scale molecular data

Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self-supervised graph transformer on large-scale molecular data. InAdvances in Neu- ral Information Processing Systems, volume 33, pages 12559–12571. Curran Associates, Inc.,

[13] [13]

URLhttps://proceedings.neurips.cc/paper_files/paper/2020/file/ 94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf

2020

[14] [14]

Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024

Yusuf Roohani, Kexin Huang, and Jure Leskovec. Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024. doi: 10.1038/ s41587-023-01905-6. 13

2024

[15] [15]

Farrell, David Gennert, Alexander F

Rahul Satija, Jeffrey A. Farrell, David Gennert, Alexander F. Schier, and Aviv Regev. Spatial re- construction of single-cell gene expression data.Nature Biotechnology, 33(5):495–502, 2015. doi: 10.1038/nbt.3192

work page doi:10.1038/nbt.3192 2015

[16] [16]

StateXDiff: Cell State-Contextualized Multimodal Diffusion for Single-Cell Perturbation Prediction

Peiting Shi, Ningfeng Que, Xianzhe Huang, Xiaofei Wang, and Jianzhong Jeff Xi. Statexdiff: Cell state-contextualized multimodal diffusion for single-cell perturbation prediction.arXiv preprint arXiv:2605.16104, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[17] [17]

Srivatsan, Jos ´e L

Sanjay R. Srivatsan, Jos ´e L. McFaline-Figueroa, Vijay Ramani, Lauren Saunders, Junyue Cao, Jonathan Packer, Hannah A. Pliner, Dana L. Jackson, Riza M. Daza, Lena Christiansen, Fan Zhang, Frank Steemers, Jay Shendure, and Cole Trapnell. Massively multiplex chemical transcriptomics at single-cell resolution.Science, 367(6473):45–51, 2020. doi: 10.1126/sci...

work page doi:10.1126/science.aax6234 2020

[18] [18]

Corsello, David D

Aravind Subramanian, Rajiv Narayan, Steven M. Corsello, David D. Peck, Ted E. Natoli, Xiaodong Lu, Joshua Gould, John F. Davis, Andrew A. Tubelli, Jacob K. Asiedu, David L. Lahr, Jodi E. Hirschman, Zihan Liu, Melanie Donahue, Bina Julian, Mariya Khan, David Wadden, Ian C. Smith, Daniel Lam, Arthur Liberzon, Courtney Toder, Mukta Bagul, Marek Orzechowski, ...

work page doi:10.1016/j.cell.2017.10.049 2017

[19] [19]

Prakadan, Marc H

Itay Tirosh, Benjamin Izar, Sanjay M. Prakadan, Marc H. Wadsworth, Daniel Treacy, John J. Trom- betta, Asaf Rotem, Christopher Rodman, Christine Lian, George Murphy, Mohammad Fallahi-Sichani, Ken Dutton-Regester, Jia-Ren Lin, Ofir Cohen, Parin Shah, Diana Lu, Alex S. Genshaft, Travis K. Hughes, Carly G. K. Ziegler, Samuel W. Kazer, Aleth Gaillard, Kellie ...

work page doi:10.1126/science.aad0501 2016

[20] [20]

Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–464, 2026

Zhiting Wei, Yiheng Wang, Yicheng Gao, Shuguang Wang, Ping Li, Duanmiao Si, Yuli Gao, Siqi Wu, Danlu Li, Kejing Dong, Xingbo Yang, Chen Tang, Shaliu Fu, Xiaohan Chen, Wannian Li, Yuzhou You, Chen Zhang, Aibin Liang, Guohui Chuai, and Qi Liu. Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–46...

2026

[21] [21]

Alexander Wolf, Philipp Angerer, and Fabian J

F. Alexander Wolf, Philipp Angerer, and Fabian J. Theis. SCANPY: large-scale single-cell gene ex- pression data analysis.Genome Biology, 19(1):15, 2018. doi: 10.1186/s13059-017-1382-0

work page doi:10.1186/s13059-017-1382-0 2018

[22] [22]

How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019. URLhttps: //openreview.net/forum?id=ryGs6iA5Km

2019

[23] [23]

Hengshi Yu, Weizhou Qian, Yuxuan Song, and Joshua D. Welch. PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations.Molecular Systems Biology, 21(8):960–982,

[24] [24]

doi: 10.1038/s44320-025-00131-3. 14

work page doi:10.1038/s44320-025-00131-3

[25] [25]

Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G

Jesse Zhang, Airol A. Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G. Jones, John D. Thompson, Vuong Tran, Joseph Pangallo, Efthymia Papalexi, Ajay Sapre, Hoai Nguyen, Oliver Sanderson, Maria Nigos, Olivia Ka- plan, Sarah Schroeder, Bryan Hariadi, Simone Marrujo, Crina Curca Alec Salvi...

2025

[26] [26]

URLhttps://www.biorxiv.org/content/10

doi: 10.1101/2025.02.20.639398. URLhttps://www.biorxiv.org/content/10. 1101/2025.02.20.639398. Preprint

work page doi:10.1101/2025.02.20.639398 2025

[27] [27]

MvMRL: a multi-view molecular representation learning method for molecular property prediction.Briefings in Bioinformatics, 25(4):bbae298, 2024

Ru Zhang, Yanmei Lin, Yijia Wu, Lei Deng, Hao Zhang, Mingzhi Liao, and Yuzhong Peng. MvMRL: a multi-view molecular representation learning method for molecular property prediction.Briefings in Bioinformatics, 25(4):bbae298, 2024. doi: 10.1093/bib/bbae298. A Additional Experimental Details A.1 Preprocessing Overview We provide a reproducible Stage 1–2 prep...

work page doi:10.1093/bib/bbae298 2024