Modeling Cell-Cycle-Aware Single-Cell Drug Perturbation Responses
Pith reviewed 2026-07-01 01:53 UTC · model grok-4.3
The pith
A closed-loop cell-cycle head that derives supervision from predicted treated expression improves both gene-expression and phase predictions in single-cell drug perturbation models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
scCycleMol derives cell-cycle supervision from its own predicted treated expression and propagates it through a learnable full-expression cell-cycle head with circular G1/S/G2M phase targets, producing improved out-of-distribution expression prediction on a SciPlex3 benchmark with over 600k cells and 186 perturbation conditions compared with conditional perturbation baselines.
What carries the argument
The circular cell-cycle head that receives closed-loop supervision derived from the model's predicted treated expression and produces phase targets for the full expression profile.
If this is right
- Out-of-distribution expression prediction improves over conditional perturbation baselines.
- Cell-cycle phase accuracy increases while expression prediction remains nearly unchanged.
- Pretraining on external datasets further enhances both expression and phase performance.
- The framework applies across multiple cancer cell lines and thousands of genes under standardized dose and molecule metadata.
Where Pith is reading between the lines
- The closed-loop strategy could be applied to supervising other latent states that shift under perturbation without requiring separate labels.
- Accurate phase forecasts may help identify which drugs selectively affect dividing versus quiescent cells in mixed populations.
- The method opens a route to testing whether explicit phase modeling changes which gene programs are recovered as drivers of drug response.
Load-bearing premise
Supervision signals derived from the model's own predicted treated expression provide unbiased cell-cycle targets that do not reinforce the model's errors through the closed loop.
What would settle it
Replacing the self-derived cell-cycle targets with independent experimental phase measurements on treated cells and checking whether the reported gains in expression and phase prediction disappear.
Figures
read the original abstract
Single-cell drug perturbation models should predict not only transcriptional response magnitude, but also whether a treatment alters the proliferative state of a cell. This is challenging because cell-cycle variation is often treated as nuisance variation, and benchmark pipelines rarely treat drug-induced phase changes as a primary prediction target. We introduce scCycleMol, a cell-cycle-aware perturbation prediction framework built on a curated 24-hour SciPlex3 benchmark with standardized molecule identities, dose and cell-line metadata, and gene expression with cell-cycle supervision derived from treated states. Instead of using cell-cycle state as an input covariate, scCycleMol derives supervision from predicted treated expression and propagates it through a learnable full-expression cell-cycle head with circular G1/S/G2M phase targets. We evaluate marker-based supervision, molecular representations, and pretraining strategies to isolate sources of improvement. Across a SciPlex3 benchmark with over 600k cells, 186 perturbation conditions, multiple cancer cell lines, and thousands of genes, scCycleMol improves out-of-distribution expression prediction compared with conditional perturbation baselines. The best LINCS-pretrained circular model achieves 0.9093 expected all-gene r squared and 0.6843 expected differentially expressed gene r squared, compared with 0.6800 and 0.5400 for LINCS-pretrained ChemCPA. Closed-loop cell-cycle supervision improves phase accuracy by about 0.5 to 0.6 points while maintaining nearly unchanged expression prediction. A Tahoe-pretrained variant reaches 0.9609 phase accuracy, highlighting the benefit of explicit cell-cycle-aware supervision in perturbation modeling.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces scCycleMol, a cell-cycle-aware framework for predicting single-cell drug perturbation responses on a curated 24-hour SciPlex3 benchmark (>600k cells, 186 conditions). Rather than treating cell-cycle state as an input covariate, it derives circular G1/S/G2M phase targets from the model's own predicted treated expression, propagates them through a learnable full-expression cell-cycle head, and reports that the best LINCS-pretrained circular variant achieves expected all-gene r² of 0.9093 and DE-gene r² of 0.6843 (vs. 0.6800/0.5400 for LINCS-pretrained ChemCPA), with an additional 0.5–0.6 point gain in phase accuracy from the closed-loop supervision.
Significance. If the closed-loop supervision can be shown to supply unbiased targets, the approach would usefully shift perturbation modeling from treating proliferative-state changes as nuisance variation to treating them as an explicit, jointly optimized prediction target, with potential downstream value for understanding drug effects on cell proliferation across cancer cell lines.
major comments (2)
- [Abstract] Abstract: the headline gains are attributed to closed-loop cell-cycle supervision derived from the model's predicted treated expression, yet the abstract supplies no external anchor (orthogonal phase measurements, held-out ground-truth labels, or ablation of the feedback loop) to demonstrate that the targets remain unbiased and do not reinforce early prediction errors.
- [Abstract] Abstract: no statistical testing, error bars, or cross-condition variance is reported for the r² values (0.9093/0.6843 vs. 0.6800/0.5400), making it impossible to assess whether the reported improvements are distinguishable from sampling variability or from differences in data splits and pretraining.
minor comments (2)
- The precise definition of 'expected' r² (averaging across perturbations, genes, or both) and the exact train/validation/test partitioning of the 600k-cell SciPlex3 benchmark should be stated explicitly.
- Implementation details for the learnable circular phase head (loss weighting, phase discretization, and how the head is trained jointly with the expression predictor) are needed to allow reproduction of the closed-loop procedure.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We address the two major comments point-by-point below, with proposed revisions to strengthen the abstract's presentation of the closed-loop supervision results and the reported metrics.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline gains are attributed to closed-loop cell-cycle supervision derived from the model's predicted treated expression, yet the abstract supplies no external anchor (orthogonal phase measurements, held-out ground-truth labels, or ablation of the feedback loop) to demonstrate that the targets remain unbiased and do not reinforce early prediction errors.
Authors: We acknowledge that the abstract does not explicitly reference external validation anchors for the closed-loop targets. The manuscript provides internal evidence via ablations (marker-based vs. closed-loop vs. no cell-cycle head) demonstrating that closed-loop supervision yields a 0.5–0.6 point gain in phase accuracy while leaving expression r² essentially unchanged, which is consistent with the targets not simply reinforcing early errors. However, the SciPlex3 benchmark lacks orthogonal phase labels, so we cannot supply such an external anchor. We will revise the abstract to briefly note the ablation results and the reliance on internal consistency checks. revision: yes
-
Referee: [Abstract] Abstract: no statistical testing, error bars, or cross-condition variance is reported for the r² values (0.9093/0.6843 vs. 0.6800/0.5400), making it impossible to assess whether the reported improvements are distinguishable from sampling variability or from differences in data splits and pretraining.
Authors: We agree that the abstract would benefit from reporting variability and statistical context for the r² metrics. The quoted values are expected averages over the 186 conditions and multiple cell lines, but standard deviations across runs or conditions and any formal tests were not included. We will revise the abstract (and main results section) to report error bars or cross-condition variance and to indicate whether the differences exceed sampling variability. revision: yes
Circularity Check
No significant circularity in claimed performance metrics
full rationale
The paper reports empirical r-squared values for gene expression (0.9093 all-gene, 0.6843 DE) and phase accuracy on a held-out SciPlex3 benchmark of >600k cells, compared against the independent baseline ChemCPA. The closed-loop supervision technique derives phase targets from predicted expression for training the phase head, but no equation, result, or metric is shown to equal its inputs by construction; the reported numbers are standard held-out statistics, not redefined quantities. The derivation chain consists of architectural choices and pretraining evaluated on external data, remaining self-contained against the benchmark without load-bearing self-citation or self-definitional reductions.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Constantin Ahlmann-Eltze, Wolfgang Huber, and Simon Anders. Deep-learning-based gene perturba- tion effect prediction does not yet outperform simple linear baselines.Nature Methods, 22(8):1657– 1661, 2025. doi: 10.1038/s41592-025-02772-6
-
[2]
Charlotte Bunne, Stefan G. Stark, Gabriele Gut, Jacobo Sarabia del Castillo, Mitch Levesque, Kjong- Van Lehmann, Lucas Pelkmans, Andreas Krause, and Gunnar R ¨atsch. Learning single-cell pertur- bation responses using neural optimal transport.Nature Methods, 20(11):1759–1768, 2023. doi: 10.1038/s41592-023-01969-x. 12
-
[3]
arXiv preprint arXiv:2010.09885 , year=
Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. ChemBERTa: Large-scale self- supervised pretraining for molecular property prediction, 2020. URLhttps://arxiv.org/abs/ 2010.09885. arXiv preprint arXiv:2010.09885
-
[4]
Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav
Adri ´an E. Granada, Alba Jim ´enez, Jacob Stewart-Ornstein, Nils Bl ¨uthgen, Simone Reber, Ashwini Jambhekar, and Galit Lahav. The effects of proliferation status and cell cycle phase on the responses of single cells to chemotherapy.Molecular Biology of the Cell, 31(8):845–857, 2020. doi: 10.1091/ mbc.E19-09-0515
2020
-
[5]
Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J
Sean M. Gross, Farnaz Mohammadi, Crystal Sanchez-Aguila, Paulina J. Zhan, Tiera A. Liby, Mark A. Dane, Aaron S. Meyer, and Laura M. Heiser. Analysis and modeling of cancer drug re- sponses using cell cycle phase-specific rate effects.Nature Communications, 14(1):3450, 2023. doi: 10.1038/s41467-023-39122-z
-
[6]
Leon Hetzel, Simon Boehm, Niki Kilbertus, Stephan G ¨unnemann, Mohammad Lotfollahi, and Fabian J. Theis. Predicting cellular responses to novel drug perturbations at a single-cell resolution. InAdvances in Neural Information Processing Systems, volume 35, pages 26711–26722. Curran As- sociates, Inc., 2022. URLhttps://proceedings.neurips.cc/paper_files/pap...
2022
-
[7]
Zhaokang Liang, Shuyang Zhuang, Xiaoran Jiao, Weian Mao, Hao Chen, and Chunhua Shen. scppdm: A diffusion model for single-cell drug-response prediction.arXiv preprint arXiv:2510.11726, 2025
-
[8]
Mohammad Lotfollahi, F. Alexander Wolf, and Fabian J. Theis. scGen predicts single-cell perturbation responses.Nature Methods, 16(8):715–721, 2019. doi: 10.1038/s41592-019-0494-8
-
[9]
Mohammad Lotfollahi, Anna Klimovskaia Susmelj, Carlo De Donno, Leon Hetzel, Yuge Ji, Ignacio L. Ibarra, Sanjay R. Srivatsan, Mohsen Naghipourfar, Riza M. Daza, Beth Martin, Jay Shendure, Jose L. McFaline-Figueroa, Pierre Boyeau, F. Alexander Wolf, Nafissa Yakubova, Stephan G¨unnemann, Cole Trapnell, David Lopez-Paz, and Fabian J. Theis. Predicting cellula...
-
[10]
Xiaoning Qi, Lianhe Zhao, Chenyu Tian, Yueyue Li, Zhen-Lin Chen, Peipei Huo, Runsheng Chen, Xiaodong Liu, Baoping Wan, Shengyong Yang, and Yi Zhao. Predicting transcriptional responses to novel chemical perturbations using deep generative model for drug discovery.Nature Communications, 15(1):9256, 2024. doi: 10.1038/s41467-024-53457-1
-
[11]
RDKit: Open-source cheminformatics, 2026
RDKit Developers. RDKit: Open-source cheminformatics, 2026. URLhttps://www.rdkit. org. Accessed: 2026-05-16
2026
-
[12]
Self-supervised graph transformer on large-scale molecular data
Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self-supervised graph transformer on large-scale molecular data. InAdvances in Neu- ral Information Processing Systems, volume 33, pages 12559–12571. Curran Associates, Inc.,
-
[13]
URLhttps://proceedings.neurips.cc/paper_files/paper/2020/file/ 94aef38441efa3380a3bed3faf1f9d5d-Paper.pdf
2020
-
[14]
Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024
Yusuf Roohani, Kexin Huang, and Jure Leskovec. Predicting transcriptional outcomes of novel multigene perturbations with GEARS.Nature Biotechnology, 42(6):927–935, 2024. doi: 10.1038/ s41587-023-01905-6. 13
2024
-
[15]
Farrell, David Gennert, Alexander F
Rahul Satija, Jeffrey A. Farrell, David Gennert, Alexander F. Schier, and Aviv Regev. Spatial re- construction of single-cell gene expression data.Nature Biotechnology, 33(5):495–502, 2015. doi: 10.1038/nbt.3192
-
[16]
StateXDiff: Cell State-Contextualized Multimodal Diffusion for Single-Cell Perturbation Prediction
Peiting Shi, Ningfeng Que, Xianzhe Huang, Xiaofei Wang, and Jianzhong Jeff Xi. Statexdiff: Cell state-contextualized multimodal diffusion for single-cell perturbation prediction.arXiv preprint arXiv:2605.16104, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[17]
Sanjay R. Srivatsan, Jos ´e L. McFaline-Figueroa, Vijay Ramani, Lauren Saunders, Junyue Cao, Jonathan Packer, Hannah A. Pliner, Dana L. Jackson, Riza M. Daza, Lena Christiansen, Fan Zhang, Frank Steemers, Jay Shendure, and Cole Trapnell. Massively multiplex chemical transcriptomics at single-cell resolution.Science, 367(6473):45–51, 2020. doi: 10.1126/sci...
-
[18]
Aravind Subramanian, Rajiv Narayan, Steven M. Corsello, David D. Peck, Ted E. Natoli, Xiaodong Lu, Joshua Gould, John F. Davis, Andrew A. Tubelli, Jacob K. Asiedu, David L. Lahr, Jodi E. Hirschman, Zihan Liu, Melanie Donahue, Bina Julian, Mariya Khan, David Wadden, Ian C. Smith, Daniel Lam, Arthur Liberzon, Courtney Toder, Mukta Bagul, Marek Orzechowski, ...
-
[19]
Itay Tirosh, Benjamin Izar, Sanjay M. Prakadan, Marc H. Wadsworth, Daniel Treacy, John J. Trom- betta, Asaf Rotem, Christopher Rodman, Christine Lian, George Murphy, Mohammad Fallahi-Sichani, Ken Dutton-Regester, Jia-Ren Lin, Ofir Cohen, Parin Shah, Diana Lu, Alex S. Genshaft, Travis K. Hughes, Carly G. K. Ziegler, Samuel W. Kazer, Aleth Gaillard, Kellie ...
-
[20]
Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–464, 2026
Zhiting Wei, Yiheng Wang, Yicheng Gao, Shuguang Wang, Ping Li, Duanmiao Si, Yuli Gao, Siqi Wu, Danlu Li, Kejing Dong, Xingbo Yang, Chen Tang, Shaliu Fu, Xiaohan Chen, Wannian Li, Yuzhou You, Chen Zhang, Aibin Liang, Guohui Chuai, and Qi Liu. Benchmarking algorithms for generalizable single-cell perturbation response prediction.Nature Methods, 23(2):451–46...
2026
-
[21]
Alexander Wolf, Philipp Angerer, and Fabian J
F. Alexander Wolf, Philipp Angerer, and Fabian J. Theis. SCANPY: large-scale single-cell gene ex- pression data analysis.Genome Biology, 19(1):15, 2018. doi: 10.1186/s13059-017-1382-0
-
[22]
How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neu- ral networks? InInternational Conference on Learning Representations, 2019. URLhttps: //openreview.net/forum?id=ryGs6iA5Km
2019
-
[23]
Hengshi Yu, Weizhou Qian, Yuxuan Song, and Joshua D. Welch. PerturbNet predicts single-cell responses to unseen chemical and genetic perturbations.Molecular Systems Biology, 21(8):960–982,
-
[24]
doi: 10.1038/s44320-025-00131-3. 14
-
[25]
Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G
Jesse Zhang, Airol A. Ubas, Richard de Borja, Valentine Svensson, Nicole Thomas, Neha Thakar, Ian Lai, Aidan Winters, Umair Khan, Matthew G. Jones, John D. Thompson, Vuong Tran, Joseph Pangallo, Efthymia Papalexi, Ajay Sapre, Hoai Nguyen, Oliver Sanderson, Maria Nigos, Olivia Ka- plan, Sarah Schroeder, Bryan Hariadi, Simone Marrujo, Crina Curca Alec Salvi...
2025
-
[26]
URLhttps://www.biorxiv.org/content/10
doi: 10.1101/2025.02.20.639398. URLhttps://www.biorxiv.org/content/10. 1101/2025.02.20.639398. Preprint
-
[27]
Ru Zhang, Yanmei Lin, Yijia Wu, Lei Deng, Hao Zhang, Mingzhi Liao, and Yuzhong Peng. MvMRL: a multi-view molecular representation learning method for molecular property prediction.Briefings in Bioinformatics, 25(4):bbae298, 2024. doi: 10.1093/bib/bbae298. A Additional Experimental Details A.1 Preprocessing Overview We provide a reproducible Stage 1–2 prep...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.