Cellular State Transformations using Generative Adversarial Networks
Pith reviewed 2026-05-25 12:33 UTC · model grok-4.3
The pith
A conditioned GAN generator can perturb gene expression profiles to simulate realistic transitions between cellular RNA states.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A generator conditioned to perturb any input gene expression profile simulates a realistic transition between source and target RNA expression states. The perturbed samples follow a similar distribution to original samples from the dataset, also suggesting these are biologically meaningful perturbations. It is possible to identify the genes most positively and negatively perturbed by the generator and that the enriched biological function of the perturbed genes are realistic.
What carries the argument
The conditioned generator within the Transcriptome State Perturbation Generator (TSPG) GAN framework, which produces output profiles from input profiles and target state information.
If this is right
- Key genes driving the transition can be extracted directly from the generator's output.
- Enriched functions among the most perturbed genes match known biology for the states.
- The method can reveal condition-defining gene expression patterns without requiring paired experimental data for every transition.
- Perturbations remain within the distribution of the original dataset rather than producing arbitrary values.
Where Pith is reading between the lines
- The same conditioning approach could be tested on other high-dimensional biological measurements such as proteomics or metabolomics.
- If the generator preserves distribution, it may serve as a tool for in silico hypothesis generation about how cells respond to specific inputs.
- Direct comparison of generator outputs against time-series expression data from real transitions would provide a stronger test than distribution matching alone.
Load-bearing premise
Similarity in statistical distribution between generated and real samples is sufficient evidence that the perturbations are biologically meaningful and that the enriched functions of the most perturbed genes are realistic.
What would settle it
An experiment in which the genes identified as most perturbed by the generator show no corresponding change or incorrect functional enrichment when the same source-to-target transition is measured in actual biological samples would falsify the claim.
Figures
read the original abstract
We introduce a novel method to unite deep learning with biology by which generative adversarial networks (GANs) generate transcriptome perturbations and reveal condition-defining gene expression patterns. We find that a generator conditioned to perturb any input gene expression profile simulates a realistic transition between source and target RNA expression states. The perturbed samples follow a similar distribution to original samples from the dataset, also suggesting these are biologically meaningful perturbations. Finally, we show that it is possible to identify the genes most positively and negatively perturbed by the generator and that the enriched biological function of the perturbed genes are realistic. We call the framework the Transcriptome State Perturbation Generator (TSPG), which is open source software available at https://github.com/ctargon/TSPG.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Transcriptome State Perturbation Generator (TSPG), a conditional GAN framework that takes source gene expression profiles and generates perturbations to simulate transitions to target cellular states. It asserts that the generated profiles match the statistical distribution of real samples from the dataset and that the genes most positively or negatively perturbed by the generator exhibit biologically realistic functional enrichments via GO analysis.
Significance. If the central claims hold after proper validation, TSPG could offer a generative tool for in silico exploration of transcriptomic state changes and hypothesis generation in systems biology. The open-source release of the software is a strength that supports potential reproducibility and extension by the community.
major comments (3)
- [Results] Results section (distribution similarity claim): the assertion that perturbed samples follow a similar distribution to original samples provides no quantitative metrics (e.g., MMD, Wasserstein distance), statistical tests, error bars, or baseline comparisons against alternative models or random perturbations, leaving the central claim of realistic transitions unverifiable from the presented evidence.
- [Results] Results section (GO enrichment): the functional enrichment of the most perturbed genes is presented without controls for dataset-specific co-expression modules or orthogonal validation such as overlap with experimentally measured differentially expressed genes for matched source-target pairs, so it does not establish that the perturbations capture condition-defining mechanisms rather than marginal distribution matching.
- [Methods] Methods section: insufficient detail is given on data splits, training/validation procedures, and hyperparameter choices, which is required to evaluate whether the supervised GAN respects regulatory structure or simply reproduces training-set statistics.
minor comments (1)
- [Abstract] Abstract: the description of the generator conditioning could be clarified with a brief reference to the specific conditioning mechanism used.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments highlight important areas for strengthening the validation and reproducibility of TSPG. We address each point below and have revised the manuscript to include the requested quantitative metrics, additional controls, and expanded methodological details.
read point-by-point responses
-
Referee: [Results] Results section (distribution similarity claim): the assertion that perturbed samples follow a similar distribution to original samples provides no quantitative metrics (e.g., MMD, Wasserstein distance), statistical tests, error bars, or baseline comparisons against alternative models or random perturbations, leaving the central claim of realistic transitions unverifiable from the presented evidence.
Authors: We agree that quantitative metrics are necessary to support the distribution similarity claim. In the revised manuscript we now report MMD and Wasserstein distances between generated and real target distributions, together with statistical tests, error bars from repeated runs, and explicit comparisons against random perturbations and a simple baseline model. These results are added to the Results section and supplementary figures. revision: yes
-
Referee: [Results] Results section (GO enrichment): the functional enrichment of the most perturbed genes is presented without controls for dataset-specific co-expression modules or orthogonal validation such as overlap with experimentally measured differentially expressed genes for matched source-target pairs, so it does not establish that the perturbations capture condition-defining mechanisms rather than marginal distribution matching.
Authors: The referee correctly notes the absence of controls. We have added (i) enrichment comparisons against co-expression modules derived from the same dataset and (ii) overlap analysis with published differentially expressed gene lists for the relevant source-to-target transitions. These controls are now presented in the revised Results section to better distinguish condition-specific signals from marginal matching. revision: yes
-
Referee: [Methods] Methods section: insufficient detail is given on data splits, training/validation procedures, and hyperparameter choices, which is required to evaluate whether the supervised GAN respects regulatory structure or simply reproduces training-set statistics.
Authors: We have substantially expanded the Methods section to specify the exact train/validation/test splits (including how samples were partitioned by condition or cell type), the training protocol with validation-based early stopping, the hyperparameter search procedure, and regularization choices intended to encourage learning of regulatory patterns rather than memorization of training statistics. revision: yes
Circularity Check
No significant circularity; standard conditional GAN application on external data
full rationale
The paper applies a conditional GAN (TSPG) trained on transcriptomic datasets to generate perturbations that match source-to-target state transitions. The claim that generated samples follow a similar distribution is the explicit training objective of the adversarial setup and is presented as empirical validation rather than a first-principles derivation. No equations, self-citations, or ansatzes reduce any result to a definitionally equivalent input. The biological interpretation step is interpretive and does not create a self-definitional or fitted-input-called-prediction loop. The derivation chain remains independent of the target claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption GAN training converges to a generator whose output distribution matches the real data distribution sufficiently for downstream biological interpretation.
Reference graph
Works this paper leans on
-
[1]
Kimberly R Kukurba and Stephen B Montgomery. Rna sequencing and analysis. Cold Spring Harbor Protocols, 2015(11):pdb–top084970, 2015
work page 2015
-
[2]
K. E. Roche, M. Weinstein, L. J. Dunwoodie, W. L. Poehlman, and F. A. Feltus. Sorting Five Human Tumor Types Reveals Specific Biomarkers and Background Classification Genes.Sci Rep, 8(1):8180, May 2018
work page 2018
-
[3]
Gene expression inference with deep learning
Yifei Chen, Yi Li, Rajiv Narayan, Aravind Subramanian, and Xiaohui Xie. Gene expression inference with deep learning. Bioinformatics, 32(12):1832–1839, 2016
work page 2016
-
[4]
Using neural networks for reducing the dimensions of single-cell rna-seq data
Chieh Lin, Siddhartha Jain, Hannah Kim, and Ziv Bar-Joseph. Using neural networks for reducing the dimensions of single-cell rna-seq data. Nucleic acids research, 45(17):e156–e156, 2017
work page 2017
-
[5]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015
work page 2015
-
[6]
Imagenet classification with deep convolutional neural networks
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems , pages 1097–1105, 2012
work page 2012
-
[7]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 770–778, 2016
work page 2016
-
[8]
Deep neural networks for acoustic modeling in speech recognition
Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Brian Kingsbury, et al. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine, 29, 2012
work page 2012
-
[9]
Teaching machines to read and comprehend
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. Teaching machines to read and comprehend. In Advances in neural information processing systems, pages 1693–1701, 2015
work page 2015
-
[10]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[11]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems , pages 2672–2680, 2014
work page 2014
-
[12]
Image-to-image translation with conditional adversarial networks
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 1125–1134, 2017
work page 2017
-
[13]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[14]
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Intriguing properties of neural networks
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[16]
Explaining and Harnessing Adversarial Examples
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[17]
Towards evaluating the robustness of neural networks
Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP) , pages 39–57. IEEE, 2017
work page 2017
-
[18]
Delving into Transferable Adversarial Examples and Black-box Attacks
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[19]
Generating Adversarial Examples with Adversarial Networks
Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu, and Dawn Song. Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[20]
Generative adversarial networks uncover epidermal regulators and predict single cell perturbations
Arsham Ghahramani, Fiona M Watt, and Nicholas M Luscombe. Generative adversarial networks uncover epidermal regulators and predict single cell perturbations. bioRxiv, page 262501, 2018
work page 2018
-
[21]
Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[22]
Semi-supervised generative adversarial network for gene expression inference
Kamran Ghasedi Dizaji, Xiaoqian Wang, and Heng Huang. Semi-supervised generative adversarial network for gene expression inference. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , pages 1435–1444. ACM, 2018. 9 A PREPRINT - J ULY 2, 2019
work page 2018
-
[23]
J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, J. M. Stuart, K. Chang, C. J. Creighton, C. Davis, L. Donehower, J. Drummond, D. Wheeler, A. Ally, M. Balasun- daram, I. Birol, S. N. Butterfield, A. Chu, E. Chuah, H. J. Chun, N. Dhalla, R. Guin, M. Hirst, C. Hirst, R. A. Holt, S. J. Jones, D...
work page 2013
-
[24]
Aravind Subramanian, Pablo Tamayo, Vamsi K Mootha, Sayan Mukherjee, Benjamin L Ebert, Michael A Gillette, Amanda Paulovich, Scott L Pomeroy, Todd R Golub, Eric S Lander, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences, 102(43):15545–15550, 2005
work page 2005
-
[25]
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011
work page 2011
-
[26]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murra...
work page 2015
-
[27]
Least squares generative adversarial networks
Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision , pages 2794–2802, 2017. 10 A PREPRINT - J ULY 2, 2019
work page 2017
-
[28]
Unpaired image-to-image translation using cycle-consistent adversarial networks
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision , pages 2223–2232, 2017
work page 2017
-
[29]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[30]
Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605, 2008
work page 2008
-
[31]
Toppgene suite for gene list enrichment analysis and candidate gene prioritization
Jing Chen, Eric E Bardes, Bruce J Aronow, and Anil G Jegga. Toppgene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic acids research, 37(suppl_2):W305–W311, 2009
work page 2009
-
[32]
The gudmap database–an online resource for genitourinary research
Simon D Harding, Chris Armit, Jane Armstrong, Jane Brennan, Ying Cheng, Bernard Haggarty, Derek Houghton, Sue Lloyd-MacGilp, Xingjun Pi, Yogmatee Roochun, et al. The gudmap database–an online resource for genitourinary research. Development, 138(13):2845–2853, 2011
work page 2011
-
[33]
Ursula Rescher and V olker Gerke. Annexins–unique membrane binding proteins with diverse functions.Journal of cell science, 117(13):2631–2639, 2004
work page 2004
-
[34]
Annexins are instrumental for efficient plasma membrane repair in cancer cells
Stine Prehn Lauritzen, Theresa Louise Boye, and Jesper Nylandsted. Annexins are instrumental for efficient plasma membrane repair in cancer cells. In Seminars in cell & developmental biology , volume 45, pages 32–38. Elsevier, 2015
work page 2015
-
[35]
Annexin a2 in renal cell carcinoma: expression, function, and prognostic significance
Shun-Fa Yang, Han-Lin Hsu, Tai-Kuang Chao, Chia-Jung Hsiao, Yung-Feng Lin, and Chao-Wen Cheng. Annexin a2 in renal cell carcinoma: expression, function, and prognostic significance. In Urologic Oncology: Seminars and Original Investigations, volume 33, pages 22–e11. Elsevier, 2015
work page 2015
-
[36]
Estrogen-related receptorα is critical for the growth of estrogen receptor– negative breast cancer
Rebecca A Stein, Ching-yi Chang, Dmitri A Kazmin, James Way, Thies Schroeder, Melanie Wergin, Mark W Dewhirst, and Donald P McDonnell. Estrogen-related receptorα is critical for the growth of estrogen receptor– negative breast cancer. Cancer research, 68(21):8805–8812, 2008
work page 2008
-
[37]
Expression analysis of the estrogen receptor target genes in renal cell carcinoma
Zhihong Liu, You Lu, Zonghai He, Libo Chen, and Yiping Lu. Expression analysis of the estrogen receptor target genes in renal cell carcinoma. Molecular medicine reports, 11(1):75–82, 2015. 11
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.