Recognition: unknown
A multi-stage soft computing framework for complex disease modelling and decision support: A liver cirrhosis case study
Pith reviewed 2026-05-08 04:53 UTC · model grok-4.3
The pith
A multi-stage framework using single-cell transcriptomics, network analysis, and CNNs identifies an endothelial subpopulation and seven signature genes in liver cirrhosis while outperforming conventional machine learning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using liver cirrhosis as a representative case, the framework identified a disease-associated endothelial subpopulation and extracted seven robust signature genes (HSPB1, GADD45A, CLDN5, ATP1B3, C1QBP, ENPP2, and PARL). The CNN-based representation learning module, applied after converting molecular features into two-dimensional disease maps, outperformed conventional machine learning pipelines in classification. The overall framework is presented as disease-agnostic and extensible to other omics-driven biomedical applications.
What carries the argument
The multi-stage pipeline that stabilizes gene modules via high-dimensional weighted gene co-expression network analysis and then restructures tabular molecular features into two-dimensional disease maps for convolutional neural network processing to capture non-linear interactions.
If this is right
- The framework supports identification of cellular subpopulations and gene signatures that can inform targeted therapeutic exploration through molecular docking.
- It offers a reusable structure for other complex diseases involving high-dimensional omics data with limited samples and feature correlations.
- Conversion of features to two-dimensional maps combined with CNN analysis improves handling of non-linear interactions compared with conventional pipelines.
- The multi-stage design yields both predictive classification gains and interpretable outputs such as signature genes for downstream decision support.
Where Pith is reading between the lines
- The 2D disease map conversion step could be tested on other high-dimensional data types such as proteomics to check whether the CNN advantage generalizes.
- Independent clinical cohorts could be used to check whether the seven signature genes function as reliable biomarkers across populations.
- Linking the identified genes and cell types more directly to longitudinal patient outcomes might strengthen the case for clinical translation.
- The modular stages could be swapped or extended to incorporate additional data modalities without redesigning the entire pipeline.
Load-bearing premise
That high-dimensional weighted gene co-expression network analysis reliably stabilizes gene modules under sparsity and noise, and that converting tabular molecular features into two-dimensional disease maps enables a CNN to capture non-linear interactions better than direct tabular processing.
What would settle it
Running the CNN classification step on an independent liver cirrhosis single-cell dataset yields no accuracy gain over standard machine learning pipelines, or the same endothelial subpopulation and seven signature genes fail to appear consistently in additional patient samples.
Figures
read the original abstract
Liver cirrhosis is a major global health problem causing millions of deaths annually, and timely detection with aggressive treatment can significantly improve patients' quality of life. Modelling complex diseases from biomedical data is computationally challenging due to high dimensionality, strong feature correlations, noise, and limited labelled samples. Conventional Machine Learning (ML) pipelines often struggle with robustness, interpretability, and generalisation under such conditions. In this study, we propose an ML-driven multi-stage decision framework for complex disease modelling and therapeutic exploration. The framework integrates single-cell transcriptomic profiling, high-dimensional network-based feature stabilisation, multi-model learning, deep representation construction, and post-hoc decision support. Specifically, single-cell sequencing data were analysed to identify key cellular subpopulations, followed by high-dimensional weighted gene co-expression network analysis (hdWGCNA) to stabilise gene modules under sparsity and noise. To enhance non-linear feature interaction modelling, tabular molecular features were restructured into two-dimensional disease maps and analysed using a CNN. Finally, molecular docking was incorporated as a decision-support module to evaluate candidate therapeutic compounds. Using liver cirrhosis as a representative case, the framework identified a disease-associated endothelial subpopulation and extracted seven robust signature genes (HSPB1, GADD45A, CLDN5, ATP1B3, C1QBP, ENPP2, and PARL). The CNN-based representation learning module outperformed conventional pipelines in classification. The framework is disease-agnostic and readily extends to other omics-driven biomedical applications involving uncertainty, heterogeneity, and limited samples.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a multi-stage soft computing framework for complex disease modeling and decision support, demonstrated on liver cirrhosis using single-cell transcriptomics. It combines identification of cellular subpopulations, high-dimensional weighted gene co-expression network analysis (hdWGCNA) to stabilize gene modules, restructuring of tabular features into 2D disease maps for CNN-based classification, and molecular docking for therapeutic evaluation. The central claims are the discovery of a disease-associated endothelial subpopulation and seven signature genes (HSPB1, GADD45A, CLDN5, ATP1B3, C1QBP, ENPP2, PARL), superior CNN performance over conventional pipelines, and the framework's disease-agnostic applicability to other omics settings with uncertainty and limited samples.
Significance. If the empirical validations and methodological details hold upon revision, the work could provide a practical, integrated pipeline for extracting interpretable biological signals from noisy, high-dimensional single-cell data while incorporating downstream therapeutic screening. The disease-agnostic framing and emphasis on robustness under sparsity are potentially valuable for the field, though the current lack of quantitative support limits immediate assessment of novelty relative to existing multi-omics frameworks.
major comments (3)
- [Abstract] Abstract: the claim that the CNN-based representation learning module outperformed conventional pipelines is unsupported by any quantitative metrics, error bars, validation splits, baseline comparisons, or statistical tests, leaving the central assertion of superiority without evidence.
- [Methods] Methods (hdWGCNA description): the assertion that hdWGCNA reliably stabilises gene modules under sparsity and noise lacks any reported stability metrics, hyperparameter settings, module preservation statistics, or ablation on synthetic sparse/noisy data, making it impossible to evaluate whether the seven signature genes follow from this step.
- [Methods] Methods (2D disease map construction): no explicit grid-construction rule, dimensionality choice, or ablation study is provided to demonstrate that converting tabular molecular features into 2D maps enables the CNN to capture non-linear interactions better than direct tabular models or standard ML baselines; without this, the reported classification gain cannot be attributed to the proposed transformation.
minor comments (2)
- [Abstract] The abstract lists the seven genes without accompanying importance scores, p-values, or cross-validation stability measures that would normally appear in a results summary.
- The manuscript would benefit from a dedicated reproducibility subsection listing all software versions, random seeds, and data preprocessing steps for the single-cell analysis pipeline.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below. Where the comments correctly identify gaps in quantitative support or methodological transparency, we will revise the manuscript to incorporate the requested details, metrics, and ablations.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the CNN-based representation learning module outperformed conventional pipelines is unsupported by any quantitative metrics, error bars, validation splits, baseline comparisons, or statistical tests, leaving the central assertion of superiority without evidence.
Authors: We agree that the abstract claim requires explicit quantitative support to be fully substantiated. Although the full manuscript reports classification results, we will expand the results section to include comprehensive metrics (accuracy, precision, recall, F1-score), error bars from repeated runs or cross-validation, details on validation splits (e.g., stratified k-fold), direct comparisons to standard baselines (logistic regression, random forest, SVM, XGBoost), and statistical tests (e.g., McNemar's test or paired t-tests). The abstract will be updated to reference these additions. This will provide the necessary evidence for the superiority assertion. revision: yes
-
Referee: [Methods] Methods (hdWGCNA description): the assertion that hdWGCNA reliably stabilises gene modules under sparsity and noise lacks any reported stability metrics, hyperparameter settings, module preservation statistics, or ablation on synthetic sparse/noisy data, making it impossible to evaluate whether the seven signature genes follow from this step.
Authors: We acknowledge that the current methods description lacks the requested quantitative validation for hdWGCNA stability. In the revision, we will specify all hyperparameter settings (softPower, minModuleSize, etc.), report module preservation statistics (Z-summary scores across bootstrap or perturbed samples), and add an ablation study on synthetic sparse/noisy datasets to demonstrate robustness. This will explicitly link the stabilized modules to the derivation of the seven signature genes (HSPB1, GADD45A, CLDN5, ATP1B3, C1QBP, ENPP2, PARL). revision: yes
-
Referee: [Methods] Methods (2D disease map construction): no explicit grid-construction rule, dimensionality choice, or ablation study is provided to demonstrate that converting tabular molecular features into 2D maps enables the CNN to capture non-linear interactions better than direct tabular models or standard ML baselines; without this, the reported classification gain cannot be attributed to the proposed transformation.
Authors: We agree that the 2D disease map construction requires more explicit description and validation. We will revise the methods to detail the grid-construction rule (e.g., feature arrangement via correlation-based or clustering-driven mapping to preserve neighborhood structure), the choice of dimensionality (e.g., nearest square grid size based on feature count), and include an ablation study comparing the 2D map + CNN pipeline against direct tabular models (MLP, XGBoost) and standard ML baselines. This will allow attribution of any performance gains to the proposed transformation. revision: yes
Circularity Check
No circularity; standard external methods applied without self-referential reduction
full rationale
The manuscript describes a pipeline that applies externally defined techniques (single-cell profiling, hdWGCNA for module stabilization, tabular-to-2D map conversion for CNN, and molecular docking) to cirrhosis scRNA-seq data. No equations, parameter-fitting steps, or derivations are presented that equate outputs to inputs by construction. Claims of an endothelial subpopulation and seven signature genes follow from running these standard tools; the CNN superiority is reported as an empirical outcome rather than a fitted or renamed result. No self-citations serve as load-bearing uniqueness theorems or ansatzes. The derivation chain is therefore independent of the paper's own fitted values or prior author work.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption High-dimensional weighted gene co-expression network analysis (hdWGCNA) can stabilise gene modules under sparsity and noise.
- domain assumption Restructuring tabular molecular features into two-dimensional disease maps enables convolutional neural networks to model non-linear feature interactions.
Reference graph
Works this paper leans on
-
[1]
Gracia-Sancho, G
J. Gracia-Sancho, G. Marrone, and A. Fernandez-Iglesias. Hepatic microcirculation and mechanisms of portal hypertension.Nat Rev Gastroenterol Hepatol, 16:221–234, 2019. X. Huang et al.:Preprint submitted to ElsevierPage 18 of 20 A multi-stage soft computing framework for liver cirrhosis
2019
-
[2]
Sarin, M
S.K. Sarin, M. Kumar, M. Eslam, J. George, M. Al Mahtab, S.M.F. Akbar, J. Jia, Q. Tian, R. Aggarwal, D.H. Muljono, M. Omata, Y. Ooka, K.H.Han,H.W.Lee,W.Jafri,A.S.Butt,C.H.Chong,S.G.Lim,R.F.Pwu,andD.S.Chen. LiverdiseasesintheAsia-Pacificregion:aLancet Gastroenterology & Hepatology commission.Lancet Gastroenterol Hepatol, 5:167–228, 2020
2020
-
[3]
Libbrecht and W.S
M.W. Libbrecht and W.S. Noble. Machine learning applications in genetics and genomics.Nat Rev Genet, 16:321–332, 2015
2015
-
[4]
Aprimerondeeplearningingenomics.NatGenet,51:12–18,2019
J.Zou,M.Huss,A.Abid,P.Mohammadi,A.Torkamani,andA.Telenti. Aprimerondeeplearningingenomics.NatGenet,51:12–18,2019
2019
-
[5]
Higashi, S.L
T. Higashi, S.L. Friedman, and Y. Hoshida. Hepatic stellate cells as key target in liver fibrosis.Adv Drug Deliv Rev, 121:27–42, 2017
2017
-
[6]
Iwakiri and J
Y. Iwakiri and J. Trebicka. Portal hypertension in cirrhosis: Pathophysiological mechanisms and therapy.JHEP Rep, 3:100316, 2021
2021
-
[7]
Single-celltranscriptomicsrevealszone-specificalterations of liver sinusoidal endothelial cells in cirrhosis.Cell Mol Gastroenterol Hepatol, 11:1139–1161, 2021
T.Su,Y.Yang,S.Lai,J.Jeong,Y.Jung,M.McConnell,T.Utsumi,andY.Iwakiri. Single-celltranscriptomicsrevealszone-specificalterations of liver sinusoidal endothelial cells in cirrhosis.Cell Mol Gastroenterol Hepatol, 11:1139–1161, 2021
2021
-
[8]
X. Wu, L. Shu, Z. Zhang, J. Li, J. Zong, L.Y. Cheong, D. Ye, K.S.L. Lam, E. Song, C. Wang, A. Xu, and R.L.C. Hoo. Adipocyte fatty acid binding protein promotes the onset and progression of liver fibrosis via mediating the crosstalk between liver sinusoidal endothelial cells and hepatic stellate cells.Adv Sci (Weinh), 8:e2003721, 2021
2021
-
[9]
Poisson, S
J. Poisson, S. Lemoinne, C. Boulanger, F. Durand, R. Moreau, D. Valla, and P.E. Rautou. Liver sinusoidal endothelial cells: Physiology and role in liver diseases.J Hepatol, 66:212–227, 2017
2017
-
[10]
Ramachandran, R
P. Ramachandran, R. Dobie, J.R. Wilson-Kanamori, E.F. Dora, B.E.P. Henderson, N.T. Luu, J.R. Portman, K.P. Matchett, M. Brice, J.A. Marwick,R.S.Taylor,M.Efremova,R.Vento-Tormo,N.O.Carragher,T.J.Kendall,J.A.Fallowfield,E.M.Harrison,D.J.Mole,S.J.Wigmore, P.N. Newsome,C.J. Weston, J.P.Iredale, F.Tacke, J.W. Pollard,C.P. Ponting, J.C.Marioni, S.A.Teichmann, a...
2019
-
[11]
Luecken and F.J
M.D. Luecken and F.J. Theis. Current best practices in single-cell RNA-seq analysis: a tutorial.Mol Syst Biol, 15:e8746, 2019
2019
-
[12]
Tutorial:guidelinesforthecomputationalanalysisofsingle-cellRNAsequencing data.Nat Protoc, 16:1–9, 2021
T.S.Andrews,V.Y.Kiselev,D.McCarthy,andM.Hemberg. Tutorial:guidelinesforthecomputationalanalysisofsingle-cellRNAsequencing data.Nat Protoc, 16:1–9, 2021
2021
-
[13]
Morabito, F
S. Morabito, F. Reese, N. Rahimzadeh, E. Miyoshi, and V. Swarup. hdWGCNA identifies co-expression networks in high-dimensional transcriptomics data.Cell Rep Methods, 3:100498, 2023
2023
-
[14]
Yasaka, H
K. Yasaka, H. Akai, O. Abe, and S. Kiryu. Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: A preliminary study.Radiology, 286:887–896, 2018
2018
-
[15]
Y. Xie, H. Shi, and B. Han. Bioinformatic analysis of underlying mechanisms of Kawasaki disease via weighted gene correlation network analysis (WGCNA) and the least absolute shrinkage and selection operator method (LASSO) regression model.BMC Pediatr, 23:90, 2023
2023
-
[16]
DeepInsight:Amethodologytotransformanon-imagedatatoanimage for convolution neural network architecture.Sci Rep, 9:11399, 2019
A.Sharma,E.Vans,D.Shigemizu,K.A.Boroevich,andT.Tsunoda. DeepInsight:Amethodologytotransformanon-imagedatatoanimage for convolution neural network architecture.Sci Rep, 9:11399, 2019
2019
-
[17]
Newman, C.L
A.M. Newman, C.L. Liu, M.R. Green, A.J. Gentles, W. Feng, Y. Xu, C.D. Hoang, M. Diehn, and A.A. Alizadeh. Robust enumeration of cell subsets from tissue expression profiles.Nat Methods, 12:453–457, 2015
2015
-
[18]
Dockingandscoringinvirtualscreeningfordrugdiscovery:methodsandapplications
D.B.Kitchen,H.Decornez,J.R.Furr,andJ.Bajorath. Dockingandscoringinvirtualscreeningfordrugdiscovery:methodsandapplications. Nat Rev Drug Discov, 3:935–949, 2004
2004
-
[19]
Journal of Hepatology, 78:1216–1233, 2023
M.Bhat,M.Rabindranath,B.S.Chara,andD.A.Simonetto.Artificialintelligence,machinelearning,anddeeplearninginlivertransplantation. Journal of Hepatology, 78:1216–1233, 2023
2023
-
[20]
Lee, J.J.Y
H.W. Lee, J.J.Y. Sung, and S.H. Ahn. Artificial intelligence in liver disease.Journal of gastroenterology and hepatology, 36:539–542, 2021
2021
-
[21]
Beam and I.S
A.L. Beam and I.S. Kohane. Big data and machine learning in health care.JAMA, 319:1317–1318, 2018
2018
-
[22]
Ahn, Z.I
J.C. Ahn, Z.I. Attia, P. Rattan, A.F. Mullan, S. Buryska, A.M. Allen, P.S. Kamath, P.A. Friedman, V.H. Shah, P.A. Noseworthy, and D.A. Simonetto. Development of the AI-Cirrhosis-ECG score: An electrocardiogram-based deep learning model in cirrhosis.The American journal of gastroenterology, 117:424–432, 2022
2022
-
[23]
Langfelder and S
P. Langfelder and S. Horvath. WGCNA: an R package for weighted correlation network analysis.BMC Bioinformatics, 9:559, 2008
2008
-
[24]
Sturm, F
G. Sturm, F. Finotello, F. Petitprez, J.D. Zhang, J. Baumbach, W.H. Fridman, M. List, and T. Aneichyk. Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology.Bioinformatics, 35:i436–i445, 2019
2019
-
[25]
Fuzzylogicapplicationsinchronicdiseaseindecision-making:acomprehensivehealthcarereview.Informaticsfor Health and Social Care, pages 1–17, 2025
S.ThukralandA.Gupta. Fuzzylogicapplicationsinchronicdiseaseindecision-making:acomprehensivehealthcarereview.Informaticsfor Health and Social Care, pages 1–17, 2025
2025
-
[26]
Gadaras and L
I. Gadaras and L. Mikhailov. An interpretable fuzzy rule-based classification methodology for medical diagnosis.Artificial Intelligence in Medicine, 47:25–41, 2009
2009
-
[27]
Ageneselectionalgorithmformicroarraycancer classification using an improved particle swarm optimization.Scientific Reports, 14:19613, 2024
A.A.Nagra,A.H.Khan,M.Abubakar,M.Faheem,A.Rasool,K.Masood,andM.Hussain. Ageneselectionalgorithmformicroarraycancer classification using an improved particle swarm optimization.Scientific Reports, 14:19613, 2024
2024
-
[28]
Seera and C.P
M. Seera and C.P. Lim. A hybrid intelligent system for medical data classification.Expert Systems with Applications, 41:2239–2249, 2014
2014
-
[29]
S. She, M. Yang, H. Hu, P. Hu, Y. Yang, and H. Ren. Proteomics based identification of autotaxin as an anti-hepatitis B virus factor and a promoter of hepatoma cell invasion and migration.Cell Physiol Biochem, 45:744–760, 2018
2018
-
[30]
Z. Xie, Y. Gao, C. Ho, L. Li, C. Jin, X. Wang, C. Zou, Y. Mao, X. Wang, Q. Li, D. Fu, and Y.F. Zhang. Exosome-delivered CD44v6/C1QBP complex drives pancreatic cancer liver metastasis by promoting fibrotic liver microenvironment.Gut, 71:568–579, 2022
2022
-
[31]
Hagan, S.K
M. Hagan, S.K. Asrani, and J. Talwalkar. Non-invasive assessment of liver fibrosis and prognosis.Expert Rev Gastroenterol Hepatol, 9: 1251–1260, 2015
2015
-
[32]
L.D. DeLeve. Liver sinusoidal endothelial cells in hepatic fibrosis.Hepatology, 61:1740–1746, 2015
2015
-
[33]
Caveolin1-relatedautophagy initiated by aldosterone-induced oxidation promotes liver sinusoidal endothelial cells defenestration.Redox Biol, 13:508–521, 2017
X.Luo,W.Dan,X.Luo,X.Zhu,G.Wang,Z.Ning,Y.Li,X.Ma,R.Yang,S.Jin,Y.Huang,Y.Meng,andX.Li. Caveolin1-relatedautophagy initiated by aldosterone-induced oxidation promotes liver sinusoidal endothelial cells defenestration.Redox Biol, 13:508–521, 2017
2017
-
[34]
G. Xie, X. Wang, L. Wang, L. Wang, R.D. Atkinson, G.C. Kanel, W.A. Gaarde, and L.D. Deleve. Role of differentiation of liver sinusoidal endothelial cells in progression and regression of hepatic fibrosis in rats.Gastroenterology, 142:918–927.e6, 2012
2012
-
[35]
Tsuchida and S.L
T. Tsuchida and S.L. Friedman. Mechanisms of hepatic stellate cell activation.Nat Rev Gastroenterol Hepatol, 14:397–411, 2017
2017
-
[36]
Krenkel and F
O. Krenkel and F. Tacke. Liver macrophages in tissue homeostasis and disease.Nat Rev Immunol, 17:306–321, 2017. X. Huang et al.:Preprint submitted to ElsevierPage 19 of 20 A multi-stage soft computing framework for liver cirrhosis
2017
-
[37]
Seki and R.F
E. Seki and R.F. Schwabe. Hepatic inflammation and fibrosis: functional links and key pathways.Hepatology, 61:1066–1079, 2015
2015
-
[38]
EsculininhibitshepaticstellatecellactivationandCCl4-induced liver fibrosis by activating the Nrf2/GPX4 signaling pathway.Phytomedicine, 128:155465, 2024
S.Xu,Y.Chen,J.Miao,Y.Li,J.Liu,J.Zhang,J.Liang,S.Chen,andS.Hou. EsculininhibitshepaticstellatecellactivationandCCl4-induced liver fibrosis by activating the Nrf2/GPX4 signaling pathway.Phytomedicine, 128:155465, 2024
2024
-
[39]
Ranalli, N
M.G. Ranalli, N. Salvati, L. Petrella, and F. Pantalone. M-quantile regression shrinkage and selection via the Lasso and Elastic Net to assess the effect of meteorology and traffic on air quality.Biom J, 65:e2100355, 2023
2023
-
[40]
Forli, R
S. Forli, R. Huey, M.E. Pique, M.F. Sanner, D.S. Goodsell, and A.J. Olson. Computational protein-ligand docking and virtual drug screening with the AutoDock suite.Nat Protoc, 11:905–919, 2016. X. Huang et al.:Preprint submitted to ElsevierPage 20 of 20
2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.