Co-design for Trustworthy AI: An Interpretable and Explainable Tool for Type 2 Diabetes Prediction Using Genomic Polygenic Risk Scores
Pith reviewed 2026-05-10 17:58 UTC · model grok-4.3
The pith
A visualization tool decomposes polygenic risk scores for type 2 diabetes into specific gene and SNP contributions using SHAP, developed through ethical co-design.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that the XPRS tool, built on Shapley Additive Explanations, decomposes polygenic risk scores for type 2 diabetes into granular gene-level and SNP contribution scores, delivering individual risk insights. This is paired with a co-design process using Z-inspection and HUDERIA frameworks to evaluate and address legal, medical, ethical, and technical trustworthiness issues during development and potential use, producing a set of lessons learned that serve as guidance for future explainability frameworks in clinical PRS applications.
What carries the argument
The XPRS visualization tool that applies SHAP to decompose PRS into gene-level and SNP contribution scores, integrated with Z-inspection and HUDERIA co-design methods for assessing trustworthiness across ethical, legal, and technical dimensions.
If this is right
- Clinicians gain granular views of which genetic factors drive an individual's type 2 diabetes risk, enabling more targeted screening and interventions.
- Developers of similar PRS models receive a practical reference for building explainability into genomic tools.
- Multidisciplinary teams obtain documented lessons on navigating ethical, legal, and robustness challenges when deploying AI in healthcare.
- The approach supplies a template for creating trustworthy AI systems in other polygenic disease contexts.
Where Pith is reading between the lines
- Wider adoption of such decomposable PRS tools could help overcome hesitation around using genetic risk scores in routine medical care.
- The same SHAP decomposition strategy might transfer to polygenic predictions for conditions like obesity or certain cancers.
- Real-world testing in clinical workflows would be needed to confirm whether the added explanations actually change patient or physician behavior.
Load-bearing premise
That SHAP-based breakdowns of polygenic risk scores produce explanations that are biologically accurate and clinically useful rather than artifacts of the model or data correlations.
What would settle it
An independent study showing that the specific gene and SNP contributions highlighted by XPRS for high-risk patients do not align with established biological pathways or functional variants linked to type 2 diabetes.
Figures
read the original abstract
The polygenic risk scores (PRS) have emerged as an important methodology for quantifying genetic predisposition to complex traits and clinical disease. Significant progress has been made in applying PRS to conditions such as obesity, cancer, and type 2 diabetes (T2DM). Studies have demonstrated that PRS can effectively identify individuals at high risk, thereby enabling early screening, personalized treatment, and targeted interventions for diseases with a genetic predisposition. One current limitation of PRS, however, is the lack of interpretability tools. To address this problem for T2DM, researchers at the Graduate School of Data Science at the Seoul National University introduced eXplainable PRS (XPRS). This visualization tool decomposes PRSs into gene-level and single-nucleotide polymorphism (SNP) contribution scores via Shapley Additive Explanations (SHAP), providing granular insights into the specific genetic factors driving an individual's risk profile. We used a co-design approach to assess XPRS trustworthiness by considering legal, medical, ethical, and technical robustness during early design and potential clinical use. For that, we used Z-inspection, an ethically aligned Trustworthy AI co-design methodology, and piloted the Council of Europe's Human Rights, Democracy, and the Rule of Law Impact Assessment for AI Systems (HUDERIA) (Council of Europe (CAI) 2025). The findings of this use-case comprise a comprehensive set of ethical, legal, and technical lessons learned. These insights, identified by a multidisciplinary team of experts (ethics, legal, human rights, computer science, and medical), serve as a framework for designers to navigate future challenges with this and other AI systems. The findings also provide a useful reference for researchers developing explainability frameworks for PRS in diverse clinical contexts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces eXplainable PRS (XPRS), a visualization tool that applies Shapley Additive Explanations (SHAP) to decompose polygenic risk scores (PRS) for Type 2 Diabetes (T2DM) into gene-level and SNP contribution scores. It describes a co-design process using Z-inspection and the HUDERIA framework to evaluate the tool's trustworthiness across legal, medical, ethical, and technical dimensions, and reports lessons learned from a multidisciplinary expert team.
Significance. If the SHAP decompositions can be shown to yield biologically grounded explanations rather than model artifacts, the work would offer a practical framework for interpretable genomic AI tools and a reusable co-design template for trustworthy AI in clinical contexts, addressing a recognized gap in PRS interpretability.
major comments (2)
- [Abstract] Abstract: the central claim that XPRS 'decomposes PRSs into gene-level and single-nucleotide polymorphism (SNP) contribution scores via SHAP, providing granular insights into the specific genetic factors driving an individual's risk profile' is presented without any performance metrics, validation against known T2DM loci or pathways, comparison to existing PRS tools, or empirical checks (e.g., eQTL overlap or perturbation tests) that would establish the attributions are biologically faithful rather than statistical artifacts or linkage-disequilibrium effects.
- [XPRS Tool and Co-design Assessment] Description of the XPRS tool and co-design process: the manuscript asserts that the Z-inspection and HUDERIA assessments identified relevant trustworthiness issues, yet supplies no concrete details on how SHAP outputs were evaluated for accuracy, stability, or clinical meaningfulness, leaving the reported ethical/legal/technical lessons dependent on an untested interpretability premise.
minor comments (1)
- [Abstract] The citation 'Council of Europe (CAI) 2025' appears to reference a future document; clarify the exact reference and publication status.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive feedback, which highlights important considerations for strengthening the manuscript's claims. We appreciate the recognition of the work's potential contribution to interpretable genomic AI and trustworthy AI co-design. We address each major comment below and will revise the manuscript accordingly to clarify scope, temper claims, and add limitations without misrepresenting the current study.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that XPRS 'decomposes PRSs into gene-level and single-nucleotide polymorphism (SNP) contribution scores via SHAP, providing granular insights into the specific genetic factors driving an individual's risk profile' is presented without any performance metrics, validation against known T2DM loci or pathways, comparison to existing PRS tools, or empirical checks (e.g., eQTL overlap or perturbation tests) that would establish the attributions are biologically faithful rather than statistical artifacts or linkage-disequilibrium effects.
Authors: We agree that the abstract phrasing risks overstating the tool's outputs as biologically faithful insights without supporting validation. The manuscript's core focus is the XPRS visualization tool and the multidisciplinary co-design process (via Z-inspection and HUDERIA) to derive ethical, legal, and technical lessons for trustworthy AI, rather than a technical validation study of SHAP attributions. We will revise the abstract to qualify the claim, emphasizing decomposition for visualization and user exploration instead of 'granular insights into specific genetic factors.' We will also add a limitations section explicitly noting the absence of performance metrics, locus validation, comparisons, or perturbation tests, and recommending these as directions for future work. This addresses the concern while preserving the paper's emphasis on the co-design framework. revision: yes
-
Referee: [XPRS Tool and Co-design Assessment] Description of the XPRS tool and co-design process: the manuscript asserts that the Z-inspection and HUDERIA assessments identified relevant trustworthiness issues, yet supplies no concrete details on how SHAP outputs were evaluated for accuracy, stability, or clinical meaningfulness, leaving the reported ethical/legal/technical lessons dependent on an untested interpretability premise.
Authors: The co-design evaluation applied Z-inspection and HUDERIA to assess the tool's overall trustworthiness dimensions (legal, medical, ethical, technical) in a hypothetical clinical context, drawing on expert input from multiple disciplines. It did not include quantitative technical validation of SHAP (e.g., accuracy or stability metrics), as that falls outside the paper's scope of reporting lessons from the process itself. We will revise the relevant sections to provide more concrete examples from the expert discussions, such as specific concerns raised about SHAP's potential for misleading attributions due to linkage disequilibrium and how these informed the derived lessons. We will also clarify that the lessons are process-oriented rather than dependent on proven biological fidelity of the attributions. revision: partial
Circularity Check
No circularity: descriptive development of XPRS tool and co-design assessment
full rationale
The manuscript presents the XPRS visualization tool that applies SHAP to decompose PRS into gene- and SNP-level contributions for T2DM risk, followed by a co-design trustworthiness evaluation using the external Z-inspection framework and HUDERIA impact assessment. No equations, fitted parameters, predictions, or derivation steps appear in the provided text. The central claims rest on the described implementation process and multidisciplinary lessons learned rather than any reduction of outputs to inputs by construction, self-citation chains, or renamed empirical patterns. The work is therefore self-contained with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption SHAP values provide meaningful and trustworthy explanations for PRS models in genomic data
- domain assumption Z-inspection and HUDERIA frameworks are adequate to assess all relevant trustworthiness aspects for this AI system
invented entities (1)
-
XPRS tool
no independent evidence
Reference graph
Works this paper leans on
-
[1]
The National Assembly of Korea adopted the AI Basic Act in December 2024.6 With this passage, Korea became the second country with an AI-specific law that encompassed a comprehensive regulatory scope. Since the government revealed initial drafts of the enforcement decree and guidelines for this Act in September 2025, the contents of many provisions will s...
work page 2024
-
[2]
consolidates 19 different AI bills introduced to the National Assembly since May 2024, after the 22nd session of the National Assembly began. 7 See (NIA Korea,
work page 2024
-
[3]
Discussion and validation of the identified points in the full WG1. Human Rights Law findings Positive contributions WG1 first stressed the importance of considering what potential positive contribution the XPRS AI system could have towards the achievement of human rights and not just any potential threats or human rights violations. Two human rights were...
work page 1948
-
[4]
[32], prepared by the Ministry of Science and ICT (MIST) and adopted in December 2020, were designed to introduce general principles applicable across different fields. They consist of three principles (must be considered in the process of developing and using AI for humanity) and ten requirements (to follow those three principles, key requirements that m...
work page 2020
-
[5]
further complicates the transparency and availability of such data. Additionally, the availability of information generated by XPRS may also prove a liability that requires proper management. From these perspectives, accountability falls well outside the clinician-patient relationship. While WG1 explored some policies in Korea, the EU, and other jurisdict...
work page 2023
-
[6]
Sun H, Saeedi P, Karuranga S, et al., IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for
work page 2021
-
[7]
Diabetes Res Clin Pract. 2022;183:109119. doi:10.1016/j.diabres.2021.109119
-
[8]
Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study
Wilson PWF, Meigs JB, Sullivan L, Fox CS, Nathan DM, D’Agostino RB. Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study. Arch Intern Med. 2007;167(10):1068-1074. doi:10.1001/archinte.167.10.1068
-
[9]
Hippisley-Cox J, Coupland C. Development and validation of QDiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study. BMJ. 2017;359:j5019. doi:10.1136/bmj.j5019
-
[10]
Lee H, et al., Prediction model for type 2 diabetes mellitus and its association with mortality using machine learning in three independent cohorts from South Korea, Japan, and the UK: a model development and validation study. eClinicalMedicine. 2025;80:103069. doi:10.1016/j.eclinm.2025.103069
-
[11]
Mahajan A, et al., Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet. 2022;54(5):560-572. doi:10.1038/s41588-022-01058-3
-
[12]
Ge T, Chen CY, Ni Y, Feng YA, Smoller JW. Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations. Genome Med. 2022;14(1):70. doi:10.1186/s13073-022-01074-2
-
[13]
Khera AV, et al., Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50(9):1219-1224. doi:10.1038/s41588-018-0183-z
-
[14]
An Improved Genome-Wide Polygenic Score Model for Predicting the Risk of Type 2 Diabetes
Liu W, Zhuang Z, Wang W, Huang T, Liu Z. An Improved Genome-Wide Polygenic Score Model for Predicting the Risk of Type 2 Diabetes. Front Genet. 2021;12:632385. doi:10.3389/fgene.2021.632385
-
[15]
Mars N, et al., Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat Med. 2020;26:549-557. doi:10.1038/s41591-020-0800-0
-
[16]
Kim NY, et al., The clinical relevance of a polygenic risk score for type 2 diabetes mellitus in the Korean population. Sci Rep. 2024;14:5749. doi:10.1038/s41598-024-55313-0
-
[17]
Influence and role of polygenic risk score in the development of 32 complex diseases
Liu Y, Wang H, Gao T, Yan Y, Wang T, Zheng C, Zeng P. Influence and role of polygenic risk score in the development of 32 complex diseases. J Glob Health. 2025;15:04071. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC11893022/
work page 2025
-
[18]
15 years of GWAS discovery: Realizing the promise
Abdellaoui A, Yengo L, Verweij KJH, Visscher PM. 15 years of GWAS discovery: Realizing the promise. Am J Hum Genet. 2023;110(2):179-194. doi:10.1016/j.ajhg.2022.12.011
-
[19]
IEEE Transactions on Technology and Society
Zicari RV, et al., Z-Inspection: A Process to Assess Trustworthy AI. IEEE Transactions on Technology and Society. 2021;2(2):83-97. doi:10.1109/TTS.2021.3066209. 49
-
[20]
Clinical utility of polygenic risk scores: a critical 2023 appraisal
Koch S, Schmidtke J, Krawczak M, Caliebe A. Clinical utility of polygenic risk scores: a critical 2023 appraisal. J Community Genet. 2023;14(5):471-487. doi:10.1007/s12687-023-00645-z
-
[21]
XPRS: a tool for interpretable and explainable polygenic risk score
Kim NY, Lee S. XPRS: a tool for interpretable and explainable polygenic risk score. Bioinformatics. 2025;41(4):btaf143. doi:10.1093/bioinformatics/btaf143
-
[22]
Steinthorsdottir V, Thorleifsson G, Reynisdottir I, et al., A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet. 2007;39(6):770-775. doi:10.1038/ng2054
-
[23]
Zicari RV, et al., How to Assess Trustworthy AI in Practice. arXiv:2206.09887 [cs.CY]
-
[24]
Engineering-Based Design Methodology for Embedding Ethics in Autonomous Robots
Robertson LJ, Abbas R, Alici G, Munoz A, Michael K. Engineering-Based Design Methodology for Embedding Ethics in Autonomous Robots. Proc. IEEE. 2019;107(3):582-599. doi:10.1109/JPROC.2018.2889678
- [25]
-
[26]
Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act)
European Parliament and Council of the European Union. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act). Official Journal of the European Union, L series, 12 July
work page 2024
-
[27]
Available at: https://rm.coe.int/cai-2024-16rev2-methodology-for-the-risk-and-impact-assessment-of-arti/1680b2a09f
work page 2024
-
[28]
Council of Europe Framework Convention on Artificial Intelligence and Human Rights, Democracy and the Rule of Law. Vilnius, 5.IX.2024. Available at: https://rm.coe.int/1680afae3c. 50
work page 2024
- [30]
-
[31]
International Covenant on Civil and Political Rights (ICCPR)
United Nations General Assembly. International Covenant on Civil and Political Rights (ICCPR). Adopted 16 December 1966, entered into force 23 March
work page 1966
-
[32]
Human Dignity, International Protection
Petersen N. Human Dignity, International Protection. Max Planck Encyclopedia of Public International Law. Oxford University Press. Available at: https://opil.ouplaw.com/display/10.1093/law:epil/9780199231690/law-9780199231690-e809
work page doi:10.1093/law:epil/9780199231690/law-9780199231690-e809
-
[33]
Schwarzerova J, Hurta M, Barton V, et al., A perspective on genetic and polygenic risk scores: advances and limitations and overview of associated tools. Brief Bioinform. 2024;25(3):bbae240. doi:10.1093/bib/bbae240
-
[34]
Jayasinghe D, Eshetie S, Beckmann K, Benyamin B, Lee SH. Advancements and limitations in polygenic risk score methods for genomic prediction: a scoping review. Hum Genet. 2024;143(12):1401-1431. doi:10.1007/s00439-024-02716-8
-
[35]
Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores
Wang Y, Tsuo K, Kanai M, Neale BM, Martin AR. Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annu Rev Biomed Data Sci. 2022;5:293-320. doi:10.1146/annurev-biodatasci-111721-074830
-
[36]
Campbell JP, Mathenge C, Cherwek H, et al., Artificial Intelligence to Reduce Ocular Health Disparities. Transl Vis Sci Technol. 2021;10(3):19. doi:10.1167/tvst.10.3.19
-
[37]
Elhussein A, Baymuradov U, NYGC ALS Consortium, et al., A framework for sharing of clinical and genetic data for precision medicine applications. Nat Med. 2024;30:3578-3589. doi:10.1038/s41591-024-03239-5
-
[38]
Utilization of genetic information for medicines development and equitable benefit sharing
Matsuyama K, Kurihara C, Crawley FP, Kerpel-Fronius S. Utilization of genetic information for medicines development and equitable benefit sharing. Front Genet. 2023;14:1085864. doi:10.3389/fgene.2023.1085864
-
[39]
Genetic data sharing and privacy
Sorani MD, Yue JK, Sharma S, Manley GT, Ferguson AR; TRACK TBI Investigators. Genetic data sharing and privacy. Neuroinformatics. 2015;13(1):1-6. doi:10.1007/s12021-014-9248-z
-
[40]
doi:10.1101/2023.02.21.23286110. 51
-
[41]
Sun J, Wang Y, Folkersen L, et al., Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction. Nat Commun. 2021;12:5276. doi:10.1038/s41467-021-25014-7
-
[42]
Misra A, Truong B, Urbut SM, et al., Instability of high polygenic risk classification and mitigation by integrative scoring. Nat Commun. 2025;16:1584. doi:10.1038/s41467-025-56945-0
-
[43]
Rahnasto J. Genetic data are not always personal, disaggregating the identifiability and sensitivity of genetic data. J Law Biosci. 2023;10(2):lsad029. doi:10.1093/jlb/lsad029
-
[44]
Int J Environ Res Public Health
Amorim M, Silva S, Machado H, et al., Benefits and Risks of Sharing Genomic Data for Research. Int J Environ Res Public Health. 2022;19(14):8788. doi:10.3390/ijerph19148788
-
[45]
Rehm HL, Page AJH, Smith L, et al., GA4GH: International policies and standards for data sharing across genomic research and healthcare. Cell Genomics. 2021;1(2):100029. doi:10.1016/j.xgen.2021.100029
-
[46]
Evangelou E, Warren HR, Mosen-Ansorena D, et al., Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat Genet. 2018;50:1412-1425. doi:10.1038/s41588-018-0205-x
-
[47]
Kuchenbaecker KB, Hopper JL, Barnes DR, et al., Risks of Breast, Ovarian, and Contralateral Breast Cancer for BRCA1 and BRCA2 Mutation Carriers. JAMA. 2017;317(23):2402-2416. doi:10.1001/jama.2017.7112
-
[48]
J Atheroscler Thromb. 2023;30(5):558-586. doi:10.5551/jat.CR005
-
[49]
Deignan JL, Astbury C, Cutting GR, et al., CFTR variant testing: a technical standard of ACMG. Genet Med. 2020;22(8):1288-1295. doi:10.1038/s41436-020-0822-5
-
[50]
To trust or not to trust? An assessment of trust in AI-based systems
Omrani N, Rivieccio G, Fiore U, Schiavone F, Agreda SG. To trust or not to trust? An assessment of trust in AI-based systems. Technol Forecast Soc Change. 2022;181:121763
work page 2022
-
[51]
Understanding Physician Attitudes Toward AI in Clinical Decision-Making: Cross-Sectional Study
Alhazmi F. Understanding Physician Attitudes Toward AI in Clinical Decision-Making: Cross-Sectional Study. JMIR Form Res. 2025;9:e79730
work page 2025
-
[52]
Tamori H, Yamashina H, Mukai M, et al., Acceptance of the use of artificial intelligence in medicine among Japan’s doctors and the public. JMIR Hum Factors. 2022;9(1):e24680
work page 2022
-
[53]
Oh S, Kim JH, Choi SW, et al., Physician confidence in artificial intelligence: an online mobile survey. J Med Internet Res. 2019;21(3):e12422
work page 2019
-
[54]
Shin D. The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI. Int J Hum Comput Stud. 2021;146:102551
work page 2021
-
[55]
Towards trustworthy medical AI ecosystems
Herzog C, Blank S, Stahl BC. Towards trustworthy medical AI ecosystems. AI Soc. 2025;40(4):2119-2139. 52
work page 2025
-
[56]
Donohue KE, Gooch C, Katz A, et al., Pitfalls and challenges in genetic test interpretation. Clin Genet. 2021;99(5):638-649. doi:10.1111/cge.13917
-
[57]
Campbell EG, Clarridge BR, Gokhale M, et al., Data Withholding in Academic Genetics. JAMA. 2002;287(4):473-480. doi:10.1001/jama.287.4.473
-
[58]
Inouye M, Abraham G, Nelson CP, et al., Genomic Risk Prediction of Coronary Artery Disease in 480,000 Adults. J Am Coll Cardiol. 2018;72(16):1883-1893. doi:10.1016/j.jacc.2018.07.079
-
[59]
Antoniou AC, Spurdle AB, Sinilnikova OM, et al., Common breast cancer-predisposition alleles are associated with breast cancer risk in BRCA1 and BRCA2 mutation carriers. Am J Hum Genet. 2008;82(4):937-948. doi:10.1016/j.ajhg.2008.02.008
-
[60]
Kim ES. Deep learning and principal-agent problems of algorithmic governance: The new materialism perspective. Technol Soc. 2020;63:101378
work page 2020
-
[61]
Fahed AC, Wang M, Homburger JR, et al., Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat Commun. 2020;11:3635. doi:10.1038/s41467-020-17374-3
-
[62]
Fritzsche MC, Akyuz K, Cano Abadia M, et al., Ethical layering in AI-driven polygenic risk scores. Front Genet. 2023;14:1098439. doi:10.3389/fgene.2023.1098439. 53 APPENDIX A. SOCIO-TECHNICAL SCENARIOS TEMPLATE
-
[63]
Zicari (Z-lead) …………………………………………………………………………………
Ethics oversight and/or approval Has the AI system already undergone an ethical assessment or other approval? If not - why not? If so, was this internal/external, volunteer/regulated, and what was covered? Did they get a waiver? Was there a clearing, but it was very light or internal and not considered sufficient? 55 APPENDIX B LOG ……………………………………………………………...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.