PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train

Ahmet Polat; Alexander Neumann; Jan Pennekamp; Johannes Lohm\"oller; Karl Kindermann; Laurenz Neumann; Martin G\"orz; Maximilian Jugl; Sascha Welten; Stefan Decker

arxiv: 2412.01275 · v1 · submitted 2024-12-02 · 💻 cs.CR · cs.DC

PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train

Sascha Welten , Karl Kindermann , Ahmet Polat , Martin G\"orz , Maximilian Jugl , Laurenz Neumann , Alexander Neumann , Johannes Lohm\"oller

show 2 more authors

Jan Pennekamp Stefan Decker

This is my paper

Pith reviewed 2026-05-23 08:03 UTC · model grok-4.3

classification 💻 cs.CR cs.DC

keywords Personal Health Trainsecurity auditautomated pipelinevulnerability detectionDevSecOpsGDPR complianceprivacy-preserving data processingdata processing transparency

0 comments

The pith

An automated pipeline audits Personal Health Train code for security vulnerabilities before it runs on hospital data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PASTA-4-PHT, a pipeline that automates security and technical audits for Personal Health Train applications. These applications move analysis code to sensitive data locations such as hospitals, which creates risks because external code interactions with private data are hard to inspect. The pipeline follows DevSecOps principles with multiple detection phases and was tested by inserting known vulnerabilities into one PHT instance and by running it on five real-world PHTs already used in studies. The evaluation shows the pipeline flags the inserted issues and uncovers problems in the real cases. This approach supplies documentation that meets GDPR needs for data processing transparency and cuts down on manual review work.

Core claim

The authors designed PASTA-4-PHT as an automated pipeline that incorporates multiple phases to detect vulnerabilities in PHT code. When tested by introducing vulnerabilities into a PHT and by auditing five real-world PHTs used in studies, the pipeline successfully identified potential vulnerabilities, demonstrating its applicability to real-world scenarios and its role in enhancing security and transparency.

What carries the argument

The PASTA-4-PHT pipeline, which automates vulnerability detection across multiple phases inspired by DevSecOps principles for PHT environments.

If this is right

The pipeline reduces manual overhead when researchers audit PHT code for security risks.
It supplies documentation that supports GDPR requirements for data management and protection.
It functions as a decision-making tool for assessing and recording potential vulnerabilities in data-processing code.
It contributes to greater security and transparency of data processing activities inside the PHT framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same phased audit structure could be reused for other code-to-data frameworks that move analysis across institutional boundaries.
Embedding the pipeline inside continuous-integration systems would allow checks to run automatically each time a PHT application is updated.
The generated audit records could serve as evidence in formal regulatory reviews beyond the initial assessment step.

Load-bearing premise

The deliberately introduced vulnerabilities and the five real-world PHTs are representative of the security risks in typical PHT deployments.

What would settle it

Running the pipeline on a PHT that contains a real security vulnerability the pipeline misses would show the identification claim does not hold.

read the original abstract

With the introduction of data protection regulations, the need for innovative privacy-preserving approaches to process and analyse sensitive data has become apparent. One approach is the Personal Health Train (PHT) that brings analysis code to the data and conducts the data processing at the data premises. However, despite its demonstrated success in various studies, the execution of external code in sensitive environments, such as hospitals, introduces new research challenges because the interactions of the code with sensitive data are often incomprehensible and lack transparency. These interactions raise concerns about potential effects on the data and increases the risk of data breaches. To address this issue, this work discusses a PHT-aligned security and audit pipeline inspired by DevSecOps principles. The automated pipeline incorporates multiple phases that detect vulnerabilities. To thoroughly study its versatility, we evaluate this pipeline in two ways. First, we deliberately introduce vulnerabilities into a PHT. Second, we apply our pipeline to five real-world PHTs, which have been utilised in real-world studies, to audit them for potential vulnerabilities. Our evaluation demonstrates that our designed pipeline successfully identifies potential vulnerabilities and can be applied to real-world studies. In compliance with the requirements of the GDPR for data management, documentation, and protection, our automated approach supports researchers using in their data-intensive work and reduces manual overhead. It can be used as a decision-making tool to assess and document potential vulnerabilities in code for data processing. Ultimately, our work contributes to an increased security and overall transparency of data processing activities within the PHT framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PASTA-4-PHT applies DevSecOps to PHT security audits with real-study tests, but the evaluation gives no metrics or breakdown so effectiveness stays unproven.

read the letter

This paper describes PASTA-4-PHT, a pipeline that runs automated security checks on code used in Personal Health Train setups. The goal is to catch vulnerabilities before external analysis code touches sensitive hospital data, while also helping with GDPR documentation. They adapt standard DevSecOps steps into phases that scan for issues in the PHT context, which is a reasonable domain fit since PHT moves code to the data rather than the other way around. Testing on one PHT with injected flaws plus five real PHTs from prior medical studies shows they tried to move beyond toy examples. That part is practical and addresses a clear operational need in health data work. The main weakness is the evaluation itself. The abstract claims the pipeline successfully finds vulnerabilities and works on real studies, yet it gives no detection rates, no list of issues actually found in the five cases, no false-positive counts, and no criteria for why those particular PHTs or injected problems represent typical risks. Without that, the success statement is hard to judge. The paper is aimed at teams already running or planning PHT deployments who need a repeatable audit process. It could save manual review time if the pipeline holds up. I would send it for peer review so referees can ask for the missing quantitative results and a clearer taxonomy of covered vulnerabilities.

Referee Report

2 major / 0 minor

Summary. The paper presents PASTA-4-PHT, a DevSecOps-inspired automated pipeline for security and technical audits tailored to Personal Health Train (PHT) environments. It incorporates multiple detection phases and evaluates the pipeline via two experiments: deliberate injection of vulnerabilities into one PHT, and application to five real-world PHTs previously used in actual studies. The central claim is that the pipeline successfully identifies potential vulnerabilities, is applicable to real-world studies, reduces manual overhead, and supports GDPR-compliant documentation of data-processing risks.

Significance. If the evaluation claims hold with quantitative metrics and representative cases, the work could provide a practical tool for improving transparency and security in privacy-preserving health-data analysis frameworks, lowering the barrier for researchers to audit external code execution in sensitive environments such as hospitals.

major comments (2)

[Abstract; Evaluation section] Abstract and Evaluation section: the claim that the pipeline 'successfully identifies potential vulnerabilities' rests on two evaluations, yet no quantitative results, detection metrics (e.g., precision, recall, false-positive rates), phase-by-phase breakdown, or error analysis are supplied. Without these, the data cannot be assessed as supporting the central claim.
[Abstract; Evaluation section] Abstract and Evaluation section: the representativeness of the deliberately introduced vulnerabilities and the five real-world PHTs is not established. No selection criteria, vulnerability taxonomy, or comparison to typical PHT deployments (hospital data stations, standard containers, orchestration risks) are provided, so success on these instances does not establish general applicability to real-world studies.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below and will revise the manuscript to strengthen the evaluation.

read point-by-point responses

Referee: [Abstract; Evaluation section] Abstract and Evaluation section: the claim that the pipeline 'successfully identifies potential vulnerabilities' rests on two evaluations, yet no quantitative results, detection metrics (e.g., precision, recall, false-positive rates), phase-by-phase breakdown, or error analysis are supplied. Without these, the data cannot be assessed as supporting the central claim.

Authors: We agree that quantitative metrics would strengthen the assessment. The experiments are case studies demonstrating identification in specific instances rather than a controlled benchmark with full ground truth. In revision we will add a phase-by-phase breakdown, report the number of vulnerabilities detected in each experiment, and include an error analysis for the injected-vulnerability case where detection rates can be computed. We will explicitly note that precision and recall cannot be calculated for the real-world PHTs due to absence of exhaustive ground truth and will discuss this limitation. revision: yes
Referee: [Abstract; Evaluation section] Abstract and Evaluation section: the representativeness of the deliberately introduced vulnerabilities and the five real-world PHTs is not established. No selection criteria, vulnerability taxonomy, or comparison to typical PHT deployments (hospital data stations, standard containers, orchestration risks) are provided, so success on these instances does not establish general applicability to real-world studies.

Authors: The five real-world PHTs were drawn from previously published studies that used actual hospital data stations. The injected vulnerabilities were chosen to cover representative container and orchestration risks. We will revise the Evaluation section to include explicit selection criteria, reference a standard vulnerability taxonomy such as CWE, and add a comparison of our cases against typical PHT deployments including hospital environments and standard container setups. This will better support claims of applicability. revision: yes

Circularity Check

0 steps flagged

No circularity; descriptive engineering pipeline with no derivations or fitted claims

full rationale

The paper describes a security audit pipeline and reports empirical results from deliberately injected flaws plus five real-world PHT instances. No equations, parameters, uniqueness theorems, or self-citation chains appear in the provided text. The central claim is a direct statement about observed behavior on the chosen test cases rather than a derived prediction that reduces to its own inputs. Representativeness concerns are validity issues, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an applied engineering paper proposing a software pipeline; no free parameters, axioms, or invented entities in a mathematical sense.

pith-pipeline@v0.9.0 · 5842 in / 1035 out tokens · 31658 ms · 2026-05-23T08:03:40.577560+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

Wirth, F., Meurers, T., Johns, M. et al. Privacy-preserving data sharing infras- tructures for medical research: systematization and comparison. BMC Medical Informatics and Decision Making 21, 242 (2021). URL https://doi.org/10.1186/ s12911-021-01602-x

work page 2021
[2]

& Saadi, M

Abouelmehdi, K., Beni-Hssane, A., Khaloufi, H. & Saadi, M. Big data security and privacy in healthcare: A review. Procedia Computer Science 27 113, 73–80 (2017). URL https://www.sciencedirect.com/science/article/pii/ S1877050917317015. The 8th International Conference on Emerging Ubiqui- tous Systems and Pervasive Networks (EUSPN 2017) / The 7th Internati...

work page 2017
[3]

Gaye, A. et al. Datashield: taking the analysis to the data, not the data to the analysis. International Journal of Epidemiology 43, 1929–1944 (2014)

work page 1929
[4]

Beyan, O. et al. Distributed Analytics on Sensitive Medical Data: The Personal Health Train. Data Intelligence 2, 96–107 (2020). URL https://direct.mit.edu/ dint/article/2/1-2/96-107/9997

work page 2020
[5]

Choudhury, A., Janssen, E., Bongers, B. et al. Colorectal cancer health and care quality indicators in a federated setting using the personal health train. BMC Medical Informatics and Decision Making 24, 121 (2024). URL https: //doi.org/10.1186/s12911-024-02526-y

work page doi:10.1186/s12911-024-02526-y 2024
[6]

Kim, J., Lim, M., Kim, K. et al. Continual learning framework for a multi- center study with an application to electrocardiogram. BMC Medical Infor- matics and Decision Making 24, 67 (2024). URL https://doi.org/10.1186/ s12911-024-02464-9

work page 2024
[7]

Welten, S. et al. A Privacy-Preserving Distributed Analytics Platform for Health Care Data. Methods of Information in Medicine (2022)

work page 2022
[8]

Budin-Ljøsne, I. et al. DataSHIELD: An Ethically Robust Solution to Multiple-Site Individual-Level Data Analysis. Public Health Genomics 18, 87–96 (2014). URL https://doi.org/10.1159/000368959. eprint: https://karger.com/phg/article-pdf/18/2/87/3426851/000368959.pdf

work page doi:10.1159/000368959 2014
[9]

Welten, S. et al. DAMS: A Distributed Analytics Metadata Schema. Data Intelligence 3, 528–547 (2021). URL https://doi.org/10.1162/dint a 00100

work page doi:10.1162/dint 2021
[10]

van Soest, J. et al. Using the personal health train for automated and privacy- preserving analytics on vertically partitioned data 247, 581–585 (2018)

work page 2018
[11]

O., Ferreira Pires, L., Graciano Martinez, V., Rebelo Moreira, J

Bonino da Silva Santos, L. O., Ferreira Pires, L., Graciano Martinez, V., Rebelo Moreira, J. L. & Silva Souza Guizzardi, R. Personal health train archi- tecture with dynamic cloud staging. SN Computer Science 4, 14 (2022). URL https://doi.org/10.1007/s42979-022-01422-4

work page doi:10.1007/s42979-022-01422-4 2022
[12]

de Arruda Botelho Herr, M. et al. Bringing the algorithms to the data – secure distributed medical analytics using the personal health train (pht-medic) (2022). 2212.03481. 28

work page arXiv 2022
[13]

& Moore, G

Dempsey, K., Takamura, E., Eavy, P. & Moore, G. Automation Support for Security Control Assessments: Software Vulnerability Management. Tech. Rep. NIST Internal or Interagency Report (NISTIR) 8011 Vol. 4, National Institute of Standards and Technology (2020). URL https://csrc.nist.gov/pubs/ir/8011/v4/ final

work page 2020
[14]

Holzinger, G

Cheng, L., Liu, F. & Yao, D. D. Enterprise data breach: causes, challenges, pre- vention, and future directions. WIREs Data Mining and Knowledge Discovery 7, e1211 (2017). URL https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm. 1211

work page doi:10.1002/widm 2017
[15]

& Zannone, N

Elahi, G., Yu, E. & Zannone, N. A vulnerability-centric requirements engineering framework: analyzing security attacks, countermeasures, and requirements based on vulnerabilities. Requirements Engineering 15, 41–62 (2010)

work page 2010
[16]

Brilhante, M. d. F., Pestana, D., Pestana, P. & Rocha, M. L. Measuring the risk of vulnerabilities exploitation. AppliedMath 4, 20–54 (2024). URL https: //www.mdpi.com/2673-9909/4/1/2

work page 2024
[17]

& Von dem Bussche, A

Voigt, P. & Von dem Bussche, A. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 10–5555 (2017)

work page 2017
[18]

Software source code identification: Use cases and identifier schemes for persistent software source code identification (2020)

Research Data Alliance/FORCE11 Software Source Code Identification WGet al. Software source code identification: Use cases and identifier schemes for persistent software source code identification (2020). October 2020

work page 2020
[19]

Chue Hong, N. P. et al. FAIR Principles for Research Software version 1.0 (FAIR4RS Principles v1.0) (2022). URL https://doi.org/10.15497/RDA00068

work page doi:10.15497/rda00068 2022
[20]

Zhang, C., Choudhury, A., Volmer, L. et al. Secure and private healthcare ana- lytics: A feasibility study of federated deep learning with personal health train. Research Square 1 (2023). URL https://doi.org/10.21203/rs.3.rs-3158418/v1. PREPRINT (Version 1) available at Research Square

work page doi:10.21203/rs.3.rs-3158418/v1 2023
[21]

Shi, Z., Zhovannik, I., Traverso, A. et al. Distributed radiomics as a signature validation study using the personal health train infrastructure. Sci Data 6, 218 (2019). URL https://doi.org/10.1038/s41597-019-0241-0

work page doi:10.1038/s41597-019-0241-0 2019
[22]

Beyan, O. et al. Distributed Analytics on Sensitive Medical Data: The Personal Health Train. Data Intelligence 2, 96–107 (2020). URL https://doi.org/10.1162/ dint a 00032

work page 2020
[23]

sawelt/PASTA-4-PHT: New Release (2024)

KarlKindermann. sawelt/PASTA-4-PHT: New Release (2024). URL https://doi. org/10.5281/zenodo.11505228. 29

work page doi:10.5281/zenodo.11505228 2024
[24]

& Serrano, N

Ebert, C., Gallardo, G., Hernantes, J. & Serrano, N. Devops. IEEE Software 33, 94–100 (2016)

work page 2016
[25]

& Oivo, M

Lwakatare, L., Kuvaja, P. & Oivo, M. Lassenius, C., Dingsøyr, T. & Paasivaara, M. (eds) Dimensions of devops . (eds Lassenius, C., Dingsøyr, T. & Paasivaara, M.) Agile Processes in Software Engineering and Extreme Programming , Vol. 212 of Lecture Notes in Business Information Processing (Springer, Cham, 2015)

work page 2015
[26]

& Colomo-Palacios, R

Myrbakken, H. & Colomo-Palacios, R. Mas, A., Mesquida, A., O’Connor, R., Rout, T. & Dorling, A. (eds) Devsecops: A multivocal literature review . (eds Mas, A., Mesquida, A., O’Connor, R., Rout, T. & Dorling, A.) Software Pro- cess Improvement and Capability Determination , Vol. 770 of Communications in Computer and Information Science (Springer, Cham, 2017)

work page 2017
[27]

N., Zahedi, M., Babar, M

Rajapakse, R. N., Zahedi, M., Babar, M. A. & Shen, H. Challenges and solutions when adopting devsecops: A systematic review. Information and Software Tech- nology 141, 106700 (2022). URL https://www.sciencedirect.com/science/article/ pii/S0950584921001543

work page 2022
[28]

& Di Pietro, R

Combe, T., Martin, A. & Di Pietro, R. To docker or not to docker: A security perspective. IEEE Cloud Computing 3, 54–62 (2016)

work page 2016
[29]

Interactive application security testing 558–561 (2019)

Pan, Y. Interactive application security testing 558–561 (2019)

work page 2019
[30]

& Argyros, M

Mateo Tudela, F., Bermejo Higuera, J.-R., Bermejo Higuera, J., Sicilia Montalvo, J.-A. & Argyros, M. I. On combining static, dynamic and interactive analysis security testing tools to improve owasp top ten security vulnerability detection in web applications. Applied Sciences 10 (2020). URL https://www.mdpi.com/ 2076-3417/10/24/9119

work page 2020
[31]

Felderer, M. et al. Security Testing: A Survey (Elsevier, Cambridge, MA, USA, 2016)

work page 2016
[32]

& Mell, P

Scarfone, K. & Mell, P. An analysis of cvss version 2 vulnerability scoring 516–525 (2009)

work page 2009
[33]

& Gligoroski, D

Wist, K., Helsem, M. & Gligoroski, D. Vulnerability analysis of 2500 docker hub images 307–327 (2021)

work page 2021
[34]

Stouffer, K. et al. Guide to industrial control systems (ics) security. NIST Special Publication 800-82r3, National Institute of Standards and Technology, Gaithersburg, MD (2023). URL https://doi.org/10.6028/NIST.SP.800-82r3

work page doi:10.6028/nist.sp.800-82r3 2023
[35]

& Rieck, K

Yamaguchi, F., Lottmann, M. & Rieck, K. Generalized vulnerability extrapola- tion using abstract syntax trees 359–368 (2012). URL https://doi.org/10.1145/ 2420950.2421003. 30

work page arXiv 2012
[36]

& Wehrle, K

Dahlmanns, M., Sander, C., Decker, R. & Wehrle, K. Secrets revealed in container images: An internet-wide study on occurrence and impact 797–811 (2023). URL https://doi.org/10.1145/3579856.3590329

work page doi:10.1145/3579856.3590329 2023
[37]

Git can facilitate greater reproducibility and increased transparency in science

Ram, K. Git can facilitate greater reproducibility and increased transparency in science. Source Code for Biology and Medicine 8, 7 (2013). URL https://doi. org/10.1186/1751-0473-8-7

work page doi:10.1186/1751-0473-8-7 2013
[38]

Neamtiu, I., Foster, J. S. & Hicks, M. Understanding source code evolution using abstract syntax tree matching. SIGSOFT Softw. Eng. Notes 30, 1–5 (2005). URL https://doi.org/10.1145/1082983.1083143

work page doi:10.1145/1082983.1083143 2005
[39]

& Enck, W

Shu, R., Gu, X. & Enck, W. A study of security vulnerabilities on docker hub 269–280 (2017). URL https://doi.org/10.1145/3029806.3029832

work page doi:10.1145/3029806.3029832 2017
[40]

& Decker, S

Welten, S., Weber, S., Holt, A., Beyan, O. & Decker, S. Will it run?—a proof of concept for smoke testing decentralized data analytics experiments. Frontiers in Medicine 10, 1305415 (2024). URL https://doi.org/10.3389/fmed.2023.1305415

work page doi:10.3389/fmed.2023.1305415 2024
[41]

PADME-PHT/playground: v 1.0.0 (2024)

Weber, S. PADME-PHT/playground: v 1.0.0 (2024). URL https://zenodo.org/ doi/10.5281/zenodo.11184159

work page doi:10.5281/zenodo.11184159 2024
[42]

Welten, S. et al. Multi-institutional breast cancer detection using a secure on- boarding service for distributed analytics. Applied Sciences 12 (2022). URL https://www.mdpi.com/2076-3417/12/9/4336

work page 2022
[43]

Mou, Y. et al. Distributed skin lesion analysis across decentralised data sources. Studies in Health Technology and Informatics 281, 352–356 (2021)

work page 2021
[44]

The impact of the eu general data protection regulation on scientific research

Chassang, G. The impact of the eu general data protection regulation on scientific research. Ecancermedicalscience 11, 709 (2017)

work page 2017
[45]

& Rost, M

Bieker, F., Friedewald, M., Hansen, M., Obersteller, H. & Rost, M. A process for data protection impact assessment under the european general data protection regulation 21–37 (2016)

work page 2016
[46]

Sirur, S., Nurse, J. R. & Webb, H. Are we there yet? understanding the chal- lenges faced in complying with the general data protection regulation (gdpr) 88–95 (2018). URL https://doi.org/10.1145/3267357.3267368

work page doi:10.1145/3267357.3267368 2018
[47]

& Tiropanis, T

Hasselbring, W., Carr, L., Hettrick, S., Packer, H. & Tiropanis, T. From FAIR research data toward FAIR and open research software. it - Information Technol- ogy 62, 39–47 (2020). URL https://www.degruyter.com/document/doi/10.1515/ itit-2019-0040/html

work page 2020
[48]

Lamprecht, A.-L. et al. Towards FAIR principles for research software. Data Sci- ence 3, 37–59 (2020). URL https://content.iospress.com/articles/data-science/ 31 ds190026#ref001

work page 2020
[49]

Barker, M., Chue Hong, N., Katz, D. et al. Introducing the fair principles for research software. Sci Data 9, 622 (2022). URL https://doi.org/10.1038/ s41597-022-01710-x

work page 2022
[50]

Tunde-Onadele, O., He, J., Dai, T. & Gu, X. A study on container vulnerability exploit detection 121–127 (2019)

work page 2019
[51]

& Toor, S

Javed, O. & Toor, S. An evaluation of container security vulnerability detection tools 95–101 (2021). URL https://doi.org/10.1145/3481646.3481661. 32

work page doi:10.1145/3481646.3481661 2021

[1] [1]

Wirth, F., Meurers, T., Johns, M. et al. Privacy-preserving data sharing infras- tructures for medical research: systematization and comparison. BMC Medical Informatics and Decision Making 21, 242 (2021). URL https://doi.org/10.1186/ s12911-021-01602-x

work page 2021

[2] [2]

& Saadi, M

Abouelmehdi, K., Beni-Hssane, A., Khaloufi, H. & Saadi, M. Big data security and privacy in healthcare: A review. Procedia Computer Science 27 113, 73–80 (2017). URL https://www.sciencedirect.com/science/article/pii/ S1877050917317015. The 8th International Conference on Emerging Ubiqui- tous Systems and Pervasive Networks (EUSPN 2017) / The 7th Internati...

work page 2017

[3] [3]

Gaye, A. et al. Datashield: taking the analysis to the data, not the data to the analysis. International Journal of Epidemiology 43, 1929–1944 (2014)

work page 1929

[4] [4]

Beyan, O. et al. Distributed Analytics on Sensitive Medical Data: The Personal Health Train. Data Intelligence 2, 96–107 (2020). URL https://direct.mit.edu/ dint/article/2/1-2/96-107/9997

work page 2020

[5] [5]

Choudhury, A., Janssen, E., Bongers, B. et al. Colorectal cancer health and care quality indicators in a federated setting using the personal health train. BMC Medical Informatics and Decision Making 24, 121 (2024). URL https: //doi.org/10.1186/s12911-024-02526-y

work page doi:10.1186/s12911-024-02526-y 2024

[6] [6]

Kim, J., Lim, M., Kim, K. et al. Continual learning framework for a multi- center study with an application to electrocardiogram. BMC Medical Infor- matics and Decision Making 24, 67 (2024). URL https://doi.org/10.1186/ s12911-024-02464-9

work page 2024

[7] [7]

Welten, S. et al. A Privacy-Preserving Distributed Analytics Platform for Health Care Data. Methods of Information in Medicine (2022)

work page 2022

[8] [8]

Budin-Ljøsne, I. et al. DataSHIELD: An Ethically Robust Solution to Multiple-Site Individual-Level Data Analysis. Public Health Genomics 18, 87–96 (2014). URL https://doi.org/10.1159/000368959. eprint: https://karger.com/phg/article-pdf/18/2/87/3426851/000368959.pdf

work page doi:10.1159/000368959 2014

[9] [9]

Welten, S. et al. DAMS: A Distributed Analytics Metadata Schema. Data Intelligence 3, 528–547 (2021). URL https://doi.org/10.1162/dint a 00100

work page doi:10.1162/dint 2021

[10] [10]

van Soest, J. et al. Using the personal health train for automated and privacy- preserving analytics on vertically partitioned data 247, 581–585 (2018)

work page 2018

[11] [11]

O., Ferreira Pires, L., Graciano Martinez, V., Rebelo Moreira, J

Bonino da Silva Santos, L. O., Ferreira Pires, L., Graciano Martinez, V., Rebelo Moreira, J. L. & Silva Souza Guizzardi, R. Personal health train archi- tecture with dynamic cloud staging. SN Computer Science 4, 14 (2022). URL https://doi.org/10.1007/s42979-022-01422-4

work page doi:10.1007/s42979-022-01422-4 2022

[12] [12]

de Arruda Botelho Herr, M. et al. Bringing the algorithms to the data – secure distributed medical analytics using the personal health train (pht-medic) (2022). 2212.03481. 28

work page arXiv 2022

[13] [13]

& Moore, G

Dempsey, K., Takamura, E., Eavy, P. & Moore, G. Automation Support for Security Control Assessments: Software Vulnerability Management. Tech. Rep. NIST Internal or Interagency Report (NISTIR) 8011 Vol. 4, National Institute of Standards and Technology (2020). URL https://csrc.nist.gov/pubs/ir/8011/v4/ final

work page 2020

[14] [14]

Holzinger, G

Cheng, L., Liu, F. & Yao, D. D. Enterprise data breach: causes, challenges, pre- vention, and future directions. WIREs Data Mining and Knowledge Discovery 7, e1211 (2017). URL https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm. 1211

work page doi:10.1002/widm 2017

[15] [15]

& Zannone, N

Elahi, G., Yu, E. & Zannone, N. A vulnerability-centric requirements engineering framework: analyzing security attacks, countermeasures, and requirements based on vulnerabilities. Requirements Engineering 15, 41–62 (2010)

work page 2010

[16] [16]

Brilhante, M. d. F., Pestana, D., Pestana, P. & Rocha, M. L. Measuring the risk of vulnerabilities exploitation. AppliedMath 4, 20–54 (2024). URL https: //www.mdpi.com/2673-9909/4/1/2

work page 2024

[17] [17]

& Von dem Bussche, A

Voigt, P. & Von dem Bussche, A. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 10–5555 (2017)

work page 2017

[18] [18]

Software source code identification: Use cases and identifier schemes for persistent software source code identification (2020)

Research Data Alliance/FORCE11 Software Source Code Identification WGet al. Software source code identification: Use cases and identifier schemes for persistent software source code identification (2020). October 2020

work page 2020

[19] [19]

Chue Hong, N. P. et al. FAIR Principles for Research Software version 1.0 (FAIR4RS Principles v1.0) (2022). URL https://doi.org/10.15497/RDA00068

work page doi:10.15497/rda00068 2022

[20] [20]

Zhang, C., Choudhury, A., Volmer, L. et al. Secure and private healthcare ana- lytics: A feasibility study of federated deep learning with personal health train. Research Square 1 (2023). URL https://doi.org/10.21203/rs.3.rs-3158418/v1. PREPRINT (Version 1) available at Research Square

work page doi:10.21203/rs.3.rs-3158418/v1 2023

[21] [21]

Shi, Z., Zhovannik, I., Traverso, A. et al. Distributed radiomics as a signature validation study using the personal health train infrastructure. Sci Data 6, 218 (2019). URL https://doi.org/10.1038/s41597-019-0241-0

work page doi:10.1038/s41597-019-0241-0 2019

[22] [22]

Beyan, O. et al. Distributed Analytics on Sensitive Medical Data: The Personal Health Train. Data Intelligence 2, 96–107 (2020). URL https://doi.org/10.1162/ dint a 00032

work page 2020

[23] [23]

sawelt/PASTA-4-PHT: New Release (2024)

KarlKindermann. sawelt/PASTA-4-PHT: New Release (2024). URL https://doi. org/10.5281/zenodo.11505228. 29

work page doi:10.5281/zenodo.11505228 2024

[24] [24]

& Serrano, N

Ebert, C., Gallardo, G., Hernantes, J. & Serrano, N. Devops. IEEE Software 33, 94–100 (2016)

work page 2016

[25] [25]

& Oivo, M

Lwakatare, L., Kuvaja, P. & Oivo, M. Lassenius, C., Dingsøyr, T. & Paasivaara, M. (eds) Dimensions of devops . (eds Lassenius, C., Dingsøyr, T. & Paasivaara, M.) Agile Processes in Software Engineering and Extreme Programming , Vol. 212 of Lecture Notes in Business Information Processing (Springer, Cham, 2015)

work page 2015

[26] [26]

& Colomo-Palacios, R

Myrbakken, H. & Colomo-Palacios, R. Mas, A., Mesquida, A., O’Connor, R., Rout, T. & Dorling, A. (eds) Devsecops: A multivocal literature review . (eds Mas, A., Mesquida, A., O’Connor, R., Rout, T. & Dorling, A.) Software Pro- cess Improvement and Capability Determination , Vol. 770 of Communications in Computer and Information Science (Springer, Cham, 2017)

work page 2017

[27] [27]

N., Zahedi, M., Babar, M

Rajapakse, R. N., Zahedi, M., Babar, M. A. & Shen, H. Challenges and solutions when adopting devsecops: A systematic review. Information and Software Tech- nology 141, 106700 (2022). URL https://www.sciencedirect.com/science/article/ pii/S0950584921001543

work page 2022

[28] [28]

& Di Pietro, R

Combe, T., Martin, A. & Di Pietro, R. To docker or not to docker: A security perspective. IEEE Cloud Computing 3, 54–62 (2016)

work page 2016

[29] [29]

Interactive application security testing 558–561 (2019)

Pan, Y. Interactive application security testing 558–561 (2019)

work page 2019

[30] [30]

& Argyros, M

Mateo Tudela, F., Bermejo Higuera, J.-R., Bermejo Higuera, J., Sicilia Montalvo, J.-A. & Argyros, M. I. On combining static, dynamic and interactive analysis security testing tools to improve owasp top ten security vulnerability detection in web applications. Applied Sciences 10 (2020). URL https://www.mdpi.com/ 2076-3417/10/24/9119

work page 2020

[31] [31]

Felderer, M. et al. Security Testing: A Survey (Elsevier, Cambridge, MA, USA, 2016)

work page 2016

[32] [32]

& Mell, P

Scarfone, K. & Mell, P. An analysis of cvss version 2 vulnerability scoring 516–525 (2009)

work page 2009

[33] [33]

& Gligoroski, D

Wist, K., Helsem, M. & Gligoroski, D. Vulnerability analysis of 2500 docker hub images 307–327 (2021)

work page 2021

[34] [34]

Stouffer, K. et al. Guide to industrial control systems (ics) security. NIST Special Publication 800-82r3, National Institute of Standards and Technology, Gaithersburg, MD (2023). URL https://doi.org/10.6028/NIST.SP.800-82r3

work page doi:10.6028/nist.sp.800-82r3 2023

[35] [35]

& Rieck, K

Yamaguchi, F., Lottmann, M. & Rieck, K. Generalized vulnerability extrapola- tion using abstract syntax trees 359–368 (2012). URL https://doi.org/10.1145/ 2420950.2421003. 30

work page arXiv 2012

[36] [36]

& Wehrle, K

Dahlmanns, M., Sander, C., Decker, R. & Wehrle, K. Secrets revealed in container images: An internet-wide study on occurrence and impact 797–811 (2023). URL https://doi.org/10.1145/3579856.3590329

work page doi:10.1145/3579856.3590329 2023

[37] [37]

Git can facilitate greater reproducibility and increased transparency in science

Ram, K. Git can facilitate greater reproducibility and increased transparency in science. Source Code for Biology and Medicine 8, 7 (2013). URL https://doi. org/10.1186/1751-0473-8-7

work page doi:10.1186/1751-0473-8-7 2013

[38] [38]

Neamtiu, I., Foster, J. S. & Hicks, M. Understanding source code evolution using abstract syntax tree matching. SIGSOFT Softw. Eng. Notes 30, 1–5 (2005). URL https://doi.org/10.1145/1082983.1083143

work page doi:10.1145/1082983.1083143 2005

[39] [39]

& Enck, W

Shu, R., Gu, X. & Enck, W. A study of security vulnerabilities on docker hub 269–280 (2017). URL https://doi.org/10.1145/3029806.3029832

work page doi:10.1145/3029806.3029832 2017

[40] [40]

& Decker, S

Welten, S., Weber, S., Holt, A., Beyan, O. & Decker, S. Will it run?—a proof of concept for smoke testing decentralized data analytics experiments. Frontiers in Medicine 10, 1305415 (2024). URL https://doi.org/10.3389/fmed.2023.1305415

work page doi:10.3389/fmed.2023.1305415 2024

[41] [41]

PADME-PHT/playground: v 1.0.0 (2024)

Weber, S. PADME-PHT/playground: v 1.0.0 (2024). URL https://zenodo.org/ doi/10.5281/zenodo.11184159

work page doi:10.5281/zenodo.11184159 2024

[42] [42]

Welten, S. et al. Multi-institutional breast cancer detection using a secure on- boarding service for distributed analytics. Applied Sciences 12 (2022). URL https://www.mdpi.com/2076-3417/12/9/4336

work page 2022

[43] [43]

Mou, Y. et al. Distributed skin lesion analysis across decentralised data sources. Studies in Health Technology and Informatics 281, 352–356 (2021)

work page 2021

[44] [44]

The impact of the eu general data protection regulation on scientific research

Chassang, G. The impact of the eu general data protection regulation on scientific research. Ecancermedicalscience 11, 709 (2017)

work page 2017

[45] [45]

& Rost, M

Bieker, F., Friedewald, M., Hansen, M., Obersteller, H. & Rost, M. A process for data protection impact assessment under the european general data protection regulation 21–37 (2016)

work page 2016

[46] [46]

Sirur, S., Nurse, J. R. & Webb, H. Are we there yet? understanding the chal- lenges faced in complying with the general data protection regulation (gdpr) 88–95 (2018). URL https://doi.org/10.1145/3267357.3267368

work page doi:10.1145/3267357.3267368 2018

[47] [47]

& Tiropanis, T

Hasselbring, W., Carr, L., Hettrick, S., Packer, H. & Tiropanis, T. From FAIR research data toward FAIR and open research software. it - Information Technol- ogy 62, 39–47 (2020). URL https://www.degruyter.com/document/doi/10.1515/ itit-2019-0040/html

work page 2020

[48] [48]

Lamprecht, A.-L. et al. Towards FAIR principles for research software. Data Sci- ence 3, 37–59 (2020). URL https://content.iospress.com/articles/data-science/ 31 ds190026#ref001

work page 2020

[49] [49]

Barker, M., Chue Hong, N., Katz, D. et al. Introducing the fair principles for research software. Sci Data 9, 622 (2022). URL https://doi.org/10.1038/ s41597-022-01710-x

work page 2022

[50] [50]

Tunde-Onadele, O., He, J., Dai, T. & Gu, X. A study on container vulnerability exploit detection 121–127 (2019)

work page 2019

[51] [51]

& Toor, S

Javed, O. & Toor, S. An evaluation of container security vulnerability detection tools 95–101 (2021). URL https://doi.org/10.1145/3481646.3481661. 32

work page doi:10.1145/3481646.3481661 2021