Multi-Site Health Research Integrating Complementary Data Sources: A Scoping Review of Statistical Inference Methods for Vertically Partitioned Data
Pith reviewed 2026-05-13 19:13 UTC · model grok-4.3
The pith
Vertical methods for statistical inference on partitioned health data rarely achieve equivalence to centralized results with efficient communication and strong privacy protection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The scope of existing approaches enabling statistical inference for vertically partitioned data is still relatively limited. Most existing methods do not concurrently achieve results equivalent to centralized analyses, high communication efficiency, and guaranteed protection of individual-level data.
What carries the argument
Systematic extraction of three properties across identified methods: comparability with pooled analysis, efficiency of communication schemes, and confidentiality safeguards.
If this is right
- Linear and logistic regression are the most common inference tasks addressed by vertical methods.
- Equivalence to pooled analyses is not systematically demonstrated across all proposed methods.
- Most methods require multiple communication rounds between participating parties.
- Only a minority of articles provide explicit privacy assessments even though nearly all describe their approach as privacy-preserving.
- Applications of these methods to real-world vertically partitioned health data remain limited.
Where Pith is reading between the lines
- New vertical methods should be designed from the start to satisfy equivalence, low communication cost, and verifiable privacy in one package.
- Standardized reporting requirements for privacy guarantees would make it easier to compare methods across studies.
- The scarcity of real-data applications suggests a need to test promising methods on actual multi-site health datasets to move from theory to practice.
Load-bearing premise
That the database searches and citation screening captured a representative sample of all vertical methods and that the extracted properties were reported consistently enough across papers to support the overall characterization.
What would settle it
Identification of many additional vertical methods that simultaneously deliver exact equivalence to pooled results, require only one communication round, and include formal privacy proofs would indicate the scope is broader than reported.
Figures
read the original abstract
To address the multidimensional nature of health-related questions, advances in health research often require integrating information from various data sources within statistical analyses. When complementary information pertaining to the same set of individuals are distributed across different institutions, vertical methods make it possible to obtain analysis results without sharing or pooling individual-level data. To guide stakeholders toward a transparent use of vertical methods, this study aims to (1) Identify existing vertical methods enabling statistical inference; and (2) Characterize the methodological properties of these methods and the current extent of their use with health data. We conducted a scoping review using four interdisciplinary databases. We then systematically extracted the characteristics of identified vertical methods with respect to comparability with the pooled analysis, efficiency of communication schemes and confidentiality. We additionally screened studies that cited included articles to identify applications on vertically partitioned real-world health data. Among 2887 articles initially screened, 30 were included in the review. Inference for the linear and the logistic regression framework were the most frequent statistical inference tasks undertaken in proposed methods. Equivalence with the pooled analyses was not systematically addressed and most methods required multiple communications between participating parties. Almost all articles described their approach as privacy-preserving, although a minority provided privacy assessments. The scope of existing approaches enabling statistical inference for vertically partitioned data is still relatively limited. Most existing methods do not concurrently achieve results equivalent to centralized analyses, high communication efficiency, and guaranteed protection of individual-level data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a scoping review that screened 2887 articles across four databases and included 30 papers on statistical inference methods for vertically partitioned data. It focuses on methods for linear and logistic regression, extracts properties related to equivalence with pooled analyses, communication efficiency, and privacy protection, and concludes that the scope of existing approaches remains limited, with most methods failing to concurrently achieve equivalence to centralized results, high communication efficiency, and guaranteed individual-level data protection.
Significance. If the property extraction is shown to be consistent and representative, the review would usefully map the landscape of vertical methods for multi-site health research and identify key gaps, particularly the frequent trade-offs among statistical fidelity, communication overhead, and privacy guarantees. This could inform both method developers and practitioners seeking to integrate complementary data sources without pooling raw records.
major comments (2)
- [Abstract and Results] Abstract and Results: The abstract states that equivalence 'was not systematically addressed' yet concludes that most methods do not achieve results equivalent to centralized analyses. Clarify in the methods or results how non-equivalence was determined when papers did not address it (e.g., whether absence of reporting was coded as failure), and provide the explicit coding rubric or supplementary extraction table used for the 30 included studies.
- [Results] Results: Only a minority of articles provided privacy assessments, yet almost all described their approach as privacy-preserving and the headline finding states that most methods lack 'guaranteed protection.' Specify the criteria applied to evaluate privacy guarantees (self-description versus formal assessment) and how this was operationalized across heterogeneous reporting standards.
minor comments (1)
- [Methods] Methods: Expand the description of the citation screening process and any quality or risk-of-bias considerations applied to the included methods, even if formal bias tools were not used.
Simulated Author's Rebuttal
We thank the referee for these constructive comments, which identify opportunities to improve transparency in our methods and results. We provide point-by-point responses below and will revise the manuscript to include explicit documentation of our coding process.
read point-by-point responses
-
Referee: [Abstract and Results] Abstract and Results: The abstract states that equivalence 'was not systematically addressed' yet concludes that most methods do not achieve results equivalent to centralized analyses. Clarify in the methods or results how non-equivalence was determined when papers did not address it (e.g., whether absence of reporting was coded as failure), and provide the explicit coding rubric or supplementary extraction table used for the 30 included studies.
Authors: We thank the referee for noting this ambiguity. Equivalence was coded only when papers explicitly addressed it (via theoretical proofs of identical estimators or empirical demonstrations of numerical equivalence to pooled results). Papers that omitted any discussion of equivalence were coded strictly as 'not addressed' and were not counted as non-equivalent. The headline conclusion that most methods fail to achieve equivalence is drawn from the subset of papers that did evaluate it (where the majority fell short) together with the observation that unverified methods cannot be assumed equivalent. We will add a supplementary extraction table and a detailed coding rubric in the revised methods section that lists, for each of the 30 studies, the exact status recorded for equivalence, number of communication rounds, and privacy evaluation. revision: yes
-
Referee: [Results] Results: Only a minority of articles provided privacy assessments, yet almost all described their approach as privacy-preserving and the headline finding states that most methods lack 'guaranteed protection.' Specify the criteria applied to evaluate privacy guarantees (self-description versus formal assessment) and how this was operationalized across heterogeneous reporting standards.
Authors: We agree that the distinction requires explicit statement. 'Guaranteed protection' was operationalized as the presence of a formal assessment (differential privacy bounds, cryptographic security proofs, or empirical attack-resistance evaluations). Self-descriptions such as 'privacy-preserving' or 'secure' without accompanying formal analysis were recorded but did not qualify as guaranteed protection. This rule was applied uniformly by examining the methods and results sections of each paper. We will revise the methods section to state these criteria verbatim and will add illustrative examples in the results to show how heterogeneous reporting was classified. revision: yes
Circularity Check
Scoping review of vertical inference methods contains no derivations or predictions
full rationale
The paper is a descriptive scoping review that screens 2887 articles, includes 30, and extracts methodological properties (equivalence to pooled analysis, communication rounds, privacy assessments) from external literature. It performs no mathematical derivations, statistical predictions, parameter fitting, or modeling steps. No equations, ansatzes, or self-citation chains are used to justify central claims; the characterization follows directly from the screening and coding protocol applied to independent papers. The skeptic concern about heterogeneous reporting affects the strength of the synthesis but does not create circularity within the paper's own logic.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Among 2887 articles initially screened, 30 were included... Equivalence with the pooled analyses was not systematically addressed and most methods required multiple communications... Almost all articles described their approach as privacy-preserving, although a minority provided privacy assessments.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Three categories of techniques were used... encryption-based methods, methods based on vertically separable quantities and methods based on secure multiparty computations (SMC).
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
S. Dash, S.K. Shakyawar, M. Sharma, S. Kaushik, Big data in healthcare: management, analysis and future prospects, J. Big Data 6 (2019) 54. https://doi.org/10.1186/s40537-019-0217-0
-
[2]
N.V . Chawla, D.A. Davis, Bringing Big Data to Personalized Healthcare: A Patient- Centered Framework, J. Gen. Intern. Med. 28 (2013) 660–665. https://doi.org/10.1007/s11606-013-2455-8
-
[3]
F.K. Dankar, A. Ptitsyn, S.K. Dankar, The development of large-scale de-identified biomedical databases in the age of genomics—principles and challenges, Hum. Genomics 12 (2018) 19. https://doi.org/10.1186/s40246-018-0147-5
-
[4]
https://nuage.recherche.usherbrooke.ca/en/ (accessed February 9, 2026)
Banques NuAge – Banques de données et d’échantillons biologiques de l’Étude longitudinale québécoise sur la nutrition comme déterminant d’un vieillissement réussi, (n.d.). https://nuage.recherche.usherbrooke.ca/en/ (accessed February 9, 2026)
work page 2026
-
[5]
B. Pfitzner, N. Steckhan, B. Arnrich, Federated Learning in a Medical Context: A Systematic Literature Review, ACM Trans. Internet Technol. 21 (2021) 1–31. https://doi.org/10.1145/3412357
-
[6]
F. Camirand Lemyre, S. Lévesque, M.-P. Domingue, K. Herrmann, J.-F. Ethier, Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Health Analytics, JMIR Med. Inform. 12 (2024) e53622. https://doi.org/10.2196/53622
-
[7]
J. Bohn, W. Eddings, S. Schneeweiss, Conducting privacy-preserving multivariable propensity score analysis when patient covariate information is stored in separate locations, Am. J. Epidemiol. 185 (2017) 501–510. https://doi.org/10.1093/aje/kww155
-
[8]
C. Chang, Z. Bu, Q. Long, CEDAR: communication efficient distributed analysis for regressions, Biometrics 79 (2023) 2357–2369. https://doi.org/10.1111/biom.13786
-
[9]
J. Tong, J.M. Reps, C. Luo, Y . Lu, L. Li, J.M. Ramirez-Anguita, M.T. Brand, S.L. DuVall, T. Falconer, A.M. Fuentes, Unlocking efficiency in real-world collaborative studies: a multi-site international study with one-shot lossless GLMM algorithm, Npj Digit. Med. 8 (2025) 457
work page 2025
- [10]
- [11]
-
[12]
R. Torkzadehmahani, R. Nasirigerdeh, D.B. Blumenthal, T. Kacprowski, M. List, J. Matschinske, J. Spaeth, N.K. Wenke, J. Baumbach, Privacy-Preserving Artificial Intelligence Techniques in Biomedicine, Methods Inf. Med. 61 (2022) e12–e27. https://doi.org/10.1055/s-0041-1740630
-
[13]
Multimodal biomedical AI.Nature Medicine, 28(9):1773–1784, 2022
J.N. Acosta, G.J. Falcone, P. Rajpurkar, E.J. Topol, Multimodal biomedical AI, Nat. Med. 28 (2022) 1773–1784. https://doi.org/10.1038/s41591-022-01981-2
-
[15]
Y . Li, X. Jiang, S. Wang, H. Xiong, L. Ohno-Machado, VERTIcal Grid lOgistic regression (VERTIGO), J. Am. Med. Inform. Assoc. 23 (2016) 570–579. https://doi.org/10.1093/jamia/ocv146
-
[16]
C. Sun, L. Ippel, A. Dekker, M. Dumontier, J. van Soest, A systematic review on privacy-preserving distributed data mining, Data Sci. 4 (2021) 121–150. https://doi.org/10.3233/DS-210036
-
[17]
Y . Liu, Y . Kang, T. Zou, Y . Pu, Y . He, X. Ye, Y . Ouyang, Y .-Q. Zhang, Q. Yang, Vertical Federated Learning: Concepts, Advances, and Challenges, IEEE Trans. Knowl. Data Eng. 36 (2024) 3615–3634. https://doi.org/10.1109/TKDE.2024.3352628
-
[18]
A. Khan, M. ten Thij, A. Wilbik, Vertical federated learning: a structured literature review, Knowl. Inf. Syst. 67 (2025) 3205–3243. https://doi.org/10.1007/s10115-025- 02356-y
-
[19]
H. Chen, H. Wang, Q. Long, D. Jin, Y . Li, Advancements in Federated Learning: Models, Methods, and Privacy, ACM Comput. Surv. (2024) 3664650. https://doi.org/10.1145/3664650
-
[20]
Z.L. Teo, L. Jin, N. Liu, S. Li, D. Miao, X. Zhang, W.Y . Ng, T.F. Tan, D.M. Lee, K.J. Chua, J. Heng, Y . Liu, R.S.M. Goh, D.S.W. Ting, Federated machine learning in healthcare: A systematic review on clinical applications and technical architecture, Cell Rep. Med. 5 (2024) 101419. https://doi.org/10.1016/j.xcrm.2024.101419
-
[21]
Shmueli, To Explain or to Predict?, Stat
G. Shmueli, To Explain or to Predict?, Stat. Sci. 25 (2010) 289–310
work page 2010
-
[22]
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, Springer New York, New York, NY , 2013. https://doi.org/10.1007/978-1-4614-7138- 7
-
[23]
C. Luo, Md.N. Islam, N.E. Sheils, J. Buresh, J. Reps, M.J. Schuemie, P.B. Ryan, M. Edmondson, R. Duan, J. Tong, A. Marks-Anglin, J. Bian, Z. Chen, T. Duarte-Salles, S. Fernández-Bertolín, T. Falconer, C. Kim, R.W. Park, S.R. Pfohl, N.H. Shah, A.E. Williams, H. Xu, Y . Zhou, E. Lautenbach, J.A. Doshi, R.M. Werner, D.A. Asch, Y . Chen, DLMM as a lossless on...
-
[24]
Q. Wu, J.M. Reps, L. Li, B. Zhang, Y . Lu, J. Tong, D. Zhang, T. Lumley, M.T. Brand, M. Van Zandt, T. Falconer, X. He, Y . Huang, H. Li, C. Yan, G. Tang, A.E. Williams, F. Wang, J. Bian, B. Malin, G. Hripcsak, M.J. Schuemie, Y . Lu, S. Drew, J. Zhou, D.A. Asch, Y . Chen, COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models...
-
[26]
M.-P. Domingue, J.-F. Ethier, J.-P. Morissette, S. Lévesque, A. Burgun, F. Camirand Lemyre, Revisiting VERTIGO and VERTIGO-CI: Identifying confidentiality breaches and introducing a statistically sound, efficient alternative, (2025). https://doi.org/10.21203/rs.3.rs-6933988/v1
-
[27]
M. Bak, V .I. Madai, L.A. Celi, G.A. Kaissis, R. Cornet, M. Maris, D. Rueckert, A. Buyx, S. McLennan, Federated learning is not a cure-all for data ethics, Nat. Mach. Intell. 6 (2024) 370–372. https://doi.org/10.1038/s42256-024-00813-x
-
[28]
Z. Munn, M.D.J. Peters, C. Stern, C. Tufanaru, A. McArthur, E. Aromataris, Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach, BMC Med. Res. Methodol. 18 (2018) 143. https://doi.org/10.1186/s12874-018-0611-x
-
[29]
H. Arksey, L. O’Malley, Scoping studies: towards a methodological framework, Int. J. Soc. Res. Methodol. 8 (2005) 19–32. https://doi.org/10.1080/1364557032000119616
-
[30]
Tricco, Erin Lillie, Wasifa Zarin, Kelly K
A.C. Tricco, E. Lillie, W. Zarin, K.K. O’Brien, H. Colquhoun, D. Levac, D. Moher, M.D.J. Peters, T. Horsley, L. Weeks, S. Hempel, E.A. Akl, C. Chang, J. McGowan, L. Stewart, L. Hartling, A. Aldcroft, M.G. Wilson, C. Garritty, S. Lewin, C.M. Godfrey, M.T. Macdonald, E.V . Langlois, K. Soares-Weiser, J. Moriarty, T. Clifford, Ö. Tunçalp, S.E. Straus, PRISMA...
-
[31]
C. Wohlin, Guidelines for snowballing in systematic literature studies and a replication in software engineering, in: Proc. 18th Int. Conf. Eval. Assess. Softw. Eng., ACM, London England United Kingdom, 2014: pp. 1–10. https://doi.org/10.1145/2601248.2601268
-
[32]
A. Carrera-Rivera, W. Ochoa, F. Larrinaga, G. Lasa, How-to conduct a systematic literature review: A quick guide for computer science research, MethodsX 9 (2022) 101895. https://doi.org/10.1016/j.mex.2022.101895
-
[33]
S.E. Fienberg, A.F. Karr, Y . Nardi, A.B. Slavkovic, Secure Logistic Regression with Multi-Party Distributed Databases, Proc. 56th Sess. ISI (2007)
work page 2007
-
[35]
M.-C. Liu, N. Zhang, A cryptographic solution to privacy-preserving two-party sign test computation on vertically partitioned data, in: Adv. Mater. Res., 2012: pp. 1249–
work page 2012
-
[39]
J.P. Reiter, C.N. Kohnen, A.F. Karr, X. Lin, A.P. Sanil, Secure Regression for Vertically Partitioned, Partially Overlapping Data, Proc. Am. Stat. Assoc. (2004)
work page 2004
- [40]
-
[42]
H. Kikuchi, C. Hamanaga, H. Yasunaga, H. Matsui, H. Hashimoto, Privacy- Preserving Multiple Linear Regression of Vertically Partitioned Real Medical Datasets, in: 2017 IEEE 31st Int. Conf. Adv. Inf. Netw. Appl. AINA, 2017: pp. 1042–
work page 2017
-
[44]
A.F. Karr, X. Lin, A.P. Sanil, J.P. Reiter, Secure statistical analysis of distributed databases, in: Stat. Methods Counterterrorism Game Theory Model. Syndr. Surveill. Biom. Authentication, 2006: pp. 237–261. https://doi.org/10.1007/0-387-35209- 0_14
-
[45]
A.F. Karr, X. Lin, A.P. Sanil, J.P. Reiter, Privacy-preserving analysis of vertically partitioned data using secure matrix products, J. Off. Stat. 25 (2009) 125–138
work page 2009
-
[47]
S.E. Fienberg, Y . Nardi, A.B. Slavković, Valid statistical analysis for logistic regression with multiple sources, in: Lect. Notes Comput. Sci. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinforma., 2009: pp. 82–94. https://doi.org/10.1007/978- 3-642-10233-2_8
-
[50]
J. Kim, W. Li, T. Bath, X. Jiang, L. Ohno-Machado, VERTIcal Grid lOgistic regression with Confidence Intervals (VERTIGO-CI), Proc. – AMIA Jt. Summits Transl. Sci. 2021 (2021) 355–364
work page 2021
-
[52]
E.C. Hector, P.X.-K. Song, Doubly distributed supervised learning and inference with high-dimensional correlated outcomes, J Mach Learn Res 21 (2020) Article-173
work page 2020
- [56]
-
[63]
C.C. Aggarwal, P.S. Yu, Privacy-Preserving Data Mining: Models and Algorithms, Springer, 2008
work page 2008
-
[64]
Q. Her, T. Kent, Y . Samizo, A. Slavkovic, Y . Vilk, S. Toh, Automatable Distributed Regression Analysis of Vertically Partitioned Data Facilitated by PopMedNet: Feasibility and Enhancement Study, JMIR Med. Inform. 9 (2021) e21459. https://doi.org/10.2196/21459
- [65]
-
[66]
X. Huang, H. Kikuchi, C.-I. Fan, Privacy Preserved Spectral Analysis Using IoT mHealth Biomedical Data for Stress Estimation, in: 2018 IEEE 32nd Int. Conf. Adv. Inf. Netw. Appl. AINA, 2018: pp. 793–800. https://doi.org/10.1109/AINA.2018.00118
-
[67]
Y . Li, D. Feng, Y . Sui, H. Li, Y . Song, T. Zhan, G. Cicconetti, M. Jin, H. Wang, I. Chan, X. Wang, Analyzing longitudinal binary data in clinical studies, Contemp. Clin. Trials 115 (2022) 106717. https://doi.org/10.1016/j.cct.2022.106717. Multi-Site Health Research Integrating Complementary Data Sources: A Scoping Review of Statistical Inference Method...
-
[68]
RESEARCH QUESTION ....................................................................................................... 1
-
[69]
METHODS .............................................................................................................................. 1 2.1. Keywords ....................................................................................................................... 1 2.1.1. Limits and restrictions ........................................................
-
[70]
continuous vs binary covariates)
RESEARCH QUESTION 1.1.What existing distributed methods allow conducting statistical inference procedures with vertically partitioned data? Regarding: Methods for different statistical models; Methods for various settings in terms of privacy; Methods for different data settings (e.g. continuous vs binary covariates). 1.2.What are the characteristics spe...
-
[71]
METHODS 2.1. Keywords In accordance with our previous review, while making sure we capture all potential existing methods, two categories of keywords were targeted, corresponding to the two themes in the research question. Distributed analyses: o Partitioned Partitioned Federated Distributed Aggregated Privacy-preserving Multiparty Multipl...
-
[72]
Vertically partitioned data This paper/study presents a solution to perform inferential statistics with vertically partitioned data. Examples of papers that would not meet the criteria: the method is 5 presented on horizontally partitioned data only; or the method requires pooling all line- level data
-
[73]
e.g., the focus is not on estimation and/or confidence intervals and/or hypothesis testing
Inferential Statistics The paper/study does not specifically address inferential statistics (Confidence intervals, Hypothesis testing or Asymptotic normality result). e.g., the focus is not on estimation and/or confidence intervals and/or hypothesis testing
-
[74]
e.g., the study is solely an application of a previously developed and presented method
Methodologi cal contribution The paper/study does not provide a new methodological contribution. e.g., the study is solely an application of a previously developed and presented method
-
[75]
Published study The paper/study has not been published
-
[76]
Language The full-text is not available in English or French. 2.4. Data-charting A data-charting form was collectively developed, and extraction will be addressed manually. Data will be independently extracted by two authors (MPD and SL) for all studies. In the case of opposite opinions from the two initial reviewers, the reviewers will discuss and, if ne...
work page 2015
-
[77]
Accurate Estimation of Structural Equation Models with Remote Partitioned Data
Snoke J, Brick T, Slavković A. Accurate Estimation of Structural Equation Models with Remote Partitioned Data. In: Privacy in Statistical Databases. Cham: Springer International Publishing; 2016. p. 190–209
work page 2016
-
[78]
VERTIcal Grid lOgistic regression with Confidence Intervals (VERTIGO-CI)
Kim J, Li W, Bath T, Jiang X, Ohno-Machado L. VERTIcal Grid lOgistic regression with Confidence Intervals (VERTIGO-CI). Proc – AMIA Jt Summits Transl Sci. 2021;2021:355–64
work page 2021
-
[79]
Dai W, Jiang X, Bonomi L, Li Y , Xiong H, Ohno-Machado L. VERTICOX: Vertically Distributed Cox Proportional Hazards Model Using the Alternating Direction Method of Multipliers. IEEE Trans Knowl Data Eng. 2022;34(2):996–1010. doi:10.1109/TKDE.2020.2989301
-
[80]
Advances in Water Resources 166, 104264
Imakura A, Tsunoda R, Kagawa R, Yamagata K, Sakurai T. DC-COX: Data collaboration Cox proportional hazards model for privacy-preserving survival analysis on multiple parties. J Biomed Inform. 2023;137. doi:10.1016/j.jbi.2022.104264
-
[81]
Valid statistical analysis for logistic regression with multiple sources
Fienberg SE, Nardi Y , Slavković AB. Valid statistical analysis for logistic regression with multiple sources. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2009. p. 82–94. doi:10.1007/978-3-642-10233-2_8
-
[82]
Secure Logistic Regression with Multi-Party Distributed Databases
Fienberg SE, Karr AF, Nardi Y , Slavkovic AB. Secure Logistic Regression with Multi-Party Distributed Databases. Proc 56th Sess ISI. 2007
work page 2007
-
[83]
Secure logistic regression of horizontally and vertically partitioned distributed databases
Slavkovic AB, Nardi Y , Tibbits MM. Secure logistic regression of horizontally and vertically partitioned distributed databases. In: Proceedings - IEEE International Conference on Data Mining, ICDM. 2007. p. 723–8. doi:10.1109/ICDMW.2007.114
-
[84]
Providing accurate models across private partitioned data: Secure maximum likelihood estimation
Snoke J, Brick TR, Slavković A, Hunte MD. Providing accurate models across private partitioned data: Secure maximum likelihood estimation. Ann Appl Stat. 2018;12(2):877–914. doi:10.1214/18-AOAS1171
-
[85]
Privacy-preserving analysis of vertically partitioned data using secure matrix products
Karr AF, Lin X, Sanil AP, Reiter JP. Privacy-preserving analysis of vertically partitioned data using secure matrix products. J Off Stat. 2009;25(1):125–38
work page 2009
-
[86]
Secure statistical analysis of distributed databases
Karr AF, Lin X, Sanil AP, Reiter JP. Secure statistical analysis of distributed databases. In: Statistical Methods in Counterterrorism: Game Theory, Modeling, Syndromic Surveillance, and Biometric Authentication. 2006. p. 237–61. doi:10.1007/0-387-35209-0_14
-
[87]
Secure Regression for Vertically Partitioned, Partially Overlapping Data
Reiter JP, Kohnen CN, Karr AF, Lin X, Sanil AP. Secure Regression for Vertically Partitioned, Partially Overlapping Data. Proc Am Stat Assoc. 2004
work page 2004
-
[88]
Privacy-preserving multivariate statistical analysis: Linear regression and classification
Du W, Han YS, Chen S. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: SIAM Proceedings Series. 2004. p. 222–33. doi:10.1137/1.9781611972740.21 6
-
[89]
Liu MC, Zhang N. A solution to privacy-preserving two-party sign test on vertically partitioned data (P22NSTv) using data disguising techniques. In: ICNIT 2010 - 2010 International Conference on Networking and Information Technology. 2010. p. 526–34. doi:10.1109/ICNIT.2010.5508458
-
[90]
Privacy-preserving cloud-based statistical analyses on sensitive categorical data
Ricci S, Domingo-Ferrer J, Sánchez D. Privacy-preserving cloud-based statistical analyses on sensitive categorical data. In: International Conference on Modeling Decisions for Artificial Intelligence. 2016. p. 227–38. doi:10.1007/978-3-319-45656-0_19
-
[91]
Privacy-Preserving Randomized Controlled Trials: A Protocol for Industry Scale Deployment
Movahedi M, Case BM, Honaker J, Knox A, Li L, Li YP, et al. Privacy-Preserving Randomized Controlled Trials: A Protocol for Industry Scale Deployment. In: Proceedings of the 2021 on Cloud Computing Security Workshop. Association for Computing Machinery; 2021. p. 59–69. doi:10.1145/3474123.3486764
-
[92]
Privacy-preserving hypothesis testing for the analysis of epidemiological medical data
Kikuchi H, Sato T, Sakuma J. Privacy-preserving hypothesis testing for the analysis of epidemiological medical data. In: Proceedings - International Conference on Advanced Information Networking and Applications, AINA. 2014. p. 359–65. doi:10.1109/AINA.2014.46
-
[93]
Privacy-Preserving Multiple Linear Regression of Vertically Partitioned Real Medical Datasets
Kikuchi H, Hamanaga C, Yasunaga H, Matsui H, Hashimoto H. Privacy-Preserving Multiple Linear Regression of Vertically Partitioned Real Medical Datasets. In: 2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA). 2017. p. 1042–9. doi:10.1109/AINA.2017.52
-
[94]
Kikuchi H, Hashimoto H, H. Yasunaga, Saito T. Scalability of Privacy-Preserving Linear Regression in Epidemiological Studies. In: 2015 IEEE 29th International Conference on Advanced Information Networking and Applications. 2015. p. 510–4. doi:10.1109/AINA.2015.229
-
[95]
Efficient Privacy-Preserving Logistic Regression with Iteratively Re-weighted Least Squares
Kikuchi H, Yasunaga H, Matsui H, Fan CI. Efficient Privacy-Preserving Logistic Regression with Iteratively Re-weighted Least Squares. In: 2016 11th Asia Joint Conference on Information Security (AsiaJCIS). 2016. p. 48–54. doi:10.1109/AsiaJCIS.2016.21
-
[96]
Secure Multiple Linear Regression Based on Homomorphic Encryption
Hall R, Fienberg SE, Nardi Y . Secure Multiple Linear Regression Based on Homomorphic Encryption. 2011
work page 2011
-
[97]
Privacy-preserving cooperative statistical analysis
Du W, Atallah MJ. Privacy-preserving cooperative statistical analysis. In: Proceedings - Annual Computer Security Applications Conference, ACSAC. 2001. p. 102–10. doi:10.1109/ACSAC.2001.991526
-
[98]
Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data
Cock M de, Dowsley R, Nascimento ACA, Newman SC. Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data. In: Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security. Association for Computing Machinery; 2015. p. 3–14. doi:10.1145/2808769.2808774
-
[99]
Privacy-preserving statistical analysis by exact logistic regression
Duverle DA, Kawasaki S, Yamada Y , Sakuma J, Tsuda K. Privacy-preserving statistical analysis by exact logistic regression. In: Proceedings - 2015 IEEE Security and Privacy Workshops, SPW 2015. 2015. p. 7–16. doi:10.1109/SPW.2015.14 7
-
[100]
Kamphorst B, Rooijakkers T, Veugen T, Cellamare M, Knoors D. Accurate training of the Cox proportional hazards model on vertically-partitioned data while preserving privacy. BMC Med Inform Decis Mak. 2022;22(1):49. Located at: 35209883. doi:10.1186/s12911-022-01771-3
-
[101]
F. Wu, B. Xi. Differentially Private Causal Inference Under Hierarchical Design. In: 2023 IEEE International Conference on Data Mining Workshops (ICDMW). 2023. p. 1390–9. doi:10.1109/ICDMW60847.2023.00177
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.