Privacy-preserving federated tensor decomposition of single-cell immune data: recovering multicellular programs across institutions

Axel Faes; Maryam Amir Haeri; Stephanie M. van den Berg

arxiv: 2606.24938 · v1 · pith:K6Z526KTnew · submitted 2026-06-22 · 🧬 q-bio.GN · cs.AI

Privacy-preserving federated tensor decomposition of single-cell immune data: recovering multicellular programs across institutions

Axel Faes , Stephanie M. van den Berg , Maryam Amir Haeri This is my paper

Pith reviewed 2026-06-26 05:58 UTC · model grok-4.3

classification 🧬 q-bio.GN cs.AI

keywords federated tensor decompositionmulticellular programssingle-cell immune dataprivacy preservationglobal-mean centeringsite-label confoundingsystemic lupus erythematosusCOVID-19 atlas

0 comments

The pith

Federated tensor decomposition recovers multicellular programs across institutions by merging local subspaces with stacked SVD after global-mean centering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a federated estimator for tensor decomposition on donor by cell-type by gene single-cell data that recovers coordinated multicellular programs spanning cell types and stratifying disease. Each site computes only its local program subspace and a coordinator merges the subspaces by stacked SVD after applying federated global-mean centering. This procedure is shown to be provably equivalent up to truncation to the result of a fully centralized decomposition. The global-mean centering step also makes the merge resistant to confounding by site labels. The method is demonstrated on real multi-institution immune atlases while sharing only subspaces and remaining compatible with secure aggregation.

Core claim

A coordinator merges local program subspaces by stacked SVD under federated global-mean centering, and this merge is provably equivalent up to truncation to the centralized decomposition while conferring robustness to site-label confounding, allowing accurate recovery of multicellular programs such as the interferon response without any site sharing cells.

What carries the argument

Stacked SVD merge of local subspaces after federated global-mean centering, which aligns the federated result with the global structure.

If this is right

Recovers the canonical interferon program with ISG enrichment AUC 0.998 and case-control separation 0.958 on the SLE atlas across institution and ancestry partitions.
Achieves subspace correlation 0.989 on three real COVID-19 sites and exact recovery (correlation 1.000) when no site observes all cell types.
On the interstitial-lung-disease atlas the recovered program predicts disease with AUC 0.96 versus 0.91 for the best single cell type, and the advantage survives federation.
Secure aggregation reduces membership-inference attack AUC from 0.91 to 0.61.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same centering step could be tested on other tensor or matrix decompositions in genomics to handle batch effects without data pooling.
Extending the approach to longitudinal or spatial single-cell datasets might allow recovery of dynamic multicellular programs across sites.
The privacy gain from sharing only subspaces suggests direct applicability to other high-dimensional biological traits governed by institutional data silos.

Load-bearing premise

Local program subspaces computed independently at each site contain sufficient information for the stacked SVD merge with global-mean centering to recover the global multicellular programs without material loss.

What would settle it

On the 261-donor SLE atlas, a direct run of the federated estimator yields an interferon program AUC outside the reported bootstrap interval of [-0.004, +0.012] relative to the centralized result.

Figures

Figures reproduced from arXiv: 2606.24938 by Axel Faes, Maryam Amir Haeri, Stephanie M. van den Berg.

**Figure 4.** Figure 4: Cross-disease federated meta-analysis. A shared interferon-type [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Per-site membership leakage without secure aggregation grows as [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Scalability of the federated estimator: (a) runtime grows linearly in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Tensor decomposition of donor $\times$ cell-type $\times$ gene single-cell data recovers \emph{multicellular programs}: coordinated axes of inter-individual transcriptional variation that span cell types and stratify disease. Yet immune single-cell atlases are increasingly multi-institution, multi-ancestry, and governed, so patient cells often cannot be pooled. We present a federated estimator: each site computes a local program subspace, and a coordinator merges these by stacked SVD under federated global-mean centering, provably equivalent (up to truncation) to the centralised decomposition. This centering makes the merge robust to site-label confounding (program AUC $0.957$ vs.\ $0.861$ for naive per-site centering). Only program subspaces leave a site, and aggregation is compatible with secure aggregation. On a 261-donor systemic lupus erythematosus atlas it recovers the canonical interferon program (ISG enrichment AUC $0.998$; case--control separation $0.958$; bootstrap $\Delta\text{AUC}=-0.000$, 95\% CI $[-0.004,+0.012]$ vs.\ centralised), across institution-scale and multi-ancestry partitions, and across three \emph{real} COVID-19 sites (subspace correlation $0.989$). It recovers the program when \emph{no site observes all cell types} (correlation $1.000$, exact by construction), which fixed-feature federated PCA cannot. On an interstitial-lung-disease atlas the recovered program predicts disease better than the best single cell type (AUC $0.96$ vs.\ $0.91$; gap 95\% CI excludes zero) and the advantage survives federation; a liver cohort is consistent ($p=0.005$). Membership-inference shows secure aggregation cuts attack AUC from $0.91$ to $0.61$. The method enables cross-institution, cross-ancestry recovery of multicellular immune programs without sharing cells.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This federated tensor method recovers multicellular programs across sites with near-centralized fidelity using stacked SVD after global-mean centering, and the equivalence plus empirical checks hold up.

read the letter

The main thing to know is that this paper gives a working federated estimator for tensor decomposition on donor-by-cell-type-by-gene single-cell immune data. Each site computes a local subspace, a coordinator merges via stacked SVD after federated global-mean centering, and the result is provably equivalent up to truncation to the centralized decomposition while staying robust to site confounding.

It does the core job cleanly. The centering step lifts program AUC from 0.861 to 0.957 on confounded partitions. Recovery stays strong on the 261-donor SLE atlas (ISG AUC 0.998, case-control 0.958, bootstrap CI overlaps centralized), on three real COVID sites (subspace correlation 0.989), and when no site sees every cell type (correlation 1.000, exact by construction). The ILD atlas result also shows the recovered program beats the best single cell type for disease prediction, and secure aggregation drops membership-inference attack AUC from 0.91 to 0.61.

Soft spots are small. The load-bearing assumption is that local subspaces carry enough signal for the merge to succeed without material loss; the reported metrics and the exact-recovery case when cell-type coverage is incomplete support it, but real-world site heterogeneity could still bite in ways the tested partitions do not capture. The full construction and equivalence argument are supplied, so the abstract claim is not just asserted.

This is for groups already running multi-institution single-cell work who need to avoid pooling cells. It deserves peer review because the method is directly compared to centralized baselines, the privacy and incomplete-coverage cases are handled, and the numbers are reported against clear controls.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces a federated estimator for tensor decomposition of donor × cell-type × gene single-cell immune data. Each institution computes a local program subspace; a coordinator merges these via stacked SVD after federated global-mean centering. The method is claimed to be provably equivalent (up to truncation) to the centralized decomposition, robust to site-label confounding, and able to recover multicellular programs even when no site observes all cell types. Validation on a 261-donor SLE atlas, COVID-19 sites, an ILD atlas, and a liver cohort reports high AUCs (e.g., ISG enrichment 0.998, case-control 0.958), subspace correlations (0.989), bootstrap CIs overlapping centralized results, and exact recovery (correlation 1.000) in the incomplete cell-type case; secure aggregation reduces membership-inference attack AUC from 0.91 to 0.61.

Significance. If the equivalence and recovery claims hold, the work enables privacy-preserving cross-institution recovery of multicellular immune programs without pooling cells, addressing a key barrier in multi-ancestry and multi-site single-cell atlases. Strengths include the explicit robustness to site confounding via global-mean centering, the exact-by-construction recovery when cell-type coverage is incomplete (a case where fixed-feature federated PCA fails), and compatibility with secure aggregation. The empirical results on real disease cohorts (SLE, COVID, ILD) with metrics comparable to centralized baselines support practical utility.

minor comments (3)

[Abstract] Abstract: the phrase 'provably equivalent (up to truncation)' would benefit from a parenthetical note on the truncation rank or the precise condition under which equivalence holds, to aid readers who encounter only the abstract.
[Results] The bootstrap procedure for the ΔAUC confidence interval is referenced but the number of replicates and resampling unit (donors vs. cells) are not stated in the provided summary; add these details in the methods for reproducibility.
[Methods] Figure legends or methods should explicitly state the rank chosen for the tensor decomposition and how it was selected, as this directly affects the 'up to truncation' equivalence claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the method's significance for multi-institution immune atlases, and the recommendation of minor revision. No specific major comments or requested changes were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The central claim is a mathematical equivalence (up to truncation) between the federated stacked-SVD merge under global-mean centering and the centralized tensor decomposition, supported by an explicit construction, a centering step that addresses site confounding, and direct empirical validation against centralized baselines (subspace correlations, AUCs within bootstrap CI, exact recovery when cell-type coverage is incomplete). No load-bearing step reduces by definition or by self-citation to the target result; the equivalence argument is presented as provable rather than fitted, and performance metrics are reported against external centralized references. The paper is therefore self-contained against its own benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the method appears to rest on standard assumptions of tensor decomposition and federated averaging.

pith-pipeline@v0.9.1-grok · 5939 in / 1178 out tokens · 24866 ms · 2026-06-26T05:58:46.108799+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 2 canonical work pages

[1]

Coordinated, multicellular patterns of transcriptional variation that stratify patient cohorts are revealed by tensor decomposition,

J. Mitchel, M. G. Gordon, R. K. Perez, E. Biederstedt, R. Bueno, C. J. Ye, and P. V . Kharchenko, “Coordinated, multicellular patterns of transcriptional variation that stratify patient cohorts are revealed by tensor decomposition,”Nature Biotechnology, vol. 43, pp. 1192–1201, 2025

2025
[2]

Context-aware deconvolution of cell–cell communication with tensor-cell2cell,

E. Armingol, H. M. Baghdassarian, C. Martino, A. Perez-Lopez, C. Aamodt, R. Knight, and N. E. Lewis, “Context-aware deconvolution of cell–cell communication with tensor-cell2cell,”Nature Communica- tions, vol. 13, p. 3665, 2022

2022
[3]

Integrative, high-resolution analysis of single-cell gene expression across experimental conditions with parafac2-rise,

A. Chenet al., “Integrative, high-resolution analysis of single-cell gene expression across experimental conditions with parafac2-rise,”Cell Systems, 2025, complete author list before submission

2025
[4]

The human cell atlas,

A. Regev, S. A. Teichmann, E. S. Lander, I. Amit, C. Benoist, E. Birney, B. Bodenmiller, P. Campbell, P. Carninci, M. Clatworthyet al., “The human cell atlas,”eLife, vol. 6, p. e27041, 2017

2017
[5]

The future of digital health with federated learning,

N. Rieke, J. Hancox, W. Li, F. Milletar `ı, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, S. Ourselin, M. Sheller, R. M. Summers, A. Trask, D. Xu, M. Baust, and M. J. Cardoso, “The future of digital health with federated learning,”npj Digital Medicine, vol. 3, p. 119, 2020

2020
[6]

Feder- ated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,

M. J. Sheller, B. Edwards, G. A. Reina, J. Martin, S. Pati, A. Kotrotsou, M. Milchenko, W. Xu, D. Marcus, R. R. Colen, and S. Bakas, “Feder- ated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,”Scientific Reports, vol. 10, p. 12598, 2020

2020
[7]

Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. Ag ¨uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. on Artificial Intelligence and Statistics (AISTATS), PMLR 54, 2017, pp. 1273–1282. [Online]. Available: https://proceedings.mlr.press/v54/mcmahan17a.html

2017
[8]

Advances and open problems in federated learning,

P. Kairouz, H. B. McMahan, B. Avent, A. Belletet al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

2021
[9]

Fed- scgen: privacy-preserving federated batch effect correction of single-cell rna sequencing data,

M. Bakhtiari, S. Bonn, F. Theis, O. Zolotareva, and J. Baumbach, “Fed- scgen: privacy-preserving federated batch effect correction of single-cell rna sequencing data,”Genome Biology, 2025

2025
[10]

Privacy-preserving federated neural network learning for disease-associated cell classification,

S. Sav, J.-P. Bossuat, J. R. Troncoso-Pastoriza, M. Claassen, and J.- P. Hubaux, “Privacy-preserving federated neural network learning for disease-associated cell classification,”Patterns, vol. 3, no. 5, p. 100487, 2022

2022
[11]

Secure and federated quantitative trait loci mapping with privateqtl,

others, “Secure and federated quantitative trait loci mapping with privateqtl,”Cell Genomics, 2025, real (PMID 39947138, Cell Genomics 2025, PII S2666-979X(25)00025-4); complete author list + exact DOI at submission

2025
[12]

Practical secure aggregation for privacy-preserving machine learning,

K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inProc. 2017 ACM SIGSAC Conf. on Computer and Communications Security (CCS), 2017, pp. 1175–1191

2017
[13]

Distributed estimation of principal eigenspaces,

J. Fan, D. Wang, K. Wang, and Z. Zhu, “Distributed estimation of principal eigenspaces,”Annals of Statistics, vol. 47, no. 6, pp. 3009– 3031, 2019

2019
[14]

Federated principal component analysis,

A. Grammenos, R. Mendoza-Smith, J. Crowcroft, and C. Mascolo, “Federated principal component analysis,” inAdvances in Neural Information Processing Systems (NeurIPS) 33, 2020. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/ 47a658229eb2368a99f1d032c8848542-Abstract.html 9

2020
[15]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” inProc. 2017 IEEE Symposium on Security and Privacy (S&P), 2017, pp. 3–18

2017
[16]

The algorithmic foundations of differential privacy,

C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,”Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–487, 2014

2014
[17]

Mofa+: a statistical framework for comprehensive integration of multi-modal single-cell data,

R. Argelaguet, D. Arnol, D. Bredikhin, Y . Deloro, B. Velten, J. C. Mar- ioni, and O. Stegle, “Mofa+: a statistical framework for comprehensive integration of multi-modal single-cell data,”Genome Biology, vol. 21, p. 111, 2020

2020
[18]

DIALOGUE maps multicellular pro- grams in tissue from single-cell or spatial transcriptomics data,

L. Jerby-Arnon and A. Regev, “DIALOGUE maps multicellular pro- grams in tissue from single-cell or spatial transcriptomics data,”Nature Biotechnology, vol. 40, pp. 1467–1477, 2022

2022
[19]

Toward a privacy-preserving predictive foundation model of single-cell transcriptomics with federated learning and tabular modeling,

J. Wanget al., “Toward a privacy-preserving predictive foundation model of single-cell transcriptomics with federated learning and tabular modeling,”bioRxiv, 2025, preprint; complete/verify author list before submission

2025
[20]

The noisy power method: A meta algorithm with applications,

M. Hardt and E. Price, “The noisy power method: A meta algorithm with applications,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 27, 2014, pp. 2861–2869

2014
[21]

DP-PCA: Statistically optimal and differentially private PCA,

X. Liu, W. Kong, P. Jain, and S. Oh, “DP-PCA: Statistically optimal and differentially private PCA,” inAdvances in Neural Information Processing Systems (NeurIPS), 2022, arXiv:2205.13709

arXiv 2022
[22]

Analyze Gauss: Optimal bounds for privacy-preserving principal component analysis,

C. Dwork, K. Talwar, A. Thakurta, and L. Zhang, “Analyze Gauss: Optimal bounds for privacy-preserving principal component analysis,” inProc. 46th Annual ACM Symposium on Theory of Computing (STOC), 2014, pp. 11–20

2014
[23]

A near-optimal algorithm for differentially-private principal components,

K. Chaudhuri, A. D. Sarwate, and K. Sinha, “A near-optimal algorithm for differentially-private principal components,”Journal of Machine Learning Research, vol. 14, pp. 2905–2943, 2013. [Online]. Available: https://jmlr.org/papers/v14/chaudhuri13a.html

2013
[24]

Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,

N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V . Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig, “Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,” PLoS Genetics, vol. 4, no. 8, p. e1000167, 2008

2008
[25]

Routes for breaching and protecting genetic privacy,

Y . Erlich and A. Narayanan, “Routes for breaching and protecting genetic privacy,”Nature Reviews Genetics, vol. 15, no. 6, pp. 409–421, 2014

2014
[26]

Private information leakage from single-cell count matrices,

C. R. Walker, X. Li, M. Chakravarthy, W. Lounsbery-Scaife, Y . A. Choi, R. Singh, and G. G ¨ursoy, “Private information leakage from single-cell count matrices,”Cell, 2024, pMID 39362221; DOI 10.1016/j.cell.2024.09.012

work page doi:10.1016/j.cell.2024.09.012 2024
[27]

muscat detects subpopulation- specific state transitions from multi-sample multi-condition single-cell transcriptomics data,

H. L. Crowell, C. Soneson, P.-L. Germain, D. Calini, L. Collin, C. Ra- poso, D. Malhotra, and M. D. Robinson, “muscat detects subpopulation- specific state transitions from multi-sample multi-condition single-cell transcriptomics data,”Nature Communications, vol. 11, p. 6077, 2020

2020
[28]

Confronting false dis- coveries in single-cell differential expression,

J. W. Squair, M. Gautier, C. Kathe, M. A. Anderson, N. D. James, T. H. Hutson, R. Hudelle, T. Qaiser, K. J. E. Matson, Q. Barraud, A. J. Levine, G. La Manno, M. A. Skinnider, and G. Courtine, “Confronting false dis- coveries in single-cell differential expression,”Nature Communications, vol. 12, p. 5692, 2021

2021
[29]

Tensor decompositions and applications,

T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, no. 3, pp. 455–500, 2009

2009
[30]

A multilinear singular value decomposition,

L. De Lathauwer, B. De Moor, and J. Vandewalle, “A multilinear singular value decomposition,”SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253–1278, 2000

2000
[31]

Federated machine learning: Concept and applications,

Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 12:1–12:19, 2019

2019
[32]

Fully homomorphic encryption using ideal lattices,

C. Gentry, “Fully homomorphic encryption using ideal lattices,” inProc. 41st Annual ACM Symposium on Theory of Computing (STOC), 2009, pp. 169–178

2009
[33]

Homomorphic encryption for arithmetic of approximate numbers,

J. H. Cheon, A. Kim, M. Kim, and Y . Song, “Homomorphic encryption for arithmetic of approximate numbers,” inAdvances in Cryptology – ASIACRYPT 2017, Part I, LNCS 10624, 2017, pp. 409–437

2017
[34]

SecureML: A system for scalable privacy- preserving machine learning,

P. Mohassel and Y . Zhang, “SecureML: A system for scalable privacy- preserving machine learning,” inProc. 2017 IEEE Symposium on Security and Privacy (S&P), 2017, pp. 19–38

2017
[35]

Membership inference attacks from first principles,

N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tram `er, “Membership inference attacks from first principles,” inProc. 2022 IEEE Symposium on Security and Privacy (S&P), 2022, pp. 1897–1914

2022
[36]

Single- cell rna-seq reveals cell type–specific molecular and genetic associations to lupus,

R. K. Perez, M. G. Gordon, M. Subramaniam, M. C. Kim, G. C. Hartoularos, S. Targ, Y . Sun, A. Ogorodnikov, R. Buenoet al., “Single- cell rna-seq reveals cell type–specific molecular and genetic associations to lupus,”Science, vol. 376, no. 6589, p. eabf1970, 2022

2022
[37]

Single-cell multi- omics analysis of the immune response in covid-19,

E. Stephenson, G. Reynolds, R. A. Bottinget al., “Single-cell multi- omics analysis of the immune response in covid-19,”Nature Medicine, vol. 27, pp. 904–916, 2021

2021
[38]

Cell-type-resolved genetic variation shapes inflammatory bowel disease risk,

others, “Cell-type-resolved genetic variation shapes inflammatory bowel disease risk,”Nature, 2026, iBDverse atlas (Wellcome Sanger); complete author list at submission

2026
[39]

Cell-type-specific and disease- associated expression quantitative trait loci in the human lung,

H. M. Natri, C. B. Del Azodi, L. Peter, C. J. Taylor, S. Chugh, R. Kendle, M.-i. Chung, D. K. Flaherty, B. K. Matlock, C. L. Calvi, T. S. Blackwell, L. B. Ware, M. Bacchetta, R. Walia, C. M. Shaver, J. A. Kropski, D. J. McCarthy, and N. E. Banovich, “Cell-type-specific and disease- associated expression quantitative trait loci in the human lung,”Nature Ge...

work page doi:10.1038/s41588-024-01702-0 2024
[40]

Single-cell, single-nucleus, and spatial transcriptomics characterization of the immunological landscape in the healthy and psc human liver,

T. S. Andrews, D. Nakib, C. T. Perciani, X. Z. Ma, L. Liu, E. Winter, D. Camat, S. W. Chung, P. Lumanto, J. Manuel, S. Mangroo, B. Hansen, B. Arpinder, C. Thoeni, B. Sayed, J. Feld, A. Gehring, A. Gulamhusein, G. M. Hirschfield, A. Ricciuto, G. D. Bader, I. D. McGilvray, and S. MacParland, “Single-cell, single-nucleus, and spatial transcriptomics characte...

2024
[41]

SCANPY: large-scale single- cell gene expression data analysis,

F. A. Wolf, P. Angerer, and F. J. Theis, “SCANPY: large-scale single- cell gene expression data analysis,”Genome Biology, vol. 19, p. 15, 2018

2018
[42]

Integrated analysis of multimodal single-cell data,

Y . Hao, S. Hao, E. Andersen-Nissen, W. M. Mauck III, S. Zheng, A. Butler, M. J. Lee, A. J. Wilk, C. Darby, M. Zageret al., “Integrated analysis of multimodal single-cell data,”Cell, vol. 184, no. 13, pp. 3573– 3587, 2021

2021
[43]

Calibrating noise to sensitivity in private data analysis,

C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography (TCC), LNCS 3876, 2006, pp. 265–284

2006
[44]

Privately learning subspaces,

V . Singhal and T. Steinke, “Privately learning subspaces,” in Advances in Neural Information Processing Systems (NeurIPS) 34,
[45]

Available: https://proceedings.neurips.cc/paper/2021/ hash/09b69adcd7cbae914c6204984097d2da-Abstract.html

[Online]. Available: https://proceedings.neurips.cc/paper/2021/ hash/09b69adcd7cbae914c6204984097d2da-Abstract.html

2021
[46]

Concentrated differential privacy: Simplifica- tions, extensions, and lower bounds,

M. Bun and T. Steinke, “Concentrated differential privacy: Simplifica- tions, extensions, and lower bounds,” inTheory of Cryptography (TCC- B), LNCS 9985, 2016, pp. 635–658

2016
[47]

Deep learning with differential privacy,

M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Tal- war, and L. Zhang, “Deep learning with differential privacy,” inProc. 2016 ACM SIGSAC Conf. on Computer and Communications Security (CCS), 2016, pp. 308–318

2016

[1] [1]

Coordinated, multicellular patterns of transcriptional variation that stratify patient cohorts are revealed by tensor decomposition,

J. Mitchel, M. G. Gordon, R. K. Perez, E. Biederstedt, R. Bueno, C. J. Ye, and P. V . Kharchenko, “Coordinated, multicellular patterns of transcriptional variation that stratify patient cohorts are revealed by tensor decomposition,”Nature Biotechnology, vol. 43, pp. 1192–1201, 2025

2025

[2] [2]

Context-aware deconvolution of cell–cell communication with tensor-cell2cell,

E. Armingol, H. M. Baghdassarian, C. Martino, A. Perez-Lopez, C. Aamodt, R. Knight, and N. E. Lewis, “Context-aware deconvolution of cell–cell communication with tensor-cell2cell,”Nature Communica- tions, vol. 13, p. 3665, 2022

2022

[3] [3]

Integrative, high-resolution analysis of single-cell gene expression across experimental conditions with parafac2-rise,

A. Chenet al., “Integrative, high-resolution analysis of single-cell gene expression across experimental conditions with parafac2-rise,”Cell Systems, 2025, complete author list before submission

2025

[4] [4]

The human cell atlas,

A. Regev, S. A. Teichmann, E. S. Lander, I. Amit, C. Benoist, E. Birney, B. Bodenmiller, P. Campbell, P. Carninci, M. Clatworthyet al., “The human cell atlas,”eLife, vol. 6, p. e27041, 2017

2017

[5] [5]

The future of digital health with federated learning,

N. Rieke, J. Hancox, W. Li, F. Milletar `ı, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, S. Ourselin, M. Sheller, R. M. Summers, A. Trask, D. Xu, M. Baust, and M. J. Cardoso, “The future of digital health with federated learning,”npj Digital Medicine, vol. 3, p. 119, 2020

2020

[6] [6]

Feder- ated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,

M. J. Sheller, B. Edwards, G. A. Reina, J. Martin, S. Pati, A. Kotrotsou, M. Milchenko, W. Xu, D. Marcus, R. R. Colen, and S. Bakas, “Feder- ated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,”Scientific Reports, vol. 10, p. 12598, 2020

2020

[7] [7]

Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. Ag ¨uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. on Artificial Intelligence and Statistics (AISTATS), PMLR 54, 2017, pp. 1273–1282. [Online]. Available: https://proceedings.mlr.press/v54/mcmahan17a.html

2017

[8] [8]

Advances and open problems in federated learning,

P. Kairouz, H. B. McMahan, B. Avent, A. Belletet al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

2021

[9] [9]

Fed- scgen: privacy-preserving federated batch effect correction of single-cell rna sequencing data,

M. Bakhtiari, S. Bonn, F. Theis, O. Zolotareva, and J. Baumbach, “Fed- scgen: privacy-preserving federated batch effect correction of single-cell rna sequencing data,”Genome Biology, 2025

2025

[10] [10]

Privacy-preserving federated neural network learning for disease-associated cell classification,

S. Sav, J.-P. Bossuat, J. R. Troncoso-Pastoriza, M. Claassen, and J.- P. Hubaux, “Privacy-preserving federated neural network learning for disease-associated cell classification,”Patterns, vol. 3, no. 5, p. 100487, 2022

2022

[11] [11]

Secure and federated quantitative trait loci mapping with privateqtl,

others, “Secure and federated quantitative trait loci mapping with privateqtl,”Cell Genomics, 2025, real (PMID 39947138, Cell Genomics 2025, PII S2666-979X(25)00025-4); complete author list + exact DOI at submission

2025

[12] [12]

Practical secure aggregation for privacy-preserving machine learning,

K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inProc. 2017 ACM SIGSAC Conf. on Computer and Communications Security (CCS), 2017, pp. 1175–1191

2017

[13] [13]

Distributed estimation of principal eigenspaces,

J. Fan, D. Wang, K. Wang, and Z. Zhu, “Distributed estimation of principal eigenspaces,”Annals of Statistics, vol. 47, no. 6, pp. 3009– 3031, 2019

2019

[14] [14]

Federated principal component analysis,

A. Grammenos, R. Mendoza-Smith, J. Crowcroft, and C. Mascolo, “Federated principal component analysis,” inAdvances in Neural Information Processing Systems (NeurIPS) 33, 2020. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/ 47a658229eb2368a99f1d032c8848542-Abstract.html 9

2020

[15] [15]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” inProc. 2017 IEEE Symposium on Security and Privacy (S&P), 2017, pp. 3–18

2017

[16] [16]

The algorithmic foundations of differential privacy,

C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,”Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–487, 2014

2014

[17] [17]

Mofa+: a statistical framework for comprehensive integration of multi-modal single-cell data,

R. Argelaguet, D. Arnol, D. Bredikhin, Y . Deloro, B. Velten, J. C. Mar- ioni, and O. Stegle, “Mofa+: a statistical framework for comprehensive integration of multi-modal single-cell data,”Genome Biology, vol. 21, p. 111, 2020

2020

[18] [18]

DIALOGUE maps multicellular pro- grams in tissue from single-cell or spatial transcriptomics data,

L. Jerby-Arnon and A. Regev, “DIALOGUE maps multicellular pro- grams in tissue from single-cell or spatial transcriptomics data,”Nature Biotechnology, vol. 40, pp. 1467–1477, 2022

2022

[19] [19]

Toward a privacy-preserving predictive foundation model of single-cell transcriptomics with federated learning and tabular modeling,

J. Wanget al., “Toward a privacy-preserving predictive foundation model of single-cell transcriptomics with federated learning and tabular modeling,”bioRxiv, 2025, preprint; complete/verify author list before submission

2025

[20] [20]

The noisy power method: A meta algorithm with applications,

M. Hardt and E. Price, “The noisy power method: A meta algorithm with applications,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 27, 2014, pp. 2861–2869

2014

[21] [21]

DP-PCA: Statistically optimal and differentially private PCA,

X. Liu, W. Kong, P. Jain, and S. Oh, “DP-PCA: Statistically optimal and differentially private PCA,” inAdvances in Neural Information Processing Systems (NeurIPS), 2022, arXiv:2205.13709

arXiv 2022

[22] [22]

Analyze Gauss: Optimal bounds for privacy-preserving principal component analysis,

C. Dwork, K. Talwar, A. Thakurta, and L. Zhang, “Analyze Gauss: Optimal bounds for privacy-preserving principal component analysis,” inProc. 46th Annual ACM Symposium on Theory of Computing (STOC), 2014, pp. 11–20

2014

[23] [23]

A near-optimal algorithm for differentially-private principal components,

K. Chaudhuri, A. D. Sarwate, and K. Sinha, “A near-optimal algorithm for differentially-private principal components,”Journal of Machine Learning Research, vol. 14, pp. 2905–2943, 2013. [Online]. Available: https://jmlr.org/papers/v14/chaudhuri13a.html

2013

[24] [24]

Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,

N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V . Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig, “Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,” PLoS Genetics, vol. 4, no. 8, p. e1000167, 2008

2008

[25] [25]

Routes for breaching and protecting genetic privacy,

Y . Erlich and A. Narayanan, “Routes for breaching and protecting genetic privacy,”Nature Reviews Genetics, vol. 15, no. 6, pp. 409–421, 2014

2014

[26] [26]

Private information leakage from single-cell count matrices,

C. R. Walker, X. Li, M. Chakravarthy, W. Lounsbery-Scaife, Y . A. Choi, R. Singh, and G. G ¨ursoy, “Private information leakage from single-cell count matrices,”Cell, 2024, pMID 39362221; DOI 10.1016/j.cell.2024.09.012

work page doi:10.1016/j.cell.2024.09.012 2024

[27] [27]

muscat detects subpopulation- specific state transitions from multi-sample multi-condition single-cell transcriptomics data,

H. L. Crowell, C. Soneson, P.-L. Germain, D. Calini, L. Collin, C. Ra- poso, D. Malhotra, and M. D. Robinson, “muscat detects subpopulation- specific state transitions from multi-sample multi-condition single-cell transcriptomics data,”Nature Communications, vol. 11, p. 6077, 2020

2020

[28] [28]

Confronting false dis- coveries in single-cell differential expression,

J. W. Squair, M. Gautier, C. Kathe, M. A. Anderson, N. D. James, T. H. Hutson, R. Hudelle, T. Qaiser, K. J. E. Matson, Q. Barraud, A. J. Levine, G. La Manno, M. A. Skinnider, and G. Courtine, “Confronting false dis- coveries in single-cell differential expression,”Nature Communications, vol. 12, p. 5692, 2021

2021

[29] [29]

Tensor decompositions and applications,

T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, no. 3, pp. 455–500, 2009

2009

[30] [30]

A multilinear singular value decomposition,

L. De Lathauwer, B. De Moor, and J. Vandewalle, “A multilinear singular value decomposition,”SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253–1278, 2000

2000

[31] [31]

Federated machine learning: Concept and applications,

Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 12:1–12:19, 2019

2019

[32] [32]

Fully homomorphic encryption using ideal lattices,

C. Gentry, “Fully homomorphic encryption using ideal lattices,” inProc. 41st Annual ACM Symposium on Theory of Computing (STOC), 2009, pp. 169–178

2009

[33] [33]

Homomorphic encryption for arithmetic of approximate numbers,

J. H. Cheon, A. Kim, M. Kim, and Y . Song, “Homomorphic encryption for arithmetic of approximate numbers,” inAdvances in Cryptology – ASIACRYPT 2017, Part I, LNCS 10624, 2017, pp. 409–437

2017

[34] [34]

SecureML: A system for scalable privacy- preserving machine learning,

P. Mohassel and Y . Zhang, “SecureML: A system for scalable privacy- preserving machine learning,” inProc. 2017 IEEE Symposium on Security and Privacy (S&P), 2017, pp. 19–38

2017

[35] [35]

Membership inference attacks from first principles,

N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tram `er, “Membership inference attacks from first principles,” inProc. 2022 IEEE Symposium on Security and Privacy (S&P), 2022, pp. 1897–1914

2022

[36] [36]

Single- cell rna-seq reveals cell type–specific molecular and genetic associations to lupus,

R. K. Perez, M. G. Gordon, M. Subramaniam, M. C. Kim, G. C. Hartoularos, S. Targ, Y . Sun, A. Ogorodnikov, R. Buenoet al., “Single- cell rna-seq reveals cell type–specific molecular and genetic associations to lupus,”Science, vol. 376, no. 6589, p. eabf1970, 2022

2022

[37] [37]

Single-cell multi- omics analysis of the immune response in covid-19,

E. Stephenson, G. Reynolds, R. A. Bottinget al., “Single-cell multi- omics analysis of the immune response in covid-19,”Nature Medicine, vol. 27, pp. 904–916, 2021

2021

[38] [38]

Cell-type-resolved genetic variation shapes inflammatory bowel disease risk,

others, “Cell-type-resolved genetic variation shapes inflammatory bowel disease risk,”Nature, 2026, iBDverse atlas (Wellcome Sanger); complete author list at submission

2026

[39] [39]

Cell-type-specific and disease- associated expression quantitative trait loci in the human lung,

H. M. Natri, C. B. Del Azodi, L. Peter, C. J. Taylor, S. Chugh, R. Kendle, M.-i. Chung, D. K. Flaherty, B. K. Matlock, C. L. Calvi, T. S. Blackwell, L. B. Ware, M. Bacchetta, R. Walia, C. M. Shaver, J. A. Kropski, D. J. McCarthy, and N. E. Banovich, “Cell-type-specific and disease- associated expression quantitative trait loci in the human lung,”Nature Ge...

work page doi:10.1038/s41588-024-01702-0 2024

[40] [40]

Single-cell, single-nucleus, and spatial transcriptomics characterization of the immunological landscape in the healthy and psc human liver,

T. S. Andrews, D. Nakib, C. T. Perciani, X. Z. Ma, L. Liu, E. Winter, D. Camat, S. W. Chung, P. Lumanto, J. Manuel, S. Mangroo, B. Hansen, B. Arpinder, C. Thoeni, B. Sayed, J. Feld, A. Gehring, A. Gulamhusein, G. M. Hirschfield, A. Ricciuto, G. D. Bader, I. D. McGilvray, and S. MacParland, “Single-cell, single-nucleus, and spatial transcriptomics characte...

2024

[41] [41]

SCANPY: large-scale single- cell gene expression data analysis,

F. A. Wolf, P. Angerer, and F. J. Theis, “SCANPY: large-scale single- cell gene expression data analysis,”Genome Biology, vol. 19, p. 15, 2018

2018

[42] [42]

Integrated analysis of multimodal single-cell data,

Y . Hao, S. Hao, E. Andersen-Nissen, W. M. Mauck III, S. Zheng, A. Butler, M. J. Lee, A. J. Wilk, C. Darby, M. Zageret al., “Integrated analysis of multimodal single-cell data,”Cell, vol. 184, no. 13, pp. 3573– 3587, 2021

2021

[43] [43]

Calibrating noise to sensitivity in private data analysis,

C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography (TCC), LNCS 3876, 2006, pp. 265–284

2006

[44] [44]

Privately learning subspaces,

V . Singhal and T. Steinke, “Privately learning subspaces,” in Advances in Neural Information Processing Systems (NeurIPS) 34,

[45] [45]

Available: https://proceedings.neurips.cc/paper/2021/ hash/09b69adcd7cbae914c6204984097d2da-Abstract.html

[Online]. Available: https://proceedings.neurips.cc/paper/2021/ hash/09b69adcd7cbae914c6204984097d2da-Abstract.html

2021

[46] [46]

Concentrated differential privacy: Simplifica- tions, extensions, and lower bounds,

M. Bun and T. Steinke, “Concentrated differential privacy: Simplifica- tions, extensions, and lower bounds,” inTheory of Cryptography (TCC- B), LNCS 9985, 2016, pp. 635–658

2016

[47] [47]

Deep learning with differential privacy,

M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Tal- war, and L. Zhang, “Deep learning with differential privacy,” inProc. 2016 ACM SIGSAC Conf. on Computer and Communications Security (CCS), 2016, pp. 308–318

2016