Data Collaboration Analysis with Orthonormal Basis Selection and Alignment

Akiko Yoshise; Keiyu Nosaka; Yamato Suetake; Yuichi Takano

arxiv: 2403.02780 · v9 · submitted 2024-03-05 · 💻 cs.LG · math.OC

Data Collaboration Analysis with Orthonormal Basis Selection and Alignment

Keiyu Nosaka , Yamato Suetake , Yuichi Takano , Akiko Yoshise This is my paper

Pith reviewed 2026-05-24 03:34 UTC · model grok-4.3

classification 💻 cs.LG math.OC

keywords data collaborationorthogonal Procrustesbasis alignmentprivacy-preserving learningmulti-party machine learninglinear projectionsorthonormal baseschange of basis

0 comments

The pith

Enforcing orthonormal bases turns data collaboration alignment into a closed-form Orthogonal Procrustes solution that makes performance invariant to the target basis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to align linear projections from multiple private datasets without sharing the secret bases by requiring both secret and target bases to be orthonormal. This change converts the alignment step into the Orthogonal Procrustes problem, which has an exact solution, and produces change-of-basis matrices that achieve orthogonal concordance. Under this concordance every party's representation matches the others up to one shared orthogonal transformation, so the accuracy of any downstream model becomes independent of which target basis was selected. The method keeps the original one-round communication pattern and privacy guarantees while cutting alignment cost from quadratic to linear in the relevant dimensions.

Core claim

By selecting orthonormal secret and target bases, the resulting change-of-basis matrices achieve orthogonal concordance: all parties' representations are aligned up to a shared orthogonal transform. This renders downstream performance invariant to the target basis. Alignment reduces to the Orthogonal Procrustes problem and admits a closed-form solution that lowers complexity from O(min{a(cl)^2,a^2cl}) to O(acl^2).

What carries the argument

Orthonormal Data Collaboration (ODC) that forces orthonormal secret and target bases so that alignment becomes the Orthogonal Procrustes problem and yields orthogonal concordance.

If this is right

Alignment cost drops from quadratic to linear in the product of party count, common dimension, and local dimension.
Empirical wall-clock speedups reach 100 times on standard benchmarks while accuracy stays equal or improves.
One-round communication and the original privacy assumptions of data collaboration remain intact.
Downstream model performance no longer depends on the particular choice of target basis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The invariance property could let practitioners pick the numerically most stable orthonormal target basis without accuracy trade-offs.
The same orthonormal reduction might apply to other multi-party linear-projection schemes that currently solve alignment iteratively.
Because the method is a drop-in replacement, existing data-collaboration codebases can adopt it with minimal refactoring.

Load-bearing premise

Forcing orthonormality on the bases still spans the common subspace and leaves the original linear-projection semantics, information content, and privacy properties unchanged.

What would settle it

An experiment in which the same downstream model is trained on ODC-aligned data using two different orthonormal target bases and accuracy differs by more than numerical precision, or a dataset where the closed-form Procrustes solution fails to produce exact orthogonal concordance.

Figures

Figures reproduced from arXiv: 2403.02780 by Akiko Yoshise, Keiyu Nosaka, Yamato Suetake, Yuichi Takano.

**Figure 1.** Figure 1: Conceptual illustration of the Orthonormal Data Collaboration (ODC) framework. Each participating user independently projects their [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Visual privacy verification using CelebA [24]. Original images (panel (a)) compared to images after orthonormal projections (panel (b)) [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Absolute communication volume versus quantization bit-width [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Heatmaps of the threshold number of FL rounds [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Sensitivity curves for the break-even FL rounds [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Wall-clock time with varying parameters ( [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of extremely heterogeneous splitting applied to TDC datasets. Binary-labeled data are partitioned across four users, each of [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗

**Figure 8.** Figure 8: Visual privacy analysis using CelebA. 29 [PITH_FULL_IMAGE:figures/full_fig_p029_8.png] view at source ↗

read the original abstract

Data Collaboration (DC) enables multiple parties to jointly train a model by sharing only linear projections of their private datasets. The core challenge in DC is to align the bases of these projections without revealing each party's secret basis. While existing theory suggests that any target basis spanning the common subspace should suffice, in practice, the choice of basis can substantially affect both accuracy and numerical stability. We introduce Orthonormal Data Collaboration (ODC), which enforces orthonormal secret and target bases, thereby reducing alignment to the classical Orthogonal Procrustes problem, which admits a closed-form solution. We prove that the resulting change-of-basis matrices achieve orthogonal concordance, aligning all parties' representations up to a shared orthogonal transform and rendering downstream performance invariant to the target basis. Computationally, ODC reduces the alignment complexity from O(min{a(cl)^2,a^2cl}) to O(acl^2), and empirical evaluations show up to 100 times speedups with equal or better accuracy across benchmarks. ODC preserves DC's one-round communication pattern and privacy assumptions, providing a simple and efficient drop-in improvement to existing DC pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ODC forces orthonormal bases in data collaboration to reduce alignment to closed-form Procrustes with a claimed invariance proof and big speedups.

read the letter

The main thing to know is that this paper forces both secret and target bases to be orthonormal in the data collaboration setup, which reduces the alignment step to the Orthogonal Procrustes problem with a closed-form solution and a proof that the resulting change-of-basis matrices give orthogonal concordance, so downstream performance becomes invariant to the target basis choice. They also cut the complexity and report up to 100x speedups on benchmarks with equal or better accuracy, all while keeping the original one-round communication and privacy model. That is the concrete advance over prior DC work. The reduction itself is clean and uses a standard linear algebra tool, which is a plus for anyone implementing these pipelines. The efficiency numbers look useful if they hold up under the full experimental details. The central invariance claim is the part that matters most, and the abstract frames it as preserving the common subspace spanning property without new loss. On the softer side, the abstract itself notes that basis choice still affects accuracy and stability in practice even though theory says any spanning basis should work, and the quantification of post-hoc effects or any subtle impact from the orthonormality constraint is not detailed here. The proof of concordance and the error analysis would need the body to verify fully. This is a targeted improvement for researchers already working on privacy-preserving collaborative learning methods. A reader focused on efficient multi-party alignment would find the construction and complexity analysis worth their time. It has enough of a new technique and reproducible-style gains to deserve a serious referee rather than a desk reject.

Referee Report

0 major / 0 minor

Summary. The manuscript introduces Orthonormal Data Collaboration (ODC) as an enhancement to standard Data Collaboration (DC). In DC, parties share only linear projections of private data and must align bases without revealing secret bases. ODC enforces orthonormal secret and target bases, reducing the alignment step to the classical Orthogonal Procrustes problem (closed-form SVD solution). The central claim is a proof that the resulting change-of-basis matrices achieve orthogonal concordance: all parties' representations become aligned up to a shared orthogonal transform, rendering downstream performance invariant to target-basis choice. The method preserves the original one-round communication pattern and privacy model, reduces alignment complexity from O(min{a(cl)^2,a^2 cl}) to O(acl^2), and reports empirical speedups up to 100x with equal or better accuracy on benchmarks.

Significance. If the proof of orthogonal concordance holds, the result supplies a theoretically grounded, drop-in improvement that directly resolves the practical sensitivity of DC to basis selection while adding no communication or privacy overhead. The reduction to a standard, parameter-free problem (Orthogonal Procrustes) and the explicit complexity improvement are clear strengths; the empirical speedups and accuracy parity across benchmarks further support utility. The work credits the classical Procrustes literature and maintains the one-round privacy assumptions of prior DC papers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and recommendation to accept. The referee's summary accurately reflects the contributions of Orthonormal Data Collaboration (ODC), including the reduction of alignment to the Orthogonal Procrustes problem, the orthogonal concordance property, complexity reduction, and preservation of the original privacy model.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper proves that orthonormal secret and target bases reduce alignment to the classical Orthogonal Procrustes problem (an external, standard result in linear algebra) and yield change-of-basis matrices achieving orthogonal concordance. No equations or claims in the provided material reduce the result by construction to a fitted parameter, self-citation chain, or renamed input. The appeal to Procrustes is not load-bearing self-citation, and the one-round privacy model plus spanning-property preservation are stated without internal reduction to the target claim. The derivation is therefore independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, domain axioms, or invented entities are identifiable. The method relies on standard linear-algebra facts about orthonormal bases and the Orthogonal Procrustes problem.

pith-pipeline@v0.9.0 · 5732 in / 1125 out tokens · 24845 ms · 2026-05-24T03:34:40.837353+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 1 internal anchor

[1]

Rosati, P

P. Rosati, P. Deeney, M. Cummins, L. van der Werff, T. Lynn, Social media and stock price reaction to data breach announcements: Evidence from us listed companies, Research in International Business and Finance 47 (2019) 458–469

work page 2019
[2]

McMahan, E

B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y Arcas, Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics, PMLR, 2017, pp. 1273–1282

work page 2017
[3]

Dwork, Differential privacy: A survey of results, in: International Conference on Theory and Applications of Models of Computation, Springer, 2008, pp

C. Dwork, Differential privacy: A survey of results, in: International Conference on Theory and Applications of Models of Computation, Springer, 2008, pp. 1–19

work page 2008
[4]

K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, H. V . Poor, Federated learning with differential privacy: Algorithms and performance analysis, IEEE transactions on information forensics and security 15 (2020) 3454–3469

work page 2020
[5]

R. Xu, N. Baracaldo, J. Joshi, Privacy-preserving machine learning: Methods, challenges and directions, arXiv preprint arXiv:2108.04417 (2021)

work page arXiv 2021
[6]

Imakura, T

A. Imakura, T. Sakurai, Data collaboration analysis framework using centralization of individual intermediate representations for distributed data sets, ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 6 (2) (2020) 04020018

work page 2020
[7]

Imakura, X

A. Imakura, X. Ye, T. Sakurai, Collaborative data analysis: Non-model sharing-type machine learning for dis- tributed data, in: Knowledge Management and Acquisition for Intelligent Systems: 17th Pacific Rim Knowledge Acquisition Workshop, PKAW 2020, Yokohama, Japan, January 7–8, 2021, Proceedings 17, Springer, 2021, pp. 14–29

work page 2020
[8]

Imakura, A

A. Imakura, A. Bogdanova, T. Yamazoe, K. Omote, T. Sakurai, Accuracy and privacy evaluations of collabora- tive data analysis, Proceedings of the AAAI Conference on Artificial Intelligence (2021)

work page 2021
[9]

Imakura, T

A. Imakura, T. Sakurai, Y . Okada, T. Fujii, T. Sakamoto, H. Abe, Non-readily identifiable data collaboration analysis for multiple datasets including personal information, Information Fusion 98 (2023) 101826

work page 2023
[10]

Yamashiro, K

H. Yamashiro, K. Omote, A. Imakura, T. Sakurai, Toward the application of differential privacy to data collabo- ration, IEEE Access PP (2024) 1–1.doi:10.1109/ACCESS.2024.3396146. 41

work page doi:10.1109/access.2024.3396146 2024
[11]

Imakura, T

A. Imakura, T. Sakurai, Feddcl: a federated data collaboration learning as a hybrid-type privacy-preserving framework based on federated learning and data collaboration, arXiv preprint arXiv:2409.18356 (2024)

work page arXiv 2024
[12]

Kawakami, Y

Y . Kawakami, Y . Takano, A. Imakura, New solutions based on the generalized eigenvalue problem for the data collaboration analysis, arXiv preprint arXiv:2404.14164 (2024)

work page arXiv 2024
[13]

Nosaka, A

K. Nosaka, A. Yoshise, Creating collaborative data representations using matrix manifold optimal computation and automated hyperparameter tuning, in: 2023 IEEE 3rd International Conference on Electronic Communica- tions, Internet of Things and Big Data (ICEIB), IEEE, 2023, pp. 180–185

work page 2023
[14]

P. H. Schönemann, A generalized solution of the orthogonal procrustes problem, Psychometrika 31 (1) (1966) 1–10

work page 1966
[15]

Penrose, A generalized inverse for matrices, Proceedings of the Cambridge Philosophical Society 51 (1955) 406–413

R. Penrose, A generalized inverse for matrices, Proceedings of the Cambridge Philosophical Society 51 (1955) 406–413

work page 1955
[16]

Mizoguchi, A

A. Mizoguchi, A. Imakura, T. Sakurai, Application of data collaboration analysis to distributed data with mis- aligned features, Informatics in Medicine Unlocked 32 (2022) 101013

work page 2022
[17]

Mizoguchi, A

A. Mizoguchi, A. Bogdanova, A. Imakura, T. Sakurai, Data collaboration analysis applied to compound datasets and the introduction of projection data to non-iid settings (2023)

work page 2023
[18]

Nakayama, Y

T. Nakayama, Y . Kawamata, A. Toyoda, A. Imakura, R. Kagawa, M. Sanuki, R. Tsunoda, K. Yamagata, T. Saku- rai, Y . Okada, Data collaboration for causal inference from limited medical testing and medication data (2025). arXiv:2501.06511. URLhttps://arxiv.org/abs/2501.06511

work page arXiv 2025
[19]

Kawamata, R

Y . Kawamata, R. Motai, Y . Okada, A. Imakura, T. Sakurai, Collaborative causal inference on distributed data, Expert Systems with Applications 244 (2024) 123024.doi:https://doi.org/10.1016/j.eswa.2023. 123024. URLhttps://www.sciencedirect.com/science/article/pii/S0957417423035261

work page doi:10.1016/j.eswa.2023 2024
[20]

Bogdanova, A

A. Bogdanova, A. Imakura, T. Sakurai, Dc-shap method for consistent explainability in privacy-preserving dis- tributed machine learning, Human-Centric Intelligent Systems 3 (3) (2023) 197–210

work page 2023
[21]

Imakura, R

A. Imakura, R. Tsunoda, R. Kagawa, K. Yamagata, T. Sakurai, Dc-cox: Data collaboration cox proportional hazards model for privacy-preserving survival analysis on multiple parties, Journal of Biomedical Informatics 137 (2023) 104264

work page 2023
[22]

Imakura, H

A. Imakura, H. Inaba, Y . Okada, T. Sakurai, Interpretable collaborative data analysis on distributed data, Expert Systems with Applications 177 (2021) 114891.doi:https://doi.org/10.1016/j.eswa.2021.114891. URLhttps://www.sciencedirect.com/science/article/pii/S0957417421003328

work page doi:10.1016/j.eswa.2021.114891 2021
[23]

Yanagi, S

T. Yanagi, S. Ikeda, N. Sukegawa, Y . Takano, Privacy-preserving recommender system using the data collabora- tion analysis for distributed datasets, arXiv preprint arXiv:2406.01603 (2024)

work page arXiv 2024
[24]

Z. Liu, P. Luo, X. Wang, X. Tang, Deep learning face attributes in the wild, in: Proceedings of International Conference on Computer Vision (ICCV), 2015

work page 2015
[25]

Nguyen, D

H. Nguyen, D. Zhuang, P.-Y . Wu, M. Chang, Autogan-based dimension reduction for privacy preservation, Neurocomputing 384 (2020) 94–103

work page 2020
[26]

J. V . Haxby, J. S. Guntupalli, A. C. Connolly, Y . O. Halchenko, B. R. Conroy, M. I. Gobbini, M. Hanke, P. J. Ramadge, A common, high-dimensional model of the representational space in human ventral temporal cortex, Neuron 72 (2) (2011) 404–416.doi:10.1016/j.neuron.2011.08.026

work page doi:10.1016/j.neuron.2011.08.026 2011
[27]

Lorbert, P

A. Lorbert, P. J. Ramadge, Kernel hyperalignment, in: Advances in Neural Information Processing Systems 25, 2012, pp. 1799–1807. 42

work page 2012
[28]

Ling, Near-optimal bounds for generalized orthogonal procrustes problem via generalized power method, Applied and Computational Harmonic Analysis 66 (2023) 62–100

S. Ling, Near-optimal bounds for generalized orthogonal procrustes problem via generalized power method, Applied and Computational Harmonic Analysis 66 (2023) 62–100

work page 2023
[29]

F. Nie, L. Tian, X. Li, Multiview clustering via adaptively weighted procrustes, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018, pp. 2022–2030. doi:10.1145/3219819.3220049

work page doi:10.1145/3219819.3220049 2018
[30]

X. Dong, D. Wu, F. Nie, R. Wang, X. Li, Multi-view clustering with adaptive procrustes on grassmann manifold, Information Sciences 609 (2022) 855–875.doi:10.1016/j.ins.2022.07.089

work page doi:10.1016/j.ins.2022.07.089 2022
[31]

C. Wang, S. Mahadevan, Manifold alignment using procrustes analysis, in: Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 1120–1127

work page 2008
[32]

Grave, A

E. Grave, A. Joulin, Q. Berthet, Unsupervised alignment of embeddings with wasserstein procrustes, in: Pro- ceedings of the 22nd International Conference on Artificial Intelligence and Statistics, V ol. 89 of Proceedings of Machine Learning Research, 2019, pp. 1880–1890

work page 2019
[33]

X. Peng, G. Chen, C. Lin, M. Stevenson, Highly efficient knowledge graph embedding learning with orthogonal procrustes analysis, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 2364–2375

work page 2021
[34]

Iakymchuk, D

R. Iakymchuk, D. Defour, C. Collange, S. Graillat, Reproducible and accurate matrix multiplication, in: Inter- national Symposium on Scientific Computing, Computer Arithmetic, and Validated Numerics, Springer, 2015, pp. 126–137

work page 2015
[35]

Martinsson, G

P.-G. Martinsson, G. Quintana OrtI, N. Heavner, R. Van De Geijn, Householder qr factorization with random- ization for column pivoting (hqrrp), SIAM Journal on Scientific Computing 39 (2) (2017) C96–C115

work page 2017
[36]

L. Wang, G. Libert, P. Manneback, Kalman filter algorithm based on singular value decomposition, in: [1992] Proceedings of the 31st IEEE Conference on Decision and Control, IEEE, 1992, pp. 1224–1229

work page 1992
[37]

Mahfoudhi, A fast triangular matrix inversion, in: Proceedings of the World Congress on Engineering, V ol

R. Mahfoudhi, A fast triangular matrix inversion, in: Proceedings of the World Congress on Engineering, V ol. 1, 2012

work page 2012
[38]

K. Chen, L. Liu, Geometric data perturbation for privacy preserving outsourced data mining, Knowledge and information systems 29 (3) (2011) 657–695

work page 2011
[39]

Huang, T

K. Huang, T. Fu, W. Gao, Y . Zhao, Y . Roohani, J. Leskovec, C. W. Coley, C. Xiao, J. Sun, M. Zitnik, Thera- peutics data commons: Machine learning datasets and tasks for drug discovery and development, arXiv preprint arXiv:2102.09548 (2021)

work page arXiv 2021
[40]

Becker, R

B. Becker, R. Kohavi, Adult, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C5XW20 (1996)

work page doi:10.24432/c5xw20 1996
[41]

T. J. Pollard, A. E. Johnson, J. D. Raffa, L. A. Celi, R. G. Mark, O. Badawi, The eicu collaborative research database, a freely available multi-center database for critical care research, Scientific data 5 (1) (2018) 1–13

work page 2018
[42]

Balle, Y .-X

B. Balle, Y .-X. Wang, Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising, in: International Conference on Machine Learning, PMLR, 2018, pp. 394–403

work page 2018
[43]

C. Xu, F. Cheng, L. Chen, Z. Du, W. Li, G. Liu, P. W. Lee, Y . Tang, In silico prediction of chemical ames mutagenicity, Journal of chemical information and modeling 52 (11) (2012) 2840–2847

work page 2012
[44]

A. Mayr, G. Klambauer, T. Unterthiner, S. Hochreiter, Deeptox: toxicity prediction using deep learning, Fron- tiers in Environmental Science 3 (2016) 80

work page 2016
[45]

Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, V . Pande, Moleculenet: a benchmark for molecular machine learning, Chemical science 9 (2) (2018) 513–530. 43

work page 2018
[46]

Veith, N

H. Veith, N. Southall, R. Huang, T. James, D. Fayne, N. Artemenko, M. Shen, J. Inglese, C. P. Austin, D. G. Lloyd, et al., Comprehensive characterization of cytochrome p450 isozyme selectivity across chemical libraries, Nature biotechnology 27 (11) (2009) 1050–1055

work page 2009
[47]

Schroff, D

F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823. doi:10.1109/CVPR.2015.7298682

work page doi:10.1109/cvpr.2015.7298682 2015
[48]

Szegedy, S

C. Szegedy, S. Ioffe, V . Vanhoucke, A. A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017, pp. 4278–4284

work page 2017
[49]

Q. Cao, L. Shen, W. Xie, O. M. Parkhi, A. Zisserman, Vggface2: A dataset for recognising faces across pose and age, in: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2018, pp. 67–74.doi:10.1109/FG.2018.00020

work page doi:10.1109/fg.2018.00020 2018
[50]

Deng, The mnist database of handwritten digit images for machine learning research [best of the web], IEEE signal processing magazine 29 (6) (2012) 141–142

L. Deng, The mnist database of handwritten digit images for machine learning research [best of the web], IEEE signal processing magazine 29 (6) (2012) 141–142

work page 2012
[51]

H. Xiao, K. Rasul, R. V ollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algo- rithms, arXiv preprint arXiv:1708.07747 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[52]

Y . Wang, J. Xiao, T. O. Suzek, J. Zhang, J. Wang, S. H. Bryant, Pubchem: a public information system for analyzing bioactivities of small molecules, Nucleic acids research 37 (suppl_2) (2009) W623–W633. 44

work page 2009

[1] [1]

Rosati, P

P. Rosati, P. Deeney, M. Cummins, L. van der Werff, T. Lynn, Social media and stock price reaction to data breach announcements: Evidence from us listed companies, Research in International Business and Finance 47 (2019) 458–469

work page 2019

[2] [2]

McMahan, E

B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y Arcas, Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics, PMLR, 2017, pp. 1273–1282

work page 2017

[3] [3]

Dwork, Differential privacy: A survey of results, in: International Conference on Theory and Applications of Models of Computation, Springer, 2008, pp

C. Dwork, Differential privacy: A survey of results, in: International Conference on Theory and Applications of Models of Computation, Springer, 2008, pp. 1–19

work page 2008

[4] [4]

K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, H. V . Poor, Federated learning with differential privacy: Algorithms and performance analysis, IEEE transactions on information forensics and security 15 (2020) 3454–3469

work page 2020

[5] [5]

R. Xu, N. Baracaldo, J. Joshi, Privacy-preserving machine learning: Methods, challenges and directions, arXiv preprint arXiv:2108.04417 (2021)

work page arXiv 2021

[6] [6]

Imakura, T

A. Imakura, T. Sakurai, Data collaboration analysis framework using centralization of individual intermediate representations for distributed data sets, ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 6 (2) (2020) 04020018

work page 2020

[7] [7]

Imakura, X

A. Imakura, X. Ye, T. Sakurai, Collaborative data analysis: Non-model sharing-type machine learning for dis- tributed data, in: Knowledge Management and Acquisition for Intelligent Systems: 17th Pacific Rim Knowledge Acquisition Workshop, PKAW 2020, Yokohama, Japan, January 7–8, 2021, Proceedings 17, Springer, 2021, pp. 14–29

work page 2020

[8] [8]

Imakura, A

A. Imakura, A. Bogdanova, T. Yamazoe, K. Omote, T. Sakurai, Accuracy and privacy evaluations of collabora- tive data analysis, Proceedings of the AAAI Conference on Artificial Intelligence (2021)

work page 2021

[9] [9]

Imakura, T

A. Imakura, T. Sakurai, Y . Okada, T. Fujii, T. Sakamoto, H. Abe, Non-readily identifiable data collaboration analysis for multiple datasets including personal information, Information Fusion 98 (2023) 101826

work page 2023

[10] [10]

Yamashiro, K

H. Yamashiro, K. Omote, A. Imakura, T. Sakurai, Toward the application of differential privacy to data collabo- ration, IEEE Access PP (2024) 1–1.doi:10.1109/ACCESS.2024.3396146. 41

work page doi:10.1109/access.2024.3396146 2024

[11] [11]

Imakura, T

A. Imakura, T. Sakurai, Feddcl: a federated data collaboration learning as a hybrid-type privacy-preserving framework based on federated learning and data collaboration, arXiv preprint arXiv:2409.18356 (2024)

work page arXiv 2024

[12] [12]

Kawakami, Y

Y . Kawakami, Y . Takano, A. Imakura, New solutions based on the generalized eigenvalue problem for the data collaboration analysis, arXiv preprint arXiv:2404.14164 (2024)

work page arXiv 2024

[13] [13]

Nosaka, A

K. Nosaka, A. Yoshise, Creating collaborative data representations using matrix manifold optimal computation and automated hyperparameter tuning, in: 2023 IEEE 3rd International Conference on Electronic Communica- tions, Internet of Things and Big Data (ICEIB), IEEE, 2023, pp. 180–185

work page 2023

[14] [14]

P. H. Schönemann, A generalized solution of the orthogonal procrustes problem, Psychometrika 31 (1) (1966) 1–10

work page 1966

[15] [15]

Penrose, A generalized inverse for matrices, Proceedings of the Cambridge Philosophical Society 51 (1955) 406–413

R. Penrose, A generalized inverse for matrices, Proceedings of the Cambridge Philosophical Society 51 (1955) 406–413

work page 1955

[16] [16]

Mizoguchi, A

A. Mizoguchi, A. Imakura, T. Sakurai, Application of data collaboration analysis to distributed data with mis- aligned features, Informatics in Medicine Unlocked 32 (2022) 101013

work page 2022

[17] [17]

Mizoguchi, A

A. Mizoguchi, A. Bogdanova, A. Imakura, T. Sakurai, Data collaboration analysis applied to compound datasets and the introduction of projection data to non-iid settings (2023)

work page 2023

[18] [18]

Nakayama, Y

T. Nakayama, Y . Kawamata, A. Toyoda, A. Imakura, R. Kagawa, M. Sanuki, R. Tsunoda, K. Yamagata, T. Saku- rai, Y . Okada, Data collaboration for causal inference from limited medical testing and medication data (2025). arXiv:2501.06511. URLhttps://arxiv.org/abs/2501.06511

work page arXiv 2025

[19] [19]

Kawamata, R

Y . Kawamata, R. Motai, Y . Okada, A. Imakura, T. Sakurai, Collaborative causal inference on distributed data, Expert Systems with Applications 244 (2024) 123024.doi:https://doi.org/10.1016/j.eswa.2023. 123024. URLhttps://www.sciencedirect.com/science/article/pii/S0957417423035261

work page doi:10.1016/j.eswa.2023 2024

[20] [20]

Bogdanova, A

A. Bogdanova, A. Imakura, T. Sakurai, Dc-shap method for consistent explainability in privacy-preserving dis- tributed machine learning, Human-Centric Intelligent Systems 3 (3) (2023) 197–210

work page 2023

[21] [21]

Imakura, R

A. Imakura, R. Tsunoda, R. Kagawa, K. Yamagata, T. Sakurai, Dc-cox: Data collaboration cox proportional hazards model for privacy-preserving survival analysis on multiple parties, Journal of Biomedical Informatics 137 (2023) 104264

work page 2023

[22] [22]

Imakura, H

A. Imakura, H. Inaba, Y . Okada, T. Sakurai, Interpretable collaborative data analysis on distributed data, Expert Systems with Applications 177 (2021) 114891.doi:https://doi.org/10.1016/j.eswa.2021.114891. URLhttps://www.sciencedirect.com/science/article/pii/S0957417421003328

work page doi:10.1016/j.eswa.2021.114891 2021

[23] [23]

Yanagi, S

T. Yanagi, S. Ikeda, N. Sukegawa, Y . Takano, Privacy-preserving recommender system using the data collabora- tion analysis for distributed datasets, arXiv preprint arXiv:2406.01603 (2024)

work page arXiv 2024

[24] [24]

Z. Liu, P. Luo, X. Wang, X. Tang, Deep learning face attributes in the wild, in: Proceedings of International Conference on Computer Vision (ICCV), 2015

work page 2015

[25] [25]

Nguyen, D

H. Nguyen, D. Zhuang, P.-Y . Wu, M. Chang, Autogan-based dimension reduction for privacy preservation, Neurocomputing 384 (2020) 94–103

work page 2020

[26] [26]

J. V . Haxby, J. S. Guntupalli, A. C. Connolly, Y . O. Halchenko, B. R. Conroy, M. I. Gobbini, M. Hanke, P. J. Ramadge, A common, high-dimensional model of the representational space in human ventral temporal cortex, Neuron 72 (2) (2011) 404–416.doi:10.1016/j.neuron.2011.08.026

work page doi:10.1016/j.neuron.2011.08.026 2011

[27] [27]

Lorbert, P

A. Lorbert, P. J. Ramadge, Kernel hyperalignment, in: Advances in Neural Information Processing Systems 25, 2012, pp. 1799–1807. 42

work page 2012

[28] [28]

Ling, Near-optimal bounds for generalized orthogonal procrustes problem via generalized power method, Applied and Computational Harmonic Analysis 66 (2023) 62–100

S. Ling, Near-optimal bounds for generalized orthogonal procrustes problem via generalized power method, Applied and Computational Harmonic Analysis 66 (2023) 62–100

work page 2023

[29] [29]

F. Nie, L. Tian, X. Li, Multiview clustering via adaptively weighted procrustes, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018, pp. 2022–2030. doi:10.1145/3219819.3220049

work page doi:10.1145/3219819.3220049 2018

[30] [30]

X. Dong, D. Wu, F. Nie, R. Wang, X. Li, Multi-view clustering with adaptive procrustes on grassmann manifold, Information Sciences 609 (2022) 855–875.doi:10.1016/j.ins.2022.07.089

work page doi:10.1016/j.ins.2022.07.089 2022

[31] [31]

C. Wang, S. Mahadevan, Manifold alignment using procrustes analysis, in: Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 1120–1127

work page 2008

[32] [32]

Grave, A

E. Grave, A. Joulin, Q. Berthet, Unsupervised alignment of embeddings with wasserstein procrustes, in: Pro- ceedings of the 22nd International Conference on Artificial Intelligence and Statistics, V ol. 89 of Proceedings of Machine Learning Research, 2019, pp. 1880–1890

work page 2019

[33] [33]

X. Peng, G. Chen, C. Lin, M. Stevenson, Highly efficient knowledge graph embedding learning with orthogonal procrustes analysis, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 2364–2375

work page 2021

[34] [34]

Iakymchuk, D

R. Iakymchuk, D. Defour, C. Collange, S. Graillat, Reproducible and accurate matrix multiplication, in: Inter- national Symposium on Scientific Computing, Computer Arithmetic, and Validated Numerics, Springer, 2015, pp. 126–137

work page 2015

[35] [35]

Martinsson, G

P.-G. Martinsson, G. Quintana OrtI, N. Heavner, R. Van De Geijn, Householder qr factorization with random- ization for column pivoting (hqrrp), SIAM Journal on Scientific Computing 39 (2) (2017) C96–C115

work page 2017

[36] [36]

L. Wang, G. Libert, P. Manneback, Kalman filter algorithm based on singular value decomposition, in: [1992] Proceedings of the 31st IEEE Conference on Decision and Control, IEEE, 1992, pp. 1224–1229

work page 1992

[37] [37]

Mahfoudhi, A fast triangular matrix inversion, in: Proceedings of the World Congress on Engineering, V ol

R. Mahfoudhi, A fast triangular matrix inversion, in: Proceedings of the World Congress on Engineering, V ol. 1, 2012

work page 2012

[38] [38]

K. Chen, L. Liu, Geometric data perturbation for privacy preserving outsourced data mining, Knowledge and information systems 29 (3) (2011) 657–695

work page 2011

[39] [39]

Huang, T

K. Huang, T. Fu, W. Gao, Y . Zhao, Y . Roohani, J. Leskovec, C. W. Coley, C. Xiao, J. Sun, M. Zitnik, Thera- peutics data commons: Machine learning datasets and tasks for drug discovery and development, arXiv preprint arXiv:2102.09548 (2021)

work page arXiv 2021

[40] [40]

Becker, R

B. Becker, R. Kohavi, Adult, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C5XW20 (1996)

work page doi:10.24432/c5xw20 1996

[41] [41]

T. J. Pollard, A. E. Johnson, J. D. Raffa, L. A. Celi, R. G. Mark, O. Badawi, The eicu collaborative research database, a freely available multi-center database for critical care research, Scientific data 5 (1) (2018) 1–13

work page 2018

[42] [42]

Balle, Y .-X

B. Balle, Y .-X. Wang, Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising, in: International Conference on Machine Learning, PMLR, 2018, pp. 394–403

work page 2018

[43] [43]

C. Xu, F. Cheng, L. Chen, Z. Du, W. Li, G. Liu, P. W. Lee, Y . Tang, In silico prediction of chemical ames mutagenicity, Journal of chemical information and modeling 52 (11) (2012) 2840–2847

work page 2012

[44] [44]

A. Mayr, G. Klambauer, T. Unterthiner, S. Hochreiter, Deeptox: toxicity prediction using deep learning, Fron- tiers in Environmental Science 3 (2016) 80

work page 2016

[45] [45]

Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, V . Pande, Moleculenet: a benchmark for molecular machine learning, Chemical science 9 (2) (2018) 513–530. 43

work page 2018

[46] [46]

Veith, N

H. Veith, N. Southall, R. Huang, T. James, D. Fayne, N. Artemenko, M. Shen, J. Inglese, C. P. Austin, D. G. Lloyd, et al., Comprehensive characterization of cytochrome p450 isozyme selectivity across chemical libraries, Nature biotechnology 27 (11) (2009) 1050–1055

work page 2009

[47] [47]

Schroff, D

F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823. doi:10.1109/CVPR.2015.7298682

work page doi:10.1109/cvpr.2015.7298682 2015

[48] [48]

Szegedy, S

C. Szegedy, S. Ioffe, V . Vanhoucke, A. A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017, pp. 4278–4284

work page 2017

[49] [49]

Q. Cao, L. Shen, W. Xie, O. M. Parkhi, A. Zisserman, Vggface2: A dataset for recognising faces across pose and age, in: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2018, pp. 67–74.doi:10.1109/FG.2018.00020

work page doi:10.1109/fg.2018.00020 2018

[50] [50]

Deng, The mnist database of handwritten digit images for machine learning research [best of the web], IEEE signal processing magazine 29 (6) (2012) 141–142

L. Deng, The mnist database of handwritten digit images for machine learning research [best of the web], IEEE signal processing magazine 29 (6) (2012) 141–142

work page 2012

[51] [51]

H. Xiao, K. Rasul, R. V ollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algo- rithms, arXiv preprint arXiv:1708.07747 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[52] [52]

Y . Wang, J. Xiao, T. O. Suzek, J. Zhang, J. Wang, S. H. Bryant, Pubchem: a public information system for analyzing bioactivities of small molecules, Nucleic acids research 37 (suppl_2) (2009) W623–W633. 44

work page 2009