Recognition: 2 theorem links
· Lean TheoremPrivate Vertical Federated Inference for Time-Series
Pith reviewed 2026-05-12 01:01 UTC · model grok-4.3
The pith
A hybrid public-private model head enables secure vertical federated inference on large time-series transformers by limiting MPC to a small private section.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PPHH-VFL splits the model head into a public plaintext head trained with adversarial objectives to hide sensitive patterns in embeddings and a lightweight private head secured by MPC that carries only the minimal information flow needed for accurate predictions. This combination lets the bulk of inference run in plaintext while the secure component remains small enough to be efficient.
What carries the argument
The Public/Private Hybrid Head-VFL architecture, which partitions the final layers into an adversarially trained public component and a compact MPC-protected private component to balance leakage protection with utility.
If this is right
- Inference runs up to 44.4 times faster in wide-area networks than a full VFL plus MPC baseline.
- Communication per batch falls from 1.7 GB to 19 MB, a 91-fold reduction.
- Classification accuracy rises by 2.5 percentage points and regression RMSE improves by 40.7 percent relative to the baseline.
- The method scales to transformer models with 86 million parameters where end-to-end MPC does not.
Where Pith is reading between the lines
- The same head-splitting pattern could be applied to other modalities such as images or text if the public embeddings can be similarly protected.
- Reducing the MPC portion might make hybrid designs attractive even when full MPC is feasible but expensive.
- Dynamic choice of which layers stay public versus private could further tune the privacy-utility trade-off on new datasets.
Load-bearing premise
Adversarial training on the public embeddings sufficiently hides sensitive time-series features without letting the small private MPC head leak information or add too much overhead.
What would settle it
An experiment showing that an adversary can recover private input statistics or original time-series values from the public embeddings after adversarial training, or that the hybrid head's accuracy drops below the VFL+MPC baseline on held-out data.
Figures
read the original abstract
Institutions may benefit from collaborative inference on time-series data. In settings where privacy is necessary, multi-party computation (MPC) is a straightforward approach to providing strong guarantees, yet it remains prohibitively expensive and scales poorly with modern transformer architectures. Vertical Federated Learning (VFL) offers efficiency but suffers from privacy leakage at the embedding level, and securing the entire VFL model head via MPC remains prohibitively slow and communication-heavy for larger models. To enable practical, secure inference at scale, we propose "Public/Private Hybrid Head-VFL" (PPHH-VFL). This hybrid architecture splits the model head into an efficient plaintext public head and a secure, lightweight MPC private head. By applying adversarial training to the public embeddings, we mitigate privacy leakage; concurrently, the small private head securely preserves the flow of sensitive information needed for high downstream utility. Empirical evaluations on models ranging up to 86 million parameters demonstrate that PPHH-VFL accelerates inference by up to six orders of magnitude compared to end-to-end MPC. Compared to a standard VFL+MPC baseline, our approach scales significantly better, achieving a speedup of up to 44.4x in WAN and a 91.2x reduction in communication costs (dropping from 1.7 GB to 19 MB per batch), while simultaneously improving downstream classification accuracy by 2.50% and regression RMSE by 40.7%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Public/Private Hybrid Head-VFL (PPHH-VFL), a hybrid architecture for private vertical federated inference on time-series data. It splits the model head into an efficient plaintext public head (with adversarial training on embeddings to mitigate leakage) and a lightweight MPC private head to preserve sensitive information flow for utility. Empirical evaluations on models up to 86 million parameters claim up to six orders of magnitude speedup over end-to-end MPC, 44.4x WAN speedup and 91.2x communication reduction (1.7 GB to 19 MB) versus a VFL+MPC baseline, plus 2.50% classification accuracy and 40.7% RMSE regression gains.
Significance. If the empirical results hold under detailed scrutiny and the privacy mitigation is substantiated, the hybrid design could enable practical secure inference for large-scale transformers in vertical federated settings, addressing the scalability limits of pure MPC while improving on standard VFL leakage. The reported model scale (86M parameters) and concrete communication reductions represent a notable strength for applied systems work.
major comments (2)
- [Experimental Evaluation] Experimental sections: The abstract reports quantitative speedups, communication savings, and accuracy improvements (2.50% classification, 40.7% RMSE), but provides no details on datasets, model architectures beyond parameter count, training procedures, number of runs, statistical significance, or the precise method used to measure privacy leakage. This absence is load-bearing for evaluating the central performance claims.
- [Method] PPHH-VFL architecture and adversarial training description: The claim that adversarial training on public embeddings sufficiently mitigates privacy leakage while the small MPC head preserves utility is central to the privacy-utility tradeoff, yet no formal privacy bounds, explicit threat model, or analysis of leakage risks from time-series temporal correlations are provided. This leaves the mitigation heuristic and unquantified.
minor comments (2)
- [Abstract] Abstract: The 'up to six orders of magnitude' speedup and the 44.4x/91.2x figures should explicitly state the network conditions (LAN vs. WAN) and batch sizes under which they were measured for clarity.
- [Throughout] Notation: Ensure consistent use of terms such as 'public head' and 'private head' across sections to avoid ambiguity in the hybrid split description.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper accordingly to enhance clarity, reproducibility, and the description of our privacy approach.
read point-by-point responses
-
Referee: [Experimental Evaluation] Experimental sections: The abstract reports quantitative speedups, communication savings, and accuracy improvements (2.50% classification, 40.7% RMSE), but provides no details on datasets, model architectures beyond parameter count, training procedures, number of runs, statistical significance, or the precise method used to measure privacy leakage. This absence is load-bearing for evaluating the central performance claims.
Authors: We agree that the experimental section requires more explicit details to support reproducibility and evaluation of the claims. In the revised manuscript, we will expand this section to specify the datasets for classification and regression tasks, full model architecture details (including transformer configurations for models up to 86M parameters), training procedures with hyperparameters, the number of runs with statistical significance measures (e.g., standard deviations), and the exact privacy leakage measurement method (adversarial attack success rates on embeddings). These additions will substantiate the reported speedups (up to 6 orders of magnitude), communication reductions (1.7 GB to 19 MB), and accuracy gains. revision: yes
-
Referee: [Method] PPHH-VFL architecture and adversarial training description: The claim that adversarial training on public embeddings sufficiently mitigates privacy leakage while the small MPC head preserves utility is central to the privacy-utility tradeoff, yet no formal privacy bounds, explicit threat model, or analysis of leakage risks from time-series temporal correlations are provided. This leaves the mitigation heuristic and unquantified.
Authors: We agree that an explicit threat model and discussion of time-series leakage risks would strengthen the presentation. We will add a dedicated subsection describing the semi-honest threat model and how adversarial training on public embeddings reduces leakage while the MPC head handles sensitive flows for utility. Although we cannot provide formal privacy bounds (our focus is practical efficiency via empirical mitigation, common in this area), we will include additional empirical leakage measurements and explicitly note the heuristic nature as a limitation, along with analysis of temporal correlation risks. revision: partial
Circularity Check
No circularity: claims rest on direct empirical measurements of hybrid architecture
full rationale
The paper presents an empirical architecture (PPHH-VFL) whose performance claims—speedups up to 6 orders of magnitude, 44.4x WAN improvement, 91.2x communication reduction, and accuracy gains—are reported as measured outcomes on models up to 86M parameters. No derivation chain, equations, or first-principles results are invoked that reduce by construction to fitted inputs or self-citations. The adversarial-training step is presented as a heuristic mitigation rather than a formally derived bound, but this does not create circularity because the utility and efficiency numbers are not algebraically forced by the same quantities used to define the split. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Adversarial training on public embeddings sufficiently hides sensitive information without degrading utility
- domain assumption The lightweight private MPC head can be made small enough to remain practical while carrying all necessary sensitive flow
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose Public/Private Hybrid Head-VFL (PPHH-VFL). This hybrid architecture splits the model head into an efficient plaintext public head and a secure, lightweight MPC private head. By applying adversarial training to the public embeddings...
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Empirical evaluations on models ranging up to 86 million parameters demonstrate that PPHH-VFL accelerates inference by up to six orders of magnitude...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Fei Tang, Shikai Liang, Guowei Ling, and Jinyong Shan. Ihvfl: a privacy-enhanced intention-hiding vertical federated learning framework for medical data.Cybersecurity, 6(1):37, 2023
work page 2023
-
[2]
Share secrets for privacy: Confidential forecasting with vertical federated learning
Aditya Shankar, Jérémie Decouchant, Dimitra Gkorou, Rihan Hai, and Lydia Chen. Share secrets for privacy: Confidential forecasting with vertical federated learning. InInternational Conference on Availability, Reliability and Security, pages 358–379. Springer, 2025
work page 2025
-
[3]
Completeness theorems for non-cryptographic fault-tolerant distributed computation
Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. Completeness theorems for non-cryptographic fault-tolerant distributed computation. InProviding sound foundations for cryptography: on the work of Shafi Goldwasser and Silvio Micali, pages 351–371. 2019
work page 2019
-
[4]
ABY - A Framework for Efficient Mixed- Protocol Secure Two-Party Computation
Daniel Demmler, Thomas Schneider, and Michael Zohner. ABY - A Framework for Efficient Mixed- Protocol Secure Two-Party Computation. InProceedings 2015 Network and Distributed System Security Symposium, San Diego, CA, 2015. Internet Society
work page 2015
-
[5]
CrypTen: Secure Multi-Party Computation Meets Machine Learning
Brian Knott, Shobha Venkataraman, Awni Hannun, Shubho Sengupta, Mark Ibrahim, and Laurens van der Maaten. CrypTen: Secure Multi-Party Computation Meets Machine Learning. InAdvances in Neural Information Processing Systems, volume 34, pages 4961–4973. Curran Associates, Inc., 2021
work page 2021
-
[6]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
work page 2017
-
[7]
Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P Xing, and Hao Zhang. Mpcformer: fast, performant and private transformer inference with mpc.arXiv preprint arXiv:2211.01452, 2022
-
[8]
Characterization of mpc-based private inference for transformer-based models
Yongqin Wang, G Edward Suh, Wenjie Xiong, Benjamin Lefaudeux, Brian Knott, Murali Annavaram, and Hsien-Hsin S Lee. Characterization of mpc-based private inference for transformer-based models. In 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 187–197. IEEE, 2022
work page 2022
-
[9]
Yuanchao Ding, Hua Guo, Yewei Guan, Weixin Liu, Jiarong Huo, Zhenyu Guan, and Xiyong Zhang. East: Efficient and accurate secure transformer framework for inference.arXiv preprint arXiv:2308.09923, 2023
-
[10]
Karthik Garimella, Nandan Kumar Jha, and Brandon Reagen. Sisyphus: A Cautionary Tale of Using Low-Degree Polynomial Activations in Privacy-Preserving Deep Learning, November 2021
work page 2021
-
[11]
COINN: Crypto/ML Codesign for Oblivious Inference via Neural Networks
Siam Umar Hussain, Mojan Javaheripi, Mohammad Samragh, and Farinaz Koushanfar. COINN: Crypto/ML Codesign for Oblivious Inference via Neural Networks. InProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21, pages 3266–3281, New York, NY , USA, November 2021. Association for Computing Machinery
work page 2021
-
[12]
Delphi: A Cryptographic Inference Service for Neural Networks
Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. Delphi: A Cryptographic Inference Service for Neural Networks. In29th USENIX Security Symposium (USENIX Security 20), pages 2505–2522, 2020. 10
work page 2020
-
[13]
Zahra Ghodsi, Akshaj Kumar Veldanda, Brandon Reagen, and Siddharth Garg. CryptoNAS | Proceedings of the 34th International Conference on Neural Information Processing Systems.Advances in Neural Information Processing Systems, 33:16961–16971, 2020
work page 2020
-
[14]
Communication-efficient learning of deep networks from decentralized data
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics, pages 1273–1282. PMLR, 2017
work page 2017
-
[15]
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. Federated machine learning: Concept and applications.ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–19, 2019
work page 2019
-
[16]
Mvfl: Multivariate vertical federated learning for time-series forecasting
Xicun Yang, JunePyo Jung, Jialiang Lu, Keun-Woo Lim, and Leonardo Linguaglossa. Mvfl: Multivariate vertical federated learning for time-series forecasting. In2025 21st International Conference on Network and Service Management (CNSM), pages 1–7. IEEE, 2025
work page 2025
-
[17]
Text embeddings reveal (almost) as much as text
John Morris, V olodymyr Kuleshov, Vitaly Shmatikov, and Alexander M Rush. Text embeddings reveal (almost) as much as text. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12448–12460, 2023
work page 2023
-
[18]
Jiankai Sun, Xin Yang, Yuanshun Yao, and Chong Wang. Label leakage and protection from forward embedding in vertical federated learning.arXiv preprint arXiv:2203.01451, 2022
-
[19]
Max Friedrich, Arne Köhn, Gregor Wiedemann, and Chris Biemann. Adversarial learning of privacy- preserving text representations for de-identification of medical records. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5829–5839, 2019
work page 2019
-
[20]
Jinyuan Jia and Neil Zhenqiang Gong. {AttriGuard}: A practical defense against attribute inference attacks via adversarial machine learning. In27th USENIX Security Symposium (USENIX Security 18), pages 513–529, 2018
work page 2018
-
[21]
Overlearning reveals sensitive attributes.arXiv preprint arXiv:1905.11742, 2019
Congzheng Song and Vitaly Shmatikov. Overlearning reveals sensitive attributes.arXiv preprint arXiv:1905.11742, 2019
-
[22]
Privacy-aware adversarial network in human mobility prediction.arXiv preprint arXiv:2208.05009, 2022
Yuting Zhan, Hamed Haddadi, and Afra Mashhadi. Privacy-aware adversarial network in human mobility prediction.arXiv preprint arXiv:2208.05009, 2022
-
[23]
Generative adversarial privacy.arXiv preprint arXiv:1807.05306, 2018
Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. Generative adversarial privacy.arXiv preprint arXiv:1807.05306, 2018
-
[24]
Yang Liu, Yan Kang, Tianyuan Zou, Yanhong Pu, Yuanqin He, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, and Qiang Yang. Vertical federated learning: Concepts, advances, and challenges.IEEE transactions on knowledge and data engineering, 36(7):3615–3634, 2024
work page 2024
-
[25]
Falcon: A privacy-preserving and interpretable vertical federated learning system
Yuncheng Wu, Naili Xing, Gang Chen, Tien Tuan Anh Dinh, Zhaojing Luo, Beng Chin Ooi, Xiaokui Xiao, and Meihui Zhang. Falcon: A privacy-preserving and interpretable vertical federated learning system. Proceedings of the VLDB Endowment, 16(10):2471–2484, 2023
work page 2023
-
[26]
Haoran Li, Mingshi Xu, and Yangqiu Song. Sentence embedding leaks more information than you expect: Generative embedding inversion attack to recover the whole sentence. InFindings of the Association for Computational Linguistics: ACL 2023, pages 14022–14040, 2023
work page 2023
-
[27]
Generative adversarial networks.Communications of the ACM, 63(11):139– 144, 2020
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.Communications of the ACM, 63(11):139– 144, 2020
work page 2020
-
[28]
John S Bridle. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. InNeurocomputing: Algorithms, architectures and applications, pages 227–236. Springer, 1990
work page 1990
-
[29]
Sicong Liu, Junzhao Du, Anshumali Shrivastava, and Lin Zhong. Privacy adversarial network: represen- tation learning for mobile data privacy.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(4):1–18, 2019
work page 2019
-
[30]
Robustness may be at odds with accuracy
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robust- ness may be at odds with accuracy.arXiv preprint arXiv:1805.12152, 2018
-
[31]
FlorianKnauer and Will Cukierski. Rossmann Store Sales. https://kaggle.com/competitions/rossmann- store-sales, 2015. Kaggle. 11
work page 2015
-
[32]
Impact of adversarial training on robustness and generalizability of language models
Enes Altinisik, Hassan Sajjad, Husrev Sencar, Safa Messaoud, and Sanjay Chawla. Impact of adversarial training on robustness and generalizability of language models. InFindings of the Association for Computational Linguistics: ACL 2023, pages 7828–7840, 2023
work page 2023
-
[33]
On the global optima of kernelized adversarial represen- tation learning
Bashir Sadeghi, Runyi Yu, and Vishnu Boddeti. On the global optima of kernelized adversarial represen- tation learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7971–7979, 2019
work page 2019
-
[34]
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption.arXiv preprint arXiv:1711.10677, 2017
-
[35]
Secureboost: A lossless federated learning framework.IEEE intelligent systems, 36(6):87–98, 2021
Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, Dimitrios Papadopoulos, and Qiang Yang. Secureboost: A lossless federated learning framework.IEEE intelligent systems, 36(6):87–98, 2021
work page 2021
-
[36]
Multi-participant vertical federated learning based time series prediction
Yang Yan, Guozheng Yang, Yan Gao, Cheng Zang, Jiajun Chen, and Qiang Wang. Multi-participant vertical federated learning based time series prediction. InProceedings of the 8th International Conference on Computing and Artificial Intelligence, pages 165–171, 2022
work page 2022
-
[37]
Local differential privacy for deep learning.IEEE Internet of Things Journal, 7(7):5827–5842, 2019
Pathum Chamikara Mahawaga Arachchige, Peter Bertok, Ibrahim Khalil, Dongxi Liu, Seyit Camtepe, and Mohammed Atiquzzaman. Local differential privacy for deep learning.IEEE Internet of Things Journal, 7(7):5827–5842, 2019
work page 2019
-
[38]
Yuanming Cao, Chengqi Li, and Wenbo He. Ldp-slicing: Local differential privacy for images via randomized bit-plane slicing.arXiv preprint arXiv:2603.03711, 2026
-
[39]
Oluwaseyi Feyisetan and Shiva Kasiviswanathan. Private release of text embedding vectors. InProceedings of the First Workshop on Trustworthy Natural Language Processing, pages 15–27, 2021. A Additional Experimental Setup Information Experiment Details.All experiments are run on the same machine with 32 CPU cores @ 3.7 GHz and 1 TB of RAM with NVIDIA A100 ...
work page 2021
-
[40]
Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.