What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Cen Chen; Fuyi Wang; Mingyuan Fan; Yu Liu

arxiv: 2605.23158 · v1 · pith:46FPUWXJnew · submitted 2026-05-22 · 💻 cs.CR · cs.CL· cs.LG

What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan , Yu Liu , Fuyi Wang , Cen Chen This is my paper

Pith reviewed 2026-05-25 04:39 UTC · model grok-4.3

classification 💻 cs.CR cs.CLcs.LG

keywords split inferenceprivacy leakagelarge language modelsreconstruction attackactivation inversionperturbation defenseLLM security

0 comments

The pith

Split inference for LLMs allows servers to reconstruct client inputs from activations despite common defenses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that partitioning large language models between client and server for split inference does not fully protect the client's private input, as the server receives intermediate activations that can be inverted. The authors introduce ActInv to solve an activation matching problem and recover the original input with high fidelity. Common defenses such as adding Gaussian noise or sparsifying activations fail to block this reconstruction in their tests. They define the Perturbation Amplification Factor to measure each layer's resistance and show that leakage risk differs sharply across layers. From these observations they design PriPert, a defense that chooses perturbation directions to increase reconstruction error while tracking utility and cost.

Core claim

ActInv reconstructs the client's input by solving an intermediate activation matching problem, yielding high-fidelity results even when Gaussian noise injection or activation sparsification is present. The Perturbation Amplification Factor quantifies that privacy vulnerability is not uniform across layers. PriPert improves protection by calibrating perturbation directions during backpropagation to maximize reconstruction error.

What carries the argument

ActInv, an attack that formulates reconstruction as an intermediate activation matching problem solved from server-received activations.

If this is right

High-fidelity input reconstruction remains possible against Gaussian noise and sparsification defenses.
Some layers exhibit natural resistance to reconstruction while others are highly susceptible.
Calibrating perturbation directions during training measurably raises reconstruction error.
PriPert maintains acceptable utility and overhead while strengthening privacy in split setups.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Choosing the split point at layers with higher PAF values could reduce leakage with no added client cost.
If the server must be assumed to know the architecture, stronger client-side input protections become necessary.
The non-uniform layer vulnerability suggests testing split inference on models where the cut is chosen adaptively per input.

Load-bearing premise

The server knows the client model architecture and can solve the activation matching problem without protections beyond the tested perturbations.

What would settle it

An experiment showing that ActInv produces only low-fidelity reconstructions when the server lacks the client architecture or when the client applies architecture modifications unknown to the server.

Figures

Figures reproduced from arXiv: 2605.23158 by Cen Chen, Fuyi Wang, Mingyuan Fan, Yu Liu.

**Figure 2.** Figure 2: The ActInv’s Precision, Recall, and ROUGE-L scores evolve over 2000 optimization iterations in AlpacaEval. We use a sparsification ratio of 0.5 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: The common components of a single block within [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of different layers’ sensitivity in Qwen3-0.6B. The expected PAF values capture the average amplification across [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of different layers’ sensitivity in Falcon3-1B. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Evaluation rubric used by the judge model. [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of different layers’ sensitivity in Llama [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

read the original abstract

The deployment of large language models (LLMs) on resource-constrained devices remains challenging, spurring interest in split inference, where models are partitioned between client and server to reduce computational burden and enhance privacy by transmitting only intermediate activations. However, the privacy-preserving capabilities of split inference, particularly in the context of LLMs, have not been exhaustively investigated. To fill this gap, we introduce ActInv, which solves an intermediate activation matching problem to reconstruct the client's input. Extensive evaluations demonstrate that ActInv achieves high-fidelity reconstructions, even in the presence of common perturbation-based defenses such as Gaussian noise injection and activation sparsification. To systematically understand this vulnerability, we develop Perturbation Amplification Factor (PAF), a metric for quantifying a layer's inherent resistance to reconstruction. Our analysis reveals that privacy vulnerability is not uniform across layers, with some layers being highly susceptible to leakage while others offer natural resistance. Furthermore, we demonstrate that defense effectiveness can be significantly improved by calibrating perturbation directions to maximize reconstruction error during backpropagation. Building on these insights, we design PriPert and conduct comprehensive evaluations, covering privacy, utility, and computational overhead, to demonstrate its effectiveness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ActInv reconstructs LLM inputs from split activations even under noise or sparsification, but only when the server knows the exact client architecture and cut layer.

read the letter

The main takeaway is that split inference for LLMs leaks client inputs through intermediate activations, and ActInv recovers them at high fidelity despite Gaussian noise or activation sparsification. The paper also gives PAF as a way to score which layers resist reconstruction better and introduces PriPert to pick perturbation directions that raise reconstruction error during backprop. Those pieces are the actual new bits: the attack formulation tailored to LLM splits, the layer-wise metric, and the calibrated defense rather than generic noise addition. The evaluations check privacy, utility, and overhead together, which is the right scope for deployment questions. The central limitation is the white-box precondition. The matching optimization needs the server to know the client model architecture, layer sizes, activations, and split point; without that the attack cannot even be set up. If clients hide or randomize the architecture the reported fidelity numbers no longer apply. The abstract claims extensive tests but supplies no numbers, error bars, or dataset details, so the practical strength of the results is still unclear. This work is aimed at people building or auditing split inference for edge LLMs. It is grounded enough in a real deployment pattern to go to referees, even though the architecture-knowledge assumption needs explicit discussion and the quantitative claims need more visible support.

Referee Report

2 major / 2 minor

Summary. The paper claims that split inference for LLMs is vulnerable to input reconstruction by the server via ActInv, which solves an intermediate activation matching optimization problem to achieve high-fidelity recovery even under perturbation defenses such as Gaussian noise and sparsification. It introduces the Perturbation Amplification Factor (PAF) metric to quantify per-layer resistance to reconstruction and proposes PriPert, a defense that calibrates perturbation directions via backpropagation to increase reconstruction error, with evaluations addressing privacy, utility, and computational overhead.

Significance. If the empirical results hold under the stated assumptions, the work identifies concrete privacy limitations of split inference for LLMs and supplies both an analysis tool (PAF) and a practical defense (PriPert). This contributes to privacy-preserving ML by showing that activation transmission alone does not suffice for strong privacy and by offering layer-specific insights that could guide split-point selection.

major comments (2)

[Method section (ActInv definition)] The ActInv formulation (described after the abstract and in the method section) requires the server to possess exact white-box knowledge of the client model architecture, layer dimensions, activation functions, and precise split point in order to instantiate the activation matching objective. This precondition is load-bearing for all reported fidelity claims; without it the optimization cannot be set up, yet the manuscript provides no black-box variant or evaluation under architecture obfuscation.
[Evaluations section] Section on evaluations: the abstract asserts 'extensive evaluations' and effectiveness against listed defenses, but the provided description contains no quantitative reconstruction metrics, error bars, dataset sizes, ablation studies, or statistical tests. These details are required to substantiate the central claim that ActInv succeeds even under Gaussian noise and sparsification.

minor comments (2)

[Abstract and introduction] Define the acronyms ActInv, PAF, and PriPert at first use in the main text rather than only in the abstract.
[Method section] Clarify the precise mathematical formulation of the activation matching loss and the backpropagation used for PriPert calibration.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, providing clarifications on the threat model and committing to improvements in the presentation of results.

read point-by-point responses

Referee: [Method section (ActInv definition)] The ActInv formulation (described after the abstract and in the method section) requires the server to possess exact white-box knowledge of the client model architecture, layer dimensions, activation functions, and precise split point in order to instantiate the activation matching objective. This precondition is load-bearing for all reported fidelity claims; without it the optimization cannot be set up, yet the manuscript provides no black-box variant or evaluation under architecture obfuscation.

Authors: ActInv is defined under the standard white-box threat model for split inference, in which the server knows the client model architecture, dimensions, activations, and split point because these are fixed at deployment time and typically public in such systems. This matches the assumptions in prior split-learning privacy analyses. We will revise the method section to state this assumption explicitly and discuss its scope, but we do not claim the attack applies under architecture obfuscation. revision: partial
Referee: [Evaluations section] Section on evaluations: the abstract asserts 'extensive evaluations' and effectiveness against listed defenses, but the provided description contains no quantitative reconstruction metrics, error bars, dataset sizes, ablation studies, or statistical tests. These details are required to substantiate the central claim that ActInv succeeds even under Gaussian noise and sparsification.

Authors: The evaluations section reports quantitative reconstruction metrics (MSE, PSNR, cosine similarity), results across multiple datasets with explicit sizes, ablations on noise variance and sparsity ratios, and comparisons against the listed defenses. To strengthen the presentation we will add error bars, dataset-size tables, additional ablation figures, and statistical significance tests in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical attack/defense constructions with no self-referential derivations

full rationale

The paper introduces ActInv as an optimization-based reconstruction attack, defines PAF as a new metric, and evaluates PriPert as a defense. All claims rest on experimental results under stated assumptions (white-box architecture knowledge). No equations, predictions, or first-principles results are presented that reduce by construction to fitted inputs, self-citations, or renamed patterns. The derivation chain is self-contained as direct empirical measurement.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

Only abstract available; no explicit free parameters, axioms, or invented physical entities are described. The work introduces algorithmic methods rather than new physical postulates.

axioms (1)

domain assumption The server receives intermediate activations computed by the client-side portion of the split model.
Core premise of the split-inference threat model stated in the abstract.

invented entities (3)

ActInv no independent evidence
purpose: Algorithm to reconstruct client input by solving an activation matching problem.
New method introduced to demonstrate leakage.
PAF no independent evidence
purpose: Metric quantifying a layer's inherent resistance to input reconstruction.
New metric proposed to analyze vulnerability.
PriPert no independent evidence
purpose: Defense that selects perturbation directions to maximize reconstruction error.
New defense design based on the analysis.

pith-pipeline@v0.9.0 · 5746 in / 1282 out tokens · 53063 ms · 2026-05-25T04:39:33.639468+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 4 internal anchors

[1]

2004.Convex optimization

Stephen P Boyd and Lieven Vandenberghe. 2004.Convex optimization. Cambridge university press

work page 2004
[2]

Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, et al. 2023. Quantifying Memorization Across Neural Language Models. InICLR

work page 2023
[3]

Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, et al. 2021. Extracting Training Data from Large Language Models. InUSENIX Security

work page 2021
[4]

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, et al. 2024. A Survey on Evaluation of Large Language Models.ACM Trans. Intell. Syst. Technol.15, 3 (2024), 39:1–39:45

work page 2024
[5]

Guanzhong Chen, Zhenghan Qin, Mingxin Yang, Yajie Zhou, Tao Fan, Tianyu Du, and Zenglin Xu. 2024. Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack. InCCS

work page 2024
[6]

Yuxuan Chen, Rongpeng Li, Xiaoxue Yu, Zhifeng Zhao, and Honggang Zhang

work page
[7]

Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach.CoRRabs/2406.02616 (2024)

work page arXiv 2024
[8]

Tian Dong, Yan Meng, Shaofeng Li, Guoxing Chen, Zhen Liu, and Haojin Zhu

work page
[9]

In 34th USENIX Security Symposium

Depth Gives a False Sense of Privacy: LLM Internal States Inversion. In 34th USENIX Security Symposium

work page
[10]

In Gim, Caihua Li, and Lin Zhong. 2024. Confidential Prompting: Protecting User Prompts from Cloud LLM Providers.CoRRabs/2409.19134 (2024)

work page arXiv 2024
[11]

Zecheng He, Tianwei Zhang, and Ruby B. Lee. 2019. Model inversion attacks against collaborative inference. InACSAC

work page 2019
[12]

1987.Introduction to numerical analysis

Francis Begnaud Hildebrand. 1987.Introduction to numerical analysis. Courier Corporation

work page 1987
[13]

Hongpeng Jin and Yanzhao Wu. 2025. CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration. InIEEE International Conference on Web Services

work page 2025
[14]

Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, et al. 2021. Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?. InNAACL. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Mingyuan Fan, Y u Liu, Fuyi Wang, and Cen Chen

work page 2021
[15]

Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, et al. 2022. Explanations from Large Language Models Make Small Reasoners Better.CoRR(2022)

work page 2022
[16]

Yupei Liu, Yuqi Jia, Jinyuan Jia, and Neil Zhenqiang Gong. 2025. Evaluating LLM-based Personal Information Extraction and Countermeasures. InUsenix Security 2025

work page 2025
[17]

Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, et al. 2024. LLM-QAT: Data-Free Quantization Aware Training for Large Language Models. InFindings of the Association for Computational Linguistics

work page 2024
[18]

Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, et al . 2023. Analyzing Leakage of Personally Identifiable Information in Language Models. InS&P

work page 2023
[19]

Xinjian Luo, Ting Yu, and Xiaokui Xiao. 2025. Prompt Inference Attack on Distributed Large Language Model Inference Frameworks. InCCS. ACM, 1739– 1753

work page 2025
[20]

Xinyin Ma, Gongfan Fang, and Xinchao Wang. 2023. LLM-Pruner: On the Structural Pruning of Large Language Models. InNeurIPS

work page 2023
[21]

Peihua Mai, Ran Yan, Zhe Huang, Youjia Yang, and Yan Pang. 2024. Split-and- Denoise: Protect large language model inference with local differential privacy. In ICML

work page 2024
[22]

Morris, Wenting Zhao, Justin T

John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, et al . 2024. Language Model Inversion. InICLR

work page 2024
[23]

Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, Andrea Santilli, Yannis Panagakis, and Emanuele Rodolà. 2025. Language Models are Injective and Hence Invertible.CoRRabs/2510.15511 (2025)

work page arXiv 2025
[24]

OpenAI. 2023. GPT-4 Technical Report.CoRRabs/2303.08774 (2023). https: //doi.org/10.48550/arXiv.2303.08774

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08774 2023
[25]

Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the Tiger: Inference Attacks on Split Learning. InCCS

work page 2021
[26]

Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, et al

Maarten G. Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, et al. 2019. Split Learning for collaborative deep learning in healthcare.CoRR abs/1912.12115 (2019)

work page arXiv 2019
[27]

Wenjie Qu, Yuguang Zhou, Yongji Wu, Tingsong Xiao, Binhang Yuan, Yiming Li, and Jiaheng Zhang. 2025. Prompt Inversion Attack Against Collaborative Infer- ence of Large Language Models. InIEEE Symposium on Security and Privacy

work page 2025
[28]

Siladitya Ray. 2023. A Growing List Of Companies Cracking Down On Use Of ChatGPT By Staffers—Here’s Why

work page 2023
[29]

Liangqin Ren, Zeyan Liu, Fengjun Li, Kaitai Liang, et al . 2024. PrivDNN: A Secure Multi-Party Computation Framework for Deep Learning using Partial DNN Encryption.PoPETs(2024)

work page 2024
[30]

Chris Renzo, Liv Aliberti, Justin Miles, and Joe Kovba. 2024. Large language model inference over confidential data using AWS Nitro Enclaves

work page 2024
[31]

Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, et al. 2024. Identifying the Risks of LM Agents with an LM-Emulated Sandbox. InICLR

work page 2024
[32]

Robin Staab, Mark Vero, Mislav Balunovic, and Martin T. Vechev. 2024. Beyond Memorization: Violating Privacy via Inference with Large Language Models. In ICLR

work page 2024
[33]

Xuchen Suo. 2024. Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications.CoRRabs/2401.07612 (2024)

work page arXiv 2024
[34]

Llama 3 Team. 2024. The Llama 3 Herd of Models.CoRRabs/2407.21783 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

Qwen3 Team. 2025. Qwen3 Technical Report.CoRRabs/2505.09388 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[36]

SmolLM2 Team. 2025. SmolLM2: When Smol Goes Big - Data-Centric Training of a Small Language Model.CoRRabs/2502.02737 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[37]

Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. InICDCS. IEEE Computer Society, 328–339

work page 2017
[38]

Dixi Yao, Liyao Xiang, Hengyuan Xu, Hangyu Ye, et al. 2022. Privacy-Preserving Split Learning via Patch Shuffling over Transformers. InICDM

work page 2022
[39]

Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, et al . 2023. Bag of Tricks for Training Data Extraction from Language Models. InICML, V ol. 202

work page 2023
[40]

Kai Yue, Richeng Jin, Chau-Wai Wong, Dror Baron, et al. 2023. Gradient Obfus- cation Gives a False Sense of Security in Federated Learning. InUSENIX Security. USENIX Association, 6381–6398

work page 2023
[41]

Zhexin Zhang, Jiaxin Wen, and Minlie Huang. 2023. ETHICIST: Targeted Train- ing Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation. InACL

work page 2023
[42]

Zishuai Zhang, Hainan Zhang, Jiaying Zheng, Ziwei Wang, Yongxin Tong, Jin Dong, and Zhiming Zheng. 2025. A Federated Splitting Framework for LLMs: Security, Efficiency, and Adaptability.CoRRabs/2505.15683 (2025)

work page arXiv 2025
[43]

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, et al. 2023. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. InNeurIPS. A Proof of Theorem 1 PROOF. We can re-express the𝛿=( ˆz−z)J by defining Δ= ˆz−z , which gives us 𝛿=ΔJ . Finding a solution for Δ can be reformulated as the following optimization problem:min Δ ||𝛿−ΔJ|| 2 2. To find the ...

work page 2023

[1] [1]

2004.Convex optimization

Stephen P Boyd and Lieven Vandenberghe. 2004.Convex optimization. Cambridge university press

work page 2004

[2] [2]

Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, et al. 2023. Quantifying Memorization Across Neural Language Models. InICLR

work page 2023

[3] [3]

Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, et al. 2021. Extracting Training Data from Large Language Models. InUSENIX Security

work page 2021

[4] [4]

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, et al. 2024. A Survey on Evaluation of Large Language Models.ACM Trans. Intell. Syst. Technol.15, 3 (2024), 39:1–39:45

work page 2024

[5] [5]

Guanzhong Chen, Zhenghan Qin, Mingxin Yang, Yajie Zhou, Tao Fan, Tianyu Du, and Zenglin Xu. 2024. Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack. InCCS

work page 2024

[6] [6]

Yuxuan Chen, Rongpeng Li, Xiaoxue Yu, Zhifeng Zhao, and Honggang Zhang

work page

[7] [7]

Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach.CoRRabs/2406.02616 (2024)

work page arXiv 2024

[8] [8]

Tian Dong, Yan Meng, Shaofeng Li, Guoxing Chen, Zhen Liu, and Haojin Zhu

work page

[9] [9]

In 34th USENIX Security Symposium

Depth Gives a False Sense of Privacy: LLM Internal States Inversion. In 34th USENIX Security Symposium

work page

[10] [10]

In Gim, Caihua Li, and Lin Zhong. 2024. Confidential Prompting: Protecting User Prompts from Cloud LLM Providers.CoRRabs/2409.19134 (2024)

work page arXiv 2024

[11] [11]

Zecheng He, Tianwei Zhang, and Ruby B. Lee. 2019. Model inversion attacks against collaborative inference. InACSAC

work page 2019

[12] [12]

1987.Introduction to numerical analysis

Francis Begnaud Hildebrand. 1987.Introduction to numerical analysis. Courier Corporation

work page 1987

[13] [13]

Hongpeng Jin and Yanzhao Wu. 2025. CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration. InIEEE International Conference on Web Services

work page 2025

[14] [14]

Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, et al. 2021. Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?. InNAACL. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Mingyuan Fan, Y u Liu, Fuyi Wang, and Cen Chen

work page 2021

[15] [15]

Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, et al. 2022. Explanations from Large Language Models Make Small Reasoners Better.CoRR(2022)

work page 2022

[16] [16]

Yupei Liu, Yuqi Jia, Jinyuan Jia, and Neil Zhenqiang Gong. 2025. Evaluating LLM-based Personal Information Extraction and Countermeasures. InUsenix Security 2025

work page 2025

[17] [17]

Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, et al. 2024. LLM-QAT: Data-Free Quantization Aware Training for Large Language Models. InFindings of the Association for Computational Linguistics

work page 2024

[18] [18]

Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, et al . 2023. Analyzing Leakage of Personally Identifiable Information in Language Models. InS&P

work page 2023

[19] [19]

Xinjian Luo, Ting Yu, and Xiaokui Xiao. 2025. Prompt Inference Attack on Distributed Large Language Model Inference Frameworks. InCCS. ACM, 1739– 1753

work page 2025

[20] [20]

Xinyin Ma, Gongfan Fang, and Xinchao Wang. 2023. LLM-Pruner: On the Structural Pruning of Large Language Models. InNeurIPS

work page 2023

[21] [21]

Peihua Mai, Ran Yan, Zhe Huang, Youjia Yang, and Yan Pang. 2024. Split-and- Denoise: Protect large language model inference with local differential privacy. In ICML

work page 2024

[22] [22]

Morris, Wenting Zhao, Justin T

John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, et al . 2024. Language Model Inversion. InICLR

work page 2024

[23] [23]

Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, Andrea Santilli, Yannis Panagakis, and Emanuele Rodolà. 2025. Language Models are Injective and Hence Invertible.CoRRabs/2510.15511 (2025)

work page arXiv 2025

[24] [24]

OpenAI. 2023. GPT-4 Technical Report.CoRRabs/2303.08774 (2023). https: //doi.org/10.48550/arXiv.2303.08774

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08774 2023

[25] [25]

Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the Tiger: Inference Attacks on Split Learning. InCCS

work page 2021

[26] [26]

Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, et al

Maarten G. Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, et al. 2019. Split Learning for collaborative deep learning in healthcare.CoRR abs/1912.12115 (2019)

work page arXiv 2019

[27] [27]

Wenjie Qu, Yuguang Zhou, Yongji Wu, Tingsong Xiao, Binhang Yuan, Yiming Li, and Jiaheng Zhang. 2025. Prompt Inversion Attack Against Collaborative Infer- ence of Large Language Models. InIEEE Symposium on Security and Privacy

work page 2025

[28] [28]

Siladitya Ray. 2023. A Growing List Of Companies Cracking Down On Use Of ChatGPT By Staffers—Here’s Why

work page 2023

[29] [29]

Liangqin Ren, Zeyan Liu, Fengjun Li, Kaitai Liang, et al . 2024. PrivDNN: A Secure Multi-Party Computation Framework for Deep Learning using Partial DNN Encryption.PoPETs(2024)

work page 2024

[30] [30]

Chris Renzo, Liv Aliberti, Justin Miles, and Joe Kovba. 2024. Large language model inference over confidential data using AWS Nitro Enclaves

work page 2024

[31] [31]

Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, et al. 2024. Identifying the Risks of LM Agents with an LM-Emulated Sandbox. InICLR

work page 2024

[32] [32]

Robin Staab, Mark Vero, Mislav Balunovic, and Martin T. Vechev. 2024. Beyond Memorization: Violating Privacy via Inference with Large Language Models. In ICLR

work page 2024

[33] [33]

Xuchen Suo. 2024. Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications.CoRRabs/2401.07612 (2024)

work page arXiv 2024

[34] [34]

Llama 3 Team. 2024. The Llama 3 Herd of Models.CoRRabs/2407.21783 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[35] [35]

Qwen3 Team. 2025. Qwen3 Technical Report.CoRRabs/2505.09388 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[36] [36]

SmolLM2 Team. 2025. SmolLM2: When Smol Goes Big - Data-Centric Training of a Small Language Model.CoRRabs/2502.02737 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[37] [37]

Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. 2017. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices. InICDCS. IEEE Computer Society, 328–339

work page 2017

[38] [38]

Dixi Yao, Liyao Xiang, Hengyuan Xu, Hangyu Ye, et al. 2022. Privacy-Preserving Split Learning via Patch Shuffling over Transformers. InICDM

work page 2022

[39] [39]

Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, et al . 2023. Bag of Tricks for Training Data Extraction from Language Models. InICML, V ol. 202

work page 2023

[40] [40]

Kai Yue, Richeng Jin, Chau-Wai Wong, Dror Baron, et al. 2023. Gradient Obfus- cation Gives a False Sense of Security in Federated Learning. InUSENIX Security. USENIX Association, 6381–6398

work page 2023

[41] [41]

Zhexin Zhang, Jiaxin Wen, and Minlie Huang. 2023. ETHICIST: Targeted Train- ing Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation. InACL

work page 2023

[42] [42]

Zishuai Zhang, Hainan Zhang, Jiaying Zheng, Ziwei Wang, Yongxin Tong, Jin Dong, and Zhiming Zheng. 2025. A Federated Splitting Framework for LLMs: Security, Efficiency, and Adaptability.CoRRabs/2505.15683 (2025)

work page arXiv 2025

[43] [43]

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, et al. 2023. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. InNeurIPS. A Proof of Theorem 1 PROOF. We can re-express the𝛿=( ˆz−z)J by defining Δ= ˆz−z , which gives us 𝛿=ΔJ . Finding a solution for Δ can be reformulated as the following optimization problem:min Δ ||𝛿−ΔJ|| 2 2. To find the ...

work page 2023