pith. sign in

arxiv: 1907.06034 · v1 · pith:LUQXUDCVnew · submitted 2019-07-13 · 💻 cs.CR · cs.LG

Towards Characterizing and Limiting Information Exposure in DNN Layers

Pith reviewed 2026-05-24 22:13 UTC · model grok-4.3

classification 💻 cs.CR cs.LG
keywords DNN layersinformation exposuregeneralization errormembership inferencetrusted execution environmentsensitive informationprivacy protection
0
0 comments X

The pith

Last layers of a DNN encode more sensitive information from the training data than the first layers

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a framework based on generalization error to measure the amount of sensitive information memorized in each layer of a pre-trained DNN. It shows that when examined individually the last layers hold a larger amount of information from the training data than the first layers. This matters for DNNs running on phones and other devices because memorized details can be extracted through attacks. The same model architecture shows similar exposure patterns per layer across different training datasets. The work also tests an approach that shields the most exposed layers inside a Trusted Execution Environment to reduce leakage risks.

Core claim

When considered individually, the last layers encode a larger amount of information from the training data compared to the first layers. Neurons in convolutional layers can expose more sensitive information than those in fully connected layers, while the same DNN architecture trained on different datasets exhibits similar exposure per layer. An architecture is evaluated that protects the most sensitive layers within the memory limits of a Trusted Execution Environment against white-box membership inference attacks without incurring significant computational overhead.

What carries the argument

Framework that measures sensitive information memorized in each DNN layer using generalization error as the indicator

If this is right

  • Last layers should receive priority when allocating protection resources such as secure hardware
  • Convolutional layers generally require more attention for exposure control than fully connected layers
  • Exposure levels per layer remain consistent for a given architecture regardless of the training dataset
  • Shielding only the highest-exposure layers inside a TEE can limit membership inference without large performance costs

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Model designers could use the measure to decide which layers to prune or retrain for lower exposure before deployment
  • The approach might extend to auditing third-party models for privacy risks on user devices
  • Layer-wise exposure data could guide hybrid on-device and cloud inference designs that keep sensitive parts local

Load-bearing premise

Generalization error serves as a reliable proxy for the quantity of sensitive information memorized in individual DNN layers

What would settle it

An experiment that finds no correlation between a layer's generalization error and the success rate of membership inference attacks targeting that layer would falsify the measurement approach

Figures

Figures reproduced from arXiv: 1907.06034 by Ali Shahin Shamsabadi, Andrea Cavallaro, Fan Mo, Hamed Haddadi, Kleomenis Katevas.

Figure 1
Figure 1. Figure 1: The proposed framework for measuring the risk of exposing sensitive information in a deep neural network [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Generalization errors of Ms and Mb trained on half of the training set, S, of (a) MNIST, (b) Fashion-MNIST and (c) CIFAR-10 for fine-tuning each target layer. Error bars represent 95% confidence intervals. ● ● ● ● ● ● ● 0.00 0.25 0.50 0.75 1.00 1 2 3 4 5 6 7 Layer Risk of sensitive information exposure Dataset ● MNIST Fashion−MNIST CIFAR−10 [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The risk of sensitive information exposure of VGG [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Using a TEE to protect the most sensitive layers [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Execution time, memory usage and power usage [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
read the original abstract

Pre-trained Deep Neural Network (DNN) models are increasingly used in smartphones and other user devices to enable prediction services, leading to potential disclosures of (sensitive) information from training data captured inside these models. Based on the concept of generalization error, we propose a framework to measure the amount of sensitive information memorized in each layer of a DNN. Our results show that, when considered individually, the last layers encode a larger amount of information from the training data compared to the first layers. We find that, while the neuron of convolutional layers can expose more (sensitive) information than that of fully connected layers, the same DNN architecture trained with different datasets has similar exposure per layer. We evaluate an architecture to protect the most sensitive layers within the memory limits of Trusted Execution Environment (TEE) against potential white-box membership inference attacks without the significant computational overhead.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a framework based on the concept of generalization error to measure the amount of sensitive information memorized in each layer of a pre-trained DNN. The authors report that, when considered individually, the last layers encode a larger amount of information from the training data compared to the first layers; neurons in convolutional layers can expose more sensitive information than those in fully connected layers; and the same DNN architecture trained with different datasets exhibits similar exposure per layer. They evaluate an architecture that protects the most sensitive layers within TEE memory limits against white-box membership inference attacks without significant computational overhead.

Significance. If the per-layer measurement is shown to be reliable, the results could inform selective protection strategies for DNNs on edge devices, particularly by identifying layers for TEE isolation. The observation of consistent per-layer exposure across datasets for a fixed architecture offers a potentially reusable design insight. The TEE-based protection evaluation provides a concrete, practical demonstration.

major comments (2)
  1. [Abstract and §3] Abstract and §3: The framework measures per-layer sensitive information via generalization error. Generalization error is defined on the full model output; the manuscript does not detail the layer-wise construction (e.g., ablation, per-layer loss, or intermediate mapping) nor demonstrate that the resulting scores are monotonic or specific to sensitive information rather than general memorization.
  2. [Membership-inference evaluation section] Membership-inference evaluation section: The attack evaluation is performed only on the protected architecture. It is not used to calibrate or cross-validate the per-layer exposure scores that underpin the central claim that last layers encode larger amounts of information; therefore the headline layer-ordering result lacks direct empirical confirmation from attack success rates.
minor comments (1)
  1. [Abstract] Abstract: The abstract states directional findings but supplies no methodological details, error bars, dataset descriptions, or validation steps, which impedes assessment of whether the evidence supports the stated claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3: The framework measures per-layer sensitive information via generalization error. Generalization error is defined on the full model output; the manuscript does not detail the layer-wise construction (e.g., ablation, per-layer loss, or intermediate mapping) nor demonstrate that the resulting scores are monotonic or specific to sensitive information rather than general memorization.

    Authors: We agree that the layer-wise construction needs explicit elaboration. In the revision we will expand §3 with the precise procedure: per-layer generalization error is obtained by freezing preceding layers, attaching a lightweight linear probe to the target layer's activations, and computing the probe's test error on held-out data. We will also add an ablation comparing scores on training versus non-training data with matched statistics to support specificity to sensitive information, and report empirical monotonicity trends across layers. revision: yes

  2. Referee: [Membership-inference evaluation section] Membership-inference evaluation section: The attack evaluation is performed only on the protected architecture. It is not used to calibrate or cross-validate the per-layer exposure scores that underpin the central claim that last layers encode larger amounts of information; therefore the headline layer-ordering result lacks direct empirical confirmation from attack success rates.

    Authors: The per-layer ordering is derived directly from the generalization-error framework, which is intended as an attack-independent characterization. The membership-inference experiments evaluate only the downstream TEE protection strategy once the high-exposure layers have been identified. While correlating attack success with the exposure scores could offer supplementary evidence, it is not required to substantiate the framework's ordering result. We therefore do not plan to alter the evaluation structure. revision: no

Circularity Check

0 steps flagged

No circularity: framework applies external generalization error concept

full rationale

The derivation applies the established external concept of generalization error to construct a per-layer measurement framework. No step reduces by definition to its own output, no fitted parameter is relabeled as a prediction, and no load-bearing premise rests on self-citation chains or imported uniqueness theorems. The claim that later layers encode more information follows from applying this independent proxy rather than from any self-referential construction or renaming of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only abstract available, so ledger is limited to the core assumption stated in the framework description; no free parameters, invented entities, or additional axioms are identifiable.

axioms (1)
  • domain assumption Generalization error can be used to measure the amount of sensitive information memorized in each DNN layer
    Explicitly invoked as the basis for the proposed framework in the abstract.

pith-pipeline@v0.9.0 · 5684 in / 1069 out tokens · 16918 ms · 2026-05-24T22:13:02.974564+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 4 internal anchors

  1. [1]

    Jerome R Bellegarda and Jannes G Dolfing. 2017. Unified language modeling framework for word prediction, auto-completion and auto-correction. US Patent App. 15/141,645

  2. [2]

    Edward Chou, Josh Beal, Daniel Levy, Serena Yeung, Albert Haque, and Li Fei-Fei

  3. [3]

    Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference

    Faster CryptoNets: Leveraging sparsity for real-world encrypted inference. arXiv preprint arXiv:1811.09953 (2018)

  4. [4]

    Zhongshu Gu, Heqing Huang, Jialong Zhang, Dong Su, Ankita Lamba, Dimitrios Pendarakis, and Ian Molloy. 2018. Securing Input Data of Deep Learning Inference Systems via Partitioned Enclave Execution. arXiv preprint arXiv:1807.00969 (2018)

  5. [5]

    Briland Hitaj, Giuseppe Ateniese, and Fernando Pérez-Cruz. 2017. Deep models under the GAN: information leakage from collaborative deep learning. In Pro- ceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 603–618

  6. [6]

    Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, and Emmett Witchel. 2018. Chiron: Privacy-preserving Machine Learning as a Service. arXiv preprint arXiv:1803.05961 (2018)

  7. [7]

    Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer

  8. [8]

    Yann LeCun, Corinna Cortes, and CJ Burges. 2010. MNIST handwritten digit database. AT&T Labs [Online]. A vailable: http://yann. lecun. com/exdb/mnist 2 (2010), 18

  9. [9]

    Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kan- ishka Rao, David Rybach, Ouais Alsharif, Haşim Sak, Alexander Gruenstein, Françoise Beaufays, et al . 2016. Personalized speech recognition on mobile devices. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5955–5959

  10. [10]

    Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov

  11. [11]

    Exploiting unintended feature leakage in collaborative learning. IEEE

  12. [12]

    Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve re- stricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) . 807–814

  13. [13]

    Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks. arXiv preprint arXiv:1812.00910 (2018)

  14. [14]

    Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. 2016. Oblivious Multi-Party Machine Learning on Trusted Processors.. In USENIX Security Symposium. 619–636

  15. [15]

    Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Kleomenis Katevas, Sina Sajadmanesh, Hamid R Rabiee, Nicholas D Lane, and Hamed Haddadi. 2017. A hybrid deep learning architecture for privacy-preserving mobile analytics. arXiv preprint arXiv:1703.02952 (2017)

  16. [16]

    Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Hamid R Rabiee, and Hamed Haddadi. 2018. Private and Scalable Personal Data Analytics Using Hybrid Edge-to-Cloud Deep Learning. Computer 51, 5 (2018), 42–49

  17. [17]

    2013–2016

    Joseph Redmon. 2013–2016. Darknet: Open Source Neural Networks in C. http: //pjreddie.com/darknet/

  18. [18]

    Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, and Karthik Sridharan. 2010. Learnability, stability and uniform convergence. Journal of Machine Learning Research 11, Oct (2010), 2635–2670

  19. [19]

    Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  20. [20]

    Esteban Vazquez-Fernandez and Daniel Gonzalez-Jimenez. 2016. Face recognition for authentication on mobile devices. Image and Vision Computing 55 (2016), 31–33

  21. [21]

    Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)

  22. [22]

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF) . IEEE, 268–282

  23. [23]

    Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolu- tional networks. In European conference on computer vision . Springer, 818–833

  24. [24]

    Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals

  25. [25]

    InProceed- ings of the International Conference on Learning Representations (ICLR)

    Understanding deep learning requires rethinking generalization. InProceed- ings of the International Conference on Learning Representations (ICLR) . Toulon, France. 5