Recognition: 2 theorem links
· Lean TheoremFrom Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal
Pith reviewed 2026-05-10 18:45 UTC · model grok-4.3
The pith
A linear projection removes identity information from visual embeddings like CLIP while preserving retrieval utility.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Frozen visual embeddings exhibit measurable identity leakage under calibrated benchmarks, yet a one-shot linear projector denoted ISP that removes an estimated identity subspace drives linear access to near-chance while retaining high non-biometric utility for visual search and retrieval, with the mitigation transferring across datasets with minor degradation.
What carries the argument
The identity sanitization projection (ISP), a one-shot linear projector that removes an estimated identity subspace from the embedding space while preserving the complementary space needed for utility tasks.
Load-bearing premise
The identity information present in the embedding space can be accurately captured and removed by a single estimated linear subspace without destroying task-relevant non-identity features.
What would settle it
An experiment showing that a non-linear attack recovers identity from ISP-processed embeddings at well above chance rates, or that utility on visual retrieval tasks falls substantially below the unprojected baseline.
Figures
read the original abstract
Frozen visual embeddings (e.g., CLIP, DINOv2/v3, SSCD) power retrieval and integrity systems, yet their use on face-containing data is constrained by unmeasured identity leakage and a lack of deployable mitigations. We take an attacker-aware view and contribute: (i) a benchmark of visual embeddings that reports open-set verification at low false-accept rates, a calibrated diffusion-based template inversion check, and face-context attribution with equal-area perturbations; and (ii) propose a one-shot linear projector that removes an estimated identity subspace while preserving the complementary space needed for utility, which for brevity we denote as the identity sanitization projection ISP. Across CelebA-20 and VGGFace2, we show that these encoders are robust under open-set linear probes, with CLIP exhibiting relatively higher leakage than DINOv2/v3 and SSCD, robust to template inversion, and are context-dominant. In addition, we show that ISP drives linear access to near-chance while retaining high non-biometric utility, and transfers across datasets with minor degradation. Our results establish the first attacker-calibrated facial privacy audit of non-FR encoders and demonstrate that linear subspace removal achieves strong privacy guarantees while preserving utility for visual search and retrieval.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to quantify identity leakage in non-face-recognition visual encoders (CLIP, DINOv2/v3, SSCD) via an attacker-calibrated benchmark on CelebA-20 and VGGFace2, including open-set verification at low false-accept rates, diffusion-based template inversion, and equal-area face-context attribution. It proposes the Identity Sanitization Projection (ISP), a data-driven linear projector that removes an estimated identity subspace while retaining the orthogonal complement for utility tasks, showing leakage reduced to near-chance levels under linear probes, preserved non-biometric utility, and minor degradation under cross-dataset transfer.
Significance. If the empirical results hold, the work is significant for delivering the first attacker-aware privacy audit of general visual encoders on face data and a practical linear mitigation that trades off privacy against utility in retrieval systems. The cross-encoder and cross-dataset consistency, combined with the simplicity of subspace removal, provides a reproducible template for privacy-preserving embeddings; the inclusion of template inversion as a non-linear check is a positive step toward falsifiability.
major comments (3)
- [§3] §3 (ISP Construction): The subspace estimation step (presumably via differences or labeled pairs in the embedding space) is described operationally but lacks explicit equations for the projection matrix, rank selection, and any regularization; this is load-bearing because the central mitigation claim that ISP removes 'essentially all' identity signal while preserving utility rests on the quality and stability of this estimate.
- [§4.3] §4.3 and §5 (Attack Evaluation): Near-chance linear probe results and template-inversion robustness are reported, yet the assumption that identity variance is fully captured by a single linear subspace is not stress-tested against additional non-linear or attribute-entangled attacks (e.g., pose- or expression-conditioned probes); residual leakage under such attacks would directly undermine the 'strong privacy guarantees' conclusion.
- [§4.1] Table 1 / §4.1 (Verification Metrics): Exact false-accept rate thresholds, number of trials, and full statistical controls (confidence intervals, multiple random seeds) for the open-set verification and cross-dataset transfer results are only summarized; without these, the comparative leakage rankings (CLIP vs. DINOv2) and the 'minor degradation' claim cannot be fully assessed.
minor comments (2)
- [Figures] Figure captions and axis labels for utility plots should explicitly name the downstream tasks (e.g., visual search, retrieval) and include error bars for cross-dataset results to aid interpretation.
- [Abstract] The abstract's phrasing 'one-shot linear projector' could be qualified to note that the subspace is estimated from the target dataset, avoiding any implication of zero data dependence.
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments on our manuscript. We have carefully considered each point and provide detailed responses below, along with indications of revisions to be made in the updated version.
read point-by-point responses
-
Referee: [§3] §3 (ISP Construction): The subspace estimation step (presumably via differences or labeled pairs in the embedding space) is described operationally but lacks explicit equations for the projection matrix, rank selection, and any regularization; this is load-bearing because the central mitigation claim that ISP removes 'essentially all' identity signal while preserving utility rests on the quality and stability of this estimate.
Authors: We agree that the description of the ISP construction in §3 would benefit from greater mathematical precision. In the revised manuscript, we will add explicit equations detailing the subspace estimation from labeled identity pairs (using mean differences or PCA on differences), the formulation of the projection matrix as the orthogonal complement to the estimated identity subspace, the criterion for rank selection (e.g., cumulative explained variance threshold), and any regularization (such as ridge if applied). This will ensure the method is fully specified and reproducible, directly addressing the concern regarding the stability and quality of the identity subspace estimate. revision: yes
-
Referee: [§4.3] §4.3 and §5 (Attack Evaluation): Near-chance linear probe results and template-inversion robustness are reported, yet the assumption that identity variance is fully captured by a single linear subspace is not stress-tested against additional non-linear or attribute-entangled attacks (e.g., pose- or expression-conditioned probes); residual leakage under such attacks would directly undermine the 'strong privacy guarantees' conclusion.
Authors: We appreciate this important point regarding the scope of our attack evaluation. Our current evaluation includes linear probes and a non-linear diffusion-based template inversion attack, which provides a check beyond linear methods. However, we acknowledge that the linear subspace assumption may not capture all possible identity leakage under attribute-entangled non-linear attacks. In the revision, we will expand §5 to include a limitations discussion on this assumption and add results from additional probes, such as those conditioned on pose or expression where data allows, or clarify the 'strong privacy guarantees' to specify they hold under the evaluated linear and inversion attacks. We believe this will strengthen the manuscript without altering the core contributions. revision: partial
-
Referee: [§4.1] Table 1 / §4.1 (Verification Metrics): Exact false-accept rate thresholds, number of trials, and full statistical controls (confidence intervals, multiple random seeds) for the open-set verification and cross-dataset transfer results are only summarized; without these, the comparative leakage rankings (CLIP vs. DINOv2) and the 'minor degradation' claim cannot be fully assessed.
Authors: We thank the referee for highlighting the need for more detailed reporting of experimental controls. In the revised manuscript, we will update Table 1 and the corresponding sections to explicitly state the false-accept rate thresholds (e.g., 10^{-4}, 10^{-3}), the number of verification trials, and include 95% confidence intervals along with the number of random seeds used for all experiments, including cross-dataset transfer. This will enable a complete assessment of the results and the statistical robustness of the leakage rankings and utility preservation claims. revision: yes
Circularity Check
No significant circularity; results are empirical evaluations of a data-driven projector
full rationale
The paper's contributions consist of empirical benchmarks measuring identity leakage via open-set verification, template inversion, and attribution methods, followed by an operationally defined linear projector (ISP) estimated from data on one dataset and validated for privacy-utility tradeoffs on held-out tasks and cross-dataset transfers. No derivation or claim reduces to its inputs by construction, as effectiveness is demonstrated through separate evaluations rather than tautological fitting; the central results rest on measurements and transfers without load-bearing self-citations or self-definitional steps.
Axiom & Free-Parameter Ledger
free parameters (1)
- identity subspace estimation parameters
axioms (1)
- domain assumption Identity information resides in a linear subspace of the embedding that can be estimated and subtracted without destroying complementary utility features.
invented entities (1)
-
Identity Sanitization Projection (ISP)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose Identity Sanitization Projection (ISP), a one-shot linear projector that removes an estimated identity subspace... form centered means Δμ_i = μ_i - μ_C, ... thin SVD M = U Σ V^T and retain the top-r left singular vectors U_r ... P = I - U_r U_r^T, z̃ = P z / ||P z||_2
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Under a standard homoscedastic model... the Bayes-optimal linear discriminants live in the Fisher/Mahalanobis geometry induced by Σ_w^{-1}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
H. O. Shahreza and S. Marcel. Template Inversion Attack against Face Recognition Systems using 3D Face Reconstruc- tion.ICCV, 2023. 3
2023
-
[2]
Radford, J
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. InProc. ICML, 2021. 1, 2
2021
-
[3]
DINOv2: Learning Robust Visual Features without Supervision
M. Oquab, T. Darcet, T. Moutakanni, H. V . V o, M. Szafraniec, V . Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby, et al. DINOv2: Learning robust visual features without super- vision.arXiv:2304.07193, 2023. 1, 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[4]
O. Siméoni, H. V . V o, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V . Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa, et al. DINOv3arXiv:2508.10104, 2025. 1
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[5]
S. Caldarella, M. Mancini, E. Ricci, and R. Aljundi. The phantom menace: Unmasking privacy leakages in vision- language models.arXiv:2408.01228, 2024. 2
-
[6]
Fredrikson, S
M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic counter- measures. InProc. ACM CCS, 2015. 2, 3
2015
-
[7]
Ravfogel, Y
S. Ravfogel, Y . Elazar, H. Gonen, M. Twiton, and Y . Gold- berg. Null it out: Guarding protected attributes by iterative nullspace projection. InProc. ACL, 2020. 2, 9
2020
-
[8]
C. Jia, Y . Yang, Y . Xia, Y .-T. Chen, Z. Parekh, H. Pham, Q. Le, Y .-H. Sung, Z. Li, T. Duerig. Scaling up visual and vision- language representation learning with noisy text supervision. InProc. ICML, 2021. 2
2021
-
[9]
Mahendran and A
A. Mahendran and A. Vedaldi. Understanding deep image representations by inverting them. InProc. CVPR, 2015. 3
2015
-
[10]
Grother, M
P. Grother, M. Ngan, and K. Hanaoka. Ongoing face recogni- tion vendor test (FRVT): Part 2.NIST Interagency/Internal Report (NISTIR), 2019. 2
2019
-
[11]
Information technology—Biometric performance testing and reporting—Part 1: Principles and framework.International Organization for Standardization,
ISO/IEC 19795-1:2021. Information technology—Biometric performance testing and reporting—Part 1: Principles and framework.International Organization for Standardization,
2021
-
[12]
M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. InProc. ECCV, 2014. 2
2014
-
[13]
Fong and A
R. Fong and A. Vedaldi. Interpretable explanations of black boxes by meaningful perturbation. InProc. ICCV, 2017. 3, 6
2017
-
[14]
Petsiuk, A
V . Petsiuk, A. Das, and K. Saenko. RISE: Randomized in- put sampling for explanation of black-box models. InProc. BMVC, 2018. 3, 6
2018
-
[15]
Chefer, S
H. Chefer, S. Gur, and L. Wolf. Transformer interpretability beyond attention rollout. InProc. CVPR, 2021. 2, 6
2021
-
[16]
Geirhos, P
R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wich- mann, and W. Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. InProc. ICLR, 2019. 3, 6
2019
- [17]
-
[18]
Pizzi, S
E. Pizzi, S. Dutta Roy, S. N. Ravindra, P. Goyal, and M. Douze. A self-supervised descriptor for image copy de- tection. InProc. CVPR, 2022. 1, 2
2022
-
[19]
D. Mery. True black-box explanation in facial analysis (Min- Plus). InCVPR Workshops, 2022. 3, 6
2022
-
[20]
Adhikari, R
S. Adhikari, R. Kumar, K. Mopuri, and R. Pachamuthu. Lost in Context: The Influence of Context on Feature Attribution Methods for Object Recognition. InICVGIP, 2024. 3, 6
2024
- [21]
- [22]
-
[23]
Ravfogel, M
S. Ravfogel, M. Twiton, Y . Goldberg, R. Cotterell. Linear Adversarial Concept Erasure. InProc. ICML, 2022. 2, 9
2022
- [24]
-
[25]
Belrose, D
N. Belrose, D. Schneider-Joseph, S. Ravfogel, R. Cotterell, E. Raff, and S. Biderman. LEACE: Perfect Linear Concept Erasure with Least Distortion. InProc. NeurIPS, 2023. 2, 6, 9
2023
- [26]
- [27]
-
[28]
L. Struppek, D. Hintersdorf, A. D. A. Correia, A. Adler, and K. Kersting. Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks.arXiv:2201.12179, 2022
-
[29]
F. P. Papantoniou, A. Lattas, S. Moschoglou, J. Deng, B. Kainz, and S. Zafeiriou. Arc2Face: A Foundation Model for ID-Consistent Human Faces.ECCV, 2024. 2
2024
-
[30]
H. Wang, S. Wang, C.-S. Lu, and I. Echizen. DiffMI: Break- ing Face Recognition Privacy via Diffusion-Driven Training- Free Model InversionarXiv:2504.18015, 2025. 3, 4, 8
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[31]
Kansy, A
M. Kansy, A. Rael, G. Mignone, J. Naruniec, C. Schroers, M. Gross, and R. M. Weber. Controllable Inversion of Black- Box Face Recognition Models via Diffusion. InProc. ICCV Workshops (AMFG), 2023
2023
-
[32]
Hintersdorf, L
D. Hintersdorf, L. Struppek, M. Brack, F. Friedrich, P. Schramowski, and K. Kersting. Does CLIP Know my Face?Journal of Artificial Intelligence Research, 80:1033– 1062, 2024. 2
2024
-
[33]
T. Fel, B. Wang, M. A. Lepori, M. Kowal, A. Lee, R. Balestriero, S. Joseph, E. Lubana, T. Konkle, D. Ba, M. Wattenberg. Into the Rabbit Hull: From Task- Relevant Concepts in DINO to Minkowski Geometry. arXiv:2510.08638, 2025. 2
work page internal anchor Pith review arXiv 2025
- [34]
-
[35]
Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman. VGGFace2: A dataset for recognising faces across pose and age. InProc. FG, 2018
2018
-
[36]
Z. Liu, P. Luo, X. Wang, and X. Tang. Deep Learning Face Attributes in the Wild. InProc. ICCV, 2015
2015
-
[37]
S. Kim, Y . K. Tan, B. Jeong, S. Mondal, K. M. M. Aung, and J. H. Seo. Scores Tell Everything about Bob: Non-adaptive Face Reconstruction on Face Recognition Systems. InProc. IEEE S&P, 2024. 4, 8
2024
-
[38]
Y . G. Jung, J. Park, X. Dong, H. Park, A. B. J. Teoh, and O. Camps. Face Reconstruction Transfer Attack as Out-of- Distribution Generalization. InProc. ECCV, 2024. 4, 8, 5
2024
-
[39]
H. Wu, J. Singh, S. Tian, L. Zheng, and K. W. Bowyer. Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors. InProc. ICLR, 2025. 4, 8 A. Few-Shot Verification: Full Protocol This section provides complete implementation-ready de- tails for the open-set verification experiments in Sec. 4.1, including dataset curation, pair construction...
2025
-
[40]
Compute all mated and impostor similarity scores on Aval
-
[41]
When the impostor count permits (|Pval impostor| ≥10,000 ), selectτsuch thatFAR(τ)≈10 −4
-
[42]
When the impostor count is insufficient for stable 10−4 estimation (e.g., LFW-20), fall back to a fixed-head par- tial AUC criterion at FAR∈[0,10 −3] and still return a singleτ
-
[43]
Aseparate τ is calibrated for each (dataset, encoder, projection method,k) combination
-
[44]
This ensures that no information from test identities influ- ences the operating point
Once set on validation identities, τ isheld fixedfor all test resamplings—the five seeds that reshuffle Si/Qi per identity do not each get their own threshold. This ensures that no information from test identities influ- ences the operating point. We report TAR on test along with the achieved FAR (as both a rate and raw count of false accepts over total i...
-
[45]
For each identity i, randomly split Bi into query set Qi and support setS i with|S i|=k
-
[46]
Train the probe (Ridge or MLP) on support embeddings from training identities only
-
[47]
Calibrateτon validation identities (as above)
-
[48]
Evaluate on test identities using the fixedτ
-
[49]
Identity-aware
Repeat with 5 random seeds forSi/Qi assignment; report mean TAR and 95% identity-aware confidence intervals. “Identity-aware” refers to the fact that variance is computed across seeds (which reshuffle per-identity splits) rather than across individual pairs, avoiding pseudo-replication from correlated pairs within the same identity. A.7. LFW-20 Sanity Che...
-
[50]
Fit ISP on dataset A (e.g., CelebA-20, 320 train identi- ties)
-
[51]
Apply PA to embeddings from dataset B (e.g., VGGFace2-20)
-
[52]
Calibrate threshold on dataset B validation set
-
[53]
Evaluate on dataset B test set. We evaluate a full2×2transfer matrix: • Within-dataset:P A →A,P B →B • Cross-dataset:P A →B,P B →A Principal Angle Analysis.To quantify subspace align- ment between projections fitted on different datasets, we compute principal angles between the identity subspaces UA andU B: cos(θi) =σ i(U ⊤ A UB) (14) where σi are the sin...
2000
-
[54]
Build a local ArcFace embedding space model (iresnet50, 512-dim)
-
[55]
Pre-generate 99 δ-orthogonal face images (orthogonal face set; OFS)
-
[56]
context bias
For each target embedding ztgt: compute 99 cosine sim- ilarity scores against OFS images, solve z≈A †s via pseudo-inverse where s is the score vector and A encodes the OFS geometry, then decode ˆx=NbNet(z) via the NbNet inverse model. Key parameters. •Local model: ArcFace iresnet50 (512-dim). • Inverse model: NbNet (512-dim embedding →128×128 face). •OFS ...
2025
-
[57]
Crop window: [cx −s/2, c y −s/2, c x +s/2, c y +s/2]
Center crop: Face center = ((x1+x2)/2,(y 1+y2)/2). Crop window: [cx −s/2, c y −s/2, c x +s/2, c y +s/2]
-
[58]
Track padding ratio: rpad =P sides pad/(4s)
Handle boundaries: If crop extends beyond the im- age, pad with mean color. Track padding ratio: rpad =P sides pad/(4s)
-
[59]
Quality control: Reject if rpad >0.15 (excessive artifi- cial background)
-
[60]
certified away
Resize: Crop region to 224×224 (or 288×288 for SSCD). Results: From 24,000 VGGFace2 images, 23,728 re- tained (272 rejected for padding). Achieved FCR: mean = 0.330, std= 0.015. Face Mask Generation.We fit an ellipse to the 5-point landmarks (left eye, right eye, nose, left mouth, right mouth): •Center: Midpoint of (eye center, mouth center). •Width axis:...
-
[61]
Auditable: rank r directly controls the privacy-utility trade-off and enables energy diagnostics
-
[62]
Lightweight: a single SVD with no adversary training or hyperparameter grids
-
[63]
Model-agnostic and exportable: a fixed P that plugs into any retrieval pipeline
-
[64]
Deployment-friendly: stable, deterministic, no retraining, sub-millisecond latency. In our evaluations, this simple construction is sufficient to drive linear identity accessibility near chance while retain- ing most utility, and its fixed projector generalizes across datasets - properties that alternatives must match to be prac- tical at scale. We emphas...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.