arxiv: 2603.14222 · v2 · submitted 2026-03-15 · 💻 cs.CR · cs.AI

Recognition: no theorem link

Membership Inference for Contrastive Pre-training Models with Text-only PII Queries

Ruoxi Cheng , Yizhong Ding , Jian Zhao , Hongyi Zhang , Haoxuan Ma , Tianle Zhang , Yiyan Huang , Xuelong Li

Authors on Pith no claims yet

Pith reviewed 2026-05-15 12:05 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords membership inferencecontrastive pre-trainingCLIPCLAPprivacy auditingtext-only queriesmultimodal memorizationpersonally identifiable information

0 comments

The pith

Text-only queries can detect if contrastive models like CLIP memorized private data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that membership inference attacks on multimodal contrastive pre-training models do not require access to images, audio, or other paired biometric inputs. Instead, text queries alone guide a latent inversion process inside the model to produce two signals: how closely the inversion aligns with the text, and how consistent those alignments are across random restarts. These signals are compared against a simple baseline of random gibberish text to flag likely training-set members. This matters because it removes the need to feed sensitive data into the model during an audit, lowering both computational cost and privacy risk while still achieving strong detection across CLIP and CLAP variants.

Core claim

Multimodal memorization within these foundational encoders can be accurately inferred using exclusively the text modality. The Unimodal Membership Inference Detector performs text-guided cross-modal latent inversion, extracts complementary similarity and variability statistics, constructs a lightweight non-member reference from synthetic gibberish, and decides membership via an ensemble of unsupervised anomaly detectors.

What carries the argument

Unimodal Membership Inference Detector (UMID) that uses text-guided cross-modal latent inversion to extract similarity (alignment to queried text) and variability (consistency across randomized inversions) signals for comparison to a gibberish reference.

If this is right

Audits become feasible at sub-second cost per query without exposing biometric data.
Shadow-model training is avoided, removing the main computational barrier for large backbones.
The same framework applies to both vision-language and audio-language contrastive models.
Auditing complies with constraints that prohibit feeding private inputs to the target model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

One modality can leak detectable traces of cross-modal training-set membership.
Third parties could run audits on deployed models without ever handling the original private data.
The inversion technique might generalize to other cross-modal architectures that share a joint latent space.

Load-bearing premise

Inverting text into the cross-modal latent space produces signals that reliably distinguish training members from non-members even without any paired biometric input.

What would settle it

If similarity and variability statistics from the inversion process are statistically identical for known training-set texts and known non-member texts, the detection decisions would collapse to chance level.

Figures

Figures reproduced from arXiv: 2603.14222 by Haoxuan Ma, Hongyi Zhang, Jian Zhao, Ruoxi Cheng, Tianle Zhang, Xuelong Li, Yiyan Huang, Yizhong Ding.

**Figure 1.** Figure 1: Overview of the UMID auditing framework and the resulting distributional gap. The UMID method enables text-only membership inference (a) and [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Pipeline of UMID. We employ an optimizer guided by target model to align non-text embeddings with PII text embeddings, maximizing their cosine similarity. By analyzing similarity and variability features of these optimized samples relative to the synthetic gibberish baseline, an anomaly detection system identifies abnormal patterns to infer the membership of the input text. higher similarity, while a lower… view at source ↗

**Figure 3.** Figure 3: Detection accuracy for CLIP model (ResNet-50) under various parameters. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Detection accuracy for CLAP model (LibriSpeech) under various parameters. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Empirical validation of geometric separation. (a) Convergence of similarity ( [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Contrastive pretraining models such as CLIP and CLAP, serve as the ubiquitous perceptual backbones for modern multimodal large models, yet their reliance on web-scale data raises growing concerns about memorizing Personally Identifiable Information (PII). Auditing such models via membership inference is challenging in practice: shadow-model MIAs are computationally prohibitive for large multimodal backbones, and existing multimodal auditing methods typically require querying the target with paired biometric inputs, thereby directly exposing sensitive biometric information to the target model. To bypass this critical limitation, we demonstrate a highly desirable capability for privacy auditing: multimodal memorization within these foundational encoders can be accurately inferred using exclusively the text modality. We propose Unimodal Membership Inference Detector (UMID), a text-only auditing framework that performs text-guided cross-modal latent inversion and extracts two complementary signals, similarity (alignment to the queried text) and variability (consistency across randomized inversions). UMID compares these statistics to a lightweight non-member reference constructed from synthetic gibberish and makes decisions via an ensemble of unsupervised anomaly detectors. Comprehensive experiments across diverse CLIP and CLAP architectures demonstrate that UMID significantly improves the effectiveness and efficiency over prior MIAs, delivering strong detection performance with sub-second auditing cost using solely text queries, completely circumventing the need for biometric inputs and complying with strict privacy constraints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a text-only membership inference attack on CLIP-style encoders by inverting text queries to latent space and running unsupervised anomaly detection against a gibberish reference.

read the letter

The main point is that the authors show you can infer whether text was used in pretraining a contrastive model like CLIP or CLAP without ever feeding it an image or other paired input. They invert the text query into the model's latent space, measure how similar and how stable the reconstructions are across random starts, then flag outliers relative to a small set of synthetic gibberish strings using an ensemble of anomaly detectors. The whole thing runs in under a second per query and stays within strict privacy rules because no biometric data ever touches the target model. That combination is new. Earlier attacks either required paired image-text queries or trained heavy shadow models, both of which are impractical at web scale. The text-only route removes a real barrier for auditors who cannot or should not supply sensitive inputs. The framework itself is described clearly enough: two signals, one reference distribution, unsupervised decision rule. Experiments on several CLIP and CLAP variants are mentioned, which at least shows they tried to check generality. The soft spot is the reference distribution. Synthetic gibberish is cheap, but it may not sit in the same place as real unseen natural text. If ordinary descriptions of public figures or events that were never in the training data produce similarity and variability scores closer to members than to gibberish, the anomaly detector will misfire on exactly the cases that matter for auditing. The abstract claims strong detection numbers yet gives none of the actual figures, no baseline comparisons, and no check on whether the gibberish reference was validated against plausible non-member text. That leaves the central performance claim hard to assess from the summary alone. This work is for researchers who audit or regulate large multimodal encoders and need methods that respect data-minimization rules. Anyone already running membership inference experiments or building privacy tooling for foundation models would find the text-only angle worth reading. The idea is timely and the approach is straightforward, so the paper deserves a serious referee even though the current write-up leaves the reference-distribution question open.

Referee Report

2 major / 2 minor

Summary. The paper proposes UMID, a text-only membership inference framework for contrastive pre-training models such as CLIP and CLAP. It performs text-guided cross-modal latent inversion on PII text queries to extract similarity (alignment) and variability (consistency across randomizations) statistics, then flags anomalies relative to a lightweight synthetic-gibberish non-member reference via an ensemble of unsupervised anomaly detectors. The central claim is that this yields strong detection of multimodal memorization using exclusively text queries, with sub-second cost and without exposing biometric inputs.

Significance. If the core claim holds under proper controls, the work offers a practical, privacy-compliant auditing tool for web-scale multimodal encoders that avoids the computational cost of shadow models and the exposure risks of paired biometric queries. This could meaningfully advance membership-inference methodology in the multimodal setting.

major comments (2)

[Method (text-guided inversion and reference construction)] The decision procedure depends on the synthetic-gibberish reference producing a distribution that is reliably separable from both members and real non-member natural text. No evidence is provided that real unseen natural-language descriptions (e.g., public-figure captions never seen in pre-training) yield similarity/variability statistics closer to the gibberish reference than to training-set members; if they do not, the anomaly detector will systematically misclassify non-members.
[Abstract and Experiments] Abstract and experimental claims of 'strong detection performance' and 'significant improvement over prior MIAs' are stated without any reported AUC, TPR@FPR, or baseline numbers, nor any ablation on the anomaly-ensemble hyperparameters. Because the central claim is empirical, the absence of these quantitative results in the provided abstract leaves the effectiveness unverified.

minor comments (2)

[Method] Notation for the two extracted signals (similarity and variability) should be defined with explicit equations rather than descriptive phrases to allow precise reproduction.
[Decision procedure] The paper should clarify whether the unsupervised anomaly detectors are applied per-query or across a batch, and report any sensitivity to the choice of detector family.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. We address each major comment below and describe the planned revisions.

read point-by-point responses

Referee: [Method (text-guided inversion and reference construction)] The decision procedure depends on the synthetic-gibberish reference producing a distribution that is reliably separable from both members and real non-member natural text. No evidence is provided that real unseen natural-language descriptions (e.g., public-figure captions never seen in pre-training) yield similarity/variability statistics closer to the gibberish reference than to training-set members; if they do not, the anomaly detector will systematically misclassify non-members.

Authors: We agree this is an important validation point for the reference construction. Our current experiments demonstrate effective separation using the gibberish reference against the evaluated non-member sets, but we acknowledge the value of explicitly comparing real unseen natural-language texts (e.g., public captions). In the revision we will add a new analysis subsection with quantitative comparisons of similarity and variability statistics for such real non-member texts versus both members and the gibberish reference, confirming the anomaly detector's behavior. revision: yes
Referee: [Abstract and Experiments] Abstract and experimental claims of 'strong detection performance' and 'significant improvement over prior MIAs' are stated without any reported AUC, TPR@FPR, or baseline numbers, nor any ablation on the anomaly-ensemble hyperparameters. Because the central claim is empirical, the absence of these quantitative results in the provided abstract leaves the effectiveness unverified.

Authors: We agree that the abstract should contain the key quantitative metrics to support the empirical claims. The full manuscript already includes AUC, TPR@FPR, baseline comparisons, and hyperparameter ablations in the experiments section. We will revise the abstract to report these specific results and ensure the ablation study is more prominently referenced. revision: yes

Circularity Check

0 steps flagged

No circularity: unsupervised anomaly detection on extracted statistics is independent of labeled membership data

full rationale

The paper's core procedure (text-guided cross-modal inversion to obtain similarity/variability statistics, followed by comparison to a fixed synthetic-gibberish reference via unsupervised anomaly detectors) does not fit any parameters to member/non-member labels and then rename those fits as predictions. No self-citation chain is invoked to justify uniqueness or to smuggle in an ansatz. The method is fully specified by the described extraction and detection steps without reducing to its own inputs by construction. The distributional concern raised by the skeptic (gibberish vs. real non-member text) is a question of empirical validity, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the method implicitly assumes the existence of a stable cross-modal latent space that preserves membership information.

pith-pipeline@v0.9.0 · 5553 in / 1046 out tokens · 42942 ms · 2026-05-15T12:05:40.616680+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · 2 internal anchors

[1]

Multimodal contrastive training for visual representation learning,

X. Yuan, Z. Lin, J. Kuen, J. Zhang, Y . Wang, M. Maire, A. Kale, and B. Faieta, “Multimodal contrastive training for visual representation learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6995–7004

work page 2021
[2]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PMLR, 2021, pp. 8748–8763

work page 2021
[3]

Clap learning audio concepts from natural language supervision,

B. Elizalde, S. Deshmukh, M. Al Ismail, and H. Wang, “Clap learning audio concepts from natural language supervision,” inICASSP 2023- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5

work page 2023
[4]

Ecoalign: An economi- cally rational framework for efficient lvlm alignment,

R. Cheng, H. Ma, T. Ma, and H. Zhang, “Ecoalign: An economi- cally rational framework for efficient lvlm alignment,”arXiv preprint arXiv:2511.11301, 2025

work page arXiv 2025
[5]

The pii problem: Privacy and a new concept of personally identifiable information,

P. M. Schwartz and D. J. Solove, “The pii problem: Privacy and a new concept of personally identifiable information,”NYUL rev., vol. 86, p. 1814, 2011

work page 2011
[6]

When better features mean greater risks: The performance- privacy trade-off in contrastive learning,

R. Sun, H. Hu, W. Luo, Z. Zhang, Y . Zhang, H. Yuan, and L. Y . Zhang, “When better features mean greater risks: The performance- privacy trade-off in contrastive learning,” inProceedings of the 20th ACM Asia Conference on Computer and Communications Security, 2025, pp. 488–500

work page 2025
[7]

Defending pre-trained language models as few-shot learners against backdoor attacks,

Z. Xi, T. Du, C. Li, R. Pang, S. Ji, J. Chen, F. Ma, and T. Wang, “Defending pre-trained language models as few-shot learners against backdoor attacks,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[8]

Defenses to membership inference attacks: A survey,

L. Hu, A. Yan, H. Yan, J. Li, T. Huang, Y . Zhang, C. Dong, and C. Yang, “Defenses to membership inference attacks: A survey,”ACM Computing Surveys, vol. 56, no. 4, pp. 1–34, 2023

work page 2023
[9]

Selfprompt: Autonomously evaluating llm robustness via domain-constrained knowledge guidelines and refined adversarial prompts,

A. Pei, Z. Yang, S. Zhu, R. Cheng, and J. Jia, “Selfprompt: Autonomously evaluating llm robustness via domain-constrained knowledge guidelines and refined adversarial prompts,” inProceedings of the 31st International Conference on Computational Linguistics, 2025, pp. 6840–6854

work page 2025
[10]

Strata-sword: A hierarchical safety evaluation towards llms based on reasoning complexity of jailbreak instructions,

S. Zhao, R. Duan, J. Liu, X. Jia, F. Wang, C. Wei, R. Cheng, Y . Xie, C. Liu, Q. Guoet al., “Strata-sword: A hierarchical safety evaluation towards llms based on reasoning complexity of jailbreak instructions,” arXiv preprint arXiv:2509.01444, 2025

work page arXiv 2025
[11]

Privacy-enhanced federated learning against attribute inference attack for speech emotion recognition,

H. Zhao, H. Chen, Y . Xiao, and Z. Zhang, “Privacy-enhanced federated learning against attribute inference attack for speech emotion recognition,” inICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5

work page 2023
[12]

Tuni: A textual unimodal detector for identity inference in clip models,

S. Li, R. Cheng, and X. Jia, “Tuni: A textual unimodal detector for identity inference in clip models,” inProceedings of the Sixth Workshop on Privacy in Natural Language Processing, 2025, pp. 1–13

work page 2025
[13]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” in2017 IEEE symposium on security and privacy (SP). IEEE, 2017, pp. 3–18

work page 2017
[14]

Students parrot their teachers: Membership inference on model distillation,

M. Jagielski, M. Nasr, K. Lee, C. A. Choquette-Choo, N. Carlini, and F. Tramer, “Students parrot their teachers: Membership inference on model distillation,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[15]

Practical membership inference attacks against large-scale multi-modal models: A pilot study,

M. Ko, M. Jin, C. Wanget al., “Practical membership inference attacks against large-scale multi-modal models: A pilot study,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4871–4881

work page 2023
[16]

Membership inference attack using self influence functions,

G. Cohen and R. Giryes, “Membership inference attack using self influence functions,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 4892–4901

work page 2024
[17]

Membership inference attacks with token-level deduplication on korean language models,

M. G. Oh, L. H. Park, J. Kim, J. Park, and T. Kwon, “Membership inference attacks with token-level deduplication on korean language models,”IEEE Access, vol. 11, pp. 10 207–10 217, 2023

work page 2023
[18]

M4i: Multi-modal models membership inference,

P. Hu, Z. Wang, R. Sun, H. Wang, and M. Xue, “M4i: Multi-modal models membership inference,”Advances in Neural Information Processing Systems, vol. 35, pp. 1867–1882, 2022

work page 2022
[19]

Multimodal unlearnable examples: Protecting data against multimodal contrastive learning,

X. Liu, X. Jia, Y . Xun, S. Liang, and X. Cao, “Multimodal unlearnable examples: Protecting data against multimodal contrastive learning,” in Proceedings of the 32nd ACM International Conference on Multimedia, 2024, pp. 8024–8033

work page 2024
[20]

A closer look at the explainability of contrastive language-image pre-training,

Y . Li, H. Wang, Y . Duan, J. Zhang, and X. Li, “A closer look at the explainability of contrastive language-image pre-training,”Pattern Recognition, vol. 162, p. 111409, 2025

work page 2025
[21]

Supervised contrastive learning,

P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y . Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,”Advances in neural information processing systems, vol. 33, pp. 18 661–18 673, 2020

work page 2020
[22]

Collap: Contrastive long-form language-audio pretraining with musical temporal structure augmentation,

J. Wu, W. Li, Z. Novack, A. Namburi, C. Chen, and J. McAuley, “Collap: Contrastive long-form language-audio pretraining with musical temporal structure augmentation,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5

work page 2025
[23]

Construction safety inspection with contrastive language- image pre-training (clip) image captioning and attention,

W.-L. Tsai, P.-L. Le, W.-F. Ho, N.-W. Chi, J. J. Lin, S. Tang, and S.-H. Hsieh, “Construction safety inspection with contrastive language- image pre-training (clip) image captioning and attention,”Automation in Construction, vol. 169, p. 105863, 2025

work page 2025
[24]

Supervised contrastive pre-training models for mammography screening,

Z. Cao, Z. Deng, Z. Yang, J. Ma, and L. Ma, “Supervised contrastive pre-training models for mammography screening,”Journal of Big Data, vol. 12, no. 1, p. 24, 2025

work page 2025
[25]

Con- trastive pretraining improves deep learning classification of endocardial electrograms in a preclinical model,

B. Hunt, E. Kwan, J. Bergquist, J. Brundage, B. Orkild, J. Dong, E. Paccione, K. Yazaki, R. S. MacLeod, D. J. Dosdallet al., “Con- trastive pretraining improves deep learning classification of endocardial electrograms in a preclinical model,”Heart Rhythm O2, vol. 6, no. 4, pp. 473–480, 2025

work page 2025
[26]

Audiotime: A temporally-aligned audio-text benchmark dataset,

Z. Xie, X. Xu, Z. Wu, and M. Wu, “Audiotime: A temporally-aligned audio-text benchmark dataset,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5

work page 2025
[27]

Mixed differential privacy in computer vision,

A. Golatkar, A. Achille, Y .-X. Wang, A. Roth, M. Kearns, and S. Soatto, “Mixed differential privacy in computer vision,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8376–8386

work page 2022
[28]

Advancing object detection in transportation with multimodal large language models (mllms): A comprehensive review and empirical testing,

H. I. Ashqar, A. Jaber, T. I. Alhadidi, and M. Elhenawy, “Advancing object detection in transportation with multimodal large language models (mllms): A comprehensive review and empirical testing,”Computation, vol. 13, no. 6, p. 133, 2025

work page 2025
[29]

Pbi-attack: Prior-guided bimodal interactive black-box jailbreak attack for toxicity maximization,

R. Cheng, Y . Ding, S. Cao, R. Duan, X. Jia, S. Yuan, S. Qin, Z. Wang, and X. Jia, “Pbi-attack: Prior-guided bimodal interactive black-box jailbreak attack for toxicity maximization,” inProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025, pp. 609– 628. PREPRINT, UNDER REVIEW. 11

work page 2025
[30]

Steering the Verifiability of Multimodal AI Hallucinations

J. Pang, R. Cheng, Z. Ye, X. Ma, Z. Wu, X. Huang, and Y .-G. Jiang, “Steering the verifiability of multimodal ai hallucinations,”arXiv preprint arXiv:2604.06714, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[31]

Pixclip: Achieving fine-grained visual language understand- ing via any-granularity pixel-text alignment learning,

Y . Xiao, Y . Chen, H. Ma, J. Hong, C. Li, L. Wu, H. Guo, and J. Wang, “Pixclip: Achieving fine-grained visual language understand- ing via any-granularity pixel-text alignment learning,”arXiv preprint arXiv:2511.04601, 2025

work page arXiv 2025
[32]

Protecting privacy in multimodal large language models with mllmu- bench,

Z. Liu, G. Dou, M. Jia, Z. Tan, Q. Zeng, Y . Yuan, and M. Jiang, “Protecting privacy in multimodal large language models with mllmu- bench,” inProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025, pp. 4105–4135

work page 2025
[33]

Privacy- preserving personalized federated prompt learning for multimodal large language models,

L. Tran, W. Sun, S. Patterson, and A. Milanova, “Privacy- preserving personalized federated prompt learning for multimodal large language models,” inThe Thirteenth International Conference on Learning Representations, 2025. [Online]. Available: https: //openreview.net/forum?id=Equ277PBN0

work page 2025
[34]

Propile: Probing privacy leakage in large language models,

S. Kim, S. Yun, H. Leeet al., “Propile: Probing privacy leakage in large language models,” inAdvances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[35]

I never willingly con- sented to this! investigate pii leakage via sso logins,

T.-H. Pham, Q.-H. V o, H. Dao, and K. Fukuda, “I never willingly con- sented to this! investigate pii leakage via sso logins,”IEEE Transactions on Privacy, 2025

work page 2025
[36]

Membership inference attacks as privacy tools: Reliability, disparity and ensemble,

Z. Wang, C. Zhang, Y . Chen, N. Baracaldo, S. R. Kadhe, and L. Yu, “Membership inference attacks as privacy tools: Reliability, disparity and ensemble,” inProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025, pp. 1724–1738

work page 2025
[37]

Reinforcement learning from multi-role debates as feedback for bias mitigation in llms,

R. Cheng, H. Ma, S. Cao, J. Li, A. Pei, Z. Wang, P. Ji, H. Wang, and J. Huo, “Reinforcement learning from multi-role debates as feedback for bias mitigation in llms,”arXiv preprint arXiv:2404.10160, 2024

work page arXiv 2024
[38]

Oyster-i: Beyond refusal–constructive safety alignment for responsible language models,

R. Duan, J. Liu, X. Jia, S. Zhao, R. Cheng, F. Wang, C. Wei, Y . Xie, C. Liu, D. Liet al., “Oyster-i: Beyond refusal–constructive safety alignment for responsible language models,”arXiv preprint arXiv:2509.01909, 2025

work page arXiv 2025
[39]

Agr: Age group fairness reward for bias mitigation in llms,

S. Cao, R. Cheng, and Z. Wang, “Agr: Age group fairness reward for bias mitigation in llms,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5

work page 2025
[40]

Inverse reinforcement learning with dynamic reward scaling for llm alignment,

R. Cheng, H. Ma, W. Wang, R. Duan, J. Liu, X. Jia, S. Qin, X. Cao, Y . Liu, and X. Jia, “Inverse reinforcement learning with dynamic reward scaling for llm alignment,”arXiv preprint arXiv:2503.18991, 2025

work page arXiv 2025
[41]

Use the spear as a shield: An adversarial example based privacy-preserving technique against membership inference attacks,

M. Xue, C. Yuan, C. He, Y . Wu, Z. Wu, Y . Zhang, Z. Liu, and W. Liu, “Use the spear as a shield: An adversarial example based privacy-preserving technique against membership inference attacks,” IEEE Transactions on Emerging Topics in Computing, vol. 11, no. 1, pp. 153–169, 2023

work page 2023
[42]

Does clip know my face?

D. Hintersdorf, L. Struppek, M. Brack, F. Friedrich, P. Schramowski, and K. Kersting, “Does clip know my face?”Journal of Artificial Intelligence Research, vol. 80, pp. 1033–1062, 2024

work page 2024
[43]

Variance-based membership inference attacks against large-scale image captioning models,

D. Samira, E. Habler, Y . Elovici, and A. Shabtai, “Variance-based membership inference attacks against large-scale image captioning models,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 9210–9219

work page 2025
[44]

Range membership inference attacks,

J. Tao and R. Shokri, “Range membership inference attacks,” in2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE, 2025, pp. 346–361

work page 2025
[45]

Deepface: Closing the gap to human-level performance in face verification,

Y . Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face verification,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701–1708

work page 2014
[46]

The megaface benchmark: 1 million faces for recognition at scale,

I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard, “The megaface benchmark: 1 million faces for recognition at scale,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4873–4882

work page 2016
[47]

Laion- 5b: An open large-scale dataset for training next generation image-text models,

C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsmanet al., “Laion- 5b: An open large-scale dataset for training next generation image-text models,”Advances in Neural Information Processing Systems, vol. 35, pp. 25 278–25 294, 2022

work page 2022
[48]

Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts,

S. Changpinyo, P. Sharma, N. Ding, and R. Soricut, “Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 3558–3568

work page 2021
[49]

Librispeech: An asr corpus based on public domain audio books,

V . Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: An asr corpus based on public domain audio books,” in2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 5206–5210

work page 2015
[51]

Common V oice: A massively- multilingual speech corpus

[Online]. Available: http://arxiv.org/abs/1912.06670

work page arXiv 1912
[52]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016
[53]

Detecting affect states using vgg16, resnet50 and se-resnet50 networks,

D. Theckedath and R. Sedamkar, “Detecting affect states using vgg16, resnet50 and se-resnet50 networks,”SN Computer Science, vol. 1, no. 2, p. 79, 2020

work page 2020
[54]

Hts-at: A hierarchical token-semantic audio transformer for sound classification and detection,

K. Chen, X. Du, B. Zhu, Z. Ma, T. Berg-Kirkpatrick, and S. Dubnov, “Hts-at: A hierarchical token-semantic audio transformer for sound classification and detection,” inICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 646–650

work page 2022
[55]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 012–10 022

work page 2021
[56]

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Y . Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V . Stoyanov, “Roberta: a robustly optimized bert pretraining approach. corr 2019,”arXiv preprint arXiv:1907.11692, 1907

work page internal anchor Pith review Pith/arXiv arXiv 2019
[57]

Lightface: A hybrid deep face recognition framework,

S. I. Serengil and A. Ozpinar, “Lightface: A hybrid deep face recognition framework,” in2020 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE, 2020, pp. 23–27. [Online]. Available: https://doi.org/10.1109/ASYU50717.2020.9259802

work page doi:10.1109/asyu50717.2020.9259802 2020
[58]

Membership inference attacks from first principles,

N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Membership inference attacks from first principles,” in2022 IEEE symposium on security and privacy (SP). IEEE, 2022, pp. 1897–1914

work page 2022
[59]

The audio auditor: User-level membership inference in internet of things voice services,

Y . Miao, M. Xue, C. Chen, L. Pan, J. Zhang, B. Z. H. Zhao, D. Kaafar, and Y . Xiang, “The audio auditor: User-level membership inference in internet of things voice services,”Proceedings on Privacy Enhancing Technologies, vol. 1, pp. 209–228, 2021

work page 2021
[60]

Exploring features for membership inference in asr model auditing,

F. Teixeira, K. Pizzi, R. Olivier, A. Abad, B. Raj, and I. Trancoso, “Exploring features for membership inference in asr model auditing,” Computer Speech & Language, p. 101812, 2025

work page 2025
[61]

Slmia-sr: Speaker-level membership inference attacks against speaker recognition systems,

G. Chen, Y . Zhang, and F. Song, “Slmia-sr: Speaker-level membership inference attacks against speaker recognition systems,” inProceedings of the 31st Annual Network and Distributed System Security (NDSS) Symposium, 2024

work page 2024
[62]

Outlier detection using isolation forest and local outlier factor,

Z. Cheng, C. Zou, and J. Dong, “Outlier detection using isolation forest and local outlier factor,” inProceedings of the conference on research in adaptive and convergent systems, 2019, pp. 161–168

work page 2019
[63]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422

work page 2008
[64]

Improving one-class svm for anomaly detection,

K.-L. Li, H.-K. Huang, S.-F. Tian, and W. Xu, “Improving one-class svm for anomaly detection,” inProceedings of the 2003 international confer- ence on machine learning and cybernetics (IEEE Cat. No. 03EX693), vol. 5. IEEE, 2003, pp. 3077–3081

work page 2003
[65]

One-class classification: taxonomy of study and review of techniques,

S. S. Khan and M. G. Madden, “One-class classification: taxonomy of study and review of techniques,”The Knowledge Engineering Review, vol. 29, no. 3, pp. 345–374, 2014

work page 2014
[66]

Autoencoder-based network anomaly detection,

Z. Chen, C. K. Yeo, B. S. Lee, and C. T. Lau, “Autoencoder-based network anomaly detection,” in2018 Wireless telecommunications symposium (WTS). IEEE, 2018, pp. 1–5. PREPRINT, UNDER REVIEW. 12 APPENDIX Table V and Table VI present examples of randomly generated gibberish and covert gibberish that mimics authentic names, respectively. TABLE V SAMPLES OF RA...

work page 2018
[67]

Member: S∞(tin)≥γ in −2δ ⋆ and D2 ∞(tin)≤2δ ⋆ + 3ρdδ⋆ ≈0

work page
[68]

Non-member: |S∞(tout)| ≤O(d −1/2)≈0 and D2 ∞(tout)≥1− 1 M −ρ d ≈1. Proof. For Member: S∞ = P pyv⊤ inµy =p y⋆ v⊤ inµy⋆ +P y̸=y ⋆ pyv⊤ inµy. Using py⋆ ≥1−δ ⋆, v⊤ inµy⋆ ≥γ in, and trivial bounds |v⊤µ| ≤1 , we get S∞ ≥(1−δ ⋆)γin −δ ⋆ ≈γ in. For dispersion, D2 ∞ = 1− ∥p y⋆ µy⋆ +P y̸=y ⋆ pyµy∥2

work page
[69]

Dominant term is 1−p 2 y⋆ ≈ 1−(1−δ ⋆)2 ≈2δ ⋆

The cross-terms are bounded by ρd. Dominant term is 1−p 2 y⋆ ≈ 1−(1−δ ⋆)2 ≈2δ ⋆. For Non-member: S∞ =v ⊤ outm(tout). Since vout is isotropic and independent of m(tout), S∞ concentrates around 0 with rate d−1/2 (Assumption A.2). For dispersion, ∥m(tout)∥2 2 =∥ P pyµy∥2 2 =P p2 y +P y̸=z pypzµ⊤ y µz. With py ≈1/M , P p2 y ≈1/M . Cross terms are bounded by ρ...

work page
[70]

(2) Concentration of empirical statistics.We require the empirical statistics to concentrate around their population means within a radius smaller than Γ/2

The decision thresholds are defined as the midpoints: sthr = 1 2(S∞(tin) +S ∞(tout))andd 2 thr = 1 2(D2 ∞(tin) +D 2 ∞(tout)). (2) Concentration of empirical statistics.We require the empirical statistics to concentrate around their population means within a radius smaller than Γ/2. First, consider the optimization localization. Let Eopt(t) be the event th...

work page