pith. sign in

arxiv: 2606.06946 · v1 · pith:GCQGTOZInew · submitted 2026-06-05 · 💻 cs.CL · cs.AI

Auditing Training Data in Domain-adapted LLMs: LoRA-MINT

Pith reviewed 2026-06-27 21:51 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords membership inferenceLoRAlarge language modelsperplexitytraining data auditingfine-tuningNLP
0
0 comments X

The pith

LoRA-MINT infers if samples were used in LoRA fine-tuning of LLMs by checking perplexity differences, reaching 0.77-0.92 precision.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LoRA-MINT as a membership inference method for LLMs adapted to specific NLP tasks with Low-Rank Adaptation. It uses the relationship between model perplexity on a sample and whether that sample appeared in the adaptation training set. This matters for auditing whether particular data contributed to a deployed model, which bears on intellectual property and sensitive data handling. Experiments across four models and three benchmark datasets show the approach outperforms prior baselines. The method is described as applicable beyond LoRA to other domain-adapted models.

Core claim

LoRA-MINT establishes that membership status in the training data of a LoRA-adapted LLM can be recovered from perplexity signals, delivering precision between 0.77 and 0.92 on standard NLP benchmarks and exceeding state-of-the-art alternatives.

What carries the argument

The perplexity-membership relationship that serves as the detection signal after LoRA adaptation.

If this is right

  • Auditing tools become feasible for checking training data in LoRA-adapted LLMs.
  • Data exposure estimates improve for fine-tuned models used in NLP tasks.
  • Transparency around intellectual property and sensitive data in adapted models increases.
  • The framework supports scalable checks across multiple models and datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar perplexity signals might be tested on full fine-tuning or other adaptation techniques.
  • Regulators could require disclosure of training sources if such inference methods prove reliable.
  • Model providers might deliberately alter training to weaken these signals and protect data.

Load-bearing premise

Perplexity differences continue to mark membership status even after LoRA fine-tuning and are not erased by task-specific effects or dataset traits.

What would settle it

A controlled test on a LoRA-adapted model with fully known training and non-training samples that yields precision no higher than chance would show the perplexity signal does not work.

Figures

Figures reproduced from arXiv: 2606.06946 by Aythami Morales, Daniel DeAlcala, Francisco Jurado, Gonzalo Mancera, Julian Fierrez, Ruben Tolosana.

Figure 1
Figure 1. Figure 1: The objective of LoRA-MINT is to determine whether a given sample [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Filtered distribution S ′ of synthetic in-domain samples, obtained by removing extremes below θlow and above θhigh. constructed so as to avoid inclusion in Dtrain. Using the set of perplexities {PPL(sj )}M j=1 of the synthetic samples, we denote the empirical mean and standard deviation of the reference distribution as µ and σ. To reduce the influence of extreme values that are far from typical in-domain b… view at source ↗
Figure 3
Figure 3. Figure 3: Overview of LoRA-MINT. The base LLM is fine-tuned with LoRA using the training set [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
read the original abstract

We present LoRA-MINT, a new methodology for Membership Inference Test (MINT) applied to recent Large Language Models (LLMs) fine-tuned for specific Natural Language Processing (NLP) tasks through Low-Rank Adaptation (LoRA). The primary goal is to assess whether individual samples were part of the training data of these adapted models, providing a useful auditing tool for the management of intellectual property and sensitive data. Our analysis explores the relationship between model perplexity and membership status, providing a systematic framework for estimating data exposure in fine-tuned LLMs. We conducted experiments on four models and three benchmark datasets, obtaining precision values in determining if given data were used for training ranging from 0.77 to 0.92, which outperform state-of-the-art baselines and demonstrate the robustness and generality of the proposed method. In general, our findings underscore the potential of LoRA-MINT as an effective and scalable framework for auditing LLMs, improving transparency, and fostering the ethical and responsible deployment of AI and NLP technologies. For the sake of concreteness and current relevance, our discussion and experiments are centered on LoRAadjusted LLMs, but note that most of the presented methodology is easily applicable for auditing training data given any other technique for adapting LLMs or, more generally, any other domain-adapted AI models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes LoRA-MINT, a membership inference method for LoRA-adapted LLMs that relies on perplexity differences to determine whether individual samples were part of the fine-tuning data. It reports experiments across four models and three benchmark datasets, claiming precision values of 0.77–0.92 that outperform state-of-the-art baselines, and positions the approach as a scalable auditing tool for intellectual property and data exposure in domain-adapted models.

Significance. If the perplexity signal can be shown to isolate membership status rather than task-domain effects after LoRA adaptation, the method would offer a practical, low-overhead auditing technique for fine-tuned LLMs. The work highlights an important application area (auditing adapted models) but currently provides limited evidence that the reported performance reflects membership inference rather than domain detection.

major comments (2)
  1. [Abstract / Experiments] Abstract and Experiments section: the reported precision range (0.77–0.92) is presented without any description of how non-member samples were drawn (same task/domain vs. out-of-domain), any statistical significance tests, or exact baseline implementations. This directly affects the central claim that perplexity differences reliably indicate membership after LoRA adaptation rather than task-domain shift.
  2. [Abstract] Abstract: the weakest assumption—that perplexity remains a membership signal post-LoRA rather than being dominated by whether a sample belongs to the fine-tuning task distribution—is not tested or controlled for, making the outperformance claim load-bearing on an unverified premise.
minor comments (2)
  1. [Abstract] The abstract states results on 'four models and three benchmark datasets' but provides no table or section reference listing the specific models, datasets, or exact precision per configuration.
  2. [Abstract] Notation for the membership inference threshold and how it is chosen is not defined in the provided abstract, complicating reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The comments highlight important aspects of experimental clarity and the need to isolate membership signals from domain effects. We address each major comment below and indicate where revisions will be made to improve the manuscript.

read point-by-point responses
  1. Referee: [Abstract / Experiments] Abstract and Experiments section: the reported precision range (0.77–0.92) is presented without any description of how non-member samples were drawn (same task/domain vs. out-of-domain), any statistical significance tests, or exact baseline implementations. This directly affects the central claim that perplexity differences reliably indicate membership after LoRA adaptation rather than task-domain shift.

    Authors: We agree that these details are essential for interpreting the results and will strengthen the central claim. In the manuscript, non-member samples are drawn from the same benchmark datasets used for fine-tuning but are held out from the training split, ensuring they are from the identical task and domain distribution. We will revise the abstract to briefly note this sampling approach and expand the Experiments section to include: (1) explicit description of non-member selection, (2) statistical significance testing on the reported precision values, and (3) precise implementation details or references for the baselines. These additions will make the controls transparent and support that the signal targets membership within the domain. revision: yes

  2. Referee: [Abstract] Abstract: the weakest assumption—that perplexity remains a membership signal post-LoRA rather than being dominated by whether a sample belongs to the fine-tuning task distribution—is not tested or controlled for, making the outperformance claim load-bearing on an unverified premise.

    Authors: This concern is well-taken, as distinguishing membership from domain/task effects is fundamental. Our design samples both members and non-members from the same task-specific benchmark datasets, which provides a control for domain shift. However, we acknowledge that an explicit test of the assumption (e.g., via additional in-domain vs. out-of-domain comparisons) is not currently present. We will add a new analysis subsection that directly examines whether perplexity differences persist after controlling for domain, and we will incorporate any necessary supporting experiments or clarifications in the revision. revision: partial

Circularity Check

0 steps flagged

No significant circularity; core signal is independent perplexity

full rationale

The paper's central method applies standard perplexity (a quantity defined independently of membership labels) to detect training data exposure after LoRA adaptation. Reported precisions (0.77-0.92) are empirical results on benchmarks, not quantities forced by fitting parameters to the same data or by self-citation chains. No equations or steps reduce the claimed prediction to a tautology or to a fitted input renamed as output. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The approach implicitly assumes perplexity is a sufficient statistic for membership after LoRA adaptation.

pith-pipeline@v0.9.1-grok · 5781 in / 1106 out tokens · 19410 ms · 2026-06-27T21:51:09.168647+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 11 canonical work pages · 6 internal anchors

  1. [1]

    Natural Language Processing in medicine: a review,

    S. Locke, A. Bashall, S. Al-Adely, J. Moore, A. Wilson, and G. B. Kitchen, “Natural Language Processing in medicine: a review,”Trends in Anaesthesia and Critical Care, vol. 38, pp. 4–9, 2021

  2. [2]

    Benchmarking graph neural networks for document layout analysis in public affairs,

    M. Lopez-Duranet al., “Benchmarking graph neural networks for document layout analysis in public affairs,” inICDAR Workshops, 2026

  3. [3]

    NLP techniques for automating responses to customer queries: a systematic review,

    P. A. Olujimi and A. Ade-Ibijola, “NLP techniques for automating responses to customer queries: a systematic review,”Discover Artificial Intelligence, vol. 3, no. 1, p. 20, 2023

  4. [4]

    Natural Language Processing in finance: A survey,

    K. Duet al., “Natural Language Processing in finance: A survey,” Information Fusion, vol. 115, p. 102755, 2025

  5. [5]

    Addressing bias in LLMs: Strategies and application to fair AI-based recruitment,

    A. Pe ˜naet al., “Addressing bias in LLMs: Strategies and application to fair AI-based recruitment,” vol. 8, no. 2, Oct. 2025, p. 1976–1987

  6. [6]

    EduEV AL-DB: A role-based dataset for pedagogical risk evaluation in educational explanations,

    J. Irigoyenet al., “EduEV AL-DB: A role-based dataset for pedagogical risk evaluation in educational explanations,” inACM/SoLAR Intl. Conf. on Learning Analytics & Knowledge Workshops (LAKw), 2026

  7. [7]

    Ethical AI: Towards defining a collective evaluation framework,

    A. K. Sharmaet al., “Ethical AI: Towards defining a collective evaluation framework,” inIEEE COMPSAC, 2025, pp. 1665–1670

  8. [8]

    Human-centric multimodal machine learning: Recent advances and testbed on AI-based recruitment,

    A. Pe ˜naet al., “Human-centric multimodal machine learning: Recent advances and testbed on AI-based recruitment,”SN Computer Science, vol. 4, no. 5, p. 434, June 2023

  9. [9]

    Privacy-preserving comparison of variable- length data with application to biometric template protection,

    M. Gomez-Barreroet al., “Privacy-preserving comparison of variable- length data with application to biometric template protection,”IEEE Access, vol. 5, pp. 8606–8619, June 2017

  10. [10]

    Differential privacy preservation in robust con- tinual learning,

    A. Hassanpouret al., “Differential privacy preservation in robust con- tinual learning,”IEEE Access, vol. 10, February 2022

  11. [11]

    The digital double: Data privacy,

    P. Omid and F. Soren, “The digital double: Data privacy,”Security, and Consent in AI Implants West J Dent Sci, vol. 2, no. 1, p. 108, 2025

  12. [12]

    Busch, R

    C. Busch, R. Veldhuiset al.,Privacy and Security Matters in Biometric Technologies. Springer, 2026

  13. [13]

    Data-owning democracy or digital socialism?

    J. Muldoon, “Data-owning democracy or digital socialism?”Critical Review Intl. Social & Political Phil., vol. 28, no. 4, pp. 570–591, 2025

  14. [14]

    Active membership inference test (aMINT): En- hancing model auditability with multi-task learning,

    D. DeAlcalaet al., “Active membership inference test (aMINT): En- hancing model auditability with multi-task learning,” inIEEE/CVF Intl. Conf. on Computer Vision (ICCV), 2025, pp. 647–656

  15. [15]

    (2024) Memorandum on Advancing the United States’ Leadership in Artificial Intelligence

    The White House. (2024) Memorandum on Advancing the United States’ Leadership in Artificial Intelligence. [Online]. Available: https: //www.whitehouse.gov/briefing-room/presidential-actions/2024/10/24/

  16. [16]

    A survey on privacy risks and protection in Large Language Models,

    K. Chenet al., “A survey on privacy risks and protection in Large Language Models,”Journal of King Saud University Computer and Information Sciences, vol. 37, no. 7, p. 163, 2025

  17. [17]

    A Survey of Large Language Models

    W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y . Hou, Y . Minet al., “A survey of large language models,”arXiv:2303.18223, 2023

  18. [18]

    HealAI: A healthcare LLM for effective medical documentation,

    S. Goyalet al., “HealAI: A healthcare LLM for effective medical documentation,” inProc. ACM WSDM, 2024, pp. 1167–1168

  19. [19]

    Financial analysis: Intelligent financial data analysis system based on LLM-RAG,

    J. Wanget al., “Financial analysis: Intelligent financial data analysis system based on LLM-RAG,”arXiv:2504.06279, 2025

  20. [20]

    GPT-4 Technical Report

    J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman et al., “GPT-4 technical report,”arXiv:2303.08774, 2023

  21. [21]

    Continuous document layout analysis: Human-in-the- loop AI-based data curation, database, and evaluation in the domain of public affairs,

    A. Pe ˜naet al., “Continuous document layout analysis: Human-in-the- loop AI-based data curation, database, and evaluation in the domain of public affairs,”Information Fusion, vol. 108, p. 102398, 2024

  22. [22]

    Carbon Emissions and Large Neural Network Training

    D. Patterson, J. Gonzalez, Q. Le, C. Lianget al., “Carbon emissions and large neural network training,”arXiv:2104.10350, 2021

  23. [23]

    LoRA: Low-rank adaptation of Large Language Models

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Liet al., “LoRA: Low-rank adaptation of Large Language Models.”Proc. ICLR, 2022

  24. [24]

    QLoRA: Efficient finetuning of quantized LLMs,

    T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient finetuning of quantized LLMs,”Advances in Neural Informa- tion Processing Systems (NIPS), vol. 36, pp. 10 088–10 115, 2023

  25. [25]

    Extracting training data from Large Language Mod- els,

    N. Carliniet al., “Extracting training data from Large Language Mod- els,” inUSENIX Security Symposium, 2021, pp. 2633–2650

  26. [26]

    PBa-LLM: Privacy-and bias-aware NLP using Named-Entity Recognition (NER),

    G. Mancera, A. Moraleset al., “PBa-LLM: Privacy-and bias-aware NLP using Named-Entity Recognition (NER),” inProc. ICDAR, 2025

  27. [27]

    Membership inference attacks against machine learning models,

    R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” inIEEE Symposium on Security and Privacy (SP), 2017, pp. 3–18

  28. [28]

    A comprehensive analysis of factors impacting membership inference,

    D. Dealcalaet al., “A comprehensive analysis of factors impacting membership inference,” inCVPR Workshops, 2024, pp. 3585–3593

  29. [29]

    Is my data in your AI? Membership Inference Test (MINT) applied to face biometrics,

    D. DeAlcala, A. Morales, J. Fierrez, G. Mancera, R. Tolosana, and J. Ortega-Garcia, “Is my data in your AI? Membership Inference Test (MINT) applied to face biometrics,”IEEE Access, 2025

  30. [30]

    gMINT: Gradient-based membership inference test applied to image models,

    D. DeAlcalaet al., “gMINT: Gradient-based membership inference test applied to image models,” inIEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops (CVPRw), 2025, pp. 2781–2790

  31. [31]

    Is my text in your AI model? Gradient-based membership inference test applied to LLMs,

    G. Mancera, D. DeAlcala, J. Fierrez, R. Tolosana, and A. Morales, “Is my text in your AI model? Gradient-based membership inference test applied to LLMs,”IEEE Conf. on AI Workshops (CAIw), 2026

  32. [32]

    MINT-Demo: membership inference test demonstrator,

    D. DeAlcala, A. Morales, J. Fierrez, G. Mancera, R. Tolosanaet al., “MINT-Demo: membership inference test demonstrator,” inAAAI Work- shop on AI Governance: Alignment, Morality, and Law (AIGOV), 2025

  33. [33]

    Is my Vision-Language data in your AI? Member- ship inference test (MINT) Demo 2,

    D. DeAlcalaet al., “Is my Vision-Language data in your AI? Member- ship inference test (MINT) Demo 2,” inIEEE COMPSAC, 2026

  34. [34]

    Mem- bership Inference Attacks from first principles,

    N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Mem- bership Inference Attacks from first principles,” inIEEE Symposium on Security and Privacy (SP), 2022, pp. 1897–1914

  35. [35]

    A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly,

    Y . Yao, J. Duan, K. Xu, Y . Cai, Z. Sun, and Y . Zhang, “A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly,”High-Confidence Computing, vol. 4, no. 2, 2024

  36. [36]

    Do membership inference attacks work on large language models? InarXiv:2402.07841, 2024

    M. Duan, A. Suri, N. Mireshghallahet al., “Do Membership Inference Attacks work on Large Language Models?”arXiv:2402.07841, 2024

  37. [37]

    arXiv preprint arXiv:2305.18462 , year=

    J. Mattern, F. Mireshghallah, Z. Jin, B. Sch ¨olkopfet al., “Membership Inference Attacks against Language Models via neighbourhood compar- ison,”arXiv:2305.18462, 2023

  38. [38]

    User inference attacks on Large Language Models,

    N. Kandpal, K. Pillutla, A. Oprea, P. Kairouz, C. A. Choquette-Choo, and Z. Xu, “User inference attacks on Large Language Models,”arXiv preprint arXiv:2310.09266, 2023

  39. [39]

    Detecting pretraining data from large language models,

    W. Shi, A. Ajith, M. Xia, Y . Huang, D. Liu, T. Blevins, D. Chen, and L. Zettlemoyer, “Detecting pretraining data from large language models,” inInternational Conference on Learning Representations, vol. 2024, 2024, pp. 51 826–51 843

  40. [40]

    Min-k%++: Improved baseline for pre-training data detection from large language models,

    J. Zhang, J. Sun, E. Yeats, Y . Ouyang, M. Kuo, J. Zhang, H. Yang, and H. Li, “Min-k%++: Improved baseline for pre-training data detection from large language models,” inInternational Conference on Learning Representations, vol. 2025, 2025, pp. 64 845–64 862

  41. [41]

    MoPe: Model perturbation based privacy attacks on language models,

    M. Li, J. Wang, J. Wang, and S. Neel, “MoPe: Model perturbation based privacy attacks on language models,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 13 647–13 660

  42. [42]

    SoK: Membership Inference Attacks on LLMs are rushing nowhere (and how to fix it),

    M. Meeus, I. Shilov, S. Jain, M. Faysse, M. Rei, and Y .-A. de Montjoye, “SoK: Membership Inference Attacks on LLMs are rushing nowhere (and how to fix it),” inIEEE Conf. on Secure and Trustworthy Machine Learning (SaTML), 2025, pp. 385–401

  43. [43]

    Quantifying privacy risks of masked Language Models using Membership Inference Attacks,

    F. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, and R. Shokri, “Quantifying privacy risks of masked Language Models using Membership Inference Attacks,”arXiv preprint arXiv:2203.03929, 2022

  44. [44]

    Balancing tails when comparing distributions: Com- prehensive equity index (CEI) with application to bias evaluation in operational face biometrics,

    I. Solanoet al., “Balancing tails when comparing distributions: Com- prehensive equity index (CEI) with application to bias evaluation in operational face biometrics,”Pattern Recognition, vol. 179, 2026

  45. [45]

    Measuring bias in AI models: An statistical approach introducing n-sigma,

    D. DeAlcalaet al., “Measuring bias in AI models: An statistical approach introducing n-sigma,” inIEEE Conf. on Computers, Software, and Applications (COMPSAC), June 2023, pp. 1167–1172

  46. [46]

    Qwen3 Technical Report

    A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huanget al., “Qwen3 technical report,”arXiv:2505.09388, 2025

  47. [47]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Biet al., “DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning,”arXiv:2501.12948, 2025

  48. [48]

    Phi-4 Technical Report

    M. Abdinet al., “Phi-4 technical report,”arXiv:2412.08905, 2024

  49. [49]

    SensitiveNets: Learning agnostic representations with application to face recognition,

    A. Moraleset al., “SensitiveNets: Learning agnostic representations with application to face recognition,”IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 43, no. 6, pp. 2158–2164, June 2021