arxiv: 2605.00955 · v1 · submitted 2026-05-01 · 💻 cs.CR · cs.AI

Recognition: unknown

E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

Zelin Guan , Shengda Zhuo , Zeyan Li , Jinchun He , Wangjie Qiu , Zhiming Zheng , Shuqiang Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:23 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords membership inferenceRAG systemsblack-box attackexam-style questionsdocument membershipretrieval-augmented generation

0 comments

The pith

Building an exam from a document's verifiable facts creates a clear signal for whether it belongs in a RAG corpus.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops E-MIA, an attack that infers whether a candidate document has been loaded into a retrieval-augmented generation system's knowledge base using only black-box query responses. It extracts concrete, verifiable details from the document and reformulates them as an exam consisting of fill-in-the-blank, short-answer, multiple-choice, and true/false questions. The attacker submits these questions as ordinary queries and uses the system's combined accuracy as the membership indicator. This method is intended to produce clearer separation between member and non-member documents than earlier similarity-based or direct-probe techniques while keeping the queries natural and hard to flag.

Core claim

E-MIA converts verifiable hard evidence in the target document, such as fine-grained details, proper nouns, definitional statements, metadata cues, and causal or constraint relations, into an exam with four objectively gradable question types: fill-in-the-blank, short answer, multiple choice, and true/false. The aggregated exam score across multiple evidence-targeted questions serves as the membership signal.

What carries the argument

An exam constructed from four objective question types drawn from the document's hard evidence, whose total score functions as the membership indicator.

If this is right

Member and non-member documents exhibit improved separability in score distributions under stringent evaluation conditions.
Queries remain natural and stealthy, reducing the chance of refusal or detection compared with explicit confirmation probes.
Attack performance varies with the composition of question types and the overall length of the exam.
Successful inference reveals corpus coverage and the presence of sensitive topics without direct access.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

RAG operators may need to limit how specifically responses echo document facts even when retrieval occurs.
The same evidence-to-exam pattern could apply to membership testing in other retrieval-based or knowledge-augmented models.
Adding controlled noise or paraphrasing to retrieved content might reduce the reliability of score-based separation.

Load-bearing premise

Verifiable hard evidence in the target document can be reliably converted into objectively gradable questions whose answer patterns differ enough between member and non-member cases to produce stable thresholds.

What would settle it

A test on a controlled RAG system where the same collection of evidence-derived questions produces statistically overlapping accuracy scores for documents known to be inside versus outside the corpus.

Figures

Figures reproduced from arXiv: 2605.00955 by Jinchun He, Shengda Zhuo, Shuqiang Huang, Wangjie Qiu, Zelin Guan, Zeyan Li, Zhiming Zheng.

**Figure 1.** Figure 1: Traditional RAG-MIAs use soft signals (similarity/perplexity/yes-no), while our E-MIA uses aggregated exam scores for a more separable membership signal. RAG shifts key assets from model parameters to the external document corpus and its retrieval index, but this shift also introduces a new privacy leakage channel [8], [9]: even without access to the knowledge base or retrieved contexts, an adversary who p… view at source ↗

**Figure 2.** Figure 2: Overview of E-MIA, which infers document membership in a black-box RAG system by generating an exam from a view at source ↗

**Figure 3.** Figure 3: Example of an exam automatically generated from a view at source ↗

**Figure 4.** Figure 4: Exploring the effectiveness of question types view at source ↗

**Figure 6.** Figure 6: Membership and non-membership score distribution view at source ↗

**Figure 5.** Figure 5: Heatmap of membership separability (∆) across evidence and question types Distributional Advantage. We visualize the membership score distributions for E-MIA and three competitive baselines on the TREC-COVID dataset in view at source ↗

**Figure 7.** Figure 7: Influence of Question Density and Design view at source ↗

**Figure 8.** Figure 8: Performance stability across different numbers of view at source ↗

**Figure 10.** Figure 10: Performance Across Diverse Exam Generators view at source ↗

**Figure 11.** Figure 11: Performance of E-MIA across various retrievers view at source ↗

read the original abstract

Retrieval-Augmented Generation (RAG) equips large language models (LLMs) with external evidence by retrieving documents at inference time, but it also turns the retrieval corpusinto a sensitive asset. Under a black-box setting, an adversary given a candidate document can infer whether it has been ingested into the RAG knowledge base (i.e., document-level membership inference) solely from query response interactions, thereby leaking corpus coverage and the existence of sensitive topics. Existing RAG MIA methods either rely on soft signals such as semantic similarity, which often yield overlapping member/non-member score distributions and unstable thresholds, or employ explicit confirmation probes whose intent is conspicuous and thus prone to refusal and detection. We propose E-MIA, which converts verifiable hard evidence in the target document (e.g., fine-grained details, proper nouns/technical terms, definitional statements, metadata cues, and causal/constraint relations) into an exam with four objectively gradable question types (FB/SC/MC/T/F), and uses the aggregated exam score across multiple evidence targeted questions as the membership signal. Experiments across multiple datasets and diverse RAG configurations demonstrate that E-MIA improves member/non-member separability in stringent settings while preserving natural, stealthy queries, and we further analyze the impact of question composition and exam length on attack effectiveness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

E-MIA turns documents into targeted exam questions for black-box RAG membership inference, which is a fresh angle, but the whole thing depends on whether that question conversion actually produces stable signals.

read the letter

E-MIA converts parts of a document into an exam with fill-in-the-blank, short-answer, multiple-choice, and true/false questions, then uses how well the RAG system answers them as a signal for whether the document is in the corpus. That's the core idea, and it aims to do this in a black-box way with queries that don't look like attacks. The new part is framing membership inference as this structured exam on verifiable evidence like proper nouns, definitions, and causal links. It moves past simple semantic similarity checks or blunt confirmation questions that might get refused. The paper shows it works across some datasets and RAG setups, keeping the queries stealthy. It does well at identifying a practical privacy risk in RAG systems that index private docs. The approach tries to make the attack more reliable by aggregating scores from multiple targeted questions instead of relying on one weak signal. The soft spot is in turning arbitrary documents into those gradable questions reliably. Not every document has clear, unique facts that the model won't know otherwise or that retrieval will surface perfectly. If question creation introduces ambiguity or if the LLM guesses right anyway, the member/non-member separation falls apart. The abstract talks about improved separability but skips the actual numbers, error bars, or ablations, so it's hard to gauge how often this conversion succeeds outside selected cases. That part needs more evidence to hold up. This paper is for folks in AI privacy and security who deal with retrieval-augmented systems. Someone looking for new black-box attack ideas could find the question types useful to build on. It deserves a serious referee because the technique is distinct from prior work and the problem is real, even if the current write-up is light on details. I would recommend sending it to peer review. The authors should add concrete metrics and tests on question quality to strengthen it.

Referee Report

3 major / 2 minor

Summary. The paper proposes E-MIA, a black-box membership inference attack on RAG systems. It extracts verifiable hard evidence (fine-grained details, proper nouns, definitional statements, causal relations) from a candidate document and converts it into an exam consisting of four objectively gradable question types (FB, SC, MC, T/F). The aggregated exam score from the RAG LLM's responses is used as the membership signal to distinguish whether the document is in the retrieval corpus. Experiments across multiple datasets and RAG configurations are claimed to show improved member/non-member separability relative to prior soft-signal or explicit-probe methods while preserving natural, stealthy queries; the paper also analyzes the effects of question composition and exam length.

Significance. If the central empirical claims hold with rigorous quantification, E-MIA would constitute a meaningful advance in black-box privacy attacks against RAG pipelines by replacing unstable similarity-based signals with structured, gradable probes. This would strengthen the case that retrieval corpora constitute a sensitive asset and could guide the design of query-monitoring or answer-consistency defenses.

major comments (3)

[§3] §3 (Method), paragraph on question generation: the central claim that verifiable hard evidence can be reliably turned into questions whose answer patterns differ stably between member and non-member cases is load-bearing, yet the manuscript supplies no ablation on conversion success rate, retrieval failure modes, or cases where the base LLM answers correctly from parametric knowledge alone. Without such data the membership signal's reliability remains unverified.
[§5] §5 (Experiments): the abstract asserts 'improved member/non-member separability' and 'stable thresholds' but the experimental reporting must include concrete metrics (AUC or separation distance with error bars), dataset sizes, number of documents per split, and statistical tests against baselines; the current description leaves the magnitude and robustness of the improvement unquantified.
[§4.2] §4.2 (Analysis of question composition and exam length): the paper states it analyzes impact on attack effectiveness, yet no table or figure reports how performance varies with the proportion of each question type or with exam length; this omission weakens the practical takeaway on optimal exam design.

minor comments (2)

[Abstract] Notation for the four question types (FB/SC/MC/T/F) is introduced without an explicit table mapping abbreviations to full names on first use.
[§2] The manuscript should cite the specific prior RAG MIA works it compares against (semantic-similarity and explicit-confirmation baselines) with full references in the related-work section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate additional analyses, metrics, and visualizations as suggested.

read point-by-point responses

Referee: [§3] §3 (Method), paragraph on question generation: the central claim that verifiable hard evidence can be reliably turned into questions whose answer patterns differ stably between member and non-member cases is load-bearing, yet the manuscript supplies no ablation on conversion success rate, retrieval failure modes, or cases where the base LLM answers correctly from parametric knowledge alone. Without such data the membership signal's reliability remains unverified.

Authors: We agree that an explicit ablation on question generation reliability would strengthen the central claim. In the revised version we will add a new subsection (or appendix) reporting: (i) conversion success rate measured via manual verification on a random sample of 200 generated questions across datasets, (ii) observed retrieval failure modes with their frequencies, and (iii) the fraction of cases in which the base LLM answers correctly from parametric knowledge alone (measured by comparing RAG responses against a no-retrieval baseline). These results will directly quantify the stability of the membership signal. revision: yes
Referee: [§5] §5 (Experiments): the abstract asserts 'improved member/non-member separability' and 'stable thresholds' but the experimental reporting must include concrete metrics (AUC or separation distance with error bars), dataset sizes, number of documents per split, and statistical tests against baselines; the current description leaves the magnitude and robustness of the improvement unquantified.

Authors: We acknowledge that the current experimental section relies on qualitative descriptions and should be augmented with precise quantitative evidence. The revision will expand §5 to report: AUC (and separation distance) with error bars over at least five random seeds, exact dataset sizes and member/non-member split counts for every experiment, and statistical significance tests (paired t-tests or Wilcoxon signed-rank tests) against all baselines. These additions will make the magnitude and robustness of the claimed improvements fully verifiable. revision: yes
Referee: [§4.2] §4.2 (Analysis of question composition and exam length): the paper states it analyzes impact on attack effectiveness, yet no table or figure reports how performance varies with the proportion of each question type or with exam length; this omission weakens the practical takeaway on optimal exam design.

Authors: While §4.2 contains textual discussion of these factors, we agree that the absence of summarized tabular and visual results limits practical utility. We will add (i) a table reporting AUC for different mixtures of FB/SC/MC/TF questions (e.g., 25/25/25/25, 40/20/20/20, etc.) and (ii) a line plot of AUC versus exam length (from 4 to 20 questions). These will be placed in §4.2 and will directly support recommendations for optimal exam design. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack validated by experiments

full rationale

The paper describes a black-box attack that converts document evidence into four types of exam questions and aggregates scores as a membership signal. No equations, parameter fits, or derivations appear in the abstract or method outline. The central claim is supported by experimental results on multiple datasets and RAG configurations rather than any self-definitional reduction, fitted-input prediction, or load-bearing self-citation. The conversion step is presented as a practical heuristic whose effectiveness is measured externally, not assumed by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on the domain assumption that RAG systems will surface document content in answers when the document is present and that certain document elements produce reliably gradable questions.

axioms (1)

domain assumption RAG systems retrieve and condition generation on documents from the indexed corpus
Standard property of retrieval-augmented generation stated in the abstract

pith-pipeline@v0.9.0 · 5551 in / 1105 out tokens · 24574 ms · 2026-05-09T19:23:55.288889+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 17 canonical work pages · 11 internal anchors

[1]

PaLM 2 Technical Report

R. Anil, A. M. Dai, O. Firat, M. Johnson, D. Lepikhin, A. Passos, S. Shakeri, E. Taropa, P. Bailey, Z. Chenet al., “Palm 2 technical report,” arXiv preprint arXiv:2305.10403, 2023

work page internal anchor Pith review arXiv 2023
[2]

Language Models are Few-Shot Learners

T. B. Brown, “Language models are few-shot learners,”arXiv preprint arXiv:2005.14165, 2020

work page internal anchor Pith review arXiv 2005
[3]

Llama 2: Open Foundation and Fine-Tuned Chat Models

H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y . Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosaleet al., “Llama 2: Open foundation and fine-tuned chat models,”arXiv preprint arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Survey of hallucination in natural language generation,

Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y . Xu, E. Ishii, Y . J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,”ACM computing surveys, vol. 55, no. 12, pp. 1–38, 2023

2023
[5]

Retrieval- augmented generation for knowledge-intensive nlp tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in NeurIPS, vol. 33, pp. 9459–9474, 2020

2020
[6]

Improving language models by retrieving from trillions of tokens,

S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Millican, G. B. Van Den Driessche, J.-B. Lespiau, B. Damoc, A. Clarket al., “Improving language models by retrieving from trillions of tokens,” in ICML, 2022, pp. 2206–2240

2022
[7]

Core techniques of question answering systems over knowledge bases: a survey,

D. Diefenbach, V . Lopez, K. Singh, and P. Maret, “Core techniques of question answering systems over knowledge bases: a survey,”Knowledge and Information systems, vol. 55, no. 3, pp. 529–569, 2018

2018
[8]

Privacy leakage vs. protection measures: the growing disconnect,

B. Krishnamurthy, K. Naryshkin, and C. Wills, “Privacy leakage vs. protection measures: the growing disconnect,” inProceedings of the Web, vol. 2, no. 2011, 2011, pp. 1–10

2011
[9]

Propile: Probing privacy leakage in large language models,

S. Kim, S. Yun, H. Lee, M. Gubri, S. Yoon, and S. J. Oh, “Propile: Probing privacy leakage in large language models,”Advances in NeurIPS, vol. 36, pp. 20 750–20 762, 2023

2023
[10]

Did the neurons read your book? document-level membership inference for large language models,

M. Meeus, S. Jain, M. Rei, and Y .-A. de Montjoye, “Did the neurons read your book? document-level membership inference for large language models,” inUSENIX, 2024, pp. 2369–2385

2024
[11]

Falcon: A universal text- only membership inference attack framework against in-context learning,

H. Su, Y . Qin, Z. Li, X. Miao, and Y . Zhou, “Falcon: A universal text- only membership inference attack framework against in-context learning,” TIFS, vol. 20, pp. 12 964–12 979, 2025

2025
[12]

Generating is believing: Membership inference attacks against retrieval-augmented generation,

Y . Li, G. Liu, C. Wang, and Y . Yang, “Generating is believing: Membership inference attacks against retrieval-augmented generation,” inICASSP, 2025, pp. 1–5

2025
[13]

Is my data in your retrieval database? membership inference attacks against retrieval augmented generation

M. Anderson, G. Amit, and A. Goldsteen, “Is my data in your retrieval database? membership inference attacks against retrieval augmented generation,” inICISSP, R. D. Pietro, K. Renaud, and P. Mori, Eds. SCITEPRESS, 2025, pp. 474–485. [Online]. Available: https://doi.org/10.5220/0013108300003899

work page doi:10.5220/0013108300003899 2025
[14]

Defenses to membership inference attacks: A survey,

L. Hu, A. Yan, H. Yan, J. Li, T. Huang, Y . Zhang, C. Dong, and C. Yang, “Defenses to membership inference attacks: A survey,”ACM Computing Surveys, vol. 56, no. 4, pp. 1–34, 2023

2023
[15]

Doc2query–: when less is more,

M. Gospodinov, S. MacAvaney, and C. Macdonald, “Doc2query–: when less is more,” inECIR, 2023, pp. 414–422

2023
[16]

Retrieval-Augmented Generation for Large Language Models: A Survey

Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,” 2024. [Online]. Available: https://arxiv.org/abs/2312.10997

work page internal anchor Pith review Pith/arXiv arXiv 2024
[17]

Rag security and privacy: Formalizing the threat model and attack surface,

A. Arzanipour, R. Behnia, R. Ebrahimi, and K. Dutta, “Rag security and privacy: Formalizing the threat model and attack surface,” 2025. [Online]. Available: https://arxiv.org/abs/2509.20324

work page arXiv 2025
[18]

Membership inference attacks on machine learning: A survey,

H. Hu, Z. Salcic, L. Sun, G. Dobbie, P. S. Yu, and X. Zhang, “Membership inference attacks on machine learning: A survey,”ACM Computing Surveys (CSUR), vol. 54, no. 11s, pp. 1–37, 2022

2022
[19]

Label- only membership inference attacks,

C. A. Choquette-Choo, F. Tramer, N. Carlini, and N. Papernot, “Label- only membership inference attacks,” inICML, 2021, pp. 1964–1974

2021
[20]

Towards label-only membership inference attack against pre-trained large language models,

Y . He, B. Li, L. Liu, Z. Ba, W. Dong, Y . Li, Z. Qin, K. Ren, and C. Chen, “Towards label-only membership inference attack against pre-trained large language models,” inUSENIX Security, 2025

2025
[21]

Mask-based membership inference attacks for retrieval-augmented generation,

M. Liu, S. Zhang, and C. Long, “Mask-based membership inference attacks for retrieval-augmented generation,” inProceedings of the ACM on Web Conference 2025, 2025, pp. 2894–2907

2025
[22]

Riddle me this! stealthy membership inference for retrieval-augmented generation,

A. Naseh, Y . Peng, A. Suri, H. Chaudhari, A. Oprea, and A. Houmansadr, “Riddle me this! stealthy membership inference for retrieval-augmented generation,” inProceedings of ACM SIGSAC, 2025, pp. 1245–1259

2025
[23]

Dcmi: A differential calibration membership inference attack against retrieval-augmented generation,

X. Gao, X. Meng, Y . Dong, Z. Li, and S. Guo, “Dcmi: A differential calibration membership inference attack against retrieval-augmented generation,” inProceedings of ACM SIGSAC, 2025, pp. 4184–4198

2025
[24]

Budgetleak: Membership inference attacks on rag systems via the generation budget side channel,

H. Li, J. He, G. Wang, D. Feng, Z. Li, and M. Zhang, “Budgetleak: Membership inference attacks on rag systems via the generation budget side channel,”arXiv preprint arXiv:2511.12043, 2025

work page arXiv 2025
[25]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” inIEEE symposium on security and privacy. IEEE, 2017, pp. 3–18

2017
[26]

Ragleak: Membership inference attacks on rag-based large language models,

K. Feng, G. Zhang, H. Tian, H. Xu, Y . Zhang, T. Zhu, M. Ding, and B. Liu, “Ragleak: Membership inference attacks on rag-based large language models,” inAustralasian Conference on Information Security and Privacy, 2025, pp. 147–166

2025
[27]

Rag-leaks: difficulty- calibrated membership inference attacks on retrieval-augmented gener- ation,

G. Wang, J. He, H. Li, M. Zhang, and D. Feng, “Rag-leaks: difficulty- calibrated membership inference attacks on retrieval-augmented gener- ation,”Science China Information Sciences, vol. 68, no. 6, p. 160102, 2025

2025
[28]

The art of defending: A systematic evaluation and analysis of llm defense strategies on safety and over-defensiveness,

N. Varshney, P. Dolin, A. Seth, and C. Baral, “The art of defending: A systematic evaluation and analysis of llm defense strategies on safety and over-defensiveness,” inFindings of the ACL, 2024, pp. 13 111–13 128

2024
[29]

Mrm: Black-box membership inference attacks against multimodal rag systems,

P. Yang, J. Yin, H. Zheng, X. Bai, H. Wang, Y . Sun, X. Li, S. Wang, Y . Huang, and T. Qi, “Mrm: Black-box membership inference attacks against multimodal rag systems,”arXiv preprint arXiv:2506.07399, 2025

work page arXiv 2025
[30]

Qa-rag: Exploring llm reliance on external knowledge,

A. Mansurova, A. Mansurova, and A. Nugumanova, “Qa-rag: Exploring llm reliance on external knowledge,”Big Data and Cognitive Computing, vol. 8, no. 9, p. 115, 2024

2024
[31]

BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models,

N. Thakur, N. Reimers, A. Rücklé, A. Srivastava, and I. Gurevych, “BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models,” inThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. [Online]. Available: https://openreview.net/forum?id=wCu6T5xFjeJ

2021
[32]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Biet al., “Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning,”arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[33]

Gemma 2: Improving Open Language Models at a Practical Size

G. Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, L. Hussenot, T. Mesnard, B. Shahriari, A. Raméet al., “Gemma 2: Improving open language models at a practical size,”arXiv preprint arXiv:2408.00118, 2024

work page internal anchor Pith review arXiv 2024
[34]

Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

A. Yang, B. Zhang, B. Hui, B. Gao, B. Yu, C. Li, D. Liu, J. Tu, J. Zhou, J. Linet al., “Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement,”arXiv preprint arXiv:2409.12122, 2024

work page internal anchor Pith review arXiv 2024
[35]

The llama 3 herd of models,

A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fanet al., “The llama 3 herd of models,”arXiv e-prints, pp. arXiv–2407, 2024

2024
[36]

Retrieve anything to augment large language models

P. Zhang, S. Xiao, Z. Liu, Z. Dou, and J.-Y . Nie, “Retrieve anything to augment large language models,”arXiv preprint arXiv:2310.07554, 2023

work page arXiv 2023
[37]

Minilm: Deep self-attention distillation for task-agnostic compression of pre- trained transformers,

W. Wang, F. Wei, L. Dong, H. Bao, N. Yang, and M. Zhou, “Minilm: Deep self-attention distillation for task-agnostic compression of pre- trained transformers,”Advances in NeurIPS, vol. 33, pp. 5776–5788, 2020

2020
[38]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Y . Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Linet al., “Qwen3 embedding: Advancing text embedding and reranking through foundation models,”arXiv preprint arXiv:2506.05176, 2025

work page internal anchor Pith review arXiv 2025
[39]

Query rewriting in retrieval-augmented large language models,

X. Ma, Y . Gong, P. He, H. Zhao, and N. Duan, “Query rewriting in retrieval-augmented large language models,” inProceedings of EMNLP, 2023, pp. 5303–5315. 12

2023
[40]

Query rewriting via llms,

S. Dharwada, H. Devrani, J. Haritsa, and H. Doraiswamy, “Query rewriting via llms,” 2025. [Online]. Available: https://arxiv.org/abs/2502. 12918

2025
[41]

Maferw: Query rewriting with multi-aspect feedbacks for retrieval-augmented large language models,

Y . Wang, H. Zhang, L. Pang, B. Guo, H. Zheng, and Z. Zheng, “Maferw: Query rewriting with multi-aspect feedbacks for retrieval-augmented large language models,” inProceedings of the AAAI, vol. 39, no. 24, 2025, pp. 25 434–25 442

2025
[42]

Piguard: Prompt injection guardrail via mitigating overdefense for free,

H. Li, X. Liu, N. Zhang, and C. Xiao, “Piguard: Prompt injection guardrail via mitigating overdefense for free,” inProceedings of ACL, 2025, pp. 30 420–30 437

2025
[43]

Ft-llama-prompt-guard-2: Fine-tuned prompt injection and jail break detector,

A. Security, “Ft-llama-prompt-guard-2: Fine-tuned prompt injection and jail break detector,” 2024. [Online]. Available: https://huggingface.co/ Aira-security/FT-Llama-Prompt-Guard-2

2024
[44]

Chinesewebtext: Large-scale high-quality chinese web text extracted with effective evaluation model,

J. Chen, P. Jian, T. Xi, D. Yi, Q. Du, C. Ding, G. Zhu, C. Zong, J. Wang, and J. Zhang, “Chinesewebtext: Large-scale high-quality chinese web text extracted with effective evaluation model,”arXiv preprint arXiv:2311.01149, 2023

work page arXiv 2023
[45]

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

G. Comanici, E. Bieber, M. Schaekermann, I. Pasupat, N. Sachdeva, I. Dhillon, M. Blistein, O. Ram, D. Zhang, E. Rosenet al., “Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities,”arXiv preprint arXiv:2507.06261, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[46]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[47]

OpenAI GPT-5 System Card

A. Singh, A. Fry, A. Perelman, A. Tart, A. Ganesh, A. El-Kishky, A. McLaughlin, A. Low, A. Ostrow, A. Ananthramet al., “Openai gpt-5 system card,”arXiv preprint arXiv:2601.03267, 2025. Zelin Guanreceived the B.A. degree in computer science and technology from Guangzhou University, Guangzhou, China in 2025. He is currently pursuing the M.S. degree in Cyber...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[48]

degree in Cyberspace Security at Jinan University, Guangzhou, China

He is currently pursuing the Ph.D. degree in Cyberspace Security at Jinan University, Guangzhou, China. He is a joint Ph.D. student at The Hong Kong Polytechnic University (PloyU) from December 2024 to December 2025. His research interests include Machine Learning, Recommendation Systems, and Data Mining. Zeyan Li(Student Member, IEEE) is a dual-degree st...

2024