Trustworthiness in Retrieval-Augmented Generation Systems: A Survey
Pith reviewed 2026-05-23 21:06 UTC · model grok-4.3
The pith
Trust-RAG Compass framework assesses RAG system trustworthiness on six dimensions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a unified framework, Trust-RAG Compass, that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy. Within this framework, we provide a thorough review of the existing literature along each dimension. Furthermore, we introduce an evaluation benchmark, TRC Bench, regarding the six dimensions and conduct comprehensive evaluations for a variety of proprietary and open-source models. Our results shed light on the performance gaps between different types of LLMs across varying dimensions of trustworthiness.
What carries the argument
Trust-RAG Compass framework, which structures trustworthiness assessment into the six dimensions and supports literature review plus benchmarking via TRC Bench.
If this is right
- Literature on RAG trustworthiness can be organized along the six dimensions for structured analysis.
- TRC Bench enables direct comparison of models on trustworthiness metrics.
- Performance gaps appear between proprietary and open-source LLMs on the dimensions.
- Key challenges identified can guide targeted improvements in RAG development.
Where Pith is reading between the lines
- Developers could apply the benchmark to diagnose and fix weaknesses in specific dimensions for their RAG deployments.
- The framework structure might transfer to trustworthiness assessment in non-RAG LLM applications.
- If new RAG risks appear, the dimensions could be revisited or expanded in follow-up work.
Load-bearing premise
The six dimensions comprehensively and without overlap capture all aspects of trustworthiness in RAG systems.
What would settle it
An empirical study that identifies a significant trustworthiness failure mode in RAG systems not covered by any of the six dimensions.
Figures
read the original abstract
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs). Although existing research mainly emphasizes accuracy and efficiency, the trustworthiness of RAG systems remains insufficiently explored. RAG can improve LLM reliability by grounding responses in external and up-to-date knowledge, reducing hallucinations. However, unreliable retrieval or improper knowledge utilization may still lead to undesirable outputs. To address these concerns, we propose a unified framework, Trust-RAG Compass, that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy. Within this framework, we provide a thorough review of the existing literature along each dimension. Furthermore, we introduce an evaluation benchmark, TRC Bench (\underline{T}rust-\underline{R}AG \underline{C}ompass \underline{Bench}mark), regarding the six dimensions and conduct comprehensive evaluations for a variety of proprietary and open-source models. Our results shed light on the performance gaps between different types of LLMs across varying dimensions of trustworthiness. Finally, we identify key challenges and promising directions for future research based on our findings. Through this work, we aim to provide a structured foundation for subsequent investigations and practical guidance for developing trustworthy RAG systems in real-world scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Trust-RAG Compass, a unified framework for assessing trustworthiness of Retrieval-Augmented Generation (RAG) systems across six dimensions (factuality, robustness, fairness, transparency, accountability, privacy). It reviews the literature organized by these dimensions, introduces the TRC Bench evaluation benchmark covering the six dimensions, evaluates a range of proprietary and open-source LLMs on the benchmark, reports performance gaps, and outlines key challenges and future directions.
Significance. A well-justified taxonomy and benchmark could supply a needed organizing structure for trustworthiness research in RAG, moving beyond accuracy-focused evaluations and enabling systematic comparisons across model types.
major comments (1)
- [Framework introduction / §3] The manuscript presents the six dimensions of Trust-RAG Compass (factuality, robustness, fairness, transparency, accountability, privacy) as given in the abstract and framework definition without deriving them from a systematic enumeration of RAG failure modes or comparing the partition against plausible alternatives (e.g., addition of security or calibration). Because this choice structures the entire literature review and the construction of TRC Bench, the absence of explicit justification or validation is load-bearing for the central claim.
minor comments (2)
- [Framework section] Ensure that the definition of each dimension in the framework section is accompanied by a short list of concrete RAG-specific failure examples so readers can map the taxonomy to observed behaviors.
- [TRC Bench section] In the benchmark description, clarify how the six evaluation subsets were constructed and whether any overlap or redundancy between dimensions was measured.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the justification of the Trust-RAG Compass framework. We address the single major comment below and will incorporate revisions accordingly.
read point-by-point responses
-
Referee: [Framework introduction / §3] The manuscript presents the six dimensions of Trust-RAG Compass (factuality, robustness, fairness, transparency, accountability, privacy) as given in the abstract and framework definition without deriving them from a systematic enumeration of RAG failure modes or comparing the partition against plausible alternatives (e.g., addition of security or calibration). Because this choice structures the entire literature review and the construction of TRC Bench, the absence of explicit justification or validation is load-bearing for the central claim.
Authors: We agree that an explicit derivation from RAG failure modes would strengthen the framework's foundation. In the revised manuscript we will expand §3 with a new subsection that first enumerates representative RAG failure modes drawn from the surveyed literature (hallucinations and retrieval errors for factuality; adversarial retrieval attacks and distribution shifts for robustness; biased retrieval results for fairness; opaque retrieval-generation pipelines for transparency; lack of audit trails for accountability; and leakage of private retrieved content for privacy). We will then map each dimension to these modes and briefly compare the resulting partition against alternatives, noting that security concerns are largely subsumed under robustness and privacy while calibration issues fall under factuality. This addition will directly support the literature organization and TRC Bench construction without changing the six dimensions themselves. revision: yes
Circularity Check
No circularity: framework organizes literature review without self-referential reduction
full rationale
The paper proposes Trust-RAG Compass as an organizing framework for a survey of existing RAG trustworthiness literature, listing the six dimensions directly in the abstract and stating that the review proceeds 'within this framework.' No equations, fitted parameters, or self-citations are invoked to derive the dimensions; the structure is presented as a proposed taxonomy drawn from the reviewed works rather than reducing to any input by construction. The central claims (literature organization and TRC Bench) therefore remain independent of the framework definition itself.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 4 Pith papers
-
Why Retrieval-Augmented Generation Fails: A Graph Perspective
Attribution graphs reveal that RAG failures arise from shallow fragmented evidence flow in LLMs, enabling topology-based detection and targeted interventions that reinforce question-guided routing.
-
When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Making
Adversarial explanation attacks preserve nearly all human trust in wrong AI outputs by using persuasive framing, shown in a study varying reasoning, evidence, style, and format with over 200 participants.
-
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Search-o1 integrates agentic retrieval-augmented generation and a Reason-in-Documents module into large reasoning models to dynamically supply missing knowledge and improve performance on complex science, math, coding...
-
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
ALDEN boosts private data extraction rates from RAG systems by combining active learning for query diversification with dynamic estimation of the underlying knowledge-base topic distribution.
Reference graph
Works this paper leans on
-
[1]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” 2023
work page 2023
-
[2]
Exploring the limits of transfer learning with a unified text-to- text transformer,
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P . J. Liu, “Exploring the limits of transfer learning with a unified text-to- text transformer,” J. Mach. Learn. Res. , vol. 21, pp. 140:1–140:67, 2020
work page 2020
-
[3]
Leveraging passage re- trieval with generative models for open domain ques- tion answering,
G. Izacard and E. Grave, “Leveraging passage re- trieval with generative models for open domain ques- tion answering,” in EACL. Association for Computa- tional Linguistics, 2021, pp. 874–880
work page 2021
-
[4]
WebGPT: Browser-assisted question-answering with human feedback
R. Nakano, J. Hilton, S. Balaji, J. Wu, L. Ouyang, C. Kim, C. Hesse, S. Jain, V . Kosaraju, W. Saun- ders, X. Jiang, K. Cobbe, T. Eloundou, G. Krueger, K. Button, M. Knight, B. Chess, and J. Schulman, “Webgpt: Browser-assisted question-answering with human feedback,” CoRR, vol. abs/2112.09332, 2021. 17
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[5]
Y. Bang, S. Cahyawijaya, N. Lee, W. Dai, D. Su, B. Wilie, H. Lovenia, Z. Ji, T. Yu, W. Chung, Q. V . Do, Y. Xu, and P . Fung, “A multitask, multilingual, multi- modal evaluation of chatgpt on reasoning, hallucina- tion, and interactivity,” inIJCNLP (1). Association for Computational Linguistics, 2023, pp. 675–718
work page 2023
-
[6]
W. Su, C. Wang, Q. Ai, Y. Hu, Z. Wu, Y. Zhou, and Y. Liu, “Unsupervised real-time hallucination detec- tion based on the internal states of large language models,” in ACL (Findings) . Association for Com- putational Linguistics, 2024, pp. 14 379–14 391
work page 2024
-
[7]
Y. Li, M. Du, R. Song, X. Wang, M. Sun, and Y. Wang, “Mitigating social biases of pre-trained lan- guage models via contrastive self-debiasing with dou- ble data augmentation,” Artificial Intelligence, vol. 332, p. 104143, 2024
work page 2024
-
[8]
Merging generated and retrieved knowledge for open-domain QA,
Y. Zhang, M. Khalifa, L. Logeswaran, M. Lee, H. Lee, and L. Wang, “Merging generated and retrieved knowledge for open-domain QA,” in EMNLP. Asso- ciation for Computational Linguistics, 2023, pp. 4710– 4728
work page 2023
-
[9]
S. Pal, M. Bhattacharya, M. A. Islam, and C. Chakraborty, “Chatgpt or llm in next-generation drug discovery and development: pharmaceutical and biotechnology companies can make use of the artifi- cial intelligence-based device for a faster way of drug discovery and development,” International Journal of Surgery, vol. 109, no. 12, pp. 4382–4384, 2023
work page 2023
-
[10]
REPLUG: Retrieval-Augmented Black-Box Language Models
W. Shi, S. Min, M. Yasunaga, M. Seo, R. James, M. Lewis, L. Zettlemoyer, and W. Yih, “REPLUG: retrieval-augmented black-box language models,” CoRR, vol. abs/2301.12652, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[11]
Benchmarking large language models in retrieval-augmented gener- ation,
J. Chen, H. Lin, X. Han, and L. Sun, “Benchmarking large language models in retrieval-augmented gener- ation,” in AAAI. AAAI Press, 2024, pp. 17 754–17 762
work page 2024
-
[12]
Revolutionizing finance with llms: An overview of applications and insights,
H. Zhao, Z. Liu, Z. Wu, Y. Li, T. Yang, P . Shu, S. Xu, H. Dai, L. Zhao, G. Maiet al., “Revolutionizing finance with llms: An overview of applications and insights,” arXiv preprint arXiv:2401.11641, 2024
-
[13]
Clipsyntel: clip and llm synergy for multimodal question summarization in healthcare,
A. Ghosh, A. Acharya, R. Jain, S. Saha, A. Chadha, and S. Sinha, “Clipsyntel: clip and llm synergy for multimodal question summarization in healthcare,” in Proceedings of the AAAI Conference on Artificial In- telligence, vol. 38, no. 20, 2024, pp. 22 031–22 039
work page 2024
-
[15]
Selecmix: Debiased learning by contradicting-pair sampling,
I. Hwang, S. Lee, Y. Kwak, S. J. Oh, D. Teney, J.-H. Kim, and B.-T. Zhang, “Selecmix: Debiased learning by contradicting-pair sampling,” Advances in Neural Information Processing Systems , vol. 35, pp. 14 345– 14 357, 2022
work page 2022
-
[16]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P . Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algo- rithms,” CoRR, vol. abs/1707.06347, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[17]
Ew-tune: A framework for privately fine-tuning large language models with differential privacy,
R. Behnia, M. Ebrahimi, J. Pacheco, and B. Padmanab- han, “Ew-tune: A framework for privately fine-tuning large language models with differential privacy,” in ICDM (Workshops). IEEE, 2022, pp. 560–566
work page 2022
-
[18]
T. Y. Zhuo, Z. Li, Y. Huang, F. Shiri, W. Wang, G. Haffari, and Y. Li, “On robustness of prompt- based semantic parsing with large pre-trained lan- guage model: An empirical study on codex,” in EACL. Association for Computational Linguistics, 2023, pp. 1090–1102
work page 2023
-
[19]
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Y. Liu, Y. Yao, J. Ton, X. Zhang, R. Guo, H. Cheng, Y. Klochkov, M. F. Taufiq, and H. Li, “Trust- worthy llms: a survey and guideline for evaluat- ing large language models’ alignment,” CoRR, vol. abs/2308.05374, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[20]
TrustLLM: Trustworthiness in Large Language Models
L. Sun, Y. Huang, H. Wang, S. Wu, Q. Zhang, C. Gao, Y. Huang, W. Lyu, Y. Zhang, X. Li, Z. Liu, Y. Liu, Y. Wang, Z. Zhang, B. Kailkhura, C. Xiong, C. Xiao, C. Li, E. P . Xing, F. Huang, H. Liu, H. Ji, H. Wang, H. Zhang, H. Yao, M. Kellis, M. Zitnik, M. Jiang, M. Bansal, J. Zou, J. Pei, J. Liu, J. Gao, J. Han, J. Zhao, J. Tang, J. Wang, J. Mitchell, K. Shu,...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[21]
Atlas: Few-shot learning with retrieval augmented language models,
G. Izacard, P . S. H. Lewis, M. Lomeli, L. Hos- seini, F. Petroni, T. Schick, J. Dwivedi-Yu, A. Joulin, S. Riedel, and E. Grave, “Atlas: Few-shot learning with retrieval augmented language models,” J. Mach. Learn. Res., vol. 24, pp. 251:1–251:43, 2023
work page 2023
-
[22]
Ragbench: Explain- able benchmark for retrieval-augmented generation systems,
R. Friel, M. Belyi, and A. Sanyal, “Ragbench: Explain- able benchmark for retrieval-augmented generation systems,” 2024
work page 2024
-
[23]
Rag- ex: A generic framework for explaining retrieval aug- mented generation,
V . Sudhi, S. R. Bhat, M. Rudat, and R. Teucher, “Rag- ex: A generic framework for explaining retrieval aug- mented generation,” in SIGIR. ACM, 2024, pp. 2776– 2780
work page 2024
-
[24]
Fairrag: Fair human generation via fair retrieval aug- mentation,
R. Shrestha, Y. Zou, Q. Chen, Z. Li, Y. Xie, and S. Deng, “Fairrag: Fair human generation via fair retrieval aug- mentation,” CoRR, vol. abs/2403.19964, 2024
-
[25]
Active retrieval augmented generation,
Z. Jiang, F. F. Xu, L. Gao, Z. Sun, Q. Liu, J. Dwivedi- Yu, Y. Yang, J. Callan, and G. Neubig, “Active retrieval augmented generation,” in EMNLP. Association for Computational Linguistics, 2023, pp. 7969–7992
work page 2023
-
[26]
Retrieval- augmented generation for knowledge-intensive NLP tasks,
P . S. H. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W. Yih, T. Rockt ¨aschel, S. Riedel, and D. Kiela, “Retrieval- augmented generation for knowledge-intensive NLP tasks,” in NeurIPS, 2020
work page 2020
-
[27]
Retrieval augmented language model pre-training,
K. Guu, K. Lee, Z. Tung, P . Pasupat, and M. Chang, “Retrieval augmented language model pre-training,” in ICML, ser. Proceedings of Machine Learning Re- search, vol. 119. PMLR, 2020, pp. 3929–3938
work page 2020
-
[28]
Improving language models by retrieving from trillions of to- 18 kens,
S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Millican, G. van den Driessche, J. Lespiau, B. Damoc, A. Clark, D. de Las Casas, A. Guy, J. Menick, R. Ring, T. Hennigan, S. Huang, L. Maggiore, C. Jones, A. Cassirer, A. Brock, M. Pa- ganini, G. Irving, O. Vinyals, S. Osindero, K. Si- monyan, J. W. Rae, E. Elsen, and L. Sifre, “Improving lang...
work page 2022
-
[29]
Generalization through memoriza- tion: Nearest neighbor language models,
U. Khandelwal, O. Levy, D. Jurafsky, L. Zettlemoyer, and M. Lewis, “Generalization through memoriza- tion: Nearest neighbor language models,” in 8th In- ternational Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 . Open- Review.net, 2020
work page 2020
-
[30]
Chain-of- thought prompting elicits reasoning in large language models,
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou, “Chain-of- thought prompting elicits reasoning in large language models,” in NeurIPS, 2022
work page 2022
-
[31]
Tree of thoughts: Deliberate problem solving with large language models,
S. Yao, D. Yu, J. Zhao, I. Shafran, T. Griffiths, Y. Cao, and K. Narasimhan, “Tree of thoughts: Deliberate problem solving with large language models,” in NeurIPS, 2023
work page 2023
-
[32]
Self- consistency improves chain of thought reasoning in language models,
X. Wang, J. Wei, D. Schuurmans, Q. V . Le, E. H. Chi, S. Narang, A. Chowdhery, and D. Zhou, “Self- consistency improves chain of thought reasoning in language models,” in ICLR. OpenReview.net, 2023
work page 2023
-
[33]
Large language models can be easily distracted by irrelevant context,
F. Shi, X. Chen, K. Misra, N. Scales, D. Dohan, E. H. Chi, N. Sch¨arli, and D. Zhou, “Large language models can be easily distracted by irrelevant context,” in ICML, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 31 210–31 227
work page 2023
-
[34]
Take a step back: Evoking reasoning via abstraction in large language models,
H. S. Zheng, S. Mishra, X. Chen, H. Cheng, E. H. Chi, Q. V . Le, and D. Zhou, “Take a step back: Evoking reasoning via abstraction in large language models,” CoRR, vol. abs/2310.06117, 2023
-
[35]
Promptagator: Few-shot dense retrieval from 8 ex- amples,
Z. Dai, V . Y. Zhao, J. Ma, Y. Luan, J. Ni, J. Lu, A. Bakalov, K. Guu, K. B. Hall, and M. Chang, “Promptagator: Few-shot dense retrieval from 8 ex- amples,” in The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
work page 2023
-
[36]
Query rewrit- ing for retrieval-augmented large language models,
X. Ma, Y. Gong, P . He, H. Zhao, and N. Duan, “Query rewriting for retrieval-augmented large lan- guage models,” CoRR, vol. abs/2305.14283, 2023
-
[37]
How context af- fects language models’ factual predictions,
F. Petroni, P . S. H. Lewis, A. Piktus, T. Rockt ¨aschel, Y. Wu, A. H. Miller, and S. Riedel, “How context af- fects language models’ factual predictions,” in AKBC, 2020
work page 2020
-
[38]
Re2g: Retrieve, rerank, generate,
M. R. Glass, G. Rossiello, M. F. M. Chowdhury, A. Naik, P . Cai, and A. Gliozzo, “Re2g: Retrieve, rerank, generate,” in NAACL-HLT. Association for Computational Linguistics, 2022, pp. 2701–2715
work page 2022
-
[39]
Walking down the memory maze: Beyond context limit through interactive reading,
H. Chen, R. Pasunuru, J. Weston, and A. Celiky- ilmaz, “Walking down the memory maze: Beyond context limit through interactive reading,” CoRR, vol. abs/2310.05029, 2023
-
[40]
Sure: Summarizing retrievals using answer candidates for open-domain QA of llms,
J. Kim, J. Nam, S. Mo, J. Park, S. Lee, M. Seo, J. Ha, and J. Shin, “Sure: Summarizing retrievals using answer candidates for open-domain QA of llms,” CoRR, vol. abs/2404.13081, 2024
-
[41]
RECOMP: improving retrieval-augmented lms with context compression and selective augmentation,
F. Xu, W. Shi, and E. Choi, “RECOMP: improving retrieval-augmented lms with context compression and selective augmentation,” in ICLR. OpenRe- view.net, 2024
work page 2024
-
[42]
J. Jin, Y. Zhu, Y. Zhou, and Z. Dou, “BIDER: bridg- ing knowledge inconsistency for efficient retrieval- augmented llms via key supporting evidence,” CoRR, vol. abs/2402.12174, 2024
-
[43]
H. Yang, Z. Li, Y. Zhang, J. Wang, N. Cheng, M. Li, and J. Xiao, “PRCA: fitting black-box large language models for retrieval question answering via pluggable reward-driven contextual adapter,” inEMNLP. Asso- ciation for Computational Linguistics, 2023, pp. 5364– 5375
work page 2023
-
[44]
Self-knowledge guided retrieval augmentation for large language models,
Y. Wang, P . Li, M. Sun, and Y. Liu, “Self-knowledge guided retrieval augmentation for large language models,” in Findings of the Association for Computa- tional Linguistics: EMNLP 2023, Singapore, December 6-10, 2023 , H. Bouamor, J. Pino, and K. Bali, Eds. Association for Computational Linguistics, 2023, pp. 10 303–10 315
work page 2023
-
[45]
S. Jeong, J. Baek, S. Cho, S. J. Hwang, and J. Park, “Adaptive-rag: Learning to adapt retrieval- augmented large language models through question complexity,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies (Volume 1: Long Papers), NAACL 2024, Mexico City, ...
work page 2024
-
[46]
React: Synergizing reason- ing and acting in language models,
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. R. Narasimhan, and Y. Cao, “React: Synergizing reason- ing and acting in language models,” in ICLR. Open- Review.net, 2023
work page 2023
-
[47]
Measuring and narrowing the com- positionality gap in language models,
O. Press, M. Zhang, S. Min, L. Schmidt, N. A. Smith, and M. Lewis, “Measuring and narrowing the com- positionality gap in language models,” in EMNLP (Findings). Association for Computational Linguis- tics, 2023, pp. 5687–5711
work page 2023
-
[48]
H. Trivedi, N. Balasubramanian, T. Khot, and A. Sab- harwal, “Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step ques- tions,” in ACL (1) . Association for Computational Linguistics, 2023, pp. 10 014–10 037
work page 2023
-
[49]
Self-rag: Learning to retrieve, generate, and critique through self-reflection,
A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, “Self-rag: Learning to retrieve, generate, and critique through self-reflection,” 2024
work page 2024
-
[50]
Toolformer: Language models can teach themselves to use tools,
T. Schick, J. Dwivedi-Yu, R. Dess `ı, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: Language models can teach themselves to use tools,” in NeurIPS, 2023
work page 2023
-
[51]
The role of chatgpt in scientific communication: writing better scientific review arti- cles,
J. Huang and M. Tan, “The role of chatgpt in scientific communication: writing better scientific review arti- cles,” American journal of cancer research , vol. 13, no. 4, p. 1148, 2023
work page 2023
-
[52]
When llm-based code genera- tion meets the software development process,
F. Lin, D. J. Kim et al., “When llm-based code genera- tion meets the software development process,” arXiv preprint arXiv:2403.15852, 2024
-
[53]
T. Feng, L. Qu, N. Tandon, Z. Li, X. Kang, and G. Haffari, “From pre-training corpora to large lan- guage models: What factors influence llm perfor- mance in causal discovery tasks?” arXiv preprint arXiv:2407.19638, 2024
-
[54]
L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin et al. , “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,” arXiv 19 preprint arXiv:2311.05232, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[55]
Llm-driven robots risk enacting discrimination, violence, and unlawful actions,
R. Azeem, A. Hundt, M. Mansouri, and M. Brand ˜ao, “Llm-driven robots risk enacting discrimination, violence, and unlawful actions,” arXiv preprint arXiv:2406.08824, 2024
-
[56]
On protecting the data privacy of large language models (llms): A survey,
B. Yan, K. Li, M. Xu, Y. Dong, Y. Zhang, Z. Ren, and X. Cheng, “On protecting the data privacy of large language models (llms): A survey,” arXiv preprint arXiv:2403.05156, 2024
-
[57]
A new era in llm security: Exploring security con- cerns in real-world llm-based systems,
F. Wu, N. Zhang, S. Jha, P . McDaniel, and C. Xiao, “A new era in llm security: Exploring security con- cerns in real-world llm-based systems,” arXiv preprint arXiv:2402.18649, 2024
-
[58]
SAIL: search-augmented instruction learning,
H. Luo, Y. Chuang, Y. Gong, T. Zhang, Y. Kim, X. Wu, D. Fox, H. Meng, and J. R. Glass, “SAIL: search-augmented instruction learning,” CoRR, vol. abs/2305.15225, 2023
-
[59]
B. Peng, M. Galley, P . He, H. Cheng, Y. Xie, Y. Hu, Q. Huang, L. Liden, Z. Yu, W. Chen, and J. Gao, “Check your facts and try again: Improving large language models with external knowledge and auto- mated feedback,” CoRR, vol. abs/2302.12813, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[60]
Generate rather than retrieve: Large language models are strong context generators,
W. Yu, D. Iter, S. Wang, Y. Xu, M. Ju, S. Sanyal, C. Zhu, M. Zeng, and M. Jiang, “Generate rather than retrieve: Large language models are strong context generators,” in ICLR. OpenReview.net, 2023
work page 2023
-
[61]
Recall: A benchmark for llms robustness against external counterfactual knowledge,
Y. Liu, L. Huang, S. Li, S. Chen, H. Zhou, F. Meng, J. Zhou, and X. Sun, “RECALL: A benchmark for llms robustness against external counterfactual knowl- edge,” CoRR, vol. abs/2311.08147, 2023
-
[62]
On the risk of misinformation pollution with large language models,
Y. Pan, L. Pan, W. Chen, P . Nakov, M. Kan, and W. Y. Wang, “On the risk of misinformation pollution with large language models,” in EMNLP (Findings) . Association for Computational Linguistics, 2023, pp. 1389–1403
work page 2023
-
[63]
S. Cho, S. Jeong, J. Seo, T. Hwang, and J. C. Park, “Ty- pos that broke the rag’s back: Genetic attack on RAG pipeline by simulating documents in the wild via low- level perturbations,” CoRR, vol. abs/2404.13948, 2024
-
[64]
Poi- soning retrieval corpora by injecting adversarial pas- sages,
Z. Zhong, Z. Huang, A. Wettig, and D. Chen, “Poi- soning retrieval corpora by injecting adversarial pas- sages,” in EMNLP. Association for Computational Linguistics, 2023, pp. 13 764–13 775
work page 2023
-
[65]
Attacking open-domain question answering by injecting misin- formation,
L. Pan, W. Chen, M. Kan, and W. Y. Wang, “Attacking open-domain question answering by injecting misin- formation,” in IJCNLP (1). Association for Computa- tional Linguistics, 2023, pp. 525–539
work page 2023
-
[66]
S. Abdelnabi, K. Greshake, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection,” inAISec@CCS. ACM, 2023, pp. 79–90
work page 2023
-
[67]
Defending against disinformation attacks in open-domain question answering,
O. Weller, A. Khan, N. Weir, D. J. Lawrie, and B. V . Durme, “Defending against disinformation attacks in open-domain question answering,” in EACL (2) . Association for Computational Linguistics, 2024, pp. 402–417
work page 2024
-
[68]
G. Hong, J. Kim, J. Kang, S. Myaeng, and J. J. Whang, “Why so gullible? enhancing the robustness of retrieval-augmented models against counterfactual noise,” CoRR, vol. abs/2305.01579, 2023
-
[69]
Certifiably robust rag against retrieval corruption
C. Xiang, T. Wu, Z. Zhong, D. Wagner, D. Chen, and P . Mittal, “Certifiably robust rag against retrieval corruption,” arXiv preprint arXiv:2405.15556, 2024
-
[70]
H. Qian, Y. Zhu, Z. Dou, H. Gu, X. Zhang, Z. Liu, R. Lai, Z. Cao, J.-Y. Nie, and J.-R. Wen, “Webbrain: Learning to generate factually correct articles for queries by grounding on large web corpus,” CoRR, vol. abs/2304.04358, 2023
-
[71]
S. Xu, L. Pang, H. Shen, X. Cheng, and T.- S. Chua, “Search-in-the-chain: Towards the accu- rate, credible and traceable content generation for complex knowledge-intensive tasks,” CoRR, vol. abs/2304.14732, 2023
-
[72]
Llatrieval: Llm-verified retrieval for verifiable gen- eration,
X. Li, C. Zhu, L. Li, Z. Yin, T. Sun, and X. Qiu, “Llatrieval: Llm-verified retrieval for verifiable gen- eration,” CoRR, vol. abs/2311.07838, 2023
-
[73]
Effective large language model adaptation for improved grounding,
X. Ye, R. Sun, S. ¨O. Arik, and T. Pfister, “Effective large language model adaptation for improved grounding,” CoRR, vol. abs/2311.09533, 2023
-
[74]
Y. Fang, S. W. Thomas, and X. Zhu, “HGOT: hierar- chical graph of thoughts for retrieval-augmented in- context learning in factuality evaluation,” CoRR, vol. abs/2402.09390, 2024
-
[75]
S. Xia, X. Wang, J. Liang, Y. Zhang, W. Zhou, J. Deng, F. Yu, and Y. Xiao, “Ground every sentence: Improving retrieval-augmented llms with interleaved reference- claim generation,” arXiv preprint arXiv:2407.01796 , 2024
-
[76]
PURR: efficiently editing language model halluci- nations by denoising language model corruptions,
A. Chen, P . Pasupat, S. Singh, H. Lee, and K. Guu, “PURR: efficiently editing language model halluci- nations by denoising language model corruptions,” CoRR, vol. abs/2305.14908, 2023
-
[77]
Citation- enhanced generation for llm-based chatbots,
W. Li, J. Li, W. Ma, and Y. Liu, “Citation- enhanced generation for llm-based chatbots,” CoRR, vol. abs/2402.16063, 2024
-
[78]
Retrieving sup- porting evidence for generative question answering,
S. Huo, N. Arabzadeh, and C. Clarke, “Retrieving sup- porting evidence for generative question answering,” in Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023, pp. 11–20
work page 2023
-
[79]
W. Zou, R. Geng, B. Wang, and J. Jia, “Poisonedrag: Knowledge poisoning attacks to retrieval-augmented generation of large language models,” CoRR, vol. abs/2402.07867, 2024
-
[80]
Phantom: General trigger attacks on retrieval augmented language generation,
H. Chaudhari, G. Severi, J. Abascal, M. Jagielski, C. A. Choquette-Choo, M. Nasr, C. Nita-Rotaru, and A. Oprea, “Phantom: General trigger attacks on retrieval augmented language generation,” arXiv preprint arXiv:2405.20485, 2024
-
[81]
Neu- ral exec: Learning (and learning from) execution triggers for prompt injection attacks,
D. Pasquini, M. Strohmeier, and C. Troncoso, “Neu- ral exec: Learning (and learning from) execution triggers for prompt injection attacks,” CoRR, vol. abs/2403.03792, 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.