pith. sign in

arxiv: 2404.10981 · v2 · pith:UOIA2PEFnew · submitted 2024-04-17 · 💻 cs.IR · cs.AI· cs.CL

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Pith reviewed 2026-05-24 02:13 UTC · model grok-4.3

classification 💻 cs.IR cs.AIcs.CL
keywords retrieval-augmented generationlarge language modelsRAGsurveytext generationinformation retrievalevaluation methodsfuture directions
0
0 comments X

The pith

Retrieval-augmented generation addresses static limits in large language models by dynamically incorporating external information through four organized stages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey argues that retrieval-augmented generation merges retrieval techniques with deep learning to let large language models pull in current external data instead of relying only on fixed training. It groups the entire process into four categories viewed from the retrieval side: pre-retrieval, retrieval, post-retrieval, and generation. The paper traces how the approach has developed by examining key studies, presents ways to evaluate these systems, and points to open challenges and next steps. A reader would care because the structure shows a practical route to making model outputs more accurate and up to date without retraining from scratch.

Core claim

The paper states that retrieval-augmented generation merges retrieval methods with deep learning to overcome the static limitations of large language models by enabling dynamic integration of up-to-date external information. Organized into the four categories of pre-retrieval, retrieval, post-retrieval, and generation from the retrieval viewpoint, the framework consolidates research, clarifies technological details, introduces evaluation methods, and outlines future directions to broaden the adaptability of large language models.

What carries the argument

The four-category framework (pre-retrieval, retrieval, post-retrieval, generation) that structures RAG components and performance factors from the retrieval perspective.

If this is right

  • RAG supplies a cost-effective route to more accurate and reliable text generation by grounding outputs in real-world data.
  • The category breakdown makes it possible to isolate and improve individual influences on overall system quality.
  • Standardized evaluation methods can be applied to compare different RAG designs against shared challenges.
  • Future work can target the gaps the survey identifies to expand large language model uses.
  • The progression analysis shows how retrieval integration has evolved to handle more complex scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Researchers could test whether reordering or merging any of the four stages yields measurable gains on standard benchmarks.
  • The same staging might apply to non-text domains such as code or image generation where retrieval also matters.
  • Developers could use the categories to diagnose why a particular RAG system underperforms and focus fixes on one stage.
  • The evaluation challenges noted could lead to new metrics that measure retrieval quality separately from generation quality.

Load-bearing premise

The four categories fully cover the main influences on RAG performance and the chosen studies adequately represent how the field has advanced.

What would settle it

A new RAG technique whose key performance drivers fall outside all four categories, or a review that omits multiple high-impact recent papers, would show the framework is incomplete.

Figures

Figures reproduced from arXiv: 2404.10981 by Jimmy Huang, Yizheng Huang.

Figure 1
Figure 1. Figure 1: An example of RAG benefits ChatGPT resolves questions that cannot be answered beyond the scope [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The unified RAG core concepts with basic workflow. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Taxonomy tree of RAG’s core techniques integrating relevant knowledge and adjusting to diverse contextual demands forms the basis of effective customization in RAG. 3 Pre-Retrieval 3.1 Indexing One of the most commonly used indexing structures in traditional information retrieval systems is the inverted index. This structure associates documents with words to form a vocabulary list, allowing users to quick… view at source ↗
Figure 4
Figure 4. Figure 4: An example of a typical RAG framework with interative retrieval strategy. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Retriever and generator experiment results sourced from eRAG [119] and BERGEN [114]. [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
read the original abstract

Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This methodology, focusing primarily on the text domain, provides a cost-effective solution to the generation of plausible but possibly incorrect responses by LLMs, thereby enhancing the accuracy and reliability of their outputs through the use of real-world data. As RAG grows in complexity and incorporates multiple concepts that can influence its performance, this paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation, offering a detailed perspective from the retrieval viewpoint. It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies. Additionally, the paper introduces evaluation methods for RAG, addressing the challenges faced and proposing future research directions. By offering an organized framework and categorization, the study aims to consolidate existing research on RAG, clarify its technological underpinnings, and highlight its potential to broaden the adaptability and applications of LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper is a survey on Retrieval-Augmented Generation (RAG) for text-based large language models. It claims that RAG addresses the static knowledge limitations of LLMs by dynamically integrating up-to-date external information via retrieval methods, and organizes the RAG paradigm into four categories (pre-retrieval, retrieval, post-retrieval, and generation) from a retrieval viewpoint. The survey reviews the evolution of RAG through analysis of significant studies, introduces evaluation methods, discusses challenges, and proposes future research directions.

Significance. If the four-category organizational framework accurately reflects the cited literature without major omissions, the survey could serve as a useful reference for consolidating RAG research, clarifying technological components, and guiding future work on improving LLM adaptability and reliability. As a purely organizational synthesis without new derivations or empirical claims, its value lies in the clarity and coverage of the proposed lens rather than in novel technical results.

major comments (1)
  1. [Abstract] Abstract: the central claim that the four-category framework (pre-retrieval, retrieval, post-retrieval, generation) offers a comprehensive perspective on factors influencing RAG performance is presented without explicit justification, comparison to alternative schemes (e.g., by architecture or task), or discussion of potential omissions such as multi-stage interactions or fine-tuning synergies; this justification is load-bearing for the survey's organizational contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our survey. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of our organizational framework.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the four-category framework (pre-retrieval, retrieval, post-retrieval, generation) offers a comprehensive perspective on factors influencing RAG performance is presented without explicit justification, comparison to alternative schemes (e.g., by architecture or task), or discussion of potential omissions such as multi-stage interactions or fine-tuning synergies; this justification is load-bearing for the survey's organizational contribution.

    Authors: We agree that the abstract would benefit from explicit justification of the four-category framework. This structure is motivated by the sequential stages inherent to the retrieval process itself, enabling a systematic analysis of performance factors from a retrieval-centric viewpoint as stated in the manuscript. In the revised version, we will expand the abstract and add a short subsection early in the introduction to (1) justify the choice by linking each category to key retrieval-influenced components, (2) briefly contrast it with alternatives such as architecture-based or task-based taxonomies, and (3) acknowledge cross-stage interactions and synergies with fine-tuning while noting that the survey's primary lens remains retrieval-oriented. These additions will make the framework's rationale clearer without altering the core categorization. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

This is a survey paper whose central contribution is an organizational lens that groups existing RAG literature into four categories (pre-retrieval, retrieval, post-retrieval, generation) from the retrieval viewpoint. No original derivations, equations, parameter fits, predictions, or uniqueness theorems are advanced; the text synthesizes and cites prior studies without any step that reduces by construction to a self-definition, fitted input renamed as prediction, or self-citation chain. The framework is explicitly presented as a viewpoint and synthesis rather than a falsifiable model whose assumptions could be internally circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey the paper introduces no free parameters, mathematical axioms, or new postulated entities; its contribution is the proposed taxonomy of prior work.

pith-pipeline@v0.9.0 · 5714 in / 994 out tokens · 26732 ms · 2026-05-24T02:13:37.345783+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

    cs.CL 2025-11 unverdicted novelty 7.0

    Evo-Memory is a new benchmark for self-evolving memory in LLM agents across task streams, with baseline ExpRAG and proposed ReMem method that integrates reasoning, actions, and memory updates for continual improvement.

  2. EHRAG: Bridging Semantic Gaps in Lightweight GraphRAG via Hybrid Hypergraph Construction and Retrieval

    cs.AI 2026-04 unverdicted novelty 6.0

    EHRAG constructs structural hyperedges from sentence co-occurrence and semantic hyperedges from entity embedding clusters, then applies hybrid diffusion plus topic-aware PPR to retrieve top-k documents, outperforming ...

  3. LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding

    cs.IR 2026-04 unverdicted novelty 6.0

    LFRAG advances multimodal RAG to block-level retrieval with layout segmentation and cross-attention fusion, reporting SOTA retrieval, 7.20% higher answer accuracy, and 73.07% lower token consumption on the new LFDocQA...

  4. ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation

    cs.IR 2026-04 unverdicted novelty 6.0

    ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.

  5. Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

    cs.CL 2025-11 unverdicted novelty 6.0

    Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and ...

  6. In-depth Analysis of Graph-based RAG in a Unified Framework

    cs.IR 2025-03 unverdicted novelty 6.0

    A unified framework and large-scale comparison of graph-based RAG methods on QA tasks yields new high-performing variants obtained by recombining existing components.

  7. ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

    cs.IR 2025-02 unverdicted novelty 6.0

    ArchRAG proposes attributed-community hierarchical indexing and LLM clustering to improve accuracy and lower token usage in graph-based retrieval-augmented generation.

  8. A Graph-Enhanced Defense Framework for Explainable Fake News Detection with LLM

    cs.CL 2026-04 unverdicted novelty 5.0

    G-Defense builds claim-centered graphs from sub-claims, applies RAG for evidence and competing explanations, then uses graph inference to detect fake news veracity and generate intuitive explanation graphs, claiming S...

  9. Agentic Reasoning for Large Language Models

    cs.AI 2026-01 unverdicted novelty 4.0

    The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applicat...

Reference graph

Works this paper leans on

175 extracted references · 175 canonical work pages · cited by 8 Pith papers · 16 internal anchors

  1. [1]

    OpenAI Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, and etc. 2023. GPT-4 Technical Report. arXiv (2023)

  2. [2]

    Bruce Croft

    Qingyao Ai, Keping Bi, Jiafeng Guo, and W. Bruce Croft. 2018. Learning a Deep Listwise Context Model for Ranking Refinement. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . ACM

  3. [3]

    Mount, Nathan S

    Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. 1998. An Optimal Algorithm for Approximate Nearest Neighbor Searching Fixed Dimensions. J. ACM 45, 6 (1998), 891–923. https://doi.org/10. 1145/293347.293348

  4. [4]

    Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. 2024. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. In The Twelfth International Conference on Learning Representations , Vol. abs/2310.11511

  5. [5]

    Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, and Moshe Wasserblat. 2023. Optimizing Retrieval- augmented Reader Models via Token Elimination. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1506–1524

  6. [6]

    Michele Bevilacqua, Giuseppe Ottaviano, Patrick S. H. Lewis, Scott Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, Novemb...

  7. [7]

    Michele Bevilacqua, Giuseppe Ottaviano, Patrick S. H. Lewis, Scott Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. In Conference on Neural Information Processing Systems (NeurIPS)

  8. [8]

    Sidney Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, Usvsn Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, and Samuel Weinbach. 2022. GPT-NeoX-20B: An Open-Source Autoregressive Language Model. In Proceedings of BigScience Ep...

  9. [9]

    Rae, Erich Elsen, and Laurent Sifre

    Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Ori...

  10. [10]

    Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin...

  11. [11]

    Jannis Bulian, Christian Buck, Wojciech Gajewski, Benjamin Börschinger, and Tal Schuster. 2022. Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation.. In Conference on Empirical Methods in Natural Language Processing (EMNLP) . 291–305

  12. [12]

    Chi-Min Chan, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo, and Jie Fu. 2024. RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation. arXiv abs/2404.00610 (2024)

  13. [13]

    Howard Chen, Ramakanth Pasunuru, Jason Weston, and Asli Celikyilmaz. 2023. Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading. arXiv abs/2310.05029 (2023)

  14. [14]

    Jiawei Chen, Hongyu Lin, Xianpei Han, and Le Sun. 2024. Benchmarking Large Language Models in Retrieval- Augmented Generation. Proceedings of the AAAI Conference on Artificial Intelligence 38, 16 (2024), 17754–17762

  15. [15]

    Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2022. GERE: Generative Evidence Retrieval for Fact Verification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in , Vol. 1, No. 1, Article . Publication date: August 2018. The Survey of Retrieval-Augmented Text Generation in Large Language Mod...

  16. [16]

    Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pondé de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavar...

  17. [17]

    Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, and William Cohen. 2022. MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)

  18. [18]

    Wenhu Chen, Hexiang Hu, Chitwan Saharia, and William W. Cohen. 2023. Re-Imagen: Retrieval-Augmented Text-to- Image Generator. In International Conference on Learning Representations (ICLR)

  19. [19]

    Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, and Haizhou Li. 2023. Phoenix: Democratizing ChatGPT across Languages. arXiv abs/2304.10453 (2023)

  20. [20]

    Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Weiwei Deng, and Qi Zhang. 2023. UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 12318–12337

  21. [21]

    Xin Cheng, Di Luo, Xiuying Chen, Lemao Liu, Dongyan Zhao, and Rui Yan. 2023. Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory.. In Conference on Neural Information Processing Systems (NeurIPS)

  22. [22]

    Gonzalez, Ion Stoica, and Eric P

    Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/

  23. [23]

    Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradb...

  24. [24]

    Scaling Instruction-Finetuned Language Models

    Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping H...

  25. [25]

    Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational...

  26. [26]

    Florin Cuconasu, Giovanni Trappolini, Federico Siciliano, Simone Filice, Cesare Campagnano, Yoelle Maarek, Nicola Tonellotto, and Fabrizio Silvestri. 2024. The Power of Noise: Redefining Retrieval for RAG Systems. In Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Vol. abs/2401.14887

  27. [27]

    Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B

    Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, and Ming-Wei Chang. 2023. Promptagator: Few-shot Dense Retrieval From 8 Examples. In International Conference on Learning Representations (ICLR)

  28. [28]

    Mirrokni

    Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions.. In International Symposium on Computational Geometry (SoCG) . 253–262. , Vol. 1, No. 1, Article . Publication date: August 2018. 30 Huang et al

  29. [29]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. InProceedings of the Conference of the North. Association for Computational Linguistics, 4171–4186

  30. [30]

    Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. 2019. Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations (ICLR)

  31. [31]

    Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. 2022. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Association for Computational Linguistics

  32. [32]

    Hare, Frédérique Laforest, and Elena Simperl

    Hady ElSahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon S. Hare, Frédérique Laforest, and Elena Simperl. 2018. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples. In International Conference on Language Resources and Evaluation (LREC)

  33. [33]

    Shahul ES, Jithin James, Luis Espinosa Anke, and Steven Schockaert. 2023. RAGAs: Automated Evaluation of Retrieval Augmented Generation. Conference of the European Chapter of the Association for Computational Linguistics abs/2309.15217 (2023)

  34. [34]

    Zhangyin Feng, Xiaocheng Feng, Dezhi Zhao, Maojin Yang, and Bing Qin. 2024. Retrieval-Generation Synergy Augmented Large Language Models. In IEEE International Conference on Acoustics, Speech, and Signal Processing , Vol. abs/2310.05149. IEEE

  35. [35]

    Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, and Connor Leahy. 2021. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv abs/2101.00027 (2021)

  36. [36]

    Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. 2023. Precise Zero-Shot Dense Retrieval without Relevance Labels. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1762–1777

  37. [37]

    Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 6894–6910

  38. [38]

    Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, and Haofen Wang. 2023. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv abs/2312.10997 (2023)

  39. [39]

    Yunfan Gao, Yun Xiong, Meng Wang, and Haofen Wang. 2024. Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks. arXiv (2024)

  40. [40]

    Michael Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Ankita Naik, Pengshan Cai, and Alfio Gliozzo. 2022. Re2G: Retrieve, Rerank, Generate. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Association for Computational Linguistics, 2701–2715

  41. [41]

    Simon Gottschalk and Elena Demidova. 2018. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph . Springer International Publishing. 272–287 pages

  42. [42]

    Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Retrieval Augmented Language Model Pre-Training. In International Conference on Machine Learning (ICML) . 3929–3938

  43. [43]

    Hamilton

    William L. Hamilton. 2020. Graph representation learning. Springer International Publishing

  44. [44]

    Robertson, Steve Walker, and P

    Micheline Hancock-Beaulieu, Mike Gatford, Xiangji Huang, Stephen E. Robertson, Steve Walker, and P. W. Williams

  45. [45]

    In Proceedings of The Fifth Text REtrieval Conference, TREC 1996, Gaithersburg, Maryland, USA, November 20-22, 1996 (NIST Special Publication, Vol

    Okapi at TREC-5. In Proceedings of The Fifth Text REtrieval Conference, TREC 1996, Gaithersburg, Maryland, USA, November 20-22, 1996 (NIST Special Publication, Vol. 500-238) , Ellen M. Voorhees and Donna K. Harman (Eds.). National Institute of Standards and Technology (NIST). http://trec.nist.gov/pubs/trec5/papers/city.procpaper.ps.gz

  46. [46]

    Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2021. Measuring Massive Multitask Language Understanding.. In International Conference on Learning Representations (ICLR)

  47. [47]

    LopezHerrera, and Carlos Porcel

    Enrique HerreraViedma, Gabriella Pasi, Antonio G. LopezHerrera, and Carlos Porcel. 2006. Evaluating the information quality of Web sites: A methodology based on fuzzy computing with words. Journal of the American Society for Information Science and Technology 57, 4 (2006), 538–549

  48. [48]

    Sebastian Hofstätter, Jiecao Chen, Karthik Raman, and Hamed Zamani. 2023. Fid-light: Efficient and effective retrieval- augmented text generation. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1437–1447

  49. [49]

    Yucheng Hu and Yuxing Lu. 2024. RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing. arXiv abs/2404.19543 (2024)

  50. [50]

    Ross, and Alireza Fathi

    Ziniu Hu, Ahmet Iscen, Chen Sun, Zirui Wang, Kai-Wei Chang, Yizhou Sun, Cordelia Schmid, David A. Ross, and Alireza Fathi. 2023. Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . IEEE, 23369– 23379. , Vol. 1, No. 1, Article...

  51. [51]

    Jie Huang, Hanyin Shao, Kevin Chen-Chuan Chang, Jinjun Xiong, and Wen-mei Hwu. 2022. Understanding Jargon: Combining Extraction and Generation for Definition Modeling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics

  52. [52]

    Jimmy Xiangji Huang, Jun Miao, and Ben He. 2013. High performance query expansion using adaptive co-training. Inf. Process. Manag. 49, 2 (2013), 441–453. https://doi.org/10.1016/J.IPM.2012.08.002

  53. [53]

    Qiushi Huang, Shuai Fu, Xubo Liu, Wenwu Wang, Tom Ko, Yu Zhang, and Lilian H. Y. Tang. 2023. Learning Retrieval Augmentation for Personalized Dialogue Generation.. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 2523–2540

  54. [54]

    Wenyu Huang, Mirella Lapata, Pavlos Vougiouklis, Nikos Papasarantopoulos, and Jeff Z Pan. 2023. Retrieval Augmented Generation with Rich Answer Encoding. Proc. of IJCNLP-AACL 2023 (2023)

  55. [55]

    Xiangji Huang and Qinmin Hu. 2009. A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Devel- opment in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009 , James Allan, Javed A. Aslam, Mark Sanderson, Che...

  56. [56]

    Yizheng Huang and Jimmy Huang. 2024. Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges. CoRR abs/2402.11203 (2024). https://doi.org/10.48550/ARXIV.2402.11203 arXiv:2402.11203

  57. [57]

    Yizheng Huang and Jimmy X. Huang. 2023. Diversified Prior Knowledge Enhanced General Language Model for Biomedical Information Retrieval. In ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland - Including 12th Conference on Prestigious Applications of Intelligent Systems (PAIS 2023) (Frontiers in...

  58. [58]

    Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, and Edouard Grave. 2022. Unsupervised Dense Information Retrieval with Contrastive Learning. Transactions on Machine Learning Research (TMLR) 2022 (2022)

  59. [59]

    Gautier Izacard and Edouard Grave. 2021. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, 874–880

  60. [60]

    Gautier Izacard, Patrick S. H. Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, and Edouard Grave. 2023. Atlas: Few-shot Learning with Retrieval Augmented Language Models. Journal of Machine Learning Research (JMLR) 24 (2023), 251:1–251:43

  61. [61]

    Tahmid Rahman Laskar, Chun Peng, and Jimmy Xiangji Huang

    Israt Jahan, Md. Tahmid Rahman Laskar, Chun Peng, and Jimmy Xiangji Huang. 2023. Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers. CoRR abs/2306.04504 (2023). https://doi.org/10.48550/ARXIV.2306.04504 arXiv:2306.04504

  62. [62]

    Jansen, Danielle L

    Bernard J. Jansen, Danielle L. Booth, and Amanda Spink. 2009. Patterns of query reformulation during Web searching. J. Assoc. Inf. Sci. Technol. 60, 7 (2009), 1358–1371. https://doi.org/10.1002/ASI.21071

  63. [63]

    H Jégou, M Douze, and C Schmid. 2011. Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117–128

  64. [64]

    Wenqi Jiang, Marco Zeller, Roger Waleffe, Torsten Hoefler, and Gustavo Alonso. 2023. Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models. arXiv abs/2310.09949 (2023)

  65. [65]

    Wenqi Jiang, Shuai Zhang, Boran Han, Jie Wang, Bernie Wang, and Tim Kraska. 2024. PipeRAG: Fast Retrieval- Augmented Generation via Algorithm-System Co-design. arXiv abs/2403.05676 (2024)

  66. [66]

    Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, and Graham Neubig

    Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, and Graham Neubig. 2023. Active Retrieval Augmented Generation. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 7969–7992

  67. [67]

    Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, and Peter Szolovits. 2021. What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams. Applied Sciences 11, 14 (2021), 6421

  68. [68]

    Cohen, and Xinghua Lu

    Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W. Cohen, and Xinghua Lu. 2019. PubMedQA: A Dataset for Biomedical Research Question Answering.. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 2567–2577

  69. [69]

    Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (2021), 535–547. https://doi.org/10.1109/TBDATA.2019.2921572

  70. [70]

    Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Association for Computational Linguistics, 1601–1611

  71. [71]

    Minki Kang, Jin Myung Kwak, Jinheon Baek, and Sung Ju Hwang. 2023. Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation. arXiv abs/2305.18846 (2023). , Vol. 1, No. 1, Article . Publication date: August 2018. 32 Huang et al

  72. [72]

    Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick S. H. Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering.. In Conference on Empirical Methods in Natural Language Processing (EMNLP) . 6769–6781

  73. [73]

    Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. 2020. Generalization through Memorization: Nearest Neighbor Language Models. In International Conference on Learning Representations (ICLR)

  74. [74]

    Omar Khattab, Keshav Santhanam, Xiang Lisa Li, David Hall, Percy Liang, Christopher Potts, and Matei Zaharia

  75. [75]

    Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive nlp,

    Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP. arXiv abs/2212.14024 (2022)

  76. [76]

    Omar Khattab and Matei Zaharia. 2020. ColBERT - Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 39–48

  77. [77]

    Sanghoon Kim, Dahyun Kim, Chanjun Park, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoungjin Gim, Miky- oung Cha, Hwalsuk Lee, and Sunghun Kim. 2024. SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling. In Proceedings of t...

  78. [78]

    Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov

    Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: A Benchmark for Question Answering Research. T...

  79. [79]

    Tahmid Rahman Laskar, M

    Md. Tahmid Rahman Laskar, M. Saiful Bari, Mizanur Rahman, Md Amran Hossen Bhuiyan, Shafiq Joty, and Jimmy Xi- angji Huang. 2023. A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets. CoRR abs/2305.18486 (2023). https://doi.org/10.48550/ARXIV.2305.18486 arXiv:2305.18486

  80. [80]

    Tahmid Rahman Laskar, Enamul Hoque, and Jimmy X

    Md. Tahmid Rahman Laskar, Enamul Hoque, and Jimmy X. Huang. 2020. Query Focused Abstractive Summarization via Incorporating Query Relevance and Transfer Learning with Transformer Models. In Advances in Artificial Intelligence - 33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020, Ottawa, ON, Canada, May 13-15, 2020, Proceedings (Lecture ...

Showing first 80 references.