arxiv: 2402.19473 · v6 · submitted 2024-02-29 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Retrieval-Augmented Generation for AI-Generated Content: A Survey

Penghao Zhao , Hailin Zhang , Qinhan Yu , Zhengren Wang , Yunteng Geng , Fangcheng Fu , Ling Yang , Wentao Zhang

show 2 more authors

Jie Jiang Bin Cui

Authors on Pith no claims yet

Pith reviewed 2026-05-15 13:27 UTC · model grok-4.3

classification 💻 cs.CV

keywords Retrieval-Augmented GenerationAI-Generated ContentRAGAIGCinformation retrievalgenerative modelssurvey

0 comments

The pith

RAG integrates retrieval into AI-generated content to pull relevant data and raise accuracy plus robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey reviews how retrieval-augmented generation is combined with AI-generated content systems to tackle problems such as stale knowledge, long-tail distributions, data leakage, and high training costs. It organizes existing work by the way retrievers feed information into generators, then covers practical enhancements, cross-modal applications, benchmarks, current limits, and open research questions. A reader would care because the added retrieval step lets generative models draw on external stores instead of relying solely on parameters learned during training.

Core claim

RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores, leading to higher accuracy and better robustness.

What carries the argument

Classification of RAG foundations according to how the retriever augments the generator, creating a unified view of augmentation methods across different retriever and generator pairs.

If this is right

RAG directly mitigates AIGC issues of knowledge updating, long-tail data, and leakage by pulling fresh objects at generation time.
Additional enhancement methods make RAG systems easier to engineer and deploy in practice.
RAG applications already span multiple modalities and concrete tasks, providing templates for new uses.
Existing benchmarks allow systematic measurement of RAG performance and identification of remaining gaps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

RAG could cut training and inference costs for large models by shifting knowledge storage to external retrieval rather than parameter growth.
The same retrieval grounding may reduce hallucinations in generative outputs by anchoring responses to retrieved evidence.
Dynamic or real-time RAG variants could be tested in live content pipelines where data stores update continuously.
The survey's taxonomy might prompt hybrid retriever-generator designs that combine multiple augmentation styles not yet catalogued.

Load-bearing premise

The collected literature and proposed classification of augmentation methodologies comprehensively represent the space of RAG-AIGC integrations without significant omissions or overlaps.

What would settle it

Publication of a major RAG-AIGC technique that cannot be placed into any of the survey's augmentation categories would show the classification is incomplete.

read the original abstract

Advancements in model algorithms, the growth of foundational models, and access to high-quality datasets have propelled the evolution of Artificial Intelligence Generated Content (AIGC). Despite its notable successes, AIGC still faces hurdles such as updating knowledge, handling long-tail data, mitigating data leakage, and managing high training and inference costs. Retrieval-Augmented Generation (RAG) has recently emerged as a paradigm to address such challenges. In particular, RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores, leading to higher accuracy and better robustness. In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios. We first classify RAG foundations according to how the retriever augments the generator, distilling the fundamental abstractions of the augmentation methodologies for various retrievers and generators. This unified perspective encompasses all RAG scenarios, illuminating advancements and pivotal technologies that help with potential future progress. We also summarize additional enhancements methods for RAG, facilitating effective engineering and implementation of RAG systems. Then from another view, we survey on practical applications of RAG across different modalities and tasks, offering valuable references for researchers and practitioners. Furthermore, we introduce the benchmarks for RAG, discuss the limitations of current RAG systems, and suggest potential directions for future research. Github: https://github.com/PKU-DAIR/RAG-Survey.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A practical taxonomy and review of RAG techniques applied to AIGC that serves as a useful reference.

read the letter

This survey organizes the intersection of retrieval-augmented generation and AI-generated content with a taxonomy based on augmentation patterns. It reviews foundations, enhancements, applications across modalities, benchmarks, and future directions. It does well by providing a unified view that covers various retrievers and generators, pulling in examples from text, vision, and multimodal settings. The summaries of additional methods like reranking and the list of benchmarks are practical for implementation. The GitHub repository adds value by making the collected references accessible. Soft spots include the usual survey risks of missing some papers in a rapidly evolving field and possible overlaps in the proposed categories that are not deeply explored. The paper relies on existing literature without new data, so its soundness depends on accurate representation of the cited works, which appears consistent from the abstract and structure. This is for engineers and researchers needing a reference to navigate RAG techniques in generative AI. It should go through peer review because the classification offers a helpful framework and the application survey can guide new projects. I recommend accepting it for review with attention to coverage completeness.

Referee Report

0 major / 2 minor

Summary. The manuscript is a survey on integrating Retrieval-Augmented Generation (RAG) with AI-Generated Content (AIGC). It classifies RAG foundations by how retrievers augment generators, distills core augmentation abstractions, summarizes enhancement methods for RAG systems, surveys applications across modalities and tasks, introduces benchmarks, discusses limitations of current RAG systems, and suggests future research directions.

Significance. If the taxonomy and coverage hold, the survey supplies a unified framework for RAG-AIGC work that directly addresses documented AIGC challenges such as knowledge updating, long-tail data, and inference cost. The explicit classification of augmentation methodologies and the inclusion of benchmarks plus future directions make it a practical reference for both researchers and implementers.

minor comments (2)

[Abstract and taxonomy section] The abstract states that the classification 'encompasses all RAG scenarios' but does not list the exact set of retriever-generator pairs examined; adding a short table or enumerated list in the taxonomy section would improve verifiability.
[Abstract] The GitHub link is provided but the manuscript does not indicate whether the repository contains the full reference list, taxonomy diagram source, or benchmark tables; clarifying this would aid reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of its unified framework for RAG-AIGC integration, and recommendation to accept. We are pleased that the taxonomy, coverage of enhancements, applications, benchmarks, and future directions are viewed as providing a practical reference for researchers and implementers.

Circularity Check

0 steps flagged

No significant circularity: survey of external literature only

full rationale

The paper is a descriptive survey that classifies and summarizes existing RAG-AIGC work from external sources. It presents no derivations, equations, fitted parameters, predictions, or theoretical claims that reduce to self-citations or internal definitions. The taxonomy and benchmarks are drawn from cited literature without load-bearing self-referential steps. All content is externally grounded, satisfying the self-contained criterion for score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey the paper introduces no new free parameters, axioms, or invented entities; it synthesizes prior literature on RAG and AIGC.

pith-pipeline@v0.9.0 · 5580 in / 1015 out tokens · 39941 ms · 2026-05-15T13:27:39.327825+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Cost.FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores, leading to higher accuracy and better robustness.
IndisputableMonolith.Foundation.DAlembert.Inevitability bilinear_family_forced unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We first classify RAG foundations according to how the retriever augments the generator, distilling the fundamental abstractions of the augmentation methodologies for various retrievers and generators.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 20 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving
cs.CL 2026-04 unverdicted novelty 7.0

OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve pe...
A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding
cs.AI 2026-04 unverdicted novelty 7.0

A-MAR decomposes art queries into reasoning plans to condition retrieval, leading to improved explanation quality and multi-step reasoning on art benchmarks compared to baselines.
RNSG: A Range-Aware Graph Index for Efficient Range-Filtered Approximate Nearest Neighbor Search
cs.DB 2026-03 unverdicted novelty 7.0

RNSG approximates the range-aware relative neighborhood graph (RRNG) to enable high-performance range-filtered ANN queries with one compact index instead of many.
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation
cs.CV 2026-05 unverdicted novelty 6.0

CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layo...
Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA
cs.IR 2026-04 unverdicted novelty 6.0

Rabtriever distills a generative reranker into an efficient bi-encoder using on-policy JEPA to achieve near-reranker accuracy with linear complexity on rationale-based retrieval.
Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA
cs.IR 2026-04 unverdicted novelty 6.0

Rabtriever distills a generative reranker into an efficient independent encoder using JEPA and auxiliary reverse KL loss to achieve linear complexity and strong performance on rationale-based retrieval tasks.
Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving
cs.CL 2026-04 unverdicted novelty 6.0

DCM-Agent improves LLM performance on multi-paradigm optimization problems by 11-21% via dual-cluster memory construction and dynamic inference guidance.
EHRAG: Bridging Semantic Gaps in Lightweight GraphRAG via Hybrid Hypergraph Construction and Retrieval
cs.AI 2026-04 unverdicted novelty 6.0

EHRAG constructs structural hyperedges from sentence co-occurrence and semantic hyperedges from entity embedding clusters, then applies hybrid diffusion plus topic-aware PPR to retrieve top-k documents, outperforming ...
EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation
cs.DB 2026-04 unverdicted novelty 6.0

EvoRAG adds a feedback-driven backpropagation step that attributes response quality to individual knowledge-graph triplets and updates the graph to raise reasoning accuracy by 7.34 percent over prior KG-RAG methods.
Transforming External Knowledge into Triplets for Enhanced Retrieval in RAG of LLMs
cs.CL 2026-04 unverdicted novelty 6.0

Tri-RAG turns external knowledge into Condition-Proof-Conclusion triplets and retrieves via the Condition anchor to improve efficiency and quality in LLM RAG.
Search-o1: Agentic Search-Enhanced Large Reasoning Models
cs.AI 2025-01 unverdicted novelty 6.0

Search-o1 integrates agentic retrieval-augmented generation and a Reason-in-Documents module into large reasoning models to dynamically supply missing knowledge and improve performance on complex science, math, coding...
EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records
cs.IR 2026-05 unverdicted novelty 5.0

EHR-RAGp is a retrieval-augmented EHR foundation model that employs prototype-guided retrieval to dynamically integrate relevant historical patient context, outperforming prior models on clinical prediction tasks.
Adaptive Query Routing: A Tier-Based Framework for Hybrid Retrieval Across Financial, Legal, and Medical Documents
cs.IR 2026-04 conditional novelty 5.0

Tree reasoning outperforms vector search on complex document queries but a hybrid approach balances results across tiers, with validation showing an 11.7-point gap on real finance documents.
Beyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented Generation
cs.AI 2026-04 unverdicted novelty 5.0

Opinion-aware RAG with LLM opinion extraction and entity-linked graphs improves retrieval diversity by 26-42% over factual baselines on e-commerce forum data.
Plasma GraphRAG: Physics-Grounded Parameter Selection for Gyrokinetic Simulations
physics.plasm-ph 2026-04 unverdicted novelty 5.0

Plasma GraphRAG automates physics-grounded parameter selection for gyrokinetic simulations via a domain-specific knowledge graph and LLMs, reporting over 10% better quality and up to 25% fewer hallucinations than stan...
MemOS: A Memory OS for AI System
cs.CL 2025-07 unverdicted novelty 5.0

MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.
Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications
cs.CL 2026-05 unverdicted novelty 4.0

RAG is more effective and cost-efficient than fine-tuning for industrial QA adaptation on automotive datasets.
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
cs.AI 2025-04 accept novelty 4.0

A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.
LightRAG: Simple and Fast Retrieval-Augmented Generation
cs.IR 2024-10 unverdicted novelty 4.0

LightRAG builds graph structures into RAG indexing and retrieval with dual-level search and incremental updates to improve accuracy and speed.
Enhancing Large Language Models with Retrieval Augmented Generation for Software Testing and Inspection Automation
cs.SE 2026-04 unverdicted novelty 3.0

RAG-enhanced LLMs show generally positive effects on automated test generation and code inspection by supplying supplementary context that reduces hallucinations.

Reference graph

Works this paper leans on

298 extracted references · 298 canonical work pages · cited by 19 Pith papers · 14 internal anchors

[1]

Language models are few-shot learners,

T. B. Brown, B. Mann et al., “Language models are few-shot learners,” in NeurIPS, 2020

work page 2020
[2]

Evaluating Large Language Models Trained on Code

M. Chen, J. Tworek et al., “Evaluating large language models trained on code,” arXiv:2107.03374, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[3]

GPT-4 Technical Report

OpenAI, “GPT-4 technical report,” arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

LLaMA: Open and Efficient Foundation Language Models

H. Touvron, T. Lavril et al., “Llama: Open and efficient foundation language models,” arXiv:2302.13971, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Llama 2: Open Foundation and Fine-Tuned Chat Models

H. Touvron, L. Martin et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[6]

Code Llama: Open Foundation Models for Code

B. Rozi `ere, J. Gehring et al., “Code llama: Open foundation models for code,” arXiv:2308.12950, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[7]

Zero-shot text-to-image gener- ation,

A. Ramesh, M. Pavlov, G. Goh et al., “Zero-shot text-to-image gener- ation,” in ICML, 2021

work page 2021
[8]

Hierarchical Text-Conditional Image Generation with CLIP Latents

A. Ramesh, P. Dhariwal, A. Nichol et al., “Hierarchical text-conditional image generation with CLIP latents,” arXiv:2204.06125, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[9]

Improving image generation with better captions,

J. Betker, G. Goh, L. Jing et al., “Improving image generation with better captions,” Computer Science, vol. 2, no. 3, p. 8, 2023

work page 2023
[10]

High-resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz et al., “High-resolution image synthesis with latent diffusion models,” in IEEE/CVF, 2022

work page 2022
[11]

Video generation models as world simulators,

OpenAI, “Video generation models as world simulators,” https://openai. com/research/video-generation-models-as-world-simulators, 2024

work page 2024
[12]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997

work page 1997
[13]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar et al., “Attention is all you need,” in NeurIPS, 2017

work page 2017
[14]

Generative adver- sarial networks,

I. Goodfellow, J. Pouget-Abadie, M. Mirza et al., “Generative adver- sarial networks,” CACM, vol. 63, no. 11, pp. 139–144, 2020

work page 2020
[15]

BERT: pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M. Chang et al., “BERT: pre-training of deep bidirectional transformers for language understanding,” in NAACL-HLT, 2019

work page 2019
[16]

Exploring the limits of transfer learning with a unified text-to-text transformer,

C. Raffel, N. Shazeer, A. Roberts et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” JMLR, vol. 21, pp. 140:1–140:67, 2020

work page 2020
[17]

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,

W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,” JMLR, vol. 23, no. 120, pp. 1–39, 2022

work page 2022
[18]

Scaling laws for neural language models,

J. Kaplan, S. McCandlish, T. Henighan et al., “Scaling laws for neural language models,” 2020

work page 2020
[19]

The probabilistic relevance frame- work: BM25 and beyond,

S. E. Robertson and H. Zaragoza, “The probabilistic relevance frame- work: BM25 and beyond,” FTIR, vol. 3, no. 4, pp. 333–389, 2009

work page 2009
[20]

Dense passage retrieval for open-domain question answering,

V . Karpukhin, B. Oguz, S. Min et al., “Dense passage retrieval for open-domain question answering,” in EMNLP, 2020

work page 2020
[21]

Billion-scale similarity search with gpus,

J. Johnson, M. Douze, and H. J ´egou, “Billion-scale similarity search with gpus,” IEEE Trans. Big Data, vol. 7, no. 3, pp. 535–547, 2021

work page 2021
[22]

SPANN: highly-efficient billion- scale approximate nearest neighborhood search,

Q. Chen, B. Zhao, H. Wang et al., “SPANN: highly-efficient billion- scale approximate nearest neighborhood search,” in NeurIPS, 2021

work page 2021
[23]

Image retrieval: Ideas, influences, and trends of the new age,

R. Datta, D. Joshi, J. Li et al., “Image retrieval: Ideas, influences, and trends of the new age,” CSUR, vol. 40, no. 2, pp. 5:1–5:60, 2008

work page 2008
[24]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021

work page 2021
[25]

Codebert: A pre-trained model for program- ming and natural languages,

Z. Feng, D. Guo et al., “Codebert: A pre-trained model for program- ming and natural languages,” in EMNLP Findings, 2020

work page 2020
[26]

Large-scale contrastive language- audio pretraining with feature fusion and keyword-to-caption augmen- tation,

Y . Wu, K. Chen, T. Zhang et al., “Large-scale contrastive language- audio pretraining with feature fusion and keyword-to-caption augmen- tation,” in ICASSP, 2023

work page 2023
[28]

Extracting training data from large language models,

N. Carlini, F. Tram `er et al., “Extracting training data from large language models,” in USENIX, 2021

work page 2021
[29]

C-RAG: certified generation risks for retrieval-augmented language models,

M. Kang, N. M. G ¨urel et al., “C-RAG: certified generation risks for retrieval-augmented language models,” arXiv:2402.03181, 2024

work page arXiv 2024
[30]

Atlas: Few-shot learning with retrieval augmented language models,

G. Izacard, P. Lewis, M. Lomeli et al., “Atlas: Few-shot learning with retrieval augmented language models,” arXiv:2208.03299, 2022

work page arXiv 2022
[31]

Memorizing transformers,

Y . Wu, M. N. Rabe, D. Hutchins, and C. Szegedy, “Memorizing transformers,” in ICLR, 2022

work page 2022
[32]

REST: retrieval-based speculative decoding,

Z. He, Z. Zhong, T. Cai et al., “REST: retrieval-based speculative decoding,” arxiv:2311.08252, 2023

work page arXiv 2023
[33]

REALM: retrieval-augmented language model pre-training,

K. Guu, K. Lee, Z. Tung et al., “REALM: retrieval-augmented language model pre-training,” ICML, 2020

work page 2020
[34]

Retrieval-augmented generation for knowledge-intensive NLP tasks,

P. S. H. Lewis, E. Perez, A. Piktus et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in NeurIPS, 2020

work page 2020
[35]

Leveraging passage retrieval with generative models for open domain question answering,

G. Izacard and E. Grave, “Leveraging passage retrieval with generative models for open domain question answering,” in EACL, 2021

work page 2021
[36]

Improving language models by retrieving from trillions of tokens,

S. Borgeaud, A. Mensch et al., “Improving language models by retrieving from trillions of tokens,” in ICML, 2022

work page 2022
[37]

Generalization through memorization: Nearest neighbor language models,

U. Khandelwal, O. Levy, D. Jurafsky et al., “Generalization through memorization: Nearest neighbor language models,” in ICLR, 2020

work page 2020
[38]

Efficient nearest neighbor language models,

J. He, G. Neubig, and T. Berg-Kirkpatrick, “Efficient nearest neighbor language models,” in EMNLP, 2021

work page 2021
[39]

(2023) Gptcache

zilliztech. (2023) Gptcache. [Online]. Available: https://github.com/ zilliztech/GPTCache

work page 2023
[40]

Retrieval augmented code gener- ation and summarization,

M. R. Parvez, W. U. Ahmad et al., “Retrieval augmented code gener- ation and summarization,” in EMNLP Findings, 2021

work page 2021
[41]

Unified pre-training for program understanding and generation,

W. U. Ahmad, S. Chakraborty, B. Ray et al., “Unified pre-training for program understanding and generation,” in NAACL-HLT, 2021

work page 2021
[42]

Docprompting: Generating code by retrieving the docs,

S. Zhou, U. Alon, F. F. Xu et al., “Docprompting: Generating code by retrieving the docs,” in ICLR, 2023

work page 2023
[43]

Audio captioning using pre-trained large- scale language model guided by audio-based similar caption retrieval,

Y . Koizumi, Y . Ohishiet al., “Audio captioning using pre-trained large- scale language model guided by audio-based similar caption retrieval,” arXiv:2012.07331, 2020

work page arXiv 2012
[44]

Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models,

R. Huang, J. Huang, D. Yang et al., “Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models,” in ICML, 2023

work page 2023
[45]

Retrievegan: Image synthesis via differentiable patch retrieval,

H.-Y . Tseng, H.-Y . Lee et al., “Retrievegan: Image synthesis via differentiable patch retrieval,” in ECCV, 2020

work page 2020
[46]

Retrieval- augmented transformer for image captioning,

S. Sarto, M. Cornia, L. Baraldi, and R. Cucchiara, “Retrieval- augmented transformer for image captioning,” in CBMI, 2022

work page 2022
[47]

Smallcap: lightweight image captioning prompted with retrieval augmentation,

R. Ramos, B. Martins et al., “Smallcap: lightweight image captioning prompted with retrieval augmentation,” in CVPR, 2023

work page 2023
[48]

Retrieval augmented convolutional encoder-decoder networks for video captioning,

J. Chen, Y . Pan, Y . Li et al., “Retrieval augmented convolutional encoder-decoder networks for video captioning,” TOMCCAP, vol. 19, no. 1s, pp. 48:1–48:24, 2023

work page 2023
[49]

Retrieval-augmented egocentric video captioning,

J. Xu, Y . Huang, J. Hou et al., “Retrieval-augmented egocentric video captioning,” arXiv:2401.00789, 2024

work page arXiv 2024
[50]

Retrieval-augmented score distillation for text- to-3d generation,

J. Seo, S. Hong et al., “Retrieval-augmented score distillation for text- to-3d generation,” arXiv:2402.02972, 2024

work page arXiv 2024
[51]

Remodiffuse: Retrieval-augmented motion diffusion model,

M. Zhang, X. Guo, L. Pan et al., “Remodiffuse: Retrieval-augmented motion diffusion model,” in ICCV, 2023

work page 2023
[52]

Logical form generation via multi- task learning for complex question answering over knowledge bases,

X. Hu, X. Wu, Y . Shu, and Y . Qu, “Logical form generation via multi- task learning for complex question answering over knowledge bases,” in COLING, 2022

work page 2022
[53]

Unseen entity handling in complex question answering over knowledge base via language generation,

X. Huang, J. Kim, and B. Zou, “Unseen entity handling in complex question answering over knowledge base via language generation,” in EMNLP Findings, 2021

work page 2021
[54]

Case-based reasoning for natural language queries over knowledge bases,

R. Das, M. Zaheer, D. Thai et al., “Case-based reasoning for natural language queries over knowledge bases,” in EMNLP, 2021

work page 2021
[55]

Retrieval-based controllable molecule generation,

Z. Wang, W. Nie, Z. Qiao et al., “Retrieval-based controllable molecule generation,” in ICLR, 2022

work page 2022
[56]

Genegpt: Augmenting large language models with domain tools for improved access to biomedical information,

Q. Jin, Y . Yang, Q. Chen, and Z. Lu, “Genegpt: Augmenting large language models with domain tools for improved access to biomedical information,” Bioinformatics, vol. 40, no. 2, p. btae075, 2024

work page 2024
[57]

A survey on retrieval-augmented text generation,

H. Li, Y . Su, D. Cai et al., “A survey on retrieval-augmented text generation,” arxiv:2202.01110, 2022. 18

work page arXiv 2022
[58]

Acl 2023 tutorial: Retrieval- based language models and applications,

A. Asai, S. Min, Z. Zhong, and D. Chen, “Acl 2023 tutorial: Retrieval- based language models and applications,” ACL 2023, 2023

work page 2023
[59]

Retrieval-Augmented Generation for Large Language Models: A Survey

Y . Gao, Y . Xiong et al., “Retrieval-augmented generation for large language models: A survey,” arxiv:2312.10997, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[60]

Retrieving multimodal information for augmented generation: A survey,

R. Zhao, H. Chen et al., “Retrieving multimodal information for augmented generation: A survey,” in EMNLP, 2023

work page 2023
[61]

A survey on rag meets llms: Towards retrieval- augmented large language models,

Y . Ding, W. Fan et al., “A survey on rag meets llms: Towards retrieval- augmented large language models,” arXiv:2405.06211, 2024

work page arXiv 2024
[62]

Visualgpt: Data-efficient adaptation of pretrained language models for image captioning,

J. Chen, H. Guo, K. Yi et al., “Visualgpt: Data-efficient adaptation of pretrained language models for image captioning,” in CVPR, 2022

work page 2022
[63]

Efficient transformers: A survey,

Y . Tay, M. Dehghani, D. Bahri, and D. Metzler, “Efficient transformers: A survey,” CSUR, vol. 55, no. 6, pp. 109:1–109:28, 2023

work page 2023
[64]

A review on the long short-term memory model,

G. V . Houdt et al., “A review on the long short-term memory model,” Artif. Intell. Rev., vol. 53, no. 8, pp. 5929–5955, 2020

work page 2020
[65]

Diffusion models: A comprehensive survey of methods and applications,

L. Yang, Z. Zhang et al., “Diffusion models: A comprehensive survey of methods and applications,” CSUR, vol. 56, no. 4, pp. 1–39, 2023

work page 2023
[66]

A review on generative adversarial networks: Algorithms, theory, and applications,

J. Gui, Z. Sun, Y . Wen et al., “A review on generative adversarial networks: Algorithms, theory, and applications,” TKDE, vol. 35, no. 4, pp. 3313–3332, 2023

work page 2023
[67]

On relevance weights with little relevance information,

S. E. Robertson and S. Walker, “On relevance weights with little relevance information,” in SIGIR, 1997

work page 1997
[68]

Document language models, query models, and risk minimization for information retrieval,

J. D. Lafferty and C. Zhai, “Document language models, query models, and risk minimization for information retrieval,” in SIGIR, 2001

work page 2001
[69]

CNN architectures for large-scale audio classification,

S. Hershey, S. Chaudhuri et al., “CNN architectures for large-scale audio classification,” in ICASSP, 2017

work page 2017
[70]

Dual encoding for zero-example video retrieval,

J. Dong, X. Li, C. Xu et al., “Dual encoding for zero-example video retrieval,” in CVPR, 2019

work page 2019
[71]

Approximate nearest neighbor negative contrastive learning for dense text retrieval,

L. Xiong, C. Xiong, Y . Li et al., “Approximate nearest neighbor negative contrastive learning for dense text retrieval,” in ICLR, 2021

work page 2021
[72]

Multidimensional binary search trees used for associa- tive searching,

J. L. Bentley, “Multidimensional binary search trees used for associa- tive searching,” CACM, vol. 18, no. 9, pp. 509–517, 1975

work page 1975
[73]

Learning balanced tree indexes for large-scale vector retrieval,

W. Li, C. Feng, D. Lian et al., “Learning balanced tree indexes for large-scale vector retrieval,” in SIGKDDg, 2023

work page 2023
[74]

Locality-sensitive hashing scheme based on p-stable distributions,

M. Datar, N. Immorlica, P. Indyk et al., “Locality-sensitive hashing scheme based on p-stable distributions,” in SCG, 2004

work page 2004
[75]

Efficient and robust approxi- mate nearest neighbor search using hierarchical navigable small world graphs,

Y . A. Malkov and D. A. Yashunin, “Efficient and robust approxi- mate nearest neighbor search using hierarchical navigable small world graphs,” TPAMI, vol. 42, no. 4, pp. 824–836, 2018

work page 2018
[76]

Diskann: Fast accurate billion-point nearest neighbor search on a single node,

S. Jayaram Subramanya, F. Devvrit et al., “Diskann: Fast accurate billion-point nearest neighbor search on a single node,” NeurIPS, 2019

work page 2019
[77]

A neural corpus indexer for document retrieval,

Y . Wang, Y . Hou, H. Wang et al., “A neural corpus indexer for document retrieval,” in NeurIPS, 2022

work page 2022
[78]

Model-enhanced vector index,

H. Zhang, Y . Wang, Q. Chen et al., “Model-enhanced vector index,” in NeurIPS, 2023

work page 2023
[79]

Retrieval-based neural code generation,

S. A. Hayati, R. Olivier, P. Avvaru et al., “Retrieval-based neural code generation,” in EMNLP, 2018

work page 2018
[80]

Retrieval-based neural source code summarization,

J. Zhang, X. Wang, H. Zhang et al., “Retrieval-based neural source code summarization,” in ICSE, 2020

work page 2020
[81]

Synchromesh: Reliable code generation from pre-trained language models,

G. Poesia, A. Polozov, V . Le et al., “Synchromesh: Reliable code generation from pre-trained language models,” in ICLR, 2022

work page 2022

Showing first 80 references.