arxiv: 2404.13501 · v1 · submitted 2024-04-21 · 💻 cs.AI

Recognition: 1 theorem link

· Lean Theorem

A Survey on the Memory Mechanism of Large Language Model based Agents

Zeyu Zhang , Xiaohe Bo , Chen Ma , Rui Li , Xu Chen , Quanyu Dai , Jieming Zhu , Zhenhua Dong

show 1 more author

Ji-Rong Wen

Authors on Pith no claims yet

Pith reviewed 2026-05-15 07:15 UTC · model grok-4.3

classification 💻 cs.AI

keywords LLM agentsmemory mechanismssurveyagent memorylong-term interactiondesign patternsself-evolving agentsevaluation methods

0 comments

The pith

Memory mechanisms let LLM-based agents handle long-term interactions by storing and retrieving information beyond single prompts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey addresses the scattered nature of prior work on memory in LLM-based agents by offering a unified review of what memory is, why agents need it, and how it has been designed and tested. It explains that standard large language models lack the persistent state required for complex, ongoing agent-environment exchanges, making memory the core enabler of self-evolution. By examining design choices, evaluation methods, and real applications, the paper extracts recurring patterns that can reduce duplication in future efforts. The work also identifies gaps and outlines directions for improvement.

Core claim

A systematic review of existing studies on memory in LLM-based agents reveals common design patterns across what were previously isolated proposals, creating a holistic framework that clarifies how memory supports sustained interactions and self-improvement while highlighting limitations in current approaches.

What carries the argument

The memory module, which maintains state across interactions to enable agents to remember past experiences, plan over time, and adapt in dynamic environments.

Load-bearing premise

The papers chosen for review represent the main ideas in the field and the proposed categories capture genuinely reusable design patterns that will usefully shape later work.

What would settle it

A new memory mechanism that works well in practice yet fits none of the survey's categories, or tests showing that following the abstracted patterns does not improve agent performance on long-horizon tasks.

read the original abstract

Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is the memory of the agents. While previous studies have proposed many promising memory mechanisms, they are scattered in different papers, and there lacks a systematical review to summarize and compare these works from a holistic perspective, failing to abstract common and effective designing patterns for inspiring future studies. To bridge this gap, in this paper, we propose a comprehensive survey on the memory mechanism of LLM-based agents. In specific, we first discuss ''what is'' and ''why do we need'' the memory in LLM-based agents. Then, we systematically review previous studies on how to design and evaluate the memory module. In addition, we also present many agent applications, where the memory module plays an important role. At last, we analyze the limitations of existing work and show important future directions. To keep up with the latest advances in this field, we create a repository at \url{https://github.com/nuster1128/LLM_Agent_Memory_Survey}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A useful survey that organizes scattered work on memory for LLM agents but introduces no new mechanisms or results.

read the letter

This paper pulls together existing memory mechanisms for LLM-based agents into one document and groups them by design patterns. That is the core value: it gives readers a map of what has already been tried instead of leaving them to hunt through individual papers. The structure moves logically from definitions and motivations to design choices, evaluation methods, applications, and then limitations plus future directions. The GitHub repository for updates is a practical addition that could keep the overview current. On the positive side, the authors do identify some recurring patterns across the reviewed work, which might help new researchers avoid obvious dead ends. The writing stays clear and the scope matches what a survey should cover. The softer spots are the usual ones for this type of paper. There are no new experiments, derivations, or empirical comparisons, so the claims rest entirely on how well the authors selected and summarized prior studies. The abstract does not spell out the inclusion criteria or search strategy in detail, which leaves open the question of whether important papers were missed or whether the categories truly capture the most useful distinctions. Without that transparency, the guidance for future work stays at a high level. This is aimed at people who are starting to build or study LLM agents and want a consolidated reference rather than a deep technical advance. It could save time for someone planning a new architecture or writing a related paper. I would send it for peer review. A survey that does the organization job cleanly can be a net positive for the subfield even if it does not push any technical boundary.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a survey on memory mechanisms for LLM-based agents, claiming to be the first systematic review that defines memory and its necessity, reviews prior work on design and evaluation, discusses applications where memory is key, analyzes limitations, and proposes future directions, with an accompanying GitHub repository for updates.

Significance. If the coverage and taxonomy hold, the survey would consolidate scattered literature on a critical component for long-term agent-environment interactions and abstract common design patterns to guide future work. The public GitHub repository for ongoing updates is a clear strength supporting reproducibility and timeliness.

major comments (2)

[Introduction] Introduction: The claim to provide the 'first systematic review' and to 'abstract common and effective designing patterns' is load-bearing but unsupported without an explicit literature search methodology (databases, keywords, date range, inclusion/exclusion criteria). This omission prevents verification of representativeness and risks the taxonomy being incomplete or non-reproducible.
[Design review section] Section on design review (likely §3): The proposed categorization of memory mechanisms must include a transparent mapping from individual reviewed papers to each category, with justification for boundaries between types; without this, the abstraction of 'common designing patterns' cannot be evaluated as systematic rather than ad-hoc.

minor comments (2)

[Abstract] Abstract: The GitHub URL is referenced but should be written out fully in the abstract for immediate accessibility.
[Limitations and future directions] Limitations and future directions section: Recommendations for future work would be strengthened by tying each suggestion directly to a specific gap identified in the reviewed literature rather than remaining high-level.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the recommendation of minor revision. The comments highlight important areas for improving the transparency and reproducibility of our survey, which we will address in the revised manuscript.

read point-by-point responses

Referee: [Introduction] The claim to provide the 'first systematic review' and to 'abstract common and effective designing patterns' is load-bearing but unsupported without an explicit literature search methodology (databases, keywords, date range, inclusion/exclusion criteria). This omission prevents verification of representativeness and risks the taxonomy being incomplete or non-reproducible.

Authors: We agree that an explicit literature search methodology is required to substantiate the claim of a systematic review. In the revised version, we will insert a dedicated subsection (Section 2.1) that details the search protocol: databases (arXiv, Google Scholar, ACL Anthology), keywords and Boolean strings (e.g., “LLM-based agent” AND “memory mechanism”), date range (January 2022–March 2024), and inclusion/exclusion criteria (peer-reviewed or preprint papers that propose or evaluate memory modules for LLM agents). This addition will allow readers to assess coverage and reproducibility. revision: yes
Referee: [Design review section] Section on design review (likely §3): The proposed categorization of memory mechanisms must include a transparent mapping from individual reviewed papers to each category, with justification for boundaries between types; without this, the abstraction of 'common designing patterns' cannot be evaluated as systematic rather than ad-hoc.

Authors: We accept that a transparent mapping is necessary for the taxonomy to be evaluated as systematic. We will add a new table (Table 1 in Section 3) that enumerates every reviewed paper, assigns it to one or more memory categories, and provides a short justification for the assignment together with explicit boundary criteria (e.g., persistence duration, retrieval mechanism, update frequency). An accompanying appendix will list the full references for cross-checking. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a literature survey on memory mechanisms for LLM-based agents. It defines the topic, reviews design and evaluation approaches from prior work, discusses applications, and outlines limitations and future directions without any equations, derivations, fitted parameters, or predictive claims. No steps reduce by construction to self-citations, definitions, or inputs; the survey structure and GitHub repository are independent of the reviewed content. The central claim of providing a holistic summary is externally verifiable against the cited papers and does not rely on internal circular reasoning.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey the paper relies on standard literature-review practices and the body of cited agent papers; no free parameters, new axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5549 in / 983 out tokens · 46708 ms · 2026-05-15T07:15:24.951273+00:00 · methodology

discussion (0)

Forward citations

Cited by 21 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MEME: Multi-entity & Evolving Memory Evaluation
cs.LG 2026-05 unverdicted novelty 7.0

All tested LLM memory systems fail at dependency reasoning in multi-entity evolving scenarios, with only an expensive file-based setup showing partial recovery.
Goal-Oriented Reasoning for RAG-based Memory in Conversational Agentic LLM Systems
cs.AI 2026-05 unverdicted novelty 7.0

Goal-Mem improves RAG memory retrieval in agentic LLMs by explicit goal decomposition and backward chaining via Natural Language Logic, outperforming nine baselines on multi-hop and implicit inference tasks.
Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory
cs.AI 2026-05 unverdicted novelty 7.0

Memory for long-horizon agents should preserve distinctions that affect decisions under a fixed budget, not descriptive features, yielding an exact forgetting boundary and a new online learner DeMem with regret guarantees.
Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory
cs.CL 2026-05 unverdicted novelty 7.0

MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.
AEL: Agent Evolving Learning for Open-Ended Environments
cs.CL 2026-04 conditional novelty 7.0

AEL uses a fast-timescale bandit for memory policy selection and slow-timescale LLM reflection for causal insights, achieving a Sharpe ratio of 2.13 on a 208-episode portfolio benchmark while showing that added mechan...
Automated Design of Agentic Systems
cs.AI 2024-08 conditional novelty 7.0

Meta Agent Search uses a meta-agent to iteratively program novel agentic systems in code, producing agents that outperform state-of-the-art hand-designed ones across coding, science, and math while transferring across...
Grounded Continuation: A Linear-Time Runtime Verifier for LLM Conversations
cs.AI 2026-05 conditional novelty 6.0

A hybrid LLM-symbolic verifier maintains a dependency graph over conversation turns classified into eight formal update operations, enabling linear-time groundedness checks and precise retraction propagation with a co...
CHAL: Council of Hierarchical Agentic Language
cs.AI 2026-05 unverdicted novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.
SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents
cs.AI 2026-05 unverdicted novelty 6.0

SkillLens organizes skills into policies-strategies-procedures-primitives layers, retrieves via degree-corrected random walk, and uses a verifier for local adaptation, yielding up to 6.31 pp gains on MuLocbench and ra...
HiGMem: A Hierarchical and LLM-Guided Memory System for Long-Term Conversational Agents
cs.CL 2026-04 unverdicted novelty 6.0

HiGMem combines hierarchical event-turn memory with LLM-guided selection to retrieve concise relevant evidence from long dialogues, improving F1 scores and cutting retrieved turns by an order of magnitude on the LoCoM...
TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation
cs.CL 2026-04 unverdicted novelty 6.0

TSUBASA improves long-horizon personalization in LLMs via dynamic memory evolution for writing and context-distillation self-learning for reading, outperforming Mem0 and Memory-R1 on Qwen-3 benchmarks while reducing t...
From Agent Loops to Deterministic Graphs: Execution Lineage for Reproducible AI-Native Work
cs.AI 2026-05 conditional novelty 5.0

Execution lineage models AI-native work as a DAG of computations with explicit dependencies, achieving perfect state preservation in controlled update tasks where loop-based agents introduce churn and contamination.
MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval
cs.CL 2026-05 unverdicted novelty 5.0

MemReranker applies multi-stage distillation to Qwen3-Reranker to produce reasoning-aware rerankers that outperform baselines on memory tasks with temporal and causal constraints.
GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
cs.CL 2026-05 unverdicted novelty 5.0

GRAVITY adds structured relational, temporal, and thematic memory anchors to conversational LLMs at generation time, delivering 7.5-10.1% average gains in LLM-judge accuracy across five host systems on LongMemEval and LoCoMo.
From Coarse to Fine: Self-Adaptive Hierarchical Planning for LLM Agents
cs.AI 2026-04 unverdicted novelty 5.0

AdaPlan-H enables LLM agents to generate self-adaptive hierarchical plans that adjust detail level to task difficulty, improving success rates in multi-step tasks.
Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning
cs.AI 2026-04 unverdicted novelty 5.0

A case-based learning framework extracts reusable knowledge from past tasks to improve LLM agents' structured performance on complex real-world tasks, outperforming standard prompting baselines especially as task comp...
MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval
cs.CL 2026-05 unverdicted novelty 4.0

MemReranker applies multi-teacher pairwise distillation, BCE pointwise training, and InfoNCE contrastive learning on mixed general and memory-specific dialogue data to produce efficient rerankers that improve calibrat...
Memory as Metabolism: A Design for Companion Knowledge Systems
cs.AI 2026-04 unverdicted novelty 4.0

This paper designs a companion knowledge system with TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT operations plus memory gravity and minority-hypothesis retention to give contradictory evidence a path to updat...
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence
cs.AI 2025-07 accept novelty 4.0

The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
cs.AI 2025-04 accept novelty 4.0

A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
cs.AI 2025-01 unverdicted novelty 4.0

The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and ident...

Reference graph

Works this paper leans on

173 extracted references · 173 canonical work pages · cited by 20 Pith papers · 22 internal anchors

[1]

ChatDev: Communicative Agents for Software Development

Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, and Maosong Sun. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

S3: Social-network simulation system with large language model-empowered agents

Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, Depeng Jin, and Yong Li. S3: Social-network simulation system with large language model-empowered agents. arXiv preprint arXiv:2307.14984, 2023

work page arXiv 2023
[3]

A Survey on Large Language Model based Autonomous Agents

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023. 28

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Reflexion: Language agents with verbal reinforcement learning

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik R Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023
[6]

Memorybank: Enhancing large language models with long-term memory

Wanjun Zhong, Lianghong Guo, Qiqi Gao, and Yanlin Wang. Memorybank: Enhancing large language models with long-term memory. arXiv preprint arXiv:2305.10250, 2023

work page arXiv 2023
[7]

Ret-llm: Towards a general read-write memory for large language models

Ali Modarressi, Ayyoob Imani, Mohsen Fayyaz, and Hinrich Schütze. Ret-llm: Towards a general read-write memory for large language models. arXiv preprint arXiv:2305.14322, 2023

work page arXiv 2023
[8]

Instruction tuning for large language models: A survey

Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, et al. Instruction tuning for large language models: A survey. arXiv preprint arXiv:2308.10792, 2023

work page arXiv 2023
[9]

Large language model alignment: A survey

Tianhao Shen, Renren Jin, Yufei Huang, Chuang Liu, Weilong Dong, Zishan Guo, Xinwei Wu, Yan Liu, and Deyi Xiong. Large language model alignment: A survey. arXiv preprint arXiv:2309.15025, 2023

work page arXiv 2023
[10]

Aligning large language models with human: A survey

Yufei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, and Qun Liu. Aligning large language models with human: A survey. arXiv preprint arXiv:2307.12966, 2023

work page arXiv 2023
[11]

Trustworthy llms: a survey and guideline for evaluating large language models’ alignment

Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, and Hang Li. Trustworthy llms: a survey and guideline for evaluating large language models’ alignment. arXiv preprint arXiv:2308.05374, 2023

work page arXiv 2023
[12]

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, and Haofen Wang. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[13]

Knowledge editing for large language models: A survey

Song Wang, Yaochen Zhu, Haochen Liu, Zaiyi Zheng, Chen Chen, et al. Knowledge editing for large language models: A survey. arXiv preprint arXiv:2310.16218, 2023

work page arXiv 2023
[14]

Editing Large Language Models: Problems, Methods, and Opportunities, November 2023

Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, and Ningyu Zhang. Editing large language models: Problems, methods, and opportuni- ties. arXiv preprint arXiv:2305.13172, 2023

work page arXiv 2023
[15]

Easyedit: An easy-to-use knowledge editing framework for large language models

Peng Wang, Ningyu Zhang, Xin Xie, Yunzhi Yao, Bozhong Tian, Mengru Wang, Zekun Xi, Siyuan Cheng, Kangwei Liu, Guozhou Zheng, et al. Easyedit: An easy-to-use knowledge editing framework for large language models. arXiv preprint arXiv:2308.07269, 2023

work page arXiv 2023
[16]

Trends in integration of knowledge and large language models: A survey and taxonomy of methods, benchmarks, and applications

Zhangyin Feng, Weitao Ma, Weijiang Yu, Lei Huang, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. Trends in integration of knowledge and large language models: A survey and taxonomy of methods, benchmarks, and applications. arXiv preprint arXiv:2311.05876, 2023

work page arXiv 2023
[17]

A Comprehensive Study of Knowledge Editing for Large Language Models, November 2024

Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, et al. A comprehensive study of knowledge editing for large language models. arXiv preprint arXiv:2401.01286, 2024

work page arXiv 2024
[18]

Tool learning with foundation models

Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, et al. Tool learning with foundation models. arXiv preprint arXiv:2304.08354, 2023

work page arXiv 2023
[19]

Advancing transformer architecture in long-context large language models: A comprehensive survey

Yunpeng Huang, Jingwei Xu, Zixu Jiang, Junyu Lai, Zenan Li, Yuan Yao, Taolue Chen, Lijuan Yang, Zhou Xin, and Xiaoxing Ma. Advancing transformer architecture in long-context large language models: A comprehensive survey. arXiv preprint arXiv:2311.12351, 2023. 29

work page arXiv 2023
[20]

Beyond the limits: A survey of techniques to extend the context length in large language models

Xindi Wang, Mahsa Salmani, Parsa Omidi, Xiangyu Ren, Mehdi Rezagholizadeh, and Ar- maghan Eshaghi. Beyond the limits: A survey of techniques to extend the context length in large language models. arXiv preprint arXiv:2402.02244, 2024

work page arXiv 2024
[21]

The what, why, and how of context length extension techniques in large language models–a detailed survey

Saurav Pawar, SM Tonmoy, SM Zaman, Vinija Jain, Aman Chadha, and Amitava Das. The what, why, and how of context length extension techniques in large language models–a detailed survey. arXiv preprint arXiv:2401.07872, 2024

work page arXiv 2024
[22]

Multimodal large language models: A survey

Jiayang Wu, Wensheng Gan, Zefeng Chen, Shicheng Wan, and S Yu Philip. Multimodal large language models: A survey. In 2023 IEEE International Conference on Big Data (BigData), pages 2247–2256. IEEE, 2023

work page 2023
[23]

How to bridge the gap between modalities: A comprehensive survey on multimodal large language model

Shezheng Song, Xiaopeng Li, and Shasha Li. How to bridge the gap between modalities: A comprehensive survey on multimodal large language model. arXiv preprint arXiv:2311.07594, 2023

work page arXiv 2023
[24]

The (r) evolution of multimodal large language models: A survey

Davide Caffagni, Federico Cocchi, Luca Barsellotti, Nicholas Moratelli, Sara Sarto, Lorenzo Baraldi, Marcella Cornia, and Rita Cucchiara. The (r) evolution of multimodal large language models: A survey. arXiv preprint arXiv:2402.12451, 2024

work page arXiv 2024
[25]

A survey on multimodal large language models

Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, and Enhong Chen. A survey on multimodal large language models. arXiv preprint arXiv:2306.13549, 2023

work page arXiv 2023
[26]

Beyond efficiency: A systematic survey of resource-efficient large language models

Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, et al. Beyond efficiency: A systematic survey of resource-efficient large language models. arXiv preprint arXiv:2401.00625, 2024

work page arXiv 2024
[27]

Efficient large language models: A survey

Zhongwei Wan, Xin Wang, Che Liu, Samiul Alam, Yu Zheng, Zhongnan Qu, Shen Yan, Yi Zhu, Quanlu Zhang, Mosharaf Chowdhury, et al. Efficient large language models: A survey. arXiv preprint arXiv:2312.03863, 1, 2023

work page arXiv 2023
[28]

Towards efficient generative large language model serving: A survey from algorithms to systems

Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, and Zhihao Jia. Towards efficient generative large language model serving: A survey from algorithms to systems. arXiv preprint arXiv:2312.15234, 2023

work page arXiv 2023
[29]

Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment

Lingling Xu, Haoran Xie, Si-Zhao Joe Qin, Xiaohui Tao, and Fu Lee Wang. Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment. arXiv preprint arXiv:2312.12148, 2023

work page arXiv 2023
[30]

A survey on model compression for large language models

Xunyu Zhu, Jian Li, Yong Liu, Can Ma, and Weiping Wang. A survey on model compression for large language models. arXiv preprint arXiv:2308.07633, 2023

work page arXiv 2023
[31]

A survey on model compression and acceleration for pretrained language models

Canwen Xu and Julian McAuley. A survey on model compression and acceleration for pretrained language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 10566–10575, 2023

work page 2023
[32]

Model compression and efficient inference for large language models: A survey

Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, and Xiaofei He. Model compression and efficient inference for large language models: A survey. arXiv preprint arXiv:2402.09748, 2024

work page arXiv 2024
[33]

A comprehensive survey of compression algorithms for language models

Seungcheol Park, Jaehyeon Choi, Sojin Lee, and U Kang. A comprehensive survey of compression algorithms for language models. arXiv preprint arXiv:2401.15347, 2024

work page arXiv 2024
[34]

A survey on evaluation of large language models

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2023

work page 2023
[35]

Evaluating large language models: A comprehensive survey

Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong, et al. Evaluating large language models: A comprehensive survey. arXiv preprint arXiv:2310.19736, 2023

work page arXiv 2023
[36]

Harnessing the power of llms in practice: A survey on chatgpt and beyond

Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, and Xia Hu. Harnessing the power of llms in practice: A survey on chatgpt and beyond. arXiv preprint arXiv:2304.13712, 2023. 30

work page arXiv 2023
[37]

Large language models for information retrieval: A survey

Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Zhicheng Dou, and Ji-Rong Wen. Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107, 2023

work page arXiv 2023
[38]

Large language models for generative information extraction: A survey

Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, and Enhong Chen. Large language models for generative information extraction: A survey. arXiv preprint arXiv:2312.17617, 2023

work page arXiv 2023
[39]

Large language models for software engineering: Survey and open problems

Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, and Jie M Zhang. Large language models for software engineering: Survey and open problems. arXiv preprint arXiv:2310.03533, 2023

work page arXiv 2023
[40]

Software testing with large language models: Survey, landscape, and vision

Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, and Qing Wang. Software testing with large language models: Survey, landscape, and vision. IEEE Transactions on Software Engineering, 2024

work page 2024
[41]

A survey of large language models for code: Evolution, benchmarking, and future trends

Zibin Zheng, Kaiwen Ning, Yanlin Wang, Jingwen Zhang, Dewu Zheng, Mingxi Ye, and Jiachi Chen. A survey of large language models for code: Evolution, benchmarking, and future trends. arXiv preprint arXiv:2311.10372, 2023

work page arXiv 2023
[42]

Large language models for robotics: A survey

Fanlong Zeng, Wensheng Gan, Yongheng Wang, Ning Liu, and Philip S Yu. Large language models for robotics: A survey. arXiv preprint arXiv:2311.07226, 2023

work page arXiv 2023
[43]

A survey on multimodal large language models for autonomous driving

Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Yang Zhou, Kaizhao Liang, Jintai Chen, Juanwu Lu, Zichong Yang, Kuei-Da Liao, et al. A survey on multimodal large language models for autonomous driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 958–979, 2024

work page 2024
[44]

A survey of large language models for autonomous driving

Zhenjie Yang, Xiaosong Jia, Hongyang Li, and Junchi Yan. A survey of large language models for autonomous driving. arXiv preprint arXiv:2311.01043, 2023

work page arXiv 2023
[45]

A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics

Kai He, Rui Mao, Qika Lin, Yucheng Ruan, Xiang Lan, Mengling Feng, and Erik Cambria. A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics. arXiv preprint arXiv:2310.05694, 2023

work page arXiv 2023
[46]

A survey of large language models in medicine: Progress, application, and challenge

Hongjian Zhou, Boyang Gu, Xinyu Zou, Yiru Li, Sam S Chen, Peilin Zhou, Junling Liu, Yining Hua, Chengfeng Mao, Xian Wu, et al. A survey of large language models in medicine: Progress, application, and challenge. arXiv preprint arXiv:2311.05112, 2023

work page arXiv 2023
[47]

Pre-trained language models in biomedical domain: A systematic survey

Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari, Zhao Li, and Jie Fu. Pre-trained language models in biomedical domain: A systematic survey. ACM Computing Surveys, 56(3):1–52, 2023

work page 2023
[48]

Large language models in finance: A survey

Yinheng Li, Shaofei Wang, Han Ding, and Hang Chen. Large language models in finance: A survey. In Proceedings of the Fourth ACM International Conference on AI in Finance, pages 374–382, 2023

work page 2023
[49]

Towards a psychological generalist ai: A sur- vey of current applications of large language models and future prospects

Tianyu He, Guanghui Fu, Yijing Yu, Fan Wang, Jianqiang Li, Qing Zhao, Changwei Song, Hongzhi Qi, Dan Luo, Huijing Zou, et al. Towards a psychological generalist ai: A sur- vey of current applications of large language models and future prospects. arXiv preprint arXiv:2312.04578, 2023

work page arXiv 2023
[50]

Large language models for generative recommendation: A survey and visionary discussions

Lei Li, Yongfeng Zhang, Dugang Liu, and Li Chen. Large language models for generative recommendation: A survey and visionary discussions. arXiv preprint arXiv:2309.01157, 2023

work page arXiv 2023
[51]

How can recommender systems benefit from large language models: A survey

Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, et al. How can recommender systems benefit from large language models: A survey. arXiv preprint arXiv:2306.05817, 2023

work page arXiv 2023
[52]

Generative recommen- dation: Towards next-generation recommender paradigm

Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, and Tat-Seng Chua. Generative recommen- dation: Towards next-generation recommender paradigm. arXiv preprint arXiv:2304.03516, 2023. 31

work page arXiv 2023
[53]

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, et al. Siren’s song in the ai ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[54]

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qian- glong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[55]

A survey of hallucination in large foundation models

Vipula Rawte, Amit Sheth, and Amitava Das. A survey of hallucination in large foundation models. arXiv preprint arXiv:2309.05922, 2023

work page arXiv 2023
[56]

Cognitive mirage: A review of hallucinations in large language models

Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, and Weiqiang Jia. Cognitive mirage: A review of hallucinations in large language models. arXiv preprint arXiv:2309.06794, 2023

work page arXiv 2023
[57]

Survey of hallucination in natural language generation

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, 2023

work page 2023
[58]

A comprehensive survey of hallucination mitigation techniques in large language models

SM Tonmoy, SM Zaman, Vinija Jain, Anku Rani, Vipula Rawte, Aman Chadha, and Amitava Das. A comprehensive survey of hallucination mitigation techniques in large language models. arXiv preprint arXiv:2401.01313, 2024

work page arXiv 2024
[59]

A survey on large language model hallucination via a creativity perspective

Xuhui Jiang, Yuxing Tian, Fengrui Hua, Chengjin Xu, Yuanzhuo Wang, and Jian Guo. A survey on large language model hallucination via a creativity perspective. arXiv preprint arXiv:2402.06647, 2024

work page arXiv 2024
[60]

Bias and fairness in large language models: A survey

Isabel O Gallegos, Ryan A Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, and Nesreen K Ahmed. Bias and fairness in large language models: A survey. arXiv preprint arXiv:2309.00770, 2023

work page arXiv 2023
[61]

Gender bias and stereotypes in large language models

Hadas Kotek, Rikker Dockum, and David Sun. Gender bias and stereotypes in large language models. In Proceedings of The ACM Collective Intelligence Conference, pages 12–24, 2023

work page 2023
[62]

A survey on fairness in large language models

Yingji Li, Mengnan Du, Rui Song, Xin Wang, and Ying Wang. A survey on fairness in large language models. arXiv preprint arXiv:2308.10149, 2023

work page arXiv 2023
[63]

Explainability for large language models: A survey

Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, and Mengnan Du. Explainability for large language models: A survey. ACM Transactions on Intelligent Systems and Technology, 2023

work page 2023
[64]

A survey on large language model (llm) security and privacy: The good, the bad, and the ugly

Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Eric Sun, and Yue Zhang. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. arXiv preprint arXiv:2312.02003, 1, 2023

work page arXiv 2023
[65]

Survey of vulnerabilities in large language models revealed by adversarial attacks

Erfan Shayegani, Md Abdullah Al Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael Abu- Ghazaleh. Survey of vulnerabilities in large language models revealed by adversarial attacks. arXiv preprint arXiv:2310.10844, 2023

work page arXiv 2023
[66]

Privacy issues in large language models: A survey

Seth Neel and Peter Chang. Privacy issues in large language models: A survey. arXiv preprint arXiv:2312.06717, 2023

work page arXiv 2023
[67]

Identifying and mitigating privacy risks stemming from language models: A survey

Victoria Smith, Ali Shahin Shamsabadi, Carolyn Ashurst, and Adrian Weller. Identifying and mitigating privacy risks stemming from language models: A survey. arXiv preprint arXiv:2310.01424, 2023

work page arXiv 2023
[68]

Attacks, defenses and evaluations for llm conversation safety: A survey

Zhichen Dong, Zhanhui Zhou, Chao Yang, Jing Shao, and Yu Qiao. Attacks, defenses and evaluations for llm conversation safety: A survey. arXiv preprint arXiv:2402.09283, 2024

work page arXiv 2024
[69]

Security and privacy challenges of large language models: A survey

Badhan Chandra Das, M Hadi Amini, and Yanzhao Wu. Security and privacy challenges of large language models: A survey. arXiv preprint arXiv:2402.00888, 2024

work page arXiv 2024
[70]

A Survey of Large Language Models

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023. 32

work page internal anchor Pith review Pith/arXiv arXiv 2023
[71]

A survey on large language models: Applications, challenges, limitations, and practical usage

Muhammad Usman Hadi, Rizwan Qureshi, Abbas Shah, Muhammad Irfan, Anas Zafar, Muhammad Bilal Shaikh, Naveed Akhtar, Jia Wu, Seyedali Mirjalili, et al. A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Preprints, 2023

work page 2023
[72]

Recent advances in natural language processing via large pre-trained language models: A survey

Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, and Dan Roth. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys, 56(2): 1–40, 2023

work page 2023
[73]

Augmented language models: a survey, 2023

Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, et al. Augmented language models: a survey. arXiv preprint arXiv:2302.07842, 2023

work page arXiv 2023
[74]

Understanding the planning of LLM agents: A survey

Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Hao Wang, Defu Lian, Yasheng Wang, Ruiming Tang, and Enhong Chen. Understanding the planning of llm agents: A survey. arXiv preprint arXiv:2402.02716, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[75]

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[76]

Personal llm agents: Insights and survey about the capability, efficiency and security

Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, et al. Personal llm agents: Insights and survey about the capability, efficiency and security. arXiv preprint arXiv:2401.05459, 2024

work page arXiv 2024
[77]

An in-depth survey of large language model-based artificial intelligence agents

Pengyu Zhao, Zijian Jin, and Ning Cheng. An in-depth survey of large language model-based artificial intelligence agents. arXiv preprint arXiv:2309.14365, 2023

work page arXiv 2023
[78]

Exploring large language model based intelligent agents: Definitions, methods, and prospects

Yuheng Cheng, Ceyao Zhang, Zhengwen Zhang, Xiangrui Meng, Sirui Hong, Wenhao Li, Zihao Wang, Zekai Wang, Feng Yin, Junhua Zhao, et al. Exploring large language model based intelligent agents: Definitions, methods, and prospects. arXiv preprint arXiv:2401.03428, 2024

work page arXiv 2024
[79]

Agent ai: Surveying the horizons of multimodal interaction

Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, et al. Agent ai: Surveying the horizons of multimodal interaction. arXiv preprint arXiv:2401.03568, 2024

work page arXiv 2024
[80]

Llm as os (llmao), agents as apps: Envisioning aios, agents and the aios-agent ecosystem

Yingqiang Ge, Yujie Ren, Wenyue Hua, Shuyuan Xu, Juntao Tan, and Yongfeng Zhang. Llm as os (llmao), agents as apps: Envisioning aios, agents and the aios-agent ecosystem. arXiv preprint arXiv:2312.03815, 2023

work page arXiv 2023

Showing first 80 references.