eMEM: A Hybrid Spatio-Temporal Memory System For Embodied Agents

A. Haroon Rasheed (MIND); Maria Kabtoul (CHROMA)

arxiv: 2606.03374 · v2 · pith:76KDGAEGnew · submitted 2026-06-02 · 💻 cs.RO

eMEM: A Hybrid Spatio-Temporal Memory System For Embodied Agents

A. Haroon Rasheed (MIND) , Maria Kabtoul (CHROMA) This is my paper

Pith reviewed 2026-06-28 09:39 UTC · model grok-4.3

classification 💻 cs.RO

keywords embodied memoryhybrid memory systemspatio-temporal indexingmemory consolidationagent memory architecturegraph-based memorycognitive benchmarksRAG comparison

0 comments

The pith

A hybrid graph-based memory system unifies semantic, spatial, and temporal indexes for embodied agents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces eMEM as a memory architecture that treats memory as a single graph backed by separate indexes for structured data, semantic similarity, and spatial location. It argues that a tiered consolidation process turning raw observations into summaries is needed so agents can handle long delays and avoid common retrieval errors. The evaluation uses tasks drawn from cognitive psychology to measure aspects such as context-specific recall and resistance to false associations. If the approach holds, agents would gain reliable access to location-linked facts and cross-layer information without relying on external stores.

Core claim

eMEM uses a multi-index architecture consisting of structured storage, approximate nearest-neighbour semantic search, and spatial indexing, all unified behind one graph model, together with a tiered consolidation pipeline that compresses raw perceptual observations into summaries; this design yields strong results on probes for context-dependent retrieval and lure rejection while maintaining retention across long simulated delays.

What carries the argument

The multi-index architecture (structured storage, semantic vector search, and spatial indexing) unified behind a single graph model, together with a tiered consolidation pipeline that transforms raw perceptual observations into compressed summaries.

If this is right

Ten recall primitives, including concept-to-location resolution and cross-layer recall, become available as direct operations for LLM tool calling.
The system runs fully embedded and in-process with the agent.
Retention of room-unique items stays at ceiling level across simulated delays from one hour to one year.
Multi-layer storage improves context-dependent retrieval while consolidation improves rejection of false associations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The consolidation step could be applied to other sensor streams beyond vision in embodied settings.
The graph unification might reduce the need for separate external memory services in deployed agents.
Psychology-derived tasks could serve as a diagnostic layer for memory components in other agent designs.

Load-bearing premise

Performance on the eight cognitive-psychology paradigms in simulated environments accurately measures the memory needs of embodied agents in real physical settings.

What would settle it

Testing the same set of probes on physical robots moving through actual rooms and checking whether the advantage over a flat retrieval baseline disappears.

Figures

Figures reproduced from arXiv: 2606.03374 by A. Haroon Rasheed (MIND), Maria Kabtoul (CHROMA).

**Figure 1.** Figure 1: eMEM system overview. The SPATIOTEMPORALMEMORY facade orchestrates a working-memory buffer, a retrieval tool surface, and a consolidation engine, all backed by a hybrid MEMORYSTORE combining SQLite, HNSW, and R-tree indices. Episodea Episodeb Obs1 Obs2 Entity1 Entity2 Gist BELONGS_TO FOLLOWS SUBTASK_OF SUMMARIZES OBSERVED_IN COOCCURS_WITH [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Data model. Four node types (observation, episode, gist, entity) connected by six edge types. Gists summarise [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Memory tiers. Observations are encoded into a working buffer, flushed to short-term storage where they [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Consolidation. (a) At episode end, observations are temporally chunked (default gap: 30 min) into sessions; [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

read the original abstract

We present eMEM (Embodied Memory), a hybrid graph-based memory system for embodied agents operating in physical environments. Current agent memory architectures, such as Generative Agents, MemGPT, and A-MEM, treat memory as text streams or knowledge graphs, but embodied agents require memory that is simultaneously searchable by meaning, space, and time. eMEM fills this gap with a multi-index architecture (SQLITE for structured storage, hnswlib for approximate nearest neighbour semantic search, and an R-tree for spatial queries) unified behind a single graph model. A tiered consolidation pipeline transforms raw perceptual observations into compressed summaries, mirroring hippocampal-neocortical consolidation in biological systems. Ten agent-facing recall tools expose memory retrieval primitives, including concept-to-location resolution and cross layer recall, as first-class operations for LLM tool calling. The system is fully embedded and runs in-process alongside the agent. In addition we introduce eMEM-Bench v1, a benchmark we construct over ProcTHOR-10K scenes for embodied memory evaluation. The benchmark is organised explicitly around eight cognitive-psychology paradigms (DRM lures, pattern separation, pattern completion, source monitoring, context-dependent retrieval, long-horizon interference, serial position, and a foil augmented retention curve), each chosen so that the result is interpretable against the broader memory-systems literature in humans and prior agent-memory systems; a level of diagnostic that surface-task benchmarks like LoCoMo or OpenEQA cannot provide. eMEM scores 80.8 weighted mean over 988 probes, with a flat retention curve at ceiling from 1 h to 1 yr of simulated delay on room-unique items. We show that a pure RAG baseline (the flat_rag ablation) loses 30 pt on context dependent retrieval and 29 pt on DRM lure rejection, isolating the contribution of multi-layer storage and consolidation respectively. We release both the system and the benchmark code.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

eMEM ships a runnable hybrid memory stack and a diagnostic benchmark, but its performance numbers come from clean simulation that leaves real-world transfer untested.

read the letter

The paper's main contribution is a working system that puts SQLite, hnswlib, and an R-tree behind one graph interface, plus a tiered consolidation step and ten explicit recall tools that agents can call. It also releases both the code and eMEM-Bench, which organizes 988 probes around eight named cognitive paradigms instead of surface tasks.

That setup is useful. The ablations show the multi-index design adding 30 points on context-dependent retrieval and the consolidation layer adding 29 points on DRM lure rejection over a flat RAG baseline. The flat retention curve from one hour to one year on room-unique items is a clear result in the reported setting. Releasing the implementation lets others check the details directly.

The soft spot is the evaluation. Everything runs inside ProcTHOR-10K with perfect perception and synthetic timestamps. The stress-test note is on target: real environments bring sensor noise, partial views, and motor-driven state changes that the benchmark does not include. It is not obvious the 80.8 weighted mean or the ceiling retention would survive those conditions, so the claimed margins over prior systems remain provisional.

This is for people building memory modules for embodied agents who need something they can drop in and call from an LLM. The benchmark gives more interpretable diagnostics than most current agent evals. It is worth sending to referees because the architecture is explicit, the code is out, and the gap it targets is real, even if the simulation-to-reality step still needs work.

Referee Report

2 major / 3 minor

Summary. The paper presents eMEM, a hybrid graph-based memory system for embodied agents that uses a multi-index architecture (SQLITE for structured data, hnswlib for semantic search, R-tree for spatial queries) unified by a graph model, along with a tiered consolidation pipeline inspired by hippocampal-neocortical processes. It introduces eMEM-Bench, a benchmark over ProcTHOR-10K scenes organized around eight cognitive-psychology paradigms (DRM lures, pattern separation, etc.), reporting an 80.8 weighted mean score over 988 probes, a flat retention curve at ceiling from 1h to 1yr on room-unique items, and ablation results showing a pure RAG baseline loses 30pt on context-dependent retrieval and 29pt on DRM lure rejection.

Significance. If the empirical results and benchmark hold, the work supplies a more diagnostic evaluation framework for agent memory that aligns with human memory-systems literature, while demonstrating concrete gains from multi-layer storage and consolidation over text-stream or flat RAG approaches; the release of code and benchmark supports reproducibility.

major comments (2)

[Benchmark construction (abstract and methods)] The headline performance claims (80.8 weighted mean, flat retention 1h–1yr) and ablation margins rest on eMEM-Bench as a valid proxy for embodied-agent memory demands, yet the benchmark is constructed exclusively in discrete ProcTHOR-10K scenes with perfect perceptual access and synthetic timestamps; the manuscript provides no analysis or experiments addressing how sensor noise, partial observability, motor-induced state changes, or continuous event streams would affect retrieval and consolidation behavior.
[Ablation experiments] The ablation isolating multi-index storage and consolidation (flat_rag loses 30pt on context-dependent retrieval and 29pt on DRM lure rejection) is load-bearing for the architectural contribution claim, but the manuscript does not detail how the ablation controls for confounding factors such as index size, consolidation parameters, or query formulation across the 988 probes.

minor comments (3)

[Results] The weighting scheme used to compute the 80.8 weighted mean across the eight paradigms and 988 probes is not specified, making it difficult to interpret the aggregate score.
[System description] The ten agent-facing recall tools are described at a high level; the manuscript would benefit from explicit pseudocode or interface signatures for the concept-to-location resolution and cross-layer recall primitives.
[Results] The manuscript should include error bars, per-paradigm breakdowns, or statistical tests for the reported ablation differences to strengthen the quantitative claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and indicate the revisions we will make to improve clarity and completeness.

read point-by-point responses

Referee: [Benchmark construction (abstract and methods)] The headline performance claims (80.8 weighted mean, flat retention 1h–1yr) and ablation margins rest on eMEM-Bench as a valid proxy for embodied-agent memory demands, yet the benchmark is constructed exclusively in discrete ProcTHOR-10K scenes with perfect perceptual access and synthetic timestamps; the manuscript provides no analysis or experiments addressing how sensor noise, partial observability, motor-induced state changes, or continuous event streams would affect retrieval and consolidation behavior.

Authors: We agree that eMEM-Bench uses idealized discrete scenes with perfect access and synthetic timestamps. This controlled setup was chosen to enable direct mapping to cognitive-psychology paradigms and isolate memory-system effects. We will add a new limitations subsection discussing how sensor noise, partial observability, and continuous streams could affect performance, along with suggested extensions of the benchmark to those regimes. No new experiments are feasible within the current scope, but the discussion will be added. revision: yes
Referee: [Ablation experiments] The ablation isolating multi-index storage and consolidation (flat_rag loses 30pt on context-dependent retrieval and 29pt on DRM lure rejection) is load-bearing for the architectural contribution claim, but the manuscript does not detail how the ablation controls for confounding factors such as index size, consolidation parameters, or query formulation across the 988 probes.

Authors: The flat_rag baseline was constructed by using a single hnswlib index with the identical embedding model and query templates as eMEM's semantic component, while disabling all consolidation and multi-index routing. Index capacity was matched to the total size of eMEM's three stores, and the same 988 probe queries were executed verbatim. We will expand the methods and appendix with explicit parameter tables, pseudocode for the ablation configuration, and verification steps confirming that only the architectural differences were varied. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system and benchmark results

full rationale

The paper presents a hybrid memory architecture and eMEM-Bench without any mathematical derivations, parameter fitting, or predictive claims that reduce to inputs by construction. Performance numbers (80.8 weighted mean, ablation deltas) are reported from direct evaluation on 988 probes in ProcTHOR-10K scenes; no equations, self-citations, or ansatzes are invoked to derive these quantities from prior fitted values or author theorems. The benchmark construction and consolidation pipeline are described as engineering choices motivated by cognitive literature, not as outputs forced by self-referential definitions. This is the common case of a self-contained empirical contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

This is an engineering system paper rather than a theoretical derivation. The main assumptions are the effectiveness of unifying three standard indexes behind a graph and the utility of the biological consolidation analogy; no free parameters or new entities are introduced in the abstract.

axioms (2)

domain assumption A graph model can usefully unify SQLITE, hnswlib, and R-tree indexes for agent memory queries.
Stated in the description of the multi-index architecture.
ad hoc to paper Tiered consolidation that mirrors hippocampal-neocortical processes will improve agent memory performance.
The pipeline is presented as mirroring biology and evaluated via the benchmark.

pith-pipeline@v0.9.1-grok · 5892 in / 1424 out tokens · 33403 ms · 2026-06-28T09:39:06.100838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

128 extracted references · 2 canonical work pages

[1]

Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST) , year =

Generative Agents: Interactive Simulacra of Human Behavior , author =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST) , year =. 2304.03442 , archivePrefix =

Pith/arXiv arXiv
[2]

and Stoica, Ion and Gonzalez, Joseph E

Packer, Charles and Wooders, Sarah and Lin, Kevin and Fang, Vivian and Patil, Shishir G. and Stoica, Ion and Gonzalez, Joseph E. , year =. 2310.08560 , archivePrefix =

Pith/arXiv arXiv
[3]

2024 , eprint =

Zhong, Wanjun and Guo, Lianghong and Gao, Qiqi and Ye, He and Wang, Yanlin , booktitle =. 2024 , eprint =

2024
[4]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Reflexion: Language Agents with Verbal Reinforcement Learning , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =. 2303.11366 , archivePrefix =

Pith/arXiv arXiv
[5]

Transactions on Machine Learning Research (TMLR) , year =

Voyager: An Open-Ended Embodied Agent with Large Language Models , author =. Transactions on Machine Learning Research (TMLR) , year =. 2305.16291 , archivePrefix =

Pith/arXiv arXiv
[6]

2023 , eprint =

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory , author =. 2023 , eprint =

2023
[7]

2025 , eprint =

Xu, Wujiang and Liang, Zujie and Mei, Kai and Gao, Hang and Tan, Juntao and Zhang, Yongfeng , booktitle =. 2025 , eprint =

2025
[8]

2504.19413 , archivePrefix =

Chhikara, Prateek and Khant, Dev and others , year =. 2504.19413 , archivePrefix =

Pith/arXiv arXiv
[9]

2501.13956 , archivePrefix =

Rasmussen, Preston and others , year =. 2501.13956 , archivePrefix =

Pith/arXiv arXiv
[10]

ACM Transactions on Information Systems , year =

A Survey on the Memory Mechanism of Large Language Model-based Agents , author =. ACM Transactions on Information Systems , year =. 2404.13501 , archivePrefix =

Pith/arXiv arXiv
[11]

Memory in the Age of

Hu, Yujia and others , year =. Memory in the Age of. 2512.13564 , archivePrefix =

Pith/arXiv arXiv
[12]

Hydra: A Real-time Spatial Perception System for

Hughes, Nathan and Chang, Yun and Carlone, Luca , booktitle =. Hydra: A Real-time Spatial Perception System for. 2022 , eprint =

2022
[13]

International Journal of Robotics Research (IJRR) , year =

Foundations of Spatial Perception for Robotics: Hierarchical Representations and Real-Time Systems , author =. International Journal of Robotics Research (IJRR) , year =
[14]

Gu, Qiao and Kuwajerwala, Ali and others , booktitle =
[15]

Rana, Krishan and Haviland, Jesse and Garg, Sourav and Abou-Chakra, Jad and Reid, Ian and Suenderhauf, Niko , booktitle =
[16]

2410.23968 , archivePrefix =

Booker, Matthew and Byrd, Gregory and Kemp, Brendan and Schmidt, Adam and Rivera, Christopher , year =. 2410.23968 , archivePrefix =

arXiv
[17]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Visual Language Maps for Robot Navigation , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =. 2210.05714 , archivePrefix =

arXiv
[18]

2401.12202 , archivePrefix =

Liu, Peiqi and Orru, Yaswanth and Vakil, Jay and Paxton, Chris and Shafiullah, Nur Muhammad Mahi and Pinto, Lerrel , year =. 2401.12202 , archivePrefix =

arXiv
[19]

, journal =

Raychaudhuri, Sonia and Chang, Angel X. , journal =. Semantic Mapping in Indoor Embodied. 2025 , eprint =

2025
[20]

Psychological Review , volume =

Why There Are Complementary Learning Systems in the Hippocampus and Neocortex: Insights from the Successes and Failures of Connectionist Models of Learning and Memory , author =. Psychological Review , volume =
[21]

Trends in Cognitive Sciences , volume =

What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , author =. Trends in Cognitive Sciences , volume =
[22]

Brain Research , volume =

The hippocampus as a spatial map: Preliminary evidence from unit activity in the freely-moving rat , author =. Brain Research , volume =
[23]

Nature , volume =

Microstructure of a spatial map in the entorhinal cortex , author =. Nature , volume =
[24]

Organization of Memory , editor =

Episodic and Semantic Memory , author =. Organization of Memory , editor =
[25]

Elements of Episodic Memory , author =
[26]

Nature Reviews Neuroscience , volume =

The memory function of sleep , author =. Nature Reviews Neuroscience , volume =
[27]

Psychological Research , volume =

System consolidation of memory during sleep , author =. Psychological Research , volume =
[28]

Nature , volume =

Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval , author =. Nature , volume =
[29]

British Journal of Psychology , volume =

Context-dependent memory in two natural environments: On land and underwater , author =. British Journal of Psychology , volume =
[30]

Psychological Review , volume =

Encoding specificity and retrieval processes in episodic memory , author =. Psychological Review , volume =
[31]

Nature Human Behaviour , volume =

A generative model of memory construction and consolidation , author =. Nature Human Behaviour , volume =
[32]

Proceedings of the National Academy of Sciences (PNAS) , volume =

Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence , author =. Proceedings of the National Academy of Sciences (PNAS) , volume =
[33]

Frontiers in Computational Neuroscience , volume =

Memory consolidation from a reinforcement learning perspective , author =. Frontiers in Computational Neuroscience , volume =
[34]

Trends in Cognitive Sciences , volume =

Interoceptive inference, emotion, and the embodied self , author =. Trends in Cognitive Sciences , volume =
[35]

Behavioral and Brain Sciences , volume =

Extending predictive processing to the body: Emotion as interoceptive inference , author =. Behavioral and Brain Sciences , volume =
[36]

Trends in Neurosciences , volume =

Functions of Interoception: From Energy Regulation to Experience of the Self , author =. Trends in Neurosciences , volume =
[37]

Maimon, Asaf and Wald, Ido and Pomarlan, Mihai and Zhang, Sen and Beßler, Daniel and Nolte, Robert and K
[38]

Towards a Synthetic Tutor Assistant: The

Vouloutsi, Vasiliki and others , booktitle =. Towards a Synthetic Tutor Assistant: The
[39]

Peller-Konrad, Fabian and Kartmann, Rainer and Dreher, Christian R. G. and Meixner, Andre and Reister, Fabian and Grotz, Markus and Asfour, Tamim , journal =. A memory system of a robot cognitive architecture and its implementation in
[40]

2023 , eprint =

Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan , booktitle =. 2023 , eprint =

2023
[41]

2023 , eprint =

Ma, Xiaojian and Yong, Silong and Zheng, Zilong and Li, Qing and Liang, Yitao and Zhu, Song-Chun and Huang, Siyuan , booktitle =. 2023 , eprint =

2023
[42]

Evaluating Very Long-Term Conversational Memory of

Maharana, Adyasha and Lee, Dong-Ho and Tulyakov, Sergey and Bansal, Mohit and Barbieri, Francesco and Fang, Yuwei , booktitle =. Evaluating Very Long-Term Conversational Memory of. 2024 , eprint =

2024
[43]

Majumdar, Arjun and Ajay, Anurag and Zhang, Xiaohan and Putta, Pranav and Yenamandra, Sriram and Henaff, Mikael and Silwal, Sneha and Mcvay, Paul and Maksymets, Oleksandr and Arnaud, Sergio and Yadav, Karmesh and Li, Qiyang and Newman, Ben and Sharma, Mohit and Berges, Vincent and Zhang, Shiqi and Agrawal, Pulkit and Bisk, Yonatan and Batra, Dhruv and Kal...
[44]

2506.15635 , archivePrefix =

Yadav, Karmesh and Ali, Yusuf and Gupta, Gunshi and Gal, Yarin and Kira, Zsolt , year =. 2506.15635 , archivePrefix =

arXiv
[45]

Explore with Long-term Memory: A Benchmark and Multimodal

Wang, Shuo and others , year =. Explore with Long-term Memory: A Benchmark and Multimodal. 2601.10744 , archivePrefix =

arXiv
[46]

1712.05474 , archivePrefix =

Kolve, Eric and Mottaghi, Roozbeh and Han, Winson and VanderBilt, Eli and Weihs, Luca and Herrasti, Alvaro and Gordon, Daniel and Zhu, Yuke and Gupta, Abhinav and Farhadi, Ali , year =. 1712.05474 , archivePrefix =

Pith/arXiv arXiv
[47]

The Nature of Explanation , author =
[48]

Nature Reviews Neuroscience , volume =

The free-energy principle: a unified brain theory? , author =. Nature Reviews Neuroscience , volume =
[49]

Journal of The Royal Society Interface , volume =

The Markov blankets of life: autonomy, active inference and the free energy principle , author =. Journal of The Royal Society Interface , volume =
[50]

2018 , eprint =

World Models , author =. 2018 , eprint =

2018
[51]

2022 , howpublished =

A Path Towards Autonomous Machine Intelligence (Version 0.9.2) , author =. 2022 , howpublished =

2022
[52]

Progress in Neurobiology , volume =

Prediction and memory: A predictive coding account , author =. Progress in Neurobiology , volume =
[53]

Surfing Uncertainty: Prediction, Action, and the Embodied Mind , author =
[54]

Neuron , volume =

What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior , author =. Neuron , volume =
[55]

Nature Neuroscience , volume =

The hippocampus as a predictive map , author =. Nature Neuroscience , volume =
[56]

Cell , volume =

The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation , author =. Cell , volume =
[57]

Neural Networks , volume =

World model learning and inference , author =. Neural Networks , volume =
[58]

Advanced Robotics , year =

World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges , author =. Advanced Robotics , year =. 2301.05832 , archivePrefix =

arXiv
[59]

Nature , volume =

Mastering diverse control tasks through world models , author =. Nature , volume =. 2025 , eprint =

2025
[60]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Jim. Advances in Neural Information Processing Systems (NeurIPS) , year =. 2405.14831 , archivePrefix =

arXiv
[61]

2024 , howpublished =

2024
[62]

2025 , eprint =

Training Sparse Mixture of Experts Text Embedding Models , author =. 2025 , eprint =

2025
[63]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , author =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2020 , eprint =

2020
[64]

Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD) , pages =

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , author =. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD) , pages =. 1996 , publisher =

1996
[65]

2019 , eprint =

Reimers, Nils and Gurevych, Iryna , booktitle =. 2019 , eprint =

2019
[66]

and Fischer, Martin and Malik, Jitendra and Savarese, Silvio , booktitle =

Armeni, Iro and He, Zhi-Yang and Gwak, JunYoung and Zamir, Amir R. and Fischer, Martin and Malik, Jitendra and Savarese, Silvio , booktitle =
[67]

Rosinol, Antoni and Violette, Andrew and Abate, Marcus and Hughes, Nathan and Chang, Yun and Shi, Jingnan and Gupta, Arjun and Carlone, Luca , booktitle =
[68]

2022 , url =

Deitke, Matt and VanderBilt, Eli and Herrasti, Alvaro and Weihs, Luca and Salvador, Jordi and Ehsani, Kiana and Han, Winson and Kolve, Eric and Farhadi, Ali and Kembhavi, Aniruddha and Mottaghi, Roozbeh , booktitle =. 2022 , url =

2022
[69]

and McDermott, Kathleen B

Roediger, Henry L. and McDermott, Kathleen B. , journal =. Creating false memories:. 1995 , publisher =

1995
[70]

Trends in Neurosciences , volume =

Pattern separation in the hippocampus , author =. Trends in Neurosciences , volume =. 2011 , publisher =

2011
[71]

Psychological Bulletin , volume =

Source monitoring , author =. Psychological Bulletin , volume =. 1993 , publisher =

1993
[72]

Ebbinghaus, Hermann , year =
[73]

Journal of Experimental Psychology , volume =

The serial position effect of free recall , author =. Journal of Experimental Psychology , volume =. 1962 , publisher =

1962
[74]

1996 , publisher =

Prospective. 1996 , publisher =

1996
[75]

Nature Communications , volume =

Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps , author =. Nature Communications , volume =. 2021 , doi =

2021
[76]

Swaroop and Zhou, Guangyao and Wendelken, Carter and L

Raju, Rajkumar Vasudeva and Guntupalli, J. Swaroop and Zhou, Guangyao and Wendelken, Carter and L. Space is a latent sequence:. Science Advances , volume =. 2024 , doi =

2024
[77]

Benna and Stefano Fusi

Marcus K. Benna and Stefano Fusi. Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence. Proceedings of the National Academy of Sciences (PNAS), 118 0 (51), 2021

2021
[78]

EmbodiedRAG : Dynamic 3D scene graph retrieval for efficient and scalable robot task planning, 2024

Matthew Booker, Gregory Byrd, Brendan Kemp, Adam Schmidt, and Christopher Rivera. EmbodiedRAG : Dynamic 3D scene graph retrieval for efficient and scalable robot task planning, 2024

2024
[79]

System consolidation of memory during sleep

Jan Born and Ines Wilhelm. System consolidation of memory during sleep. Psychological Research, 76: 0 192--203, 2012

2012
[80]

Mem0 : Building production-ready AI agents with scalable long-term memory, 2025

Prateek Chhikara, Dev Khant, et al. Mem0 : Building production-ready AI agents with scalable long-term memory, 2025

2025

Showing first 80 references.

[1] [1]

Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST) , year =

Generative Agents: Interactive Simulacra of Human Behavior , author =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST) , year =. 2304.03442 , archivePrefix =

Pith/arXiv arXiv

[2] [2]

and Stoica, Ion and Gonzalez, Joseph E

Packer, Charles and Wooders, Sarah and Lin, Kevin and Fang, Vivian and Patil, Shishir G. and Stoica, Ion and Gonzalez, Joseph E. , year =. 2310.08560 , archivePrefix =

Pith/arXiv arXiv

[3] [3]

2024 , eprint =

Zhong, Wanjun and Guo, Lianghong and Gao, Qiqi and Ye, He and Wang, Yanlin , booktitle =. 2024 , eprint =

2024

[4] [4]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Reflexion: Language Agents with Verbal Reinforcement Learning , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =. 2303.11366 , archivePrefix =

Pith/arXiv arXiv

[5] [5]

Transactions on Machine Learning Research (TMLR) , year =

Voyager: An Open-Ended Embodied Agent with Large Language Models , author =. Transactions on Machine Learning Research (TMLR) , year =. 2305.16291 , archivePrefix =

Pith/arXiv arXiv

[6] [6]

2023 , eprint =

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory , author =. 2023 , eprint =

2023

[7] [7]

2025 , eprint =

Xu, Wujiang and Liang, Zujie and Mei, Kai and Gao, Hang and Tan, Juntao and Zhang, Yongfeng , booktitle =. 2025 , eprint =

2025

[8] [8]

2504.19413 , archivePrefix =

Chhikara, Prateek and Khant, Dev and others , year =. 2504.19413 , archivePrefix =

Pith/arXiv arXiv

[9] [9]

2501.13956 , archivePrefix =

Rasmussen, Preston and others , year =. 2501.13956 , archivePrefix =

Pith/arXiv arXiv

[10] [10]

ACM Transactions on Information Systems , year =

A Survey on the Memory Mechanism of Large Language Model-based Agents , author =. ACM Transactions on Information Systems , year =. 2404.13501 , archivePrefix =

Pith/arXiv arXiv

[11] [11]

Memory in the Age of

Hu, Yujia and others , year =. Memory in the Age of. 2512.13564 , archivePrefix =

Pith/arXiv arXiv

[12] [12]

Hydra: A Real-time Spatial Perception System for

Hughes, Nathan and Chang, Yun and Carlone, Luca , booktitle =. Hydra: A Real-time Spatial Perception System for. 2022 , eprint =

2022

[13] [13]

International Journal of Robotics Research (IJRR) , year =

Foundations of Spatial Perception for Robotics: Hierarchical Representations and Real-Time Systems , author =. International Journal of Robotics Research (IJRR) , year =

[14] [14]

Gu, Qiao and Kuwajerwala, Ali and others , booktitle =

[15] [15]

Rana, Krishan and Haviland, Jesse and Garg, Sourav and Abou-Chakra, Jad and Reid, Ian and Suenderhauf, Niko , booktitle =

[16] [16]

2410.23968 , archivePrefix =

Booker, Matthew and Byrd, Gregory and Kemp, Brendan and Schmidt, Adam and Rivera, Christopher , year =. 2410.23968 , archivePrefix =

arXiv

[17] [17]

IEEE International Conference on Robotics and Automation (ICRA) , year =

Visual Language Maps for Robot Navigation , author =. IEEE International Conference on Robotics and Automation (ICRA) , year =. 2210.05714 , archivePrefix =

arXiv

[18] [18]

2401.12202 , archivePrefix =

Liu, Peiqi and Orru, Yaswanth and Vakil, Jay and Paxton, Chris and Shafiullah, Nur Muhammad Mahi and Pinto, Lerrel , year =. 2401.12202 , archivePrefix =

arXiv

[19] [19]

, journal =

Raychaudhuri, Sonia and Chang, Angel X. , journal =. Semantic Mapping in Indoor Embodied. 2025 , eprint =

2025

[20] [20]

Psychological Review , volume =

Why There Are Complementary Learning Systems in the Hippocampus and Neocortex: Insights from the Successes and Failures of Connectionist Models of Learning and Memory , author =. Psychological Review , volume =

[21] [21]

Trends in Cognitive Sciences , volume =

What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , author =. Trends in Cognitive Sciences , volume =

[22] [22]

Brain Research , volume =

The hippocampus as a spatial map: Preliminary evidence from unit activity in the freely-moving rat , author =. Brain Research , volume =

[23] [23]

Nature , volume =

Microstructure of a spatial map in the entorhinal cortex , author =. Nature , volume =

[24] [24]

Organization of Memory , editor =

Episodic and Semantic Memory , author =. Organization of Memory , editor =

[25] [25]

Elements of Episodic Memory , author =

[26] [26]

Nature Reviews Neuroscience , volume =

The memory function of sleep , author =. Nature Reviews Neuroscience , volume =

[27] [27]

Psychological Research , volume =

System consolidation of memory during sleep , author =. Psychological Research , volume =

[28] [28]

Nature , volume =

Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval , author =. Nature , volume =

[29] [29]

British Journal of Psychology , volume =

Context-dependent memory in two natural environments: On land and underwater , author =. British Journal of Psychology , volume =

[30] [30]

Psychological Review , volume =

Encoding specificity and retrieval processes in episodic memory , author =. Psychological Review , volume =

[31] [31]

Nature Human Behaviour , volume =

A generative model of memory construction and consolidation , author =. Nature Human Behaviour , volume =

[32] [32]

Proceedings of the National Academy of Sciences (PNAS) , volume =

Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence , author =. Proceedings of the National Academy of Sciences (PNAS) , volume =

[33] [33]

Frontiers in Computational Neuroscience , volume =

Memory consolidation from a reinforcement learning perspective , author =. Frontiers in Computational Neuroscience , volume =

[34] [34]

Trends in Cognitive Sciences , volume =

Interoceptive inference, emotion, and the embodied self , author =. Trends in Cognitive Sciences , volume =

[35] [35]

Behavioral and Brain Sciences , volume =

Extending predictive processing to the body: Emotion as interoceptive inference , author =. Behavioral and Brain Sciences , volume =

[36] [36]

Trends in Neurosciences , volume =

Functions of Interoception: From Energy Regulation to Experience of the Self , author =. Trends in Neurosciences , volume =

[37] [37]

Maimon, Asaf and Wald, Ido and Pomarlan, Mihai and Zhang, Sen and Beßler, Daniel and Nolte, Robert and K

[38] [38]

Towards a Synthetic Tutor Assistant: The

Vouloutsi, Vasiliki and others , booktitle =. Towards a Synthetic Tutor Assistant: The

[39] [39]

Peller-Konrad, Fabian and Kartmann, Rainer and Dreher, Christian R. G. and Meixner, Andre and Reister, Fabian and Grotz, Markus and Asfour, Tamim , journal =. A memory system of a robot cognitive architecture and its implementation in

[40] [40]

2023 , eprint =

Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan , booktitle =. 2023 , eprint =

2023

[41] [41]

2023 , eprint =

Ma, Xiaojian and Yong, Silong and Zheng, Zilong and Li, Qing and Liang, Yitao and Zhu, Song-Chun and Huang, Siyuan , booktitle =. 2023 , eprint =

2023

[42] [42]

Evaluating Very Long-Term Conversational Memory of

Maharana, Adyasha and Lee, Dong-Ho and Tulyakov, Sergey and Bansal, Mohit and Barbieri, Francesco and Fang, Yuwei , booktitle =. Evaluating Very Long-Term Conversational Memory of. 2024 , eprint =

2024

[43] [43]

Majumdar, Arjun and Ajay, Anurag and Zhang, Xiaohan and Putta, Pranav and Yenamandra, Sriram and Henaff, Mikael and Silwal, Sneha and Mcvay, Paul and Maksymets, Oleksandr and Arnaud, Sergio and Yadav, Karmesh and Li, Qiyang and Newman, Ben and Sharma, Mohit and Berges, Vincent and Zhang, Shiqi and Agrawal, Pulkit and Bisk, Yonatan and Batra, Dhruv and Kal...

[44] [44]

2506.15635 , archivePrefix =

Yadav, Karmesh and Ali, Yusuf and Gupta, Gunshi and Gal, Yarin and Kira, Zsolt , year =. 2506.15635 , archivePrefix =

arXiv

[45] [45]

Explore with Long-term Memory: A Benchmark and Multimodal

Wang, Shuo and others , year =. Explore with Long-term Memory: A Benchmark and Multimodal. 2601.10744 , archivePrefix =

arXiv

[46] [46]

1712.05474 , archivePrefix =

Kolve, Eric and Mottaghi, Roozbeh and Han, Winson and VanderBilt, Eli and Weihs, Luca and Herrasti, Alvaro and Gordon, Daniel and Zhu, Yuke and Gupta, Abhinav and Farhadi, Ali , year =. 1712.05474 , archivePrefix =

Pith/arXiv arXiv

[47] [47]

The Nature of Explanation , author =

[48] [48]

Nature Reviews Neuroscience , volume =

The free-energy principle: a unified brain theory? , author =. Nature Reviews Neuroscience , volume =

[49] [49]

Journal of The Royal Society Interface , volume =

The Markov blankets of life: autonomy, active inference and the free energy principle , author =. Journal of The Royal Society Interface , volume =

[50] [50]

2018 , eprint =

World Models , author =. 2018 , eprint =

2018

[51] [51]

2022 , howpublished =

A Path Towards Autonomous Machine Intelligence (Version 0.9.2) , author =. 2022 , howpublished =

2022

[52] [52]

Progress in Neurobiology , volume =

Prediction and memory: A predictive coding account , author =. Progress in Neurobiology , volume =

[53] [53]

Surfing Uncertainty: Prediction, Action, and the Embodied Mind , author =

[54] [54]

Neuron , volume =

What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior , author =. Neuron , volume =

[55] [55]

Nature Neuroscience , volume =

The hippocampus as a predictive map , author =. Nature Neuroscience , volume =

[56] [56]

Cell , volume =

The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation , author =. Cell , volume =

[57] [57]

Neural Networks , volume =

World model learning and inference , author =. Neural Networks , volume =

[58] [58]

Advanced Robotics , year =

World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges , author =. Advanced Robotics , year =. 2301.05832 , archivePrefix =

arXiv

[59] [59]

Nature , volume =

Mastering diverse control tasks through world models , author =. Nature , volume =. 2025 , eprint =

2025

[60] [60]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Jim. Advances in Neural Information Processing Systems (NeurIPS) , year =. 2405.14831 , archivePrefix =

arXiv

[61] [61]

2024 , howpublished =

2024

[62] [62]

2025 , eprint =

Training Sparse Mixture of Experts Text Embedding Models , author =. 2025 , eprint =

2025

[63] [63]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , author =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2020 , eprint =

2020

[64] [64]

Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD) , pages =

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , author =. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD) , pages =. 1996 , publisher =

1996

[65] [65]

2019 , eprint =

Reimers, Nils and Gurevych, Iryna , booktitle =. 2019 , eprint =

2019

[66] [66]

and Fischer, Martin and Malik, Jitendra and Savarese, Silvio , booktitle =

Armeni, Iro and He, Zhi-Yang and Gwak, JunYoung and Zamir, Amir R. and Fischer, Martin and Malik, Jitendra and Savarese, Silvio , booktitle =

[67] [67]

Rosinol, Antoni and Violette, Andrew and Abate, Marcus and Hughes, Nathan and Chang, Yun and Shi, Jingnan and Gupta, Arjun and Carlone, Luca , booktitle =

[68] [68]

2022 , url =

Deitke, Matt and VanderBilt, Eli and Herrasti, Alvaro and Weihs, Luca and Salvador, Jordi and Ehsani, Kiana and Han, Winson and Kolve, Eric and Farhadi, Ali and Kembhavi, Aniruddha and Mottaghi, Roozbeh , booktitle =. 2022 , url =

2022

[69] [69]

and McDermott, Kathleen B

Roediger, Henry L. and McDermott, Kathleen B. , journal =. Creating false memories:. 1995 , publisher =

1995

[70] [70]

Trends in Neurosciences , volume =

Pattern separation in the hippocampus , author =. Trends in Neurosciences , volume =. 2011 , publisher =

2011

[71] [71]

Psychological Bulletin , volume =

Source monitoring , author =. Psychological Bulletin , volume =. 1993 , publisher =

1993

[72] [72]

Ebbinghaus, Hermann , year =

[73] [73]

Journal of Experimental Psychology , volume =

The serial position effect of free recall , author =. Journal of Experimental Psychology , volume =. 1962 , publisher =

1962

[74] [74]

1996 , publisher =

Prospective. 1996 , publisher =

1996

[75] [75]

Nature Communications , volume =

Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps , author =. Nature Communications , volume =. 2021 , doi =

2021

[76] [76]

Swaroop and Zhou, Guangyao and Wendelken, Carter and L

Raju, Rajkumar Vasudeva and Guntupalli, J. Swaroop and Zhou, Guangyao and Wendelken, Carter and L. Space is a latent sequence:. Science Advances , volume =. 2024 , doi =

2024

[77] [77]

Benna and Stefano Fusi

Marcus K. Benna and Stefano Fusi. Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence. Proceedings of the National Academy of Sciences (PNAS), 118 0 (51), 2021

2021

[78] [78]

EmbodiedRAG : Dynamic 3D scene graph retrieval for efficient and scalable robot task planning, 2024

Matthew Booker, Gregory Byrd, Brendan Kemp, Adam Schmidt, and Christopher Rivera. EmbodiedRAG : Dynamic 3D scene graph retrieval for efficient and scalable robot task planning, 2024

2024

[79] [79]

System consolidation of memory during sleep

Jan Born and Ines Wilhelm. System consolidation of memory during sleep. Psychological Research, 76: 0 192--203, 2012

2012

[80] [80]

Mem0 : Building production-ready AI agents with scalable long-term memory, 2025

Prateek Chhikara, Dev Khant, et al. Mem0 : Building production-ready AI agents with scalable long-term memory, 2025

2025