arxiv: 2604.12034 · v1 · submitted 2026-04-13 · 💻 cs.AI

Recognition: unknown

Memory as Metabolism: A Design for Companion Knowledge Systems

Stefan Miteski

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:42 UTC · model grok-4.3

classification 💻 cs.AI

keywords companion knowledge systemsLLM memorypersonal knowledge wikisentrenchmentevidence accumulationmemory operationsknowledge governanceepistemic failures

0 comments

The pith

Personal LLM knowledge wikis need five metabolic operations to let accumulated contradictory evidence update entrenched dominant interpretations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that personal memory systems built on the LLM wiki pattern should function as companion systems whose job is to mirror the user's working vocabulary and context continuity while compensating for the epistemic failure of entrenchment. It proposes five operations—TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT—supported by memory gravity and minority-hypothesis retention as the mechanism that creates a structural path for contradictory evidence to build pressure across multiple cycles and eventually revise centrality-protected interpretations. A sympathetic reader would care because without such a path, single-user wikis ossify around initial views and suppress new evidence, reducing their value as long-term companions. The design supplies normative obligations, time-structured rules, and conformance invariants targeted at this specific failure mode of entrenchment under user-coupled drift.

Core claim

The paper claims that memory in companion knowledge systems should operate like metabolism by applying TRIAGE to classify inputs, DECAY to manage retention over time, CONTEXTUALIZE to embed relational links, CONSOLIDATE to integrate stable structures, and AUDIT to review for drift, all reinforced by memory gravity that pulls toward central elements and retention of minority hypotheses. This combination produces a multi-cycle buffer pressure mechanism so that accumulated contradictory evidence gains a structural route to updating a dominant interpretation that would otherwise remain protected by centrality, a failure mode no existing benchmark is designed to detect.

What carries the argument

The five operations TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT together with memory gravity and minority-hypothesis retention, which together generate accumulating buffer pressure that can revise centrality-protected interpretations.

If this is right

Contradictory evidence can accumulate across cycles without immediate suppression because minority hypotheses are retained.
Dominant interpretations become revisable once multi-cycle buffer pressure reaches a threshold set by the operations.
The system supplies a governance profile with time-structured procedural rules and testable conformance invariants for single-agent memory.
Personal wikis can maintain continuity with user vocabulary and structure while actively countering epistemic ossification.
Partial safety at the single-agent level follows from reduced suppression of new evidence, though the paper states this does not solve broader agent governance questions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The buffer-pressure idea could be adapted to multi-agent memory settings to reduce collective entrenchment, though the paper restricts itself to single-user cases.
Explicit accumulation mechanics might inspire new evaluation benchmarks that measure whether evidence actually forces interpretation updates rather than just retrieval accuracy.
Treating memory as metabolism suggests parallels with homeostatic control in other computational systems, where decay and audit steps prevent runaway stability.
If the conformance invariants prove workable, they could serve as a template for governance rules in other persistent LLM artifacts beyond personal wikis.

Load-bearing premise

That the five named operations together with memory gravity and minority-hypothesis retention can be realized in existing LLM wiki architectures and will produce the claimed structural path for evidence accumulation.

What would settle it

A controlled multi-cycle test that injects streams of contradictory evidence into a wiki built with the five operations and checks whether the dominant interpretation updates only after buffer pressure accumulates or remains unchanged despite the operations running.

read the original abstract

Retrieval-Augmented Generation remains the dominant pattern for giving LLMs persistent memory, but a visible cluster of personal wiki-style memory architectures emerged in April 2026 -- design proposals from Karpathy, MemPalace, and LLM Wiki v2 that compile knowledge into an interlinked artifact for long-term use by a single user. They sit alongside production memory systems that the major labs have shipped for over a year, and an active academic lineage including MemGPT, Generative Agents, Mem0, Zep, A-Mem, MemMachine, SleepGate, and Second Me. Within a 2026 landscape of emerging governance frameworks for agent context and memory -- including Context Cartography and MemOS -- this paper proposes a companion-specific governance profile: a set of normative obligations, a time-structured procedural rule, and testable conformance invariants for the specific failure mode of entrenchment under user-coupled drift in single-user knowledge wikis built on the LLM wiki pattern. The design principle is that personal LLM memory is a companion system: its job is to mirror the user on operational dimensions (working vocabulary, load-bearing structure, continuity of context) and compensate on epistemic failure modes (entrenchment, suppression of contradicting evidence, Kuhnian ossification). Five operations implement this split -- TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, AUDIT -- supported by memory gravity and minority-hypothesis retention. The sharpest prediction: accumulated contradictory evidence should have a structural path to updating a centrality-protected dominant interpretation through multi-cycle buffer pressure accumulation, a failure mode no existing benchmark captures. The safety story at the single-agent level is partial, and the paper is explicit about what it does and does not solve.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A clear design sketch for handling entrenchment in personal LLM wikis, but the key pressure mechanism stays too abstract to check.

read the letter

The paper's main contribution is a governance profile for companion-style LLM memory systems. It takes the recent cluster of wiki-style architectures and adds five operations plus supporting rules aimed at letting contradictory evidence gradually shift dominant interpretations instead of letting them ossify under user drift. That target is stated plainly and tied to a single-agent safety angle that existing systems do not explicitly address. The synthesis of prior work (MemGPT, Generative Agents, the new wiki proposals) is useful for context, and the split between mirroring the user and compensating for epistemic failure modes is a reasonable organizing principle. The prediction about multi-cycle buffer pressure is the sharpest part of the piece and names a failure mode that current benchmarks miss. The authors are also upfront about scope, which keeps the claims grounded. The soft spot is that the operations remain at the level of named functions without the data structures, priority rules, or interaction invariants needed to guarantee the claimed accumulation dynamic. Nothing in the description shows why TRIAGE plus DECAY and the rest would force pressure on centrality-protected items rather than permit implementations that simply ignore minority hypotheses. This gap is internal to the proposal and makes the central claim hard to evaluate or implement as stated. The work is aimed at researchers and builders working on long-term personal agent memory, especially those already thinking about governance layers. It is not a finished system or an empirical result, but the framing is honest enough that a serious referee could push the authors on the missing mechanics and still find value in the overall direction. I would send it to peer review.

Referee Report

3 major / 3 minor

Summary. The paper proposes a companion-specific governance profile for single-user LLM wiki-style memory systems to address entrenchment under user-coupled drift. It defines five operations (TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, AUDIT) supported by memory gravity and minority-hypothesis retention, with the central claim that their combination yields a structural path allowing accumulated contradictory evidence to update centrality-protected dominant interpretations via multi-cycle buffer pressure accumulation—a failure mode not captured by existing benchmarks.

Significance. If the proposed operations can be realized with the claimed dynamics, the design would supply normative obligations and testable conformance invariants for epistemic failure modes in personal knowledge systems, extending beyond current RAG and wiki patterns (e.g., MemGPT, Generative Agents) by explicitly compensating for Kuhnian ossification in user-coupled settings. The emphasis on falsifiable predictions and partial safety scoping is a strength for a design paper.

major comments (3)

[§3] §3 (Design Principle and Operations): The claim that TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT plus memory gravity and minority-hypothesis retention produce multi-cycle buffer pressure accumulation is stated at the level of intended outcome; no data structures, priority functions, update rules, or interaction invariants are supplied that would guarantee pressure on centrality-protected interpretations rather than permitting insulated implementations.
[§4] §4 (Sharpest Prediction): The prediction that accumulated contradictory evidence has a structural path to updating dominant interpretations is presented as a direct consequence of the design but lacks a concrete derivation, parameter-free mechanism, or proposed benchmark that would allow independent verification or falsification of the accumulation dynamic.
[§2] §2 (Related Work and Landscape): While the paper positions the proposal against MemGPT, Zep, and emerging governance frameworks like MemOS, it does not specify how the five operations differ mechanically from existing decay or consolidation heuristics in those systems, leaving the novelty of the pressure-accumulation path underspecified.

minor comments (3)

[Abstract and §1] The abstract and introduction use 'memory gravity' and 'minority-hypothesis retention' without initial formal definitions; a dedicated notation subsection would improve readability.
[Figure 1 or §3.3] Figure 1 (if present) or the procedural rule diagram would benefit from explicit arrows showing buffer pressure flow across cycles to match the textual description.
[Safety Story] The safety story section could add a short table contrasting solved vs. unsolved failure modes for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our design paper. We address each major point below, agreeing where additional detail is needed and outlining the revisions to make the proposal more concrete and verifiable.

read point-by-point responses

Referee: [§3] §3 (Design Principle and Operations): The claim that TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, and AUDIT plus memory gravity and minority-hypothesis retention produce multi-cycle buffer pressure accumulation is stated at the level of intended outcome; no data structures, priority functions, update rules, or interaction invariants are supplied that would guarantee pressure on centrality-protected interpretations rather than permitting insulated implementations.

Authors: The manuscript is intentionally positioned at the level of design principles and normative obligations rather than a full implementation specification. However, we agree that to support the claim of guaranteed pressure accumulation, additional structure is required. In the revised version, we will expand §3 with pseudocode outlines for the operations, explicit priority functions incorporating memory gravity (e.g., decay rate modulated by centrality and contradiction count), and interaction invariants such as 'minority hypotheses must be retained for at least N cycles before consolidation' and 'buffer pressure threshold triggers AUDIT'. This will prevent insulated implementations by enforcing the accumulation dynamic. revision: yes
Referee: [§4] §4 (Sharpest Prediction): The prediction that accumulated contradictory evidence has a structural path to updating dominant interpretations is presented as a direct consequence of the design but lacks a concrete derivation, parameter-free mechanism, or proposed benchmark that would allow independent verification or falsification of the accumulation dynamic.

Authors: We acknowledge that while the prediction follows from the described interactions, a more explicit derivation and falsifiable mechanism would strengthen the paper. We will revise §4 to include a step-by-step derivation showing how repeated TRIAGE and DECAY cycles build buffer pressure until it overcomes centrality protection via CONSOLIDATE and AUDIT. Additionally, we propose a parameter-free benchmark: a simulated environment with a dominant hypothesis and injected contradictions, measuring the number of operation cycles until the dominant interpretation updates, with the prediction that the design reduces this cycle count compared to baseline decay-only systems. revision: partial
Referee: [§2] §2 (Related Work and Landscape): While the paper positions the proposal against MemGPT, Zep, and emerging governance frameworks like MemOS, it does not specify how the five operations differ mechanically from existing decay or consolidation heuristics in those systems, leaving the novelty of the pressure-accumulation path underspecified.

Authors: We will enhance §2 with a dedicated comparison subsection. This will detail mechanical differences, such as: our DECAY is not a simple time-based decay but weighted by memory gravity and paired with minority-hypothesis retention to ensure contradictions are not discarded; CONSOLIDATE is conditioned on AUDIT results to force re-evaluation of dominant structures, unlike the heuristic consolidation in MemGPT or Zep. The novelty lies in the explicit multi-cycle pressure accumulation path for Kuhnian ossification, which is not a design goal in the referenced systems. revision: yes

Circularity Check

0 steps flagged

No circularity: design proposal with stated goals, not a derivation reducing to inputs

full rationale

The paper presents a conceptual design for companion knowledge systems, naming five operations (TRIAGE, DECAY, CONTEXTUALIZE, CONSOLIDATE, AUDIT) plus supporting mechanisms and stating that their combination should enable multi-cycle buffer pressure on entrenched interpretations. This is framed as a design principle and intended outcome rather than a mathematical derivation or fitted prediction from prior equations. No self-citations, uniqueness theorems, or ansatzes from the authors' prior work are invoked as load-bearing justifications in the abstract or described structure. The 'sharpest prediction' is explicitly the design's target behavior, not an independent result claimed to follow from external premises. Since the manuscript supplies no equations, parameter fits, or self-referential reductions that would make the central claim equivalent to its own inputs by construction, the proposal remains self-contained as a normative design sketch without circularity in its reasoning chain.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The design rests on several domain assumptions about LLM memory failure modes and introduces new conceptual entities without external evidence or prior literature grounding.

axioms (2)

domain assumption Personal LLM memory is a companion system whose job is to mirror the user on operational dimensions and compensate on epistemic failure modes.
Stated as the core design principle in the abstract.
domain assumption Entrenchment under user-coupled drift is the primary failure mode to address in single-user knowledge wikis.
Assumed without citation or data as the target problem.

invented entities (2)

memory gravity no independent evidence
purpose: Mechanism to support the five operations in maintaining the mirror-and-compensate split.
New term introduced to implement the proposed design.
minority-hypothesis retention no independent evidence
purpose: Mechanism to prevent suppression of contradicting evidence.
New term introduced to implement the proposed design.

pith-pipeline@v0.9.0 · 5602 in / 1583 out tokens · 42872 ms · 2026-05-10T15:42:42.042304+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 29 canonical work pages · 13 internal anchors

[1]

R., & Lebiere, C

Anderson, J. R., & Lebiere, C. (1998). The Atomic Components of Thought . Lawrence Erlbaum Associates

1998
[2]

Bonawitz, K., et al. (2019). Towards federated learning at scale: A system design. In Proceedings of MLSys 2019 . arXiv:1902.01046

work page arXiv 2019
[3]

Brusilovsky, P. (2001). Adaptive hypermedia. User Modeling and User-Adapted Interaction , 11, 87–110

2001
[4]

Chhikara, P., Khant, D., Aryan, S., Singh, T., & Yadav, D. (2025). Mem0: Building production- ready AI agents with scalable long-term memory. arXiv:2504.19413

work page internal anchor Pith review arXiv 2025
[5]

Dewey, J. (1938). Logic: The Theory of Inquiry . Henry Holt and Company

1938
[6]

Doyle, J. (1979). A truth maintenance system. Artificial Intelligence, 12(3), 231–272

1979
[7]

Ebbinghaus, H. (1885). Über das Gedächtnis . Duncker & Humblot
[8]

Fang, J., Deng, X., Xu, H., Jiang, Z., Tang, Y., Xu, Z., Deng, S., Yao, Y., Wang, M., Qiao, S., Chen, H., & Zhang, N. (2026). LightMem: Lightweight and eﬀicient memory-augmented generation. ICLR 2026 . arXiv:2510.18866

work page arXiv 2026
[9]

Ford, N., Parsons, R., & Kua, P. (2017). Building Evolutionary Architectures: Support Constant Change. O’Reilly Media

2017
[10]

Gärdenfors, P., & Makinson, D. (1988). Revisions of knowledge systems using epistemic entrenchment. In Proceedings TARK ’88, 83–95

1988
[11]

Goel, R. (2026). LLM Wiki v2 [GitHub gist]. https://gist.github.com/rohitg00/2067ab416f7bbe447c1977edaaa681e2

2026
[12]

Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. PNAS, 102(46), 16569–16572

2005
[13]

Hu, Y., Liu, S., Yue, Y., Zhang, G., et al. (2025). Memory in the Age of AI Agents. arXiv:2512.13564

work page internal anchor Pith review arXiv 2025
[14]

Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., Dwivedi-Yu, J., Joulin, A., Riedel, S., & Grave, E. (2022). Few-shot learning with retrieval augmented language models. arXiv:2208.03299. 37

work page arXiv 2022
[15]

James, W. (1907). Pragmatism: A New Name for Some Old Ways of Thinking . Longmans, Green, and Co

1907
[16]

Jia, Z., Li, J., Kang, Y., Wang, Y., Wu, T., Wang, Q., Wang, X., Zhang, S., Shen, J., Li, Q., Qi, S., Liang, Y., He, D., Zheng, Z., & Zhu, S.-C. (2025). The AI Hippocampus: How far are we from human memory? TMLR. arXiv:2601.09113

work page arXiv 2025
[17]

Jovovich, M., & Sigman, B. (2026). MemPalace v3.0.0 [GitHub repository]. https://github.com/milla- jovovich/mempalace/releases/tag/v3.0.0

2026
[18]

S., Lydon-Staley, D

Ju, H., Zhou, D., Blevins, A. S., Lydon-Staley, D. M., Kaplan, J., Tuma, J. R., & Bassett, D. S. (2020). The network structure of scientific revolutions. arXiv:2010.08381

work page arXiv 2020
[19]

S., Lydon-Staley, D

Ju, H., Zhou, D., Blevins, A. S., Lydon-Staley, D. M., Kaplan, J., Tuma, J. R., & Bassett, D. S. (2022). Historical growth of concept networks in Wikipedia. Collective Intelligence , 1(2)

2022
[20]

Karpathy, A. (2026). LLM Wiki: A pattern for building personal knowledge bases using LLMs [GitHub gist]. https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f

2026
[21]

Kuhn, T. S. (1962). The Structure of Scientific Revolutions . University of Chicago Press

1962
[22]

Optical Context Compression Is Just (Bad) Autoencoding

Lee, I. Y., Yang, C., & Berg-Kirkpatrick, T. (2025). Optical Context Compression Is Just (Bad) Autoencoding. arXiv:2512.03643

work page internal anchor Pith review Pith/arXiv arXiv 2025
[23]

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS, 33, 9459–9474

2020
[24]

Li, Z., Xi, C., Li, C., Chen, D., Chen, B., Song, S., Niu, S., Wang, H., et al. (2025). MemOS: A Memory OS for AI System. arXiv:2507.03724

work page internal anchor Pith review arXiv 2025
[25]

Liu, F., & Qiu, H. (2025). Context Cascade Compression: Exploring the Upper Limits of Text Compression. arXiv:2511.15244

work page arXiv 2025
[26]

Mani, I. (2001). Automatic Summarization . John Benjamins

2001
[27]

L., McNaughton, B

McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex. Psychological Review, 102(3), 419–457

1995
[28]

Men, X., Xu, M., Zhang, Q., Wang, B., Lin, H., Lu, Y., Han, X., & Chen, W. (2024). ShortGPT: Layers in large language models are more redundant than you expect. arXiv:2403.03853

work page arXiv 2024
[29]

Nenkova, A., & McKeown, K. (2011). Automatic summarization. Foundations and Trends in Information Retrieval , 5(2–3), 103–233

2011
[30]

T., Kim, N., Gwak, M., Chae, H., Kwon, T., Jo, Y., Hwang, S., Lee, D., & Yeo, J

Ong, K. T., Kim, N., Gwak, M., Chae, H., Kwon, T., Jo, Y., Hwang, S., Lee, D., & Yeo, J. (2025). Towards lifelong dialogue agents via timeline-based memory management. In Proceedings of NAACL 2025 . arXiv:2406.10996

work page arXiv 2025
[31]

MemGPT: Towards LLMs as Operating Systems

Packer, C., Wooders, S., Lin, K., Fang, V., Patil, S. G., Stoica, I., & Gonzalez, J. E. (2023). MemGPT: Towards LLMs as operating systems. arXiv:2310.08560

work page internal anchor Pith review Pith/arXiv arXiv 2023
[32]

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web . Stanford InfoLab. 38

1999
[33]

Generative Agents: Interactive Simulacra of Human Behavior

Park, J. S., O’Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. In Proceedings of UIST 2023 . arXiv:2304.03442

work page internal anchor Pith review arXiv 2023
[34]

Peirce, C. S. (1878). How to make our ideas clear. Popular Science Monthly , 12, 286–302
[35]

Planck, M. (1950). Scientific Autobiography and Other Papers . Williams & Norgate

1950
[36]

Qian, C., Parisi, A., Bouleau, C., Tsai, V., Lebreton, M., & Dixon, L. (2025). To mask or to mirror: Human-AI alignment in collective reasoning. In Proceedings of EMNLP 2025 . arXiv:2510.01924

work page arXiv 2025
[37]

Shi, W., Gao, M., Xu, Z., Feng, S., Xu, W., Shi, P., Zettlemoyer, L., & Tsvetkov, Y. (2024). LongMemEval: Benchmarking chat assistants on long-term interactive memory. arXiv:2410.10813

work page internal anchor Pith review arXiv 2024
[38]

Khemani, S. (2025). Reverse-engineering ChatGPT’s memory architecture [community analysis; not oﬀicial OpenAI documentation]. https://www.shloked.com/writing/chatgpt-memory-bitter- lesson (archived: https://web.archive.org/web/20260413152757/https://www.shloked.com/writing/chatgpt- memory-bitter-lesson)

work page arXiv 2025
[39]

Tononi, G., & Cirelli, C. (2014). Sleep and the price of plasticity. Neuron, 81(1), 12–34

2014
[40]

Tulving, E. (1972). Episodic and semantic memory. In Organization of Memory , Academic Press

1972
[41]

Wang, S., Yu, E., Love, O., Zhang, T., Wong, T., Scargall, S., & Fan, C. (2026). MemMachine: A Ground-Truth-Preserving Memory System for Personalized AI Agents. arXiv:2604.04853

work page internal anchor Pith review Pith/arXiv arXiv 2026
[42]

Wei, H., Sun, Y., & Li, Y. (2025). DeepSeek-OCR: Contexts Optical Compression. arXiv:2510.18234

work page internal anchor Pith review arXiv 2025
[43]

Wei, J., Ying, X., Gao, T., Bao, F., Tao, F., & Shang, J. (2025). AI-native memory 2.0: Second Me. arXiv:2503.08102

work page arXiv 2025
[44]

Wu, Y., Liang, S., Zhang, C., Wang, Y., Zhang, Y., Guo, H., Tang, R., & Liu, Y. (2025). From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs. arXiv:2504.15965

work page arXiv 2025
[45]

Wu, Z., & Gartner, G. (2026). Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems. arXiv:2603.20578

work page arXiv 2026
[46]

Xie, Y. (2026). Learning to forget: Sleep-inspired memory consolidation for resolving proactive interference in large language models. arXiv:2603.14517

work page arXiv 2026
[47]

Xu, W., Liang, Z., Mei, K., Gao, H., Tan, J., & Zhang, Y. (2025). A-MEM: Agentic Memory for LLM Agents. arXiv:2502.12110

work page internal anchor Pith review arXiv 2025
[48]

You, Z., Yuan, J., & Cai, J. (2026). D-Mem: A dual-process memory system for LLM agents. arXiv:2603.18631

work page arXiv 2026
[49]

Zadeh, L. A. (1965). Fuzzy sets. Information and Control , 8(3), 338–353

1965
[50]

Zep AI. (2025). Zep: A temporal knowledge graph architecture for agent memory. arXiv:2501.13956

work page internal anchor Pith review arXiv 2025
[51]

Zhang, Z., Bo, X., Ma, C., Li, R., Chen, X., Dai, Q., Zhu, J., Dong, Z., & Wen, J.-R. (2024). A Survey on the Memory Mechanism of Large Language Model based Agents. arXiv:2404.13501. 39

work page internal anchor Pith review arXiv 2024
[52]

Zhong, W., Guo, L., Gao, Q., Ye, H., & Wang, Y. (2023). MemoryBank: Enhancing large language models with long-term memory. arXiv:2305.10250

work page arXiv 2023
[53]

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

Zhou, C., Chai, H., Chen, W., Guo, Z., Shan, R., Song, Y., Xu, T., Yang, Y., Yu, A., Zhang, W., Zheng, C., Zhu, J., Zheng, Z., Zhang, Z., Lou, X., Zhang, C., Fu, Z., Wang, J., Liu, W., Lin, J., & Zhang, W. (2026). Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering. arXiv:2604.08224. Acknowledgments This pa...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.5281/zenodo.19501651 2026