arxiv: 2604.20117 · v1 · submitted 2026-04-22 · 💻 cs.CL

Recognition: unknown

To Know is to Construct: Schema-Constrained Generation for Agent Memory

Lei Zheng , Weinan Song , Daili Li , Yanming Yang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:01 UTC · model grok-4.3

classification 💻 cs.CL

keywords agent memoryschema-constrained generationstructural hallucinationscognitive schemasLLM agentsmemory architectureconstructivist memory

0 comments

The pith

Agent memory improves when LLM generation is constrained to valid keys from a dynamic cognitive schema rather than open retrieval or free generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that dense retrieval in agent memory systems creates noise by pulling context-mismatched entries that share semantic overlap. It proposes instead that memory is organized by cognitive schemas and that recall should be a generative process strictly limited to those structures. SCG-MEM maintains a dynamic Cognitive Schema that forces the LLM to decode only existing valid memory entry keys, which the authors say supplies a formal guarantee against structural hallucinations where non-existent keys are invented. Memory evolves through assimilation of new inputs into current schemas and accommodation by expanding schemas when needed, while an associative graph supports multi-hop reasoning. Tests on the LoCoMo benchmark show consistent gains over retrieval baselines in all evaluated categories.

Core claim

SCG-MEM reformulates memory access as Schema-Constrained Generation. By maintaining a dynamic Cognitive Schema, the system strictly constrains LLM decoding to generate only valid memory entry keys. This provides a formal guarantee against structural hallucinations. Memory updates occur via assimilation of inputs into existing schemas and accommodation to expand schemas with novel concepts, while an Associative Graph enables multi-hop reasoning through activation propagation.

What carries the argument

Schema-Constrained Generation using a dynamic Cognitive Schema that restricts decoding to only valid existing memory entry keys.

If this is right

Constrained decoding eliminates structural hallucinations by construction because only existing keys can be produced.
Memory stays consistent over long interactions through assimilation of familiar inputs and accommodation to add new schema elements.
The Associative Graph allows activation to spread across related entries for multi-hop inference without separate retrieval steps.
Performance gains appear across all task categories on the LoCoMo long-context memory benchmark compared with retrieval baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same constrained-generation idea could be applied to other agent outputs such as plan steps or tool calls to reduce invalid actions.
This memory design may scale to agents that maintain separate schemas for different domains without requiring full retraining.
If schemas can be learned from scratch rather than initialized, the method might extend to open-ended lifelong learning settings.

Load-bearing premise

Memory recall works best as a generative process bounded inside pre-defined cognitive schemas instead of free retrieval or unconstrained generation.

What would settle it

An experiment in which an unconstrained generative memory model or a pure retrieval model achieves higher accuracy and fewer lookup failures than SCG-MEM on the LoCoMo benchmark.

Figures

Figures reproduced from arXiv: 2604.20117 by Daili Li, Lei Zheng, Weinan Song, Yanming Yang.

**Figure 1.** Figure 1: Comparison of Memory Access Paradigms. (a) Dense Retrieval: Encodes queries and memory into vectors and retrieves top-k entries via similarity matching. While structurally safe (kˆ ∈ S), it suffers from the semantic gap where nearest neighbors may be contextually irrelevant. (b) Unconstrained Generative Memory: Directly prompts the LLM to generate memory keys. This approach risks Structural Hallucination—p… view at source ↗

**Figure 2.** Figure 2: The SCG-Mem Framework. (A) Evolutionary Schema Construction: New dialogue turns are processed via dual pathways— Assimilation grounds inputs to existing schema nodes through constrained decoding, while Accommodation expands the Prefix Trie with novel concepts via free generation. (B) Relational Topology Construction: Co-occurring concepts within each turn are linked in an Associative Graph, with edge weigh… view at source ↗

**Figure 4.** Figure 4: Impact of Hop Count across Categories. Performance consistently peaks at hop-1 across all categories, demonstrating the critical value of one-step associative propagation. Multi-Hop exhibits the largest relative gain from hop-0 to hop-1, as direct schema matching alone cannot bridge disjoint conversation sessions. However, hop-2 uniformly degrades performance, with Temporal reasoning suffering the most … view at source ↗

read the original abstract

Constructivist epistemology argues that knowledge is actively constructed rather than passively copied. Despite the generative nature of Large Language Models (LLMs), most existing agent memory systems are still based on dense retrieval. However, dense retrieval heavily relies on semantic overlap or entity matching within sentences. Consequently, embeddings often fail to distinguish instances that are semantically similar but contextually distinct, introducing substantial noise by retrieving context-mismatched entries. Conversely, directly employing open-ended generation for memory access risks "Structural Hallucination" where the model generates memory keys that do not exist in the memory, leading to lookup failures. Inspired by this epistemology, we posit that memory is fundamentally organized by cognitive schemas, and valid recall must be a generative process performed within these schematic structures. To realize this, we propose SCG-MEM, a schema-constrained generative memory architecture. SCG-MEM reformulates memory access as Schema-Constrained Generation. By maintaining a dynamic Cognitive Schema, we strictly constrain LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations. To support long-term adaptation, we model memory updates via assimilation (grounding inputs into existing schemas) and accommodation (expanding schemas with novel concepts). Furthermore, we construct an Associative Graph to enable multi-hop reasoning through activation propagation. Experiments on the LoCoMo benchmark show that SCG-MEM substantially improves performance across all categories over retrieval-based baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SCG-MEM reframes agent memory access as schema-constrained generation with assimilation/accommodation updates and an associative graph, but the formal guarantee claim hinges on enforcement details that the abstract leaves unspecified.

read the letter

The main thing to know about this paper is that it proposes SCG-MEM, which turns memory access in LLM agents into a schema-constrained generative process. This is meant to sidestep the noise from dense retrieval on semantically similar but contextually wrong items, and to block the generation of nonexistent memory keys that cause lookup failures. The approach draws on constructivist epistemology to argue that memory should be organized by cognitive schemas. It maintains a dynamic schema that constrains the LLM's decoding to only produce valid keys. For adaptation over time, it uses assimilation to fit new inputs into existing schemas and accommodation to expand schemas for new concepts. An associative graph is added on top to support multi-hop reasoning by propagating activations. This integration looks fresh. Most agent memory work sticks to retrieval or simple generation, so combining schema constraints with those specific update mechanisms and the graph is not just a minor tweak. The reported gains on the LoCoMo benchmark across all categories over retrieval baselines give some evidence that the method works in practice. Still, the claim of a formal guarantee against structural hallucinations is the part that needs scrutiny. The abstract describes strict constraint on decoding, but without details on the technique—whether it's logit masking over the schema keys, a grammar constraint, or something else—the guarantee could be weaker than stated. If it reduces to prompting the model to stay in schema, then invalid outputs remain possible and the improvement might come from other factors not isolated in the experiments. The absence of ablations or full baseline descriptions in the summary makes it harder to pin down exactly what drives the results. This paper would interest people working on long-context agents and memory systems for conversational AI. A reader who cares about making retrieval more structured and reliable could take the schema idea and run with it, even if they end up modifying the enforcement method. It has a clear problem statement, a coherent proposal, and benchmark results, so it deserves to go through peer review. The reviewers can push on the implementation specifics and ask for more controls.

Referee Report

2 major / 2 minor

Summary. The paper proposes SCG-MEM, a schema-constrained generative memory architecture for LLM agents. Drawing from constructivist epistemology, it argues that memory is organized by cognitive schemas and that recall should be a generative process within those structures. SCG-MEM maintains a dynamic Cognitive Schema to strictly constrain LLM decoding to valid memory entry keys (providing a claimed formal guarantee against structural hallucinations), handles updates via assimilation and accommodation, and uses an Associative Graph for multi-hop reasoning via activation propagation. Experiments on the LoCoMo benchmark are reported to show substantial improvements over retrieval-based baselines across all categories.

Significance. If the decoding constraint is implemented as a hard, enforceable mechanism and the empirical gains are shown to be robust with proper controls, this work could offer a principled shift from dense retrieval to constrained generative memory in agents, potentially reducing noise from semantic mismatches while supporting long-term adaptation. The constructivist framing and combination of schema constraints with graph-based propagation represent a novel direction worth exploring in agent memory design.

major comments (2)

[Abstract and §3] Abstract and §3 (Schema-Constrained Generation): The central claim that the dynamic Cognitive Schema 'strictly constrain[s] LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations' is load-bearing but unsupported by any description of the enforcement technique. If the implementation relies on prompting, fine-tuning, or post-hoc filtering rather than a hard restriction (e.g., logit masking over a closed key set, trie-based prefix filtering, or CFG-guided generation), the guarantee reduces to a statistical claim and structural hallucinations remain possible.
[§4] §4 (Experiments): The manuscript asserts that SCG-MEM 'substantially improves performance across all categories' on LoCoMo over retrieval-based baselines, yet supplies no quantitative metrics, baseline descriptions, implementation details, ablation studies, or statistical tests. Without these, it is impossible to verify whether any gains follow from the schema constraints or from unstated factors such as prompt engineering or model scale.

minor comments (2)

[§2 and §3] The definitions of 'Cognitive Schema' and 'Associative Graph' are introduced conceptually but lack formal specifications, update rules, or pseudocode, which would aid reproducibility.
[§3] Notation for assimilation, accommodation, and activation propagation could be made more precise with equations or algorithms to distinguish the approach from standard retrieval or prompting methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to provide the requested clarifications and details.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (Schema-Constrained Generation): The central claim that the dynamic Cognitive Schema 'strictly constrain[s] LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations' is load-bearing but unsupported by any description of the enforcement technique. If the implementation relies on prompting, fine-tuning, or post-hoc filtering rather than a hard restriction (e.g., logit masking over a closed key set, trie-based prefix filtering, or CFG-guided generation), the guarantee reduces to a statistical claim and structural hallucinations remain possible.

Authors: We acknowledge that the description of the enforcement mechanism in the current manuscript is insufficient to fully substantiate the claim of a formal guarantee. In the revised version, we will expand §3 to detail the implementation: the Cognitive Schema is represented as a dynamic trie of valid memory entry keys, and decoding is constrained via logit masking at each step to permit only tokens that prefix a valid key in the trie. This is a hard, enforceable restriction rather than prompting or post-filtering. We will include pseudocode and a formal description of the masking procedure to clarify this point. revision: yes
Referee: [§4] §4 (Experiments): The manuscript asserts that SCG-MEM 'substantially improves performance across all categories' on LoCoMo over retrieval-based baselines, yet supplies no quantitative metrics, baseline descriptions, implementation details, ablation studies, or statistical tests. Without these, it is impossible to verify whether any gains follow from the schema constraints or from unstated factors such as prompt engineering or model scale.

Authors: We agree that the experimental reporting in the current draft is incomplete. The revised manuscript will include a full experimental section with quantitative metrics on the LoCoMo benchmark (exact scores per category), detailed descriptions of the retrieval baselines and their implementations, ablation studies on the schema constraint and associative graph components, implementation details (LLM backbone, hyperparameters, and computational setup), and statistical significance tests. These additions will enable verification of the source of the reported gains. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on external benchmark and explicit architectural design rather than self-referential fits or definitions.

full rationale

The paper's central claims are (1) an architectural proposal that reformulates memory access as schema-constrained generation and (2) an empirical performance lift on the external LoCoMo benchmark. The 'formal guarantee' is presented as a direct consequence of maintaining a dynamic schema that enumerates only valid keys and then restricting decoding to that set; this is definitional of the method rather than a derived prediction that collapses back to fitted inputs. No equations are shown that equate a model output to a parameter fitted on the same data, no self-citation is invoked as a uniqueness theorem, and the assimilation/accommodation and associative-graph components are described as additional mechanisms rather than tautological restatements. The derivation chain therefore remains self-contained against the stated benchmark results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The abstract introduces a dynamic Cognitive Schema and Associative Graph as core constructs without providing independent evidence or formal definitions. No numerical free parameters are mentioned.

axioms (1)

domain assumption Knowledge is actively constructed rather than passively copied (constructivist epistemology)
Invoked in the opening paragraph as the foundational inspiration for treating memory as schema-organized generation.

invented entities (2)

Cognitive Schema no independent evidence
purpose: Dynamic structure that organizes memory entries and constrains generation to valid keys
Introduced as the central organizing principle; no independent falsifiable handle supplied in abstract.
Associative Graph no independent evidence
purpose: Enables multi-hop reasoning via activation propagation across memory entries
Constructed to support reasoning; no separate validation or external evidence given.

pith-pipeline@v0.9.0 · 5556 in / 1483 out tokens · 68430 ms · 2026-05-10T01:01:54.282840+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 15 canonical work pages · 6 internal anchors

[1]

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

[Chenet al., 2024 ] Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation.arXiv preprint arXiv:2402.03216,

work page internal anchor Pith review arXiv 2024
[2]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

[Edgeet al., 2024 ] Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Tru- itt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph rag ap- proach to query-focused summarization.arXiv preprint arXiv:2404.16130,

work page internal anchor Pith review arXiv 2024
[3]

The Llama 3 Herd of Models

[Grattafioriet al., 2024 ] Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783,

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

arXiv preprint arXiv:1704.07138 , year=

[Hokamp and Liu, 2017] Chris Hokamp and Qun Liu. Lex- ically constrained decoding for sequence generation using grid beam search.arXiv preprint arXiv:1704.07138,

work page arXiv 2017
[5]

A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55,

[Huanget al., 2025 ] Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qiang- long Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55,

2025
[6]

Qwen2.5-Coder Technical Report

[Huiet al., 2024 ] Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-coder technical report.arXiv preprint arXiv:2409.12186,

work page internal anchor Pith review arXiv 2024
[7]

arXiv preprint arXiv:2402.09727 , year=

[Leeet al., 2024 ] Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, and Ian Fischer. A human-inspired reading agent with gist memory of very long contexts. arXiv preprint arXiv:2402.09727,

work page arXiv 2024
[8]

Cam: A constructivist view of agentic mem- ory for llm-based reading comprehension.arXiv preprint arXiv:2510.05520,

[Liet al., 2025b ] Rui Li, Zeyu Zhang, Xiaohe Bo, Zihang Tian, Xu Chen, Quanyu Dai, Zhenhua Dong, and Ruim- ing Tang. Cam: A constructivist view of agentic mem- ory for llm-based reading comprehension.arXiv preprint arXiv:2510.05520,

work page arXiv
[9]

arXiv preprint arXiv:2410.13080 , year =

[Luoet al., 2024 ] Linhao Luo, Zicheng Zhao, Gholam- reza Haffari, Yuan-Fang Li, Chen Gong, and Shirui Pan. Graph-constrained reasoning: Faithful reasoning on knowledge graphs with large language models.arXiv preprint arXiv:2410.13080,

work page arXiv 2024
[10]

Evaluating Very Long-Term Conversational Memory of LLM Agents

[Maharanaet al., 2024 ] Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversational memory of llm agents.arXiv preprint arXiv:2402.17753,

work page internal anchor Pith review arXiv 2024
[11]

Memgpt: Towards llms as operating systems

[Packeret al., 2023 ] Charles Packer, Vivian Fang, Shishir G Patil, Kevin Lin, Sarah Wooders, and Joseph E Gonzalez. Memgpt: Towards llms as operating systems

2023
[12]

International Universities Press,

[Piaget, 1952] Jean Piaget.The origins of intelligence in children. International Universities Press,

1952
[13]

Columbia University Press, New York,

[Piaget, 1970] Jean Piaget.Genetic Epistemology. Columbia University Press, New York,

1970
[14]

[Poesiaet al., 2022 ] Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani

The core philosophical basis: ”To know is to construct”. [Poesiaet al., 2022 ] Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. Synchromesh: Reliable code gener- ation from pre-trained language models.arXiv preprint arXiv:2201.11227,

work page arXiv 2022
[15]

arXiv preprint arXiv:1804.06609 , year=

[Post and Vilar, 2018] Matt Post and David Vilar. Fast lex- ically constrained decoding with dynamic beam allo- cation for neural machine translation.arXiv preprint arXiv:1804.06609,

work page arXiv 2018
[16]

From isolated conversations to hierarchical schemas: Dynamic tree memory representation for llms.arXiv preprint arXiv:2410.14052, 2024

[Rezazadehet al., 2024 ] Alireza Rezazadeh, Zichao Li, Wei Wei, and Yujia Bao. From isolated conversations to hierar- chical schemas: Dynamic tree memory representation for llms.arXiv preprint arXiv:2410.14052,

work page arXiv 2024
[17]

Raptor: Recursive abstractive processing for tree-organized retrieval

[Sarthiet al., 2024 ] Parth Sarthi, Salman Abdullah, Aditi Tuli, Shubh Khanna, Anna Goldie, and Christopher D Manning. Raptor: Recursive abstractive processing for tree-organized retrieval. InThe Twelfth International Con- ference on Learning Representations,

2024
[18]

Picard: Parsing incremen- tally for constrained auto-regressive decoding from language models,

[Scholaket al., 2021 ] Torsten Scholak, Nathan Schucher, and Dzmitry Bahdanau. Picard: Parsing incrementally for constrained auto-regressive decoding from language mod- els.arXiv preprint arXiv:2109.05093,

work page arXiv 2021
[19]

Dense retrievers can fail on simple queries: Revealing the granularity dilemma of embeddings.arXiv preprint arXiv:2506.08592,

[Xuet al., 2025a ] Liyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, and Jie Zhou. Dense retrievers can fail on simple queries: Revealing the granularity dilemma of embeddings.arXiv preprint arXiv:2506.08592,

work page arXiv
[20]

A-MEM: Agentic Memory for LLM Agents

[Xuet al., 2025b ] Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang. A-mem: Agentic memory for llm agents.arXiv preprint arXiv:2502.12110,

work page internal anchor Pith review arXiv
[21]

Memorybank: Enhancing large language models with long-term memory

[Zhonget al., 2024 ] Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang. Memorybank: Enhancing large language models with long-term memory. InPro- ceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19724–19731, 2024

2024