Recognition: unknown
To Know is to Construct: Schema-Constrained Generation for Agent Memory
Pith reviewed 2026-05-10 01:01 UTC · model grok-4.3
The pith
Agent memory improves when LLM generation is constrained to valid keys from a dynamic cognitive schema rather than open retrieval or free generation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SCG-MEM reformulates memory access as Schema-Constrained Generation. By maintaining a dynamic Cognitive Schema, the system strictly constrains LLM decoding to generate only valid memory entry keys. This provides a formal guarantee against structural hallucinations. Memory updates occur via assimilation of inputs into existing schemas and accommodation to expand schemas with novel concepts, while an Associative Graph enables multi-hop reasoning through activation propagation.
What carries the argument
Schema-Constrained Generation using a dynamic Cognitive Schema that restricts decoding to only valid existing memory entry keys.
If this is right
- Constrained decoding eliminates structural hallucinations by construction because only existing keys can be produced.
- Memory stays consistent over long interactions through assimilation of familiar inputs and accommodation to add new schema elements.
- The Associative Graph allows activation to spread across related entries for multi-hop inference without separate retrieval steps.
- Performance gains appear across all task categories on the LoCoMo long-context memory benchmark compared with retrieval baselines.
Where Pith is reading between the lines
- The same constrained-generation idea could be applied to other agent outputs such as plan steps or tool calls to reduce invalid actions.
- This memory design may scale to agents that maintain separate schemas for different domains without requiring full retraining.
- If schemas can be learned from scratch rather than initialized, the method might extend to open-ended lifelong learning settings.
Load-bearing premise
Memory recall works best as a generative process bounded inside pre-defined cognitive schemas instead of free retrieval or unconstrained generation.
What would settle it
An experiment in which an unconstrained generative memory model or a pure retrieval model achieves higher accuracy and fewer lookup failures than SCG-MEM on the LoCoMo benchmark.
Figures
read the original abstract
Constructivist epistemology argues that knowledge is actively constructed rather than passively copied. Despite the generative nature of Large Language Models (LLMs), most existing agent memory systems are still based on dense retrieval. However, dense retrieval heavily relies on semantic overlap or entity matching within sentences. Consequently, embeddings often fail to distinguish instances that are semantically similar but contextually distinct, introducing substantial noise by retrieving context-mismatched entries. Conversely, directly employing open-ended generation for memory access risks "Structural Hallucination" where the model generates memory keys that do not exist in the memory, leading to lookup failures. Inspired by this epistemology, we posit that memory is fundamentally organized by cognitive schemas, and valid recall must be a generative process performed within these schematic structures. To realize this, we propose SCG-MEM, a schema-constrained generative memory architecture. SCG-MEM reformulates memory access as Schema-Constrained Generation. By maintaining a dynamic Cognitive Schema, we strictly constrain LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations. To support long-term adaptation, we model memory updates via assimilation (grounding inputs into existing schemas) and accommodation (expanding schemas with novel concepts). Furthermore, we construct an Associative Graph to enable multi-hop reasoning through activation propagation. Experiments on the LoCoMo benchmark show that SCG-MEM substantially improves performance across all categories over retrieval-based baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes SCG-MEM, a schema-constrained generative memory architecture for LLM agents. Drawing from constructivist epistemology, it argues that memory is organized by cognitive schemas and that recall should be a generative process within those structures. SCG-MEM maintains a dynamic Cognitive Schema to strictly constrain LLM decoding to valid memory entry keys (providing a claimed formal guarantee against structural hallucinations), handles updates via assimilation and accommodation, and uses an Associative Graph for multi-hop reasoning via activation propagation. Experiments on the LoCoMo benchmark are reported to show substantial improvements over retrieval-based baselines across all categories.
Significance. If the decoding constraint is implemented as a hard, enforceable mechanism and the empirical gains are shown to be robust with proper controls, this work could offer a principled shift from dense retrieval to constrained generative memory in agents, potentially reducing noise from semantic mismatches while supporting long-term adaptation. The constructivist framing and combination of schema constraints with graph-based propagation represent a novel direction worth exploring in agent memory design.
major comments (2)
- [Abstract and §3] Abstract and §3 (Schema-Constrained Generation): The central claim that the dynamic Cognitive Schema 'strictly constrain[s] LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations' is load-bearing but unsupported by any description of the enforcement technique. If the implementation relies on prompting, fine-tuning, or post-hoc filtering rather than a hard restriction (e.g., logit masking over a closed key set, trie-based prefix filtering, or CFG-guided generation), the guarantee reduces to a statistical claim and structural hallucinations remain possible.
- [§4] §4 (Experiments): The manuscript asserts that SCG-MEM 'substantially improves performance across all categories' on LoCoMo over retrieval-based baselines, yet supplies no quantitative metrics, baseline descriptions, implementation details, ablation studies, or statistical tests. Without these, it is impossible to verify whether any gains follow from the schema constraints or from unstated factors such as prompt engineering or model scale.
minor comments (2)
- [§2 and §3] The definitions of 'Cognitive Schema' and 'Associative Graph' are introduced conceptually but lack formal specifications, update rules, or pseudocode, which would aid reproducibility.
- [§3] Notation for assimilation, accommodation, and activation propagation could be made more precise with equations or algorithms to distinguish the approach from standard retrieval or prompting methods.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to provide the requested clarifications and details.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Schema-Constrained Generation): The central claim that the dynamic Cognitive Schema 'strictly constrain[s] LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations' is load-bearing but unsupported by any description of the enforcement technique. If the implementation relies on prompting, fine-tuning, or post-hoc filtering rather than a hard restriction (e.g., logit masking over a closed key set, trie-based prefix filtering, or CFG-guided generation), the guarantee reduces to a statistical claim and structural hallucinations remain possible.
Authors: We acknowledge that the description of the enforcement mechanism in the current manuscript is insufficient to fully substantiate the claim of a formal guarantee. In the revised version, we will expand §3 to detail the implementation: the Cognitive Schema is represented as a dynamic trie of valid memory entry keys, and decoding is constrained via logit masking at each step to permit only tokens that prefix a valid key in the trie. This is a hard, enforceable restriction rather than prompting or post-filtering. We will include pseudocode and a formal description of the masking procedure to clarify this point. revision: yes
-
Referee: [§4] §4 (Experiments): The manuscript asserts that SCG-MEM 'substantially improves performance across all categories' on LoCoMo over retrieval-based baselines, yet supplies no quantitative metrics, baseline descriptions, implementation details, ablation studies, or statistical tests. Without these, it is impossible to verify whether any gains follow from the schema constraints or from unstated factors such as prompt engineering or model scale.
Authors: We agree that the experimental reporting in the current draft is incomplete. The revised manuscript will include a full experimental section with quantitative metrics on the LoCoMo benchmark (exact scores per category), detailed descriptions of the retrieval baselines and their implementations, ablation studies on the schema constraint and associative graph components, implementation details (LLM backbone, hyperparameters, and computational setup), and statistical significance tests. These additions will enable verification of the source of the reported gains. revision: yes
Circularity Check
No circularity: claims rest on external benchmark and explicit architectural design rather than self-referential fits or definitions.
full rationale
The paper's central claims are (1) an architectural proposal that reformulates memory access as schema-constrained generation and (2) an empirical performance lift on the external LoCoMo benchmark. The 'formal guarantee' is presented as a direct consequence of maintaining a dynamic schema that enumerates only valid keys and then restricting decoding to that set; this is definitional of the method rather than a derived prediction that collapses back to fitted inputs. No equations are shown that equate a model output to a parameter fitted on the same data, no self-citation is invoked as a uniqueness theorem, and the assimilation/accommodation and associative-graph components are described as additional mechanisms rather than tautological restatements. The derivation chain therefore remains self-contained against the stated benchmark results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Knowledge is actively constructed rather than passively copied (constructivist epistemology)
invented entities (2)
-
Cognitive Schema
no independent evidence
-
Associative Graph
no independent evidence
Reference graph
Works this paper leans on
-
[1]
[Chenet al., 2024 ] Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation.arXiv preprint arXiv:2402.03216,
work page internal anchor Pith review arXiv 2024
-
[2]
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
[Edgeet al., 2024 ] Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Tru- itt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph rag ap- proach to query-focused summarization.arXiv preprint arXiv:2404.16130,
work page internal anchor Pith review arXiv 2024
-
[3]
[Grattafioriet al., 2024 ] Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783,
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[4]
arXiv preprint arXiv:1704.07138 , year=
[Hokamp and Liu, 2017] Chris Hokamp and Qun Liu. Lex- ically constrained decoding for sequence generation using grid beam search.arXiv preprint arXiv:1704.07138,
-
[5]
A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55,
[Huanget al., 2025 ] Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qiang- long Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Prin- ciples, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55,
2025
-
[6]
Qwen2.5-Coder Technical Report
[Huiet al., 2024 ] Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2.5-coder technical report.arXiv preprint arXiv:2409.12186,
work page internal anchor Pith review arXiv 2024
-
[7]
arXiv preprint arXiv:2402.09727 , year=
[Leeet al., 2024 ] Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, and Ian Fischer. A human-inspired reading agent with gist memory of very long contexts. arXiv preprint arXiv:2402.09727,
-
[8]
[Liet al., 2025b ] Rui Li, Zeyu Zhang, Xiaohe Bo, Zihang Tian, Xu Chen, Quanyu Dai, Zhenhua Dong, and Ruim- ing Tang. Cam: A constructivist view of agentic mem- ory for llm-based reading comprehension.arXiv preprint arXiv:2510.05520,
-
[9]
arXiv preprint arXiv:2410.13080 , year =
[Luoet al., 2024 ] Linhao Luo, Zicheng Zhao, Gholam- reza Haffari, Yuan-Fang Li, Chen Gong, and Shirui Pan. Graph-constrained reasoning: Faithful reasoning on knowledge graphs with large language models.arXiv preprint arXiv:2410.13080,
-
[10]
Evaluating Very Long-Term Conversational Memory of LLM Agents
[Maharanaet al., 2024 ] Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, and Yuwei Fang. Evaluating very long-term conversational memory of llm agents.arXiv preprint arXiv:2402.17753,
work page internal anchor Pith review arXiv 2024
-
[11]
Memgpt: Towards llms as operating systems
[Packeret al., 2023 ] Charles Packer, Vivian Fang, Shishir G Patil, Kevin Lin, Sarah Wooders, and Joseph E Gonzalez. Memgpt: Towards llms as operating systems
2023
-
[12]
International Universities Press,
[Piaget, 1952] Jean Piaget.The origins of intelligence in children. International Universities Press,
1952
-
[13]
Columbia University Press, New York,
[Piaget, 1970] Jean Piaget.Genetic Epistemology. Columbia University Press, New York,
1970
-
[14]
The core philosophical basis: ”To know is to construct”. [Poesiaet al., 2022 ] Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. Synchromesh: Reliable code gener- ation from pre-trained language models.arXiv preprint arXiv:2201.11227,
-
[15]
arXiv preprint arXiv:1804.06609 , year=
[Post and Vilar, 2018] Matt Post and David Vilar. Fast lex- ically constrained decoding with dynamic beam allo- cation for neural machine translation.arXiv preprint arXiv:1804.06609,
-
[16]
[Rezazadehet al., 2024 ] Alireza Rezazadeh, Zichao Li, Wei Wei, and Yujia Bao. From isolated conversations to hierar- chical schemas: Dynamic tree memory representation for llms.arXiv preprint arXiv:2410.14052,
-
[17]
Raptor: Recursive abstractive processing for tree-organized retrieval
[Sarthiet al., 2024 ] Parth Sarthi, Salman Abdullah, Aditi Tuli, Shubh Khanna, Anna Goldie, and Christopher D Manning. Raptor: Recursive abstractive processing for tree-organized retrieval. InThe Twelfth International Con- ference on Learning Representations,
2024
-
[18]
Picard: Parsing incremen- tally for constrained auto-regressive decoding from language models,
[Scholaket al., 2021 ] Torsten Scholak, Nathan Schucher, and Dzmitry Bahdanau. Picard: Parsing incrementally for constrained auto-regressive decoding from language mod- els.arXiv preprint arXiv:2109.05093,
-
[19]
[Xuet al., 2025a ] Liyan Xu, Zhenlin Su, Mo Yu, Jiangnan Li, Fandong Meng, and Jie Zhou. Dense retrievers can fail on simple queries: Revealing the granularity dilemma of embeddings.arXiv preprint arXiv:2506.08592,
-
[20]
A-MEM: Agentic Memory for LLM Agents
[Xuet al., 2025b ] Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang. A-mem: Agentic memory for llm agents.arXiv preprint arXiv:2502.12110,
work page internal anchor Pith review arXiv
-
[21]
Memorybank: Enhancing large language models with long-term memory
[Zhonget al., 2024 ] Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang. Memorybank: Enhancing large language models with long-term memory. InPro- ceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19724–19731, 2024
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.