Recognition: unknown
Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory
Pith reviewed 2026-05-09 19:07 UTC · model grok-4.3
The pith
Cognition-inspired two-stage optimization learns memory guidelines then policies for evolving LLM personalization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce MemCoE, a cognition-inspired two-stage optimization framework that learns how memory should be organized and what information to update. In the first stage, Memory Guideline Induction optimizes a global guideline via contrastive feedback interpreted as textual gradients; in the second stage, Guideline-Aligned Memory Policy Optimization uses the induced guideline to define structured process rewards and performs multi-turn RL to learn a guideline-following memory evolution policy. We evaluate on three personalization memory benchmarks, covering explicit/implicit preference and different sizes and noise, and observe consistent improvements over strong baselines with favorable ro
What carries the argument
MemCoE's two-stage framework: Memory Guideline Induction, which derives a global memory organization rule from contrastive textual feedback, followed by Guideline-Aligned Memory Policy Optimization, which converts the guideline into process rewards to guide RL-based learning of memory update actions.
Load-bearing premise
That contrastive feedback interpreted as textual gradients can reliably induce an optimal global memory guideline, and that this guideline then supplies sufficiently informative process rewards to stabilize multi-turn RL for the memory evolution policy.
What would settle it
An ablation experiment in which the guideline induction stage is removed or replaced with random guidelines, resulting in no performance gain or greater instability during RL training on the same benchmarks, would falsify the claim that the two-stage separation is necessary for the observed improvements.
Figures
read the original abstract
Large language model (LLM) agents require long-term user memory for consistent personalization, but limited context windows hinder tracking evolving preferences over long interactions. Existing memory systems mainly rely on static, hand-crafted update rules; although reinforcement learning (RL)-based agents learn memory updates, sparse outcome rewards provide weak supervision, resulting in unstable long-horizon optimization. Drawing on memory schema theory and the functional division between prefrontal regions and hippocampus regions, we introduce MemCoE, a cognition-inspired two-stage optimization framework that learns how memory should be organized and what information to update. In the first stage, we propose Memory Guideline Induction to optimize a global guideline via contrastive feedback interpreted as textual gradients; in the second stage, Guideline-Aligned Memory Policy Optimization uses the induced guideline to define structured process rewards and performs multi-turn RL to learn a guideline-following memory evolution policy. We evaluate on three personalization memory benchmarks, covering explicit/implicit preference and different sizes and noise, and observe consistent improvements over strong baselines with favorable robustness, transferability, and efficiency.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes MemCoE, a cognition-inspired two-stage optimization framework for evolving memory in LLM agents. Stage 1 induces a global memory guideline from contrastive feedback interpreted as textual gradients. Stage 2 uses the guideline to supply structured process rewards for multi-turn RL, training a guideline-following memory evolution policy. The approach is evaluated on three personalization memory benchmarks covering explicit/implicit preferences and varying sizes/noise levels, with claims of consistent improvements over strong baselines along with favorable robustness, transferability, and efficiency.
Significance. If the two-stage separation reliably converts contrastive feedback into stable, dense process rewards that resolve the sparse-outcome instability identified in prior RL memory work, the framework could advance long-horizon memory management for personalized LLM agents by providing both interpretability (via the explicit guideline) and empirical gains. The cognitive analogy and explicit decoupling of guideline induction from policy optimization are conceptually attractive strengths.
major comments (3)
- [Abstract] Abstract: the central claims of 'consistent improvements over strong baselines with favorable robustness, transferability, and efficiency' are stated without any quantitative metrics, error bars, baseline descriptions, or ablation results, preventing assessment of effect sizes or whether the two-stage procedure actually outperforms direct RL or static rules.
- [Method] Description of the two-stage framework: the guideline produced by contrastive feedback in stage 1 is used directly to define the process rewards in stage 2, creating a circular dependency in which the RL optimization target is shaped by the same learned component without an independent external benchmark or validation that the guideline is optimal or general.
- [Experiments] Evaluation section: no ablation, convergence argument, or analysis is supplied demonstrating that the induced guideline measurably reduces reward sparsity or variance in the multi-turn RL stage, despite the abstract explicitly identifying sparse outcome rewards as the source of instability in prior work.
minor comments (1)
- The term 'textual gradients' is used without a formal definition or worked example showing how contrastive pairs are converted into an update rule for the guideline.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and positive recognition of the conceptual contributions. We address each major comment point by point below, clarifying our approach where needed and committing to revisions that strengthen the presentation without altering the core claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claims of 'consistent improvements over strong baselines with favorable robustness, transferability, and efficiency' are stated without any quantitative metrics, error bars, baseline descriptions, or ablation results, preventing assessment of effect sizes or whether the two-stage procedure actually outperforms direct RL or static rules.
Authors: We agree that the abstract would benefit from greater specificity to convey effect sizes. In the revised version we will insert concise quantitative highlights drawn from the experimental results (e.g., average accuracy gains across the three benchmarks and reference to statistical robustness), while remaining within the abstract length constraint. Baseline families and the two-stage versus direct-RL comparison will be mentioned at a high level. revision: yes
-
Referee: [Method] Description of the two-stage framework: the guideline produced by contrastive feedback in stage 1 is used directly to define the process rewards in stage 2, creating a circular dependency in which the RL optimization target is shaped by the same learned component without an independent external benchmark or validation that the guideline is optimal or general.
Authors: The two stages are strictly sequential and non-circular. Stage 1 performs an independent optimization of the guideline using only contrastive textual feedback; once induced, the guideline is frozen and supplied as a fixed reward-shaping function to Stage 2. The RL policy is then optimized against this fixed external signal. We will add explicit wording and a diagram annotation in the method section to emphasize the one-way information flow and note that the guideline itself can be inspected or validated on held-out data independently of the RL stage. revision: partial
-
Referee: [Experiments] Evaluation section: no ablation, convergence argument, or analysis is supplied demonstrating that the induced guideline measurably reduces reward sparsity or variance in the multi-turn RL stage, despite the abstract explicitly identifying sparse outcome rewards as the source of instability in prior work.
Authors: We acknowledge that a direct quantitative demonstration of reduced reward sparsity or variance would strengthen the link to the motivating problem. Although overall performance gains are reported, we did not include a dedicated ablation isolating this mechanism. In the revision we will add a short analysis (main text or appendix) comparing reward variance and convergence behavior between the guideline-aligned RL and a direct-outcome-reward baseline, using the same multi-turn setup. revision: yes
Circularity Check
No significant circularity: two-stage framework validated on external benchmarks
full rationale
The paper proposes MemCoE as a two-stage process where stage 1 induces a global guideline from contrastive feedback and stage 2 uses it to shape process rewards for RL-based policy learning. No equations, definitions, or self-citations in the provided text reduce the final performance claims to the inputs by construction. The central results rest on empirical improvements over baselines across three independent personalization memory benchmarks with varying preference types, sizes, and noise levels, providing external falsifiability outside the fitted guideline itself.
Axiom & Free-Parameter Ledger
free parameters (2)
- guideline induction parameters
- RL process reward scaling
axioms (1)
- domain assumption Memory schema theory and functional division between prefrontal regions and hippocampus regions
invented entities (1)
-
MemCoE two-stage framework
no independent evidence
Reference graph
Works this paper leans on
-
[2]
Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation , author=. arXiv preprint arXiv:2402.03216 , year=
work page internal anchor Pith review arXiv
-
[3]
Automating Customer Service using LangChain: Building custom open-source GPT Chatbot for organizations , author=. arXiv preprint arXiv:2310.05421 , year=
-
[4]
The Thirteenth International Conference on Learning Representations , year=
SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents , author=. The Thirteenth International Conference on Learning Representations , year=
-
[5]
Neurocomputing , pages=
Recursively summarizing enables long-term dialogue memory in large language models , author=. Neurocomputing , pages=. 2025 , publisher=
2025
-
[6]
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10) , pages=
PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Fusion in Question Answering , author=. Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10) , pages=
-
[7]
arXiv preprint arXiv:2409.15240 , year=
MADial-Bench: Towards Real-world Evaluation of Memory-Augmented Dialogue Generation , author=. arXiv preprint arXiv:2409.15240 , year=
-
[8]
arXiv preprint arXiv:2409.20163 , year=
Memsim: A bayesian simulator for evaluating memory of llm-based personal assistants , author=. arXiv preprint arXiv:2409.20163 , year=
-
[9]
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year=
Beyond Goldfish Memory: Long-Term Open-Domain Conversation , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year=
-
[10]
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track , pages=
CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding , author=. Proceedings of the 31st International Conference on Computational Linguistics: Industry Track , pages=
-
[11]
Foundations and Trends
The probabilistic relevance framework: BM25 and beyond , author=. Foundations and Trends. 2009 , publisher=
2009
-
[13]
LangChain Team , title =
-
[15]
arXiv preprint arXiv:2210.08750 , year=
Keep me updated! memory management in long-term conversations , author=. arXiv preprint arXiv:2210.08750 , year=
-
[16]
Towards General Text Embeddings with Multi-stage Contrastive Learning
Towards general text embeddings with multi-stage contrastive learning , author=. arXiv preprint arXiv:2308.03281 , year=
work page internal anchor Pith review arXiv
-
[17]
arXiv preprint arXiv:2402.11573 , year=
Bge landmark embedding: A chunking-free embedding method for retrieval augmented long-context large language models , author=. arXiv preprint arXiv:2402.11573 , year=
-
[18]
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Text embeddings by weakly-supervised contrastive pre-training , author=. arXiv preprint arXiv:2212.03533 , year=
work page internal anchor Pith review arXiv
-
[19]
Unsupervised Dense Information Retrieval with Contrastive Learning
Unsupervised dense information retrieval with contrastive learning , author=. arXiv preprint arXiv:2112.09118 , year=
work page internal anchor Pith review arXiv
-
[20]
, author=
Dense Passage Retrieval for Open-Domain Question Answering. , author=. EMNLP (1) , pages=
-
[21]
Proceedings of the 31st International Conference on Computational Linguistics , pages=
Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations , author=. Proceedings of the 31st International Conference on Computational Linguistics , pages=
-
[22]
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=
Crafting Personalized Agents through Retrieval-Augmented Generation on Editable Memory Graphs , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=
2024
-
[23]
arXiv e-prints , pages=
Theanine: Revisiting memory management in long-term conversations with timeline-augmented response generation , author=. arXiv e-prints , pages=
-
[25]
Towards Lifelong Dialogue Agents via Timeline-based Memory Management , author=. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=
2025
-
[26]
arXiv preprint arXiv:2410.17509 , year=
WAGLE: Strategic weight attribution for effective and modular unlearning in large language models , author=. arXiv preprint arXiv:2410.17509 , year=
-
[29]
and Stoica, Ion and Gonzalez, Joseph E
Packer, Charles and Wooders, Sarah and Lin, Kevin and Fang, Vivian and Patil, Shishir G. and Stoica, Ion and Gonzalez, Joseph E. , journal=
-
[30]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Memorybank: Enhancing large language models with long-term memory , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[31]
Hipporag: Neurobiologically inspired long-term memory for large language models , author=
-
[32]
arXiv preprint arXiv:2404.07103 , year=
Graph chain-of-thought: Augmenting large language models by reasoning on graphs , author=. arXiv preprint arXiv:2404.07103 , year=
-
[35]
Advances in Neural Information Processing Systems , volume=
G-retriever: Retrieval-augmented generation for textual graph understanding and question answering , author=. Advances in Neural Information Processing Systems , volume=
-
[36]
Advances in Neural Information Processing Systems , volume=
Tablerag: Million-token table understanding with language models , author=. Advances in Neural Information Processing Systems , volume=
-
[37]
Parth Sarthi and Salman Abdullah and Aditi Tuli and Shubh Khanna and Anna Goldie and Christopher D Manning , booktitle=
-
[38]
Lightrag: Simple and fast retrieval-augmented generation , author=
-
[39]
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
From local to global: A graph rag approach to query-focused summarization , author=. arXiv preprint arXiv:2404.16130 , year=
work page internal anchor Pith review arXiv
-
[40]
arXiv preprint arXiv:2410.08815 , year=
Structrag: Boosting knowledge intensive reasoning of llms via inference-time hybrid information structurization , author=. arXiv preprint arXiv:2410.08815 , year=
-
[42]
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
Evaluating Very Long-Term Conversational Memory of LLM Agents , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[43]
The Thirteenth International Conference on Learning Representations , year=
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory , author=. The Thirteenth International Conference on Learning Representations , year=
-
[45]
Findings of the Association for Computational Linguistics: ACL 2023 , pages=
Prompted LLMs as Chatbot Modules for Long Open-domain Conversation , author=. Findings of the Association for Computational Linguistics: ACL 2023 , pages=
2023
-
[47]
Frontiers of Computer Science , volume=
A survey on large language model based autonomous agents , author=. Frontiers of Computer Science , volume=. 2024 , publisher=
2024
-
[48]
arXiv preprint arXiv:2504.10147 , year=
A Survey of Personalization: From RAG to Agent , author=. arXiv preprint arXiv:2504.10147 , year=
-
[50]
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
ChatDev: Communicative Agents for Software Development , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[51]
Lost in the Middle: How Language Models Use Long Contexts
Lost in the middle: How language models use long contexts , author=. arXiv preprint arXiv:2307.03172 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
Proceedings of the 36th annual acm symposium on user interface software and technology , pages=
Generative agents: Interactive simulacra of human behavior , author=. Proceedings of the 36th annual acm symposium on user interface software and technology , pages=
-
[53]
Advances in neural information processing systems , volume=
Mpnet: Masked and permuted pre-training for language understanding , author=. Advances in neural information processing systems , volume=
-
[54]
P er LTQA : A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Fusion in Question Answering
Du, Yiming and Wang, Hongru and Zhao, Zhengyi and Liang, Bin and Wang, Baojun and Zhong, Wanjun and Wang, Zezhong and Wong, Kam-Fai. P er LTQA : A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Fusion in Question Answering. Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10). 2024
2024
-
[55]
Gonzalez and Ion Stoica , booktitle=
Lianmin Zheng and Wei-Lin Chiang and Ying Sheng and Siyuan Zhuang and Zhanghao Wu and Yonghao Zhuang and Zi Lin and Zhuohan Li and Dacheng Li and Eric Xing and Hao Zhang and Joseph E. Gonzalez and Ion Stoica , booktitle=. Judging
-
[56]
, author=
The hippocampal memory indexing theory. , author=. Behavioral neuroscience , volume=. 1986 , publisher=
1986
-
[57]
2025 , eprint=
MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation , author=. 2025 , eprint=
2025
-
[58]
Advances in neural information processing systems , volume=
The infinite Gaussian mixture model , author=. Advances in neural information processing systems , volume=
-
[59]
IEEE Transactions on Knowledge and Data Engineering , year=
Efficient algorithms for personalized pagerank computation: A survey , author=. IEEE Transactions on Knowledge and Data Engineering , year=
-
[60]
Proceedings of the 11th international conference on World Wide Web , pages=
Topic-sensitive pagerank , author=. Proceedings of the 11th international conference on World Wide Web , pages=
-
[61]
arXiv preprint arXiv:2312.17257 , year=
Personalized Large Language Model Assistant with Evolving Conditional Memory , author=. arXiv preprint arXiv:2312.17257 , year=
-
[62]
Colbertv2: Effective and efficient retrieval via light weight late interaction
Colbertv2: Effective and efficient retrieval via lightweight late interaction , author=. arXiv preprint arXiv:2112.01488 , year=
-
[63]
A survey of graph retrieval-augmented generation for customized large language models
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models , author=. arXiv preprint arXiv:2501.13958 , year=
-
[70]
arXiv preprint arXiv:2507.22925 , year=
Hierarchical memory for high-efficiency long-term reasoning in llm agents , author=. arXiv preprint arXiv:2507.22925 , year=
-
[71]
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
Long-context language modeling with parallel context encoding , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[78]
First Workshop on Multi-Turn Interactions in Large Language Models , year=
PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time , author=. First Workshop on Multi-Turn Interactions in Large Language Models , year=
-
[81]
Psychological Bulletin , volume=
Is memory schematic? , author=. Psychological Bulletin , volume=. 1983 , publisher=
1983
-
[82]
2025 , eprint=
Memory in the Age of AI Agents , author=. 2025 , eprint=
2025
-
[83]
arXiv preprint arXiv:2505.15456 , year=
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment , author=. arXiv preprint arXiv:2505.15456 , year=
-
[85]
Forty-second International Conference on Machine Learning , year=
Agent Workflow Memory , author=. Forty-second International Conference on Machine Learning , year=
-
[88]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Expel: Llm agents are experiential learners , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[89]
The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
-
[91]
Findings of the Association for Computational Linguistics: ACL 2025 , pages=
Personabench: Evaluating ai models on understanding personal information through accessing (synthetic) private user data , author=. Findings of the Association for Computational Linguistics: ACL 2025 , pages=
2025
-
[92]
Advances in neural information processing systems , volume=
Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers , author=. Advances in neural information processing systems , volume=
-
[94]
Qwen3 technical report , author=. arXiv preprint arXiv:2505.09388 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[95]
2025 , eprint=
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey , author=. 2025 , eprint=
2025
-
[96]
Evoking User Memory: Personalizing
Yingyi Zhang and Junyi Li and Wenlin Zhang and Pengyue Jia and Xianneng Li and Yichao Wang and Derong Xu and Yi Wen and Huifeng Guo and Yong Liu and Xiangyu Zhao , booktitle=. Evoking User Memory: Personalizing. 2026 , url=
2026
-
[97]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Personalize before retrieve: Llm-based personalized query expansion for user-centric retrieval , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[98]
The Fourteenth International Conference on Learning Representations , year=
From Single to Multi-Granularity: Toward Long-Term Memory Association and Selection of Conversational Agents , author=. The Fourteenth International Conference on Learning Representations , year=
-
[99]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Harnessing large language models for knowledge graph question answering via adaptive multi-aspect retrieval-augmentation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[101]
Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models
Xu, Derong and Zhang, Ziheng and Lin, Zhenxi and Wu, Xian and Zhu, Zhihong and Xu, Tong and Zhao, Xiangyu and Zheng, Yefeng and Chen, Enhong. Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-CO...
2024
-
[102]
Proceedings of the ACM on Web Conference 2025 , pages=
Llm4rerank: Llm-based auto-reranking framework for recommendations , author=. Proceedings of the ACM on Web Conference 2025 , pages=
2025
-
[103]
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V
Notellm-2: Multimodal large representation models for recommendation , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1 , pages=
-
[104]
ACM Transactions on Information Systems , volume=
A unified framework for multi-domain ctr prediction via large language models , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=
2025
-
[105]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Llm-powered user simulator for recommender system , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[106]
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V
Large Language Model Enhanced Recommender Systems: Methods, Applications and Trends , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2 , pages=
-
[108]
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=
Mill: Mutual verification with large language models for zero-shot query expansion , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=
2024
-
[109]
Nature , volume=
Optimizing generative ai by backpropagating language model feedback , author=. Nature , volume=. 2025 , publisher=
2025
-
[110]
The Twelfth International Conference on Learning Representations , year=
Large language models as optimizers , author=. The Twelfth International Conference on Learning Representations , year=
-
[111]
gradient descent
Automatic prompt optimization with “gradient descent” and beam search , author=. Proceedings of the 2023 conference on empirical methods in natural language processing , pages=
2023
-
[112]
Advances in neural information processing systems , volume=
Reflexion: Language agents with verbal reinforcement learning , author=. Advances in neural information processing systems , volume=
-
[113]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Unleashing the potential of large language models as prompt optimizers: Analogical analysis with gradient-based model optimizers , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.