textsc{MasFACT}: Continual Multi-Agent Topology Learning via Geometry-Aware Posterior Transfer
Pith reviewed 2026-05-20 14:31 UTC · model grok-4.3
The pith
MasFACT transfers historical agent collaboration patterns as priors to prevent topology forgetting when multi-agent LLM systems face streams of evolving tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that topology forgetting arises from cross-task misalignment in agent-level functional semantics and relational communication structures, and that this can be addressed by transferring historical topology priors across task-specific agent spaces through Fused Gromov-Wasserstein optimal transport followed by PAC-Bayes-guided conservative posterior adaptation that balances task-specific plasticity with structural stability.
What carries the argument
Fused Gromov-Wasserstein optimal transport that aligns and transfers topology priors between different task-specific agent spaces, combined with PAC-Bayes-guided conservative posterior adaptation to retain stability while allowing new-task learning.
If this is right
- Average accuracy improves across class-, domain-, and task-level continual learning settings.
- Topology forgetting is reduced compared to strong topology generation and replay-based baselines.
- The method integrates directly with existing MAS topology generators without requiring changes to their core design.
Where Pith is reading between the lines
- The same prior-transfer idea could apply to continual learning in single-agent or non-LLM systems where structural knowledge must persist across task shifts.
- Longer task sequences might expose limits on how many historical priors can be maintained before adaptation costs rise.
- Measuring the geometric distance between task agent spaces before and after transfer could serve as a diagnostic for when the method succeeds or fails.
Load-bearing premise
The approach assumes that useful past collaboration structures can be aligned and transferred as priors to new tasks despite shifts in agent functions and communication relations.
What would settle it
A direct comparison on a multi-task sequence where MasFACT is applied versus a standard topology generator, checking whether accuracy on the first task remains higher or forgetting metrics are lower; if no improvement appears, the central claim would be falsified.
Figures
read the original abstract
Multi-agent systems (MAS) powered by large language models (LLMs) have emerged as a powerful paradigm for complex problem solving, where performance critically depends on the underlying inter-agent communication topology. However, existing topology generation methods mainly optimize for isolated tasks, while real-world deployments involve streams of evolving tasks, requiring previously effective collaboration patterns to be retained and reused rather than rediscovered or overwritten. We identify a previously underexplored failure mode, \emph{topology forgetting}, in which adapting to new tasks shifts the topology generator away from communication structures required by earlier tasks. This issue stems from cross-task misalignment in both agent-level functional semantics and relational communication structures. To address this challenge, we propose \textbf{\textsc{MasFACT}}, a geometry-aware posterior transfer framework that preserves and reuses historical collaboration knowledge as transferable topology priors. We transfer these priors across task-specific agent spaces through Fused Gromov-Wasserstein optimal transport and perform PAC-Bayes-guided conservative posterior adaptation to balance task-specific plasticity with structural stability. Experiments across class-, domain-, and task-level continual settings demonstrate that \textsc{MasFACT} consistently improves average accuracy while reducing topology forgetting compared to strong topology generation and replay-based baselines, and can be seamlessly integrated with different MAS topology generators.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that existing MAS topology generators suffer from 'topology forgetting' when adapting to streams of evolving tasks due to cross-task misalignment in agent functional semantics and relational structures. It proposes MasFACT, which transfers historical collaboration priors across task-specific agent spaces via Fused Gromov-Wasserstein optimal transport and applies PAC-Bayes-guided conservative posterior adaptation to balance plasticity and stability. Experiments across class-, domain-, and task-level continual settings reportedly show consistent gains in average accuracy, reduced topology forgetting versus topology generation and replay baselines, and seamless integration with multiple MAS topology generators.
Significance. If the empirical claims and the structural-preservation properties of the OT transfer hold, the work would be moderately significant for continual learning in LLM-based multi-agent systems, as it targets an underexplored failure mode and reuses established optimal-transport and PAC-Bayes machinery in a geometry-aware posterior-transfer setting. The reported compatibility with arbitrary topology generators is a practical strength.
major comments (2)
- [§3] §3 (Fused Gromov-Wasserstein transfer): The central claim that the fused OT cost (node features + edge relations) produces a transport plan preserving relational communication structures required by prior tasks is load-bearing for the reduction in topology forgetting. The manuscript does not provide a concrete verification (e.g., a structure-preservation metric or ablation on the relative weighting of feature vs. relational terms in the fused distance) that the alignment does not distort collaboration patterns when functional-semantic embeddings dominate. This directly addresses the skeptic's concern and must be strengthened for the retention argument to be convincing.
- [§5] §5 (Experiments): The reported improvements in average accuracy and topology-forgetting reduction are presented without error bars, statistical significance tests, or explicit rules for data exclusion and hyper-parameter selection across the class-/domain-/task-level settings. Because the soundness assessment is currently low, these details are required to substantiate that gains arise from structural retention rather than increased plasticity alone.
minor comments (1)
- [§3] Notation for the PAC-Bayes posterior adaptation and the precise definition of the fused Gromov-Wasserstein cost should be introduced earlier and used consistently to improve readability.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us identify areas for improvement in the manuscript. We address each major comment below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [§3] §3 (Fused Gromov-Wasserstein transfer): The central claim that the fused OT cost (node features + edge relations) produces a transport plan preserving relational communication structures required by prior tasks is load-bearing for the reduction in topology forgetting. The manuscript does not provide a concrete verification (e.g., a structure-preservation metric or ablation on the relative weighting of feature vs. relational terms in the fused distance) that the alignment does not distort collaboration patterns when functional-semantic embeddings dominate. This directly addresses the skeptic's concern and must be strengthened for the retention argument to be convincing.
Authors: We agree that additional verification of the structure-preservation properties is necessary to strengthen our claims. In the revised version, we will add an ablation study varying the relative weighting between the feature and relational components in the fused Gromov-Wasserstein distance. Furthermore, we will introduce quantitative metrics to assess structure preservation, such as the similarity in communication patterns or the retention of key relational edges across tasks. These additions will provide concrete evidence that the transport plan maintains the required collaboration structures even when semantic embeddings are prominent. revision: yes
-
Referee: [§5] §5 (Experiments): The reported improvements in average accuracy and topology-forgetting reduction are presented without error bars, statistical significance tests, or explicit rules for data exclusion and hyper-parameter selection across the class-/domain-/task-level settings. Because the soundness assessment is currently low, these details are required to substantiate that gains arise from structural retention rather than increased plasticity alone.
Authors: We recognize the need for greater statistical rigor and transparency in our experimental evaluation. We will revise the experimental section to include error bars computed over multiple independent runs with different random seeds. We will also conduct and report statistical significance tests comparing MasFACT against the baselines. Additionally, we will provide explicit details on our hyper-parameter selection methodology and any criteria used for data exclusion or inclusion in the continual learning benchmarks. These changes should clarify that the improvements stem from the proposed structural retention mechanisms. revision: yes
Circularity Check
No significant circularity; derivation relies on established external techniques
full rationale
The paper's central framework applies Fused Gromov-Wasserstein optimal transport and PAC-Bayes posterior adaptation to transfer topology priors across tasks. These are standard, independently established methods from prior literature outside the present authors. No derivation step reduces a claimed prediction or result to a quantity defined by the target itself, nor does any load-bearing premise collapse to a self-citation chain or fitted input renamed as output. The abstract and method description treat the OT alignment and conservative adaptation as imported tools whose properties are not redefined within the paper, leaving the empirical claims about reduced topology forgetting as testable against baselines rather than tautological.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Program Synthesis with Large Language Models
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. Program synthesis with large language models.arXiv preprint arXiv:2108.07732, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[2]
Evaluating Large Language Models Trained on Code
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code.arXiv preprint arXiv:2107.03374, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors
Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, et al. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors. InThe Twelfth International Conference on Learning Representations, 2023
work page 2023
-
[4]
Theoremqa: A theorem-driven question answering dataset
Wenhu Chen, Ming Yin, Max Ku, et al. Theoremqa: A theorem-driven question answering dataset. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
work page 2023
-
[5]
Training Verifiers to Solve Math Word Problems
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems.arXiv preprint arXiv:2110.14168, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
Improv- ing factuality and reasoning in language models through multiagent debate
Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. Improv- ing factuality and reasoning in language models through multiagent debate. InForty-first international conference on machine learning, 2024
work page 2024
-
[7]
Ai agents in engineering design: a multi-agent framework for aesthetic and aerodynamic car design
Mohamed Elrefaie, Janet Qian, Raina Wu, Qian Chen, Angela Dai, and Faez Ahmed. Ai agents in engineering design: a multi-agent framework for aesthetic and aerodynamic car design. In International Design Engineering Technical Conferences and Computers and Information in EngineeringConference, volume 89237, page V03BT03A048.American Societyof Mechanical Engi...
work page 2025
-
[8]
TodyComm: Task-Oriented Dynamic Communication for Multi-Round LLM-based Multi-Agent System
WenzheFan,TommasoTognoli,HenryPengZou,ChunyuMiao,YiboWang,andXinhuaZhang. Todycomm: Task-oriented dynamic communication for multi-round llm-based multi-agent system.arXiv preprint arXiv:2602.03688, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[9]
Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies
Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, Dan Roth, and Jonathan Berant. Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies. Transactions of the Association for Computational Linguistics (TACL), 9:346–361, 2021
work page 2021
-
[10]
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. Large language model based multi-agents: A survey of progress and challenges.arXiv preprint arXiv:2402.01680, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[11]
Slim: Let llm learn more and forget less with soft lora and identity mixture
Jiayi Han, Liang Du, Hongwei Du, Xiangguo Zhou, Yiwen Wu, Yuanfang Zhang, Weibo Zheng, and Donghong Han. Slim: Let llm learn more and forget less with soft lora and identity mixture. InProceedingsofthe2025ConferenceoftheNationsoftheAmericasChapteroftheAssociation 10 forComputationalLinguistics: HumanLanguageTechnologies(Volume1: LongPapers), pages 4792–4804, 2025
work page 2025
-
[12]
Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, et al. Olympiadbench: A challenging benchmark for promoting agi with olympiad-level bilingual multimodal scientific problems. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Paper...
work page 2024
-
[13]
Measuring massive multitask language understanding
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. InInternational Conference on Learning Representations (ICLR), 2021
work page 2021
-
[14]
Measuring mathematical problem solving with the math dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021
work page 2021
-
[15]
Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps
Xanh Ho, Anh-Khoa Duong, and Kazunari Sugiyama. Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps. InProceedings of the 28th International Conference on Computational Linguistics (COLING), pages 6609–6625, 2020
work page 2020
-
[16]
Metagpt: Meta programming for a multi-agent collaborative framework
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework. InThe twelfth international conference on learning representations, 2023
work page 2023
-
[17]
Automated Design of Agentic Systems
Shengran Hu, Cong Lu, and Jeff Clune. Automated design of agentic systems.arXiv preprint arXiv:2408.08435, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain, King King, et al. Livecodebench: Holistic and contamination free evaluation of large language models for code.arXiv preprint arXiv:2403.07974, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[19]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, AndreiARusu, KieranMilan, JohnQuan, TiagoRamalho, AgnieszkaGrabska-Barwinska, etal. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13):3521–3526, 2017
work page 2017
-
[20]
Amas: Adaptively determining communication topology for llm-based multi-agent system
Hui Yi Leong, Yuheng Li, Yuqing Wu, Wenwen Ouyang, Wei Zhu, Jiechao Gao, and Wei Han. Amas: Adaptively determining communication topology for llm-based multi-agent system. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 2061–2070, 2025
work page 2025
-
[21]
Adaptive graph pruning for multi-agent communication.arXiv preprint arXiv:2506.02951, 2025
Boyi Li, Zhonghan Zhao, Der-Horng Lee, and Gaoang Wang. Adaptive graph pruning for multi-agent communication.arXiv preprint arXiv:2506.02951, 2025
-
[22]
Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. Camel: Communicative agents for" mind" exploration of large language model society.Advances in neural information processing systems, 36:51991–52008, 2023
work page 2023
-
[23]
TACO: Topics in algorithmic COde generation dataset.arXiv preprint, arXiv:2312.14852, 2023
RongaoLi,JieFu,Bo-WenZhang,TaoHuang,ZhihongSun,ChenLyu,GuangLiu,ZhiJin,and Ge Li. Taco: Topics in algorithmic code generation dataset.arXiv preprint arXiv:2312.14852, 2023
-
[24]
Shiyuan Li, Yixin Liu, Qingsong Wen, Chengqi Zhang, and Shirui Pan. Assemble your crew: Automatic multi-agent communication topology design via autoregressive graph generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 23142–23150, 2026
work page 2026
-
[25]
Zhizhong Li and Derek Hoiem. Learning without forgetting.IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017
work page 2017
-
[26]
Gradient episodic memory for continual learning
David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning. Advances in neural information processing systems, 30, 2017. 11
work page 2017
-
[27]
Packnet: Adding multiple tasks to a single network by iterative pruning
Arun Mallya and Svetlana Lazebnik. Packnet: Adding multiple tasks to a single network by iterative pruning. InProceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018
work page 2018
-
[28]
David A McAllester. Some pac-bayesian theorems. InProceedings of the eleventh annual conference on Computational learning theory, pages 230–234, 1998
work page 1998
-
[29]
Chatdev: Communicative agents for software development
Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. Chatdev: Communicative agents for software development. InProceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long papers), pages 15174–15186, 2024
work page 2024
-
[30]
Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, et al. Scaling large language model-based multi-agent collaboration.arXiv preprint arXiv:2406.07155, 2024
-
[31]
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Jackson, Evan Frankel, Ethan Perez, Samuel R Bowman, and Jared Perez. Gpqa: A graduate-level google-proof q&a benchmark. arXiv preprint arXiv:2311.12022, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[32]
Experience replay for continual learning.Advances in neural information processing systems, 32, 2019
David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. Experience replay for continual learning.Advances in neural information processing systems, 32, 2019
work page 2019
-
[33]
Xu Shen, Yixin Liu, Yiwei Dai, Yili Wang, Rui Miao, Yue Tan, Shirui Pan, and Xin Wang. Understanding the information propagation effects of communication topologies in llm-based multi-agent systems. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 12358–12372, 2025
work page 2025
-
[34]
Optimal transport for structured data with application on graphs
Vayer Titouan, Nicolas Courty, Romain Tavenard, and Rémi Flamary. Optimal transport for structured data with application on graphs. InInternational Conference on Machine Learning, pages 6275–6284. PMLR, 2019
work page 2019
-
[35]
Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, and Ashish Sabharwal. Musique: Multihop questions via single-hop question composition.Transactions of the Association for Computational Linguistics, 10:539–554, 2022
work page 2022
-
[36]
Voyager: An Open-Ended Embodied Agent with Large Language Models
Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[37]
Jiaqi Wang, Hanqi Jiang, Yiheng Liu, Chong Ma, Xu Zhang, Yi Pan, Mengyuan Liu, Peiran Gu, Sichen Xia, Wenjun Li, et al. A comprehensive review of multimodal large language models: Performance and challenges across different tasks.arXiv preprint arXiv:2408.01319, 2024
-
[38]
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery,andDennyZhou. Self-consistencyimproveschainofthoughtreasoninginlanguage models.arXiv preprint arXiv:2203.11171, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[39]
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Yubo Wang, Xuehai Xue, Xin Jiang, Nanning Zheng, et al. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark.arXiv preprint arXiv:2406.01574, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[40]
Zhexuan Wang, Yutong Wang, Xuebo Liu, Liang Ding, Miao Zhang, Jie Liu, and Min Zhang. Agentdropout: Dynamic agent elimination for token-efficient and high-performance llm-based multi-agent collaboration. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 24013–24035, 2025
work page 2025
-
[41]
Autogen: Enabling next-gen llm applications via multi-agent conversations
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen llm applications via multi-agent conversations. InFirst conference on language modeling, 2024. 12
work page 2024
-
[42]
arXiv preprint arXiv:2402.01364 , year=
Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, and Gholamreza Haffari. Continual learning for large language models: A survey.arXiv preprint arXiv:2402.01364, 2024
-
[43]
Hotpotqa: A dataset for diverse, explainable multi-hop question answering
ZhilinYang,PengQi,SaizhengZhang,YoshuaBengio,WilliamWCohen,RuslanSalakhutdinov, and Christopher D Manning. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2369–2380, 2018
work page 2018
-
[44]
Masrouter: Learning to route llms for multi-agent systems
Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, and Yiyan Qi. Masrouter: Learning to route llms for multi-agent systems. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15549–15572, 2025
work page 2025
-
[45]
Guibin Zhang, Yanwei Yue, Xiangguo Sun, Guancheng Wan, Miao Yu, Junfeng Fang, Kun Wang, Tianlong Chen, and Dawei Cheng. G-designer: Architecting multi-agent communication topologies via graph neural networks.arXiv preprint arXiv:2410.11782, 2024
-
[46]
Aflow: Automating agentic workflow generation
Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xiong-Hui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, et al. Aflow: Automating agentic workflow generation. InThe Thirteenth International Conference on Learning Representations, 2024
work page 2024
-
[47]
LingZhang,ZijunYao,andAryaHadizadehMoghaddam. Designinggenaitoolsforpersonalized learning implementation: Theoretical analysis and prototype of a multi-agent system.Journal of Teacher Education, 76(3):280–293, 2025
work page 2025
-
[48]
Xueqiao Zhang, Chao Zhang, Jianwen Sun, Jun Xiao, Yi Yang, and Yawei Luo. Eduplanner: Llm-based multi-agent systems for customized and intelligent instructional design.IEEE Transactions on Learning Technologies, 2025
work page 2025
-
[49]
Towardslifelonglearningoflarge language models: A survey.ACM Computing Surveys, 57(8):1–35, 2025
JunhaoZheng, ShengjieQiu, ChengmingShi, andQianliMa. Towardslifelonglearningoflarge language models: A survey.ACM Computing Surveys, 57(8):1–35, 2025
work page 2025
-
[50]
Agieval: A human-centric benchmark for evaluating foundation models
Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, and Nan Duan. Agieval: A human-centric benchmark for evaluating foundation models. InFindings of the association for computational linguistics: NAACL 2024, pages 2299–2314, 2024
work page 2024
-
[51]
Memorybank: Enhancing large language models with long-term memory
Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang. Memorybank: Enhancing large language models with long-term memory. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 19724–19731, 2024
work page 2024
-
[52]
Multi-agent design: Optimizing agents with better prompts and topologies
HanZhou,XingchenWan,RuoxiSun,HamidPalangi,ShariqIqbal,IvanVulić,AnnaKorhonen, and Sercan Ö Arık. Multi-agent design: Optimizing agents with better prompts and topologies. arXiv preprint arXiv:2502.02533, 2025. 13 A Appendix Overview The appendix is structured as follows: •Section B discusses the limitations ofMasFACTand outlines future research directions...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.