Recognition: unknown
Improving the Efficiency of Language Agent Teams with Adaptive Task Graphs
Pith reviewed 2026-05-08 03:40 UTC · model grok-4.3
The pith
LLM agent teams maintain a shared evolving task graph to reduce token use, time, and conflicts while matching accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In LATTE, a team of agents collaboratively construct and maintain a shared, evolving coordination graph which encodes sub-task dependencies, individual agent assignment, and the current state of sub-task progress. This protocol maintains consistency while empowering agents to dynamically allocate work, adapt coordination, and discover new tasks. Across multiple collaborative tasks and a variety of base models, the approach reduces token usage, wall-clock time, communication, and coordination failures such as file conflicts and redundant outputs, while matching or exceeding the accuracy of standard designs including MetaGPT, decentralized teams, top-down Leader-Worker hierarchies, and static
What carries the argument
The shared evolving coordination graph that records sub-task dependencies, agent assignments, and progress states so agents can update it together under partial information.
If this is right
- Agents can discover new tasks and reallocate work dynamically instead of following a fixed plan.
- Coordination failures such as file conflicts and redundant outputs become less frequent.
- Overall token consumption and wall-clock time decrease across varied base models and tasks.
- Final accuracy stays at or above the level of fixed hierarchies and unstructured teams.
Where Pith is reading between the lines
- Similar shared-state mechanisms could help agent teams that must incorporate external tools or data sources mid-task.
- The approach may scale best on longer projects where the cost of early miscoordination grows large.
- Explicit dependency tracking might reduce the need for verbose natural-language messages between agents.
Load-bearing premise
Language model agents can reliably and consistently collaborate to build and maintain the shared task graph without creating new inconsistencies or adding too much extra overhead.
What would settle it
Run LATTE on a multi-step collaborative coding task and check whether the maintained graph ever allows two agents to edit the same file at once or repeat the same sub-task output.
Figures
read the original abstract
Large language models (LLMs) are increasingly deployed in teams, yet existing coordination approaches often occupy two extremes. Highly structured methods rely on fixed roles, pipelines, or task decompositions assigned a priori. In contrast, fully unstructured teams enable adaptability and exploration but suffer from inefficiencies such as error propagation, inter-agent conflicts, and wasted resources (measured in time, tokens, or file operations). We introduce Language Agent Teams for Task Evolution (LATTE), a framework for coordinating LLM teams inspired by distributed systems, where processors must operate under partial observability and communication constraints. In LATTE, a team of agents collaboratively construct and maintain a shared, evolving coordination graph which encodes sub-task dependencies, individual agent assignment, and the current state of sub-task progress. This protocol maintains consistency while empowering agents to dynamically allocate work, adapt coordination, and discover new tasks. Across multiple collaborative tasks and a variety of base models, we demonstrate how LATTE reduces token usage, wall-clock time, communication, and coordination failures (e.g. file conflicts and redundant outputs) while matching or exceeding the accuracy of standard designs including MetaGPT, decentralized teams, top-down Leader-Worker hierarchies, and static decompositions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces LATTE (Language Agent Teams for Task Evolution), a coordination framework for LLM-based agent teams. Agents collaboratively build and maintain a shared, evolving task graph encoding sub-task dependencies, assignments, and progress states. Inspired by distributed-systems principles for partial observability, the protocol aims to enable dynamic work allocation and adaptation. Empirical results across collaborative tasks and base models claim that LATTE reduces token usage, wall-clock time, communication volume, and coordination failures (e.g., file conflicts, redundant outputs) while matching or exceeding accuracy of baselines including MetaGPT, fully decentralized teams, Leader-Worker hierarchies, and static decompositions.
Significance. If the empirical claims hold under rigorous controls, LATTE offers a practical middle path between rigid a-priori structures and unstructured teams, potentially improving scalability and resource efficiency in multi-agent LLM deployments. The distributed-systems analogy and focus on measurable coordination failures provide a concrete, falsifiable protocol that could influence subsequent work on agent orchestration.
major comments (2)
- [Abstract / Experimental Evaluation] The central efficiency claims (reductions in tokens, time, communication, and failures) rest on experimental demonstrations, yet the abstract and framing provide no quantitative results, error bars, statistical tests, or controls. This leaves the magnitude and reliability of the reported improvements unassessable from the given material; the full manuscript must supply detailed tables, ablation studies, and significance testing to support the efficiency-accuracy tradeoff.
- [LATTE Protocol Description] The protocol's correctness hinges on agents reliably constructing, updating, and maintaining a consistent shared task graph under partial observability. The manuscript should provide a precise description (with pseudocode or state-transition rules) of conflict resolution, consistency guarantees, and overhead measurements for graph maintenance; without this, the weakest assumption identified in the review cannot be evaluated.
minor comments (2)
- [Introduction / Framework Overview] Define the precise data structure of the adaptive task graph (nodes, edges, state fields) with an early figure or formal notation to aid readability.
- [Experimental Setup] Clarify how baseline implementations (MetaGPT, Leader-Worker, etc.) were reproduced or adapted to ensure fair comparison of communication and failure metrics.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive review. We address the major comments point-by-point below and will revise the manuscript to incorporate the suggested improvements.
read point-by-point responses
-
Referee: [Abstract / Experimental Evaluation] The central efficiency claims (reductions in tokens, time, communication, and failures) rest on experimental demonstrations, yet the abstract and framing provide no quantitative results, error bars, statistical tests, or controls. This leaves the magnitude and reliability of the reported improvements unassessable from the given material; the full manuscript must supply detailed tables, ablation studies, and significance testing to support the efficiency-accuracy tradeoff.
Authors: We agree that the abstract should include quantitative highlights to make the efficiency claims more concrete and assessable. The full manuscript provides detailed tables, ablation studies, and comparisons across tasks and models in the experimental section. In the revision, we will update the abstract to report key quantitative results such as average reductions in token usage and time, and we will ensure that error bars and statistical significance are explicitly discussed and visualized in the main text. revision: yes
-
Referee: [LATTE Protocol Description] The protocol's correctness hinges on agents reliably constructing, updating, and maintaining a consistent shared task graph under partial observability. The manuscript should provide a precise description (with pseudocode or state-transition rules) of conflict resolution, consistency guarantees, and overhead measurements for graph maintenance; without this, the weakest assumption identified in the review cannot be evaluated.
Authors: We acknowledge the need for a more precise and formal description of the LATTE protocol to allow evaluation of its correctness under partial observability. The manuscript currently describes the protocol in natural language with examples of graph evolution. To strengthen this, we will include pseudocode for the core procedures of task graph construction, update, and conflict resolution, along with a discussion of consistency mechanisms drawn from distributed systems principles. We will also add experimental measurements of the computational overhead for maintaining the shared graph. revision: yes
Circularity Check
No significant circularity; empirical framework with external validation
full rationale
The paper introduces LATTE as a practical coordination protocol for LLM agent teams, drawing inspiration from distributed systems concepts like partial observability but presenting it as an implemented design rather than a mathematical derivation. No equations, fitted parameters, or predictions appear that reduce by construction to inputs. Claims of efficiency gains (reduced tokens, time, conflicts) rest on empirical comparisons to external baselines (MetaGPT, decentralized teams, Leader-Worker hierarchies, static decompositions) across multiple tasks and models. The graph maintenance protocol is described as a design choice to handle consistency under partial views, not derived from self-cited uniqueness theorems or ansatzes. This is a self-contained empirical systems contribution with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM agents can collaboratively construct and maintain a consistent shared task graph without introducing new coordination failures
invented entities (1)
-
Adaptive Task Graph
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Executing task graphs using work-stealing
Kunal Agrawal, Charles E Leiserson, and Jim Sukha. Executing task graphs using work-stealing. In2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pages 1–12. IEEE, 2010
2010
-
[2]
How we built our multi-agent research system
Anthropic. How we built our multi-agent research system. https://www.anthropic.com/ engineering/multi-agent-research-system, June 2025. Anthropic Engineering Blog
2025
-
[3]
Frédéric Berdoz, Leonardo Rugli, and Roger Wattenhofer. Can AI agents agree?arXiv preprint arXiv:2603.01213, 2026
-
[4]
Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, et al. Graph of thoughts: Solving elaborate problems with Large Language Models.Proceedings of the AAAI Conference on Artificial Intelligence, 38(16):17682–17690, 2024. doi: 10.1609/aaai.v38i16. 29720
-
[5]
Social agents: Collective intelligence improves LLM predic- tions
Aanisha Bhattacharyya, Abhilekh Borah, Yaman Kumar Singla, Rajiv Ratn Shah, Changyou Chen, and Balaji Krishnamurthy. Social agents: Collective intelligence improves LLM predic- tions. InThe Fourteenth International Conference on Learning Representations, 2026
2026
-
[6]
Brooks.The Mythical Man-Month: Essays on Software Engineering
Frederick P. Brooks.The Mythical Man-Month: Essays on Software Engineering. Addison- Wesley, Reading, MA, 1975. ISBN 0-201-00650-2
1975
-
[7]
Marcelo Cataldo, James D. Herbsleb, and Kathleen M. Carley. Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software de- velopment productivity. InProceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 2–11, Kaiserslautern, Germany, 2...
-
[8]
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang, Lakshya A Agrawal, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, et al. Why do multi- agent LLM systems fail?arXiv preprint arXiv:2503.13657, 2025
work page internal anchor Pith review arXiv 2025
-
[9]
Melvin E. Conway. How do committees invent?Datamation, 14(4):28–31, April 1968
1968
-
[10]
Jeffrey Dean and Luiz André Barroso. The tail at scale.Communications of the ACM, 56(2): 74–80, 2013. doi: 10.1145/2408776.2408794
-
[11]
MapReduce: Simplified data processing on large clusters
Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, January 2008. doi: 10.1145/1327452.1327492
-
[12]
Hierarchical reinforcement learning with the MAXQ value function decomposition.Journal of Artificial Intelligence Research, 13:227–303, 2000
Thomas G Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition.Journal of Artificial Intelligence Research, 13:227–303, 2000. 10
2000
-
[13]
A survey on in-context learning
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, et al. A survey on in-context learning. InProceedings of the 2024 conference on empirical methods in natural language processing, pages 1107–1128, 2024
2024
-
[14]
Improv- ing factuality and reasoning in language models through multiagent debate
Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. Improv- ing factuality and reasoning in language models through multiagent debate. InForty-first International Conference on Machine Learning, 2024
2024
-
[15]
Gleiph Ghiotto, Leonardo Murta, Márcio Barros, and André van der Hoek. On the nature of merge conflicts: A study of 2,731 open source Java projects hosted by GitHub.IEEE Transactions on Software Engineering, 46(8):892–915, 2020. doi: 10.1109/TSE.2018.2871083
-
[16]
Planning with abstract Markov decision processes
Nakul Gopalan, Michael Littman, James MacGlashan, Shawn Squire, Stefanie Tellex, John Winder, and Lawson Wong. Planning with abstract Markov decision processes. InProceedings of the International Conference on Automated Planning and Scheduling, volume 27, pages 480–488, 2017
2017
-
[17]
Doing more with less: Meta-reasoning and meta-learning in humans and machines
Thomas L Griffiths, Frederick Callaway, Michael B Chang, Erin Grant, Paul M Krueger, and Falk Lieder. Doing more with less: Meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29:24–30, 2019
2019
-
[18]
W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97–109, 1970. doi: 10.1093/biomet/57.1.97
-
[19]
Selecting computa- tions: Theory and applications
Nicholas Hay, Stuart Russell, David Tolpin, and Solomon Eyal Shimony. Selecting computa- tions: Theory and applications. InProceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pages 346–355, 2012
2012
-
[20]
Princeton University Press, Princeton, NJ,
Joseph Henrich.The Secret of Our Success: How Culture is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton University Press, Princeton, NJ,
-
[21]
James D. Herbsleb and Audris Mockus. An empirical study of speed and communication in globally distributed software development.IEEE Transactions on Software Engineering, 29(6): 481–494, 2003. doi: 10.1109/TSE.2003.1205177
-
[22]
MetaGPT: Meta programming for a multi-agent collaborative framework
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. MetaGPT: Meta programming for a multi-agent collaborative framework. InThe Twelfth International Conference on Learning Representations, 2023
2023
-
[23]
Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Michael R Lyu, and Maarten Sap. On the resilience of LLM-based multi-agent collaboration with faulty agents.arXiv preprint arXiv:2408.00989, 2024
-
[24]
arXiv preprint arXiv:2507.14928 , year=
Yongrae Jo and Chanik Park. Byzantine-robust decentralized coordination of LLM agents. arXiv preprint arXiv:2507.14928, 2025
-
[25]
A concurrent dynamic task graph
Theodore Johnson. A concurrent dynamic task graph. In1993 International Conference on Parallel Processing-ICPP’93, volume 2, pages 223–230. IEEE, 1993
1993
-
[26]
Towards a Science of Scaling Agent Systems
Yubin Kim, Ken Gu, Chanwoo Park, Chunjong Park, Samuel Schmidgall, A Ali Heydari, Yao Yan, Zhihan Zhang, Yuchen Zhuang, Mark Malhotra, et al. Towards a science of scaling agent systems.arXiv preprint arXiv:2512.08296, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Metareasoning structures, problems, and modes for multiagent systems: A survey.IEEE Access, 8:183080–183089, 2020
Samuel T Langlois, Oghenetekevwe Akoroda, Estefany Carrillo, Jeffrey W Herrmann, Shapour Azarm, Huan Xu, and Michael Otte. Metareasoning structures, problems, and modes for multiagent systems: A survey.IEEE Access, 8:183080–183089, 2020
2020
-
[28]
Agent-oriented planning in multi-agent systems
Ao Li, Yuexiang Xie, Songze Li, Fugee Tsung, Bolin Ding, and Yaliang Li. Agent-oriented planning in multi-agent systems. InThe Thirteenth International Conference on Learning Representations, 2025. 11
2025
-
[29]
Junyou Li, Qin Zhang, Yangbin Yu, Qiang Fu, and Deheng Ye. More agents is all you need. arXiv preprint arXiv:2402.05120, 2024
-
[30]
Lost in the middle: How language models use long contexts.Transactions of the Association for Computational Linguistics, 12:157–173, 2024
Nelson F Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. Lost in the middle: How language models use long contexts.Transactions of the Association for Computational Linguistics, 12:157–173, 2024
2024
-
[31]
Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic
Shuo Liu, Tianle Chen, Ryan Amiri, and Christopher Amato. Learning decentralized LLM collaboration with multi-agent actor critic.arXiv preprint arXiv:2601.21972, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[32]
Alan MacCormack, Carliss Baldwin, and John Rusnak. Exploring the duality between product and organizational architectures: A test of the “mirroring” hypothesis.Research Policy, 41(8): 1309–1324, 2012. doi: 10.1016/j.respol.2012.04.011
-
[33]
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD ’10), pages 135–146, Indianapolis, Indiana, USA, 2010. Association for Computing Machinery....
-
[34]
Learning scheduling algorithms for data processing clusters
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. Learning scheduling algorithms for data processing clusters. InProceedings of the ACM Special Interest Group on Data Communication (SIGCOMM), pages 270–288. ACM,
-
[35]
doi: 10.1145/3341302.3342080
-
[36]
Language model teams as distributed systems.arXiv preprint arXiv:2603.12229, 2026
Elizabeth Mieczkowski, Katherine M Collins, Ilia Sucholutsky, Natalia Vélez, and Thomas L Griffiths. Language model teams as distributed systems.arXiv preprint arXiv:2603.12229, 2026
-
[37]
Ray: A distributed framework for emerging AI applications
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, and Ion Stoica. Ray: A distributed framework for emerging AI applications. In13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 561–577. USENIX Association, 2018
2018
-
[38]
Multi-agent teams hold experts back.arXiv preprint arXiv:2602.01011, 2026
Aneesh Pappu, Batu El, Hancheng Cao, Carmelo di Nolfo, Yanchao Sun, Meng Cao, and James Zou. Multi-agent teams hold experts back.arXiv preprint arXiv:2602.01011, 2026
-
[39]
O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S
Joon Sung Park, Joseph C O’Brien, Carrie J Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. InProceed- ings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023. doi: 10.1145/3586183.3606763
-
[40]
Constantine D. Polychronopoulos and David J. Kuck. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers.IEEE Transactions on Computers, C-36(12): 1425–1439, December 1987. doi: 10.1109/TC.1987.5009495
-
[41]
Chatdev: Communicative agents for software development
Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. Chatdev: Communicative agents for software development. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, pages 15174–15186, 2024
2024
-
[42]
A framework for meta-level control in multi-agent systems
Anita Raja and Victor Lesser. A framework for meta-level control in multi-agent systems. Autonomous Agents and Multi-Agent Systems, 15(2):147–196, 2007
2007
-
[43]
Emergent Coordination in Multi-Agent Language Models
Christoph Riedl. Emergent coordination in multi-agent language models.arXiv preprint arXiv:2510.05174, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[44]
Michael Rizvi-Martel, Satwik Bhattamishra, Neil Rathi, Guillaume Rabusseau, and Michael Hahn. Benefits and limitations of communication in multi-agent reasoning.arXiv preprint arXiv:2510.13903, 2025
-
[45]
Principles of metareasoning.Artificial Intelligence, 49(1-3): 361–395, 1991
Stuart Russell and Eric Wefald. Principles of metareasoning.Artificial Intelligence, 49(1-3): 361–395, 1991. 12
1991
-
[46]
H. Sackman, W. J. Erikson, and E. E. Grant. Exploratory experimental studies comparing online and offline programming performance.Communications of the ACM, 11(1):3–11, 1968. doi: 10.1145/362851.362858
-
[47]
Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki, Alex Loftus, Aditya Ratan Jannali, Nikhil Prakash, et al. Agents of chaos.arXiv preprint arXiv:2602.20021, 2026
work page internal anchor Pith review arXiv 2026
-
[48]
HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face
Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023). Curran Associates, Inc., 2023
2023
-
[49]
Multiagent metareasoning through organizational design
Jason Sleight and Edmund Durfee. Multiagent metareasoning through organizational design. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 28, 2014
2014
-
[50]
The virtual lab of AI agents designs new SARS-CoV-2 nanobodies.Nature, 646(8085):716–723, 2025
Kyle Swanson, Wesley Wu, Nash L Bulaong, John E Pak, and James Zou. The virtual lab of AI agents designs new SARS-CoV-2 nanobodies.Nature, 646(8085):716–723, 2025
2025
-
[51]
Un- derstanding and sharing intentions: The origins of cultural cognition.Behavioral and Brain Sciences, 28(5):675–691, 2005
Michael Tomasello, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll. Un- derstanding and sharing intentions: The origins of cultural cognition.Behavioral and Brain Sciences, 28(5):675–691, 2005
2005
-
[52]
Performance-effective and low-complexity task scheduling for heterogeneous computing.IEEE Transactions on Parallel and Distributed Systems, 13(3):260–274, 2002
Haluk Topcuoglu, Salim Hariri, and Min-You Wu. Performance-effective and low-complexity task scheduling for heterogeneous computing.IEEE Transactions on Parallel and Distributed Systems, 13(3):260–274, 2002
2002
-
[53]
distributed-systems.net, 2023
Maarten Van Steen and Andrew S Tanenbaum.Distributed Systems. distributed-systems.net, 2023
2023
-
[54]
Chain-of-thought prompting elicits reasoning in large language models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022
2022
-
[55]
Strengthening the case for pair programming.IEEE Software, 17(4):19–25, 2000
Laurie Williams, Robert R Kessler, Ward Cunningham, and Ron Jeffries. Strengthening the case for pair programming.IEEE Software, 17(4):19–25, 2000
2000
-
[56]
Task scheduling in distributed computing systems with a genetic algorithm
Sung-Ho Woo, Sung-Bong Yang, Shin-Dug Kim, and Tack-Don Han. Task scheduling in distributed computing systems with a genetic algorithm. InProceedings High Performance Computing on the Information Superhighway. HPC Asia’97, pages 301–305. IEEE, 1997
1997
-
[57]
Autogen: Enabling next-gen LLM applications via multi-agent conversations
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen LLM applications via multi-agent conversations. InFirst Conference on Language Modeling, 2024
2024
-
[58]
Yingxuan Yang, Chengrui Qu, Muning Wen, Laixi Shi, Ying Wen, Weinan Zhang, Adam Wierman, and Shangding Gu. Understanding agent scaling in LLM-based multi-agent systems via diversity.arXiv preprint arXiv:2602.03794, 2026
-
[59]
Tree of thoughts: Deliberate problem solving with large language models.Ad- vances in Neural Information Processing Systems, 36:11809–11822, 2023
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. Tree of thoughts: Deliberate problem solving with large language models.Ad- vances in Neural Information Processing Systems, 36:11809–11822, 2023
2023
-
[60]
Guibin Zhang, Yanwei Yue, Zhixun Li, Sukwon Yun, Guancheng Wan, Kun Wang, Dawei Cheng, Jeffrey Xu Yu, and Tianlong Chen. Cut the crap: An economical communication pipeline for LLM-based multi-agent systems.arXiv preprint arXiv:2410.02506, 2024
-
[61]
Position: Science is collaborative—LLM for science should be too
Terry Jingchen Zhang, Wenyuan Jiang, Yongjin Yang, Sirui Lu, Bernhard Schölkopf, and Zhijing Jin. Position: Science is collaborative—LLM for science should be too. InICLR 2026 Workshop on Foundation Models for Science: Real-World Impact, 2026. Oral
2026
-
[62]
Chain of agents: Large language models collaborating on long-context tasks.Advances in Neural Information Processing Systems, 37:132208–132237, 2024
Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, and Sercan Arik. Chain of agents: Large language models collaborating on long-context tasks.Advances in Neural Information Processing Systems, 37:132208–132237, 2024. 13
2024
-
[63]
Least-to-most prompting enables complex reasoning in large language models
Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc V Le, and Ed H Chi. Least-to-most prompting enables complex reasoning in large language models. InThe Eleventh International Conference on Learning Representations, 2023. 14 Appendix A1 Related Work 15 A1.1 LLM teams . . . . . ...
2023
-
[69]
I can’t start B until A’s output exists
Synthesize results and ensure successful project completion Work efficiently and delegate appropriately. Trust your teammates to handle their assignments, but provide guidance when needed. Keep communication clear and actionable. Parallelism: Teammates can self-assign from the ready queue — they do not need to wait for you. Your job is to keep the graph c...
-
[75]
fix-index
Graph updates. The task graph is a living document. Use <discover_task> to add new tasks whenever: (a) A teammate reports that tests are still failing after completing their task, (b) you notice a dependency was missed or a prior task produced incorrect output, (c) the project needs a verification or integration pass that wasn’t planned upfront. Example: ...
-
[76]
The verifying agent will check correctness and fix any issues
If a task is high-stakes — it is upstream of many other tasks, or its output is hard to validate later — you can request a verification pass by a second agent: <verify_task id="task-X" /> This inserts a lightweight review task into the graph that must complete before downstream tasks proceed. The verifying agent will check correctness and fix any issues
-
[77]
task-X" /> This clears the current owner and resets the task to pending. Then reassign it with <assign_task id=
Straggler mitigation. If a teammate has been assigned a task for several rounds without completing it, they may be stuck. Use this action to release the task back to pending so it can be reassigned: <release_task id="task-X" /> This clears the current owner and resets the task to pending. Then reassign it with <assign_task id="task-X" to="DevY" /> either ...
-
[78]
assigned
If the test suite is passing but tasks are still marked "assigned" or "in_progress" (e.g. a teammate completed the work but forgot to emit <complete_task>), you can close them directly: <close_task id="task-X" /> Only use this after confirming with <run_tests /> that tests pass. This is the right action when: all tests are green, a task’s work is clearly ...
-
[81]
math_utils.py
To read an existing file’s contents directly, use: <read_file path="math_utils.py" /> This returns the file contents immediately — no script needed. Always prefer this over writing a helper script to print a file. To execute a script and see its output, use: <run_script path="script.py" /> This runs the file and returns stdout/stderr to you. Use this to v...
-
[83]
Communicate with the team Lead when blocked or in need of clarification
-
[84]
I cannot start B until A’s output exists
Complete tasks thoroughly before moving to the next one. Be proactive, collaborative, and detail-oriented. Focus on producing high-quality work. Discovering New Tasks Use <discover_task> whenever you uncover work that isn’t already in the task list. When possible, build a wide graph, not a deep one. Only use dependencies to express real implementation ord...
2024
-
[85]
ProductManager(Alice) translates the task description into a Product Requirements Docu- ment (PRD), user stories, and a competitive analysis
-
[86]
3.ProjectManager(Eve) reads the system design and issues a task list to the Engineer
Architect(Bob) receives the PRD and produces a system design document, including the Python package name, file structure, and API specifications. 3.ProjectManager(Eve) reads the system design and issues a task list to the Engineer
-
[87]
Engineer(Alex, n_borg= 1 ) implements the assigned files sequentially, one file per action, emitting code blocks to shared memory
-
[88]
Agents communicate exclusively through a shared publish-subscribe message bus: each role watches a fixed set of upstream action types and acts only when a matching message arrives
QaEngineer(Edward, test_round_allowed= 5 ) watches for Engineer output and iterates a write-test→run-code→debug-error loop up to the allowed round count. Agents communicate exclusively through a shared publish-subscribe message bus: each role watches a fixed set of upstream action types and acts only when a matching message arrives. The task decomposition...
-
[90]
Break down work and strategically assign tasks to team members
-
[91]
Monitor progress and coordinate the team
-
[92]
Help unblock teammates when they face issues
-
[93]
Review work for quality and consistency
-
[94]
Trust your teammates to handle their assignments, but provide guidance when needed
Synthesize results and ensure successful project completion Work efficiently and delegate appropriately. Trust your teammates to handle their assignments, but provide guidance when needed. Keep communication clear and actionable. Available Actions: Do NOT edit files yourself — focus on directing your team and verifying their work
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.