OrchestrXR: A Multi-Agent System for Idea-to-Prototype XR Study Authoring
Pith reviewed 2026-07-03 07:16 UTC · model grok-4.3
The pith
OrchestrXR uses multi-agent orchestration to convert XR study ideas into Unity prototypes while preserving researcher intent across design, scene, and interaction stages.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
OrchestrXR is a multi-agent workflow for early-stage XR study authoring that supports a controllable process across study design, scene generation, and interaction generation through structured schemas, multi-agent orchestration, and interactive human-agent interfaces, yielding Unity-based prototypes from a researcher's idea, as suggested by a user study with 12 XR researchers showing effective support and strong intent preservation across stages.
What carries the argument
Multi-agent orchestration that coordinates study design, scene generation, and interaction generation using structured schemas and interactive human-agent interfaces to output Unity prototypes.
If this is right
- XR researchers can iterate from idea to runnable prototype without manual handoff between separate design and coding tools.
- Intent from the initial study concept remains consistent through the three authoring stages.
- The workflow reduces fragmentation when specifying experimental tasks, 3D scenes, and interactive logic together.
- Early-stage XR experiments become more accessible for researchers who lack deep Unity programming experience.
Where Pith is reading between the lines
- The staged orchestration could transfer to authoring tools for AR training scenarios or VR therapy applications beyond academic studies.
- Adding quantitative intent-matching metrics in future evaluations would strengthen validation of the preservation claim.
- Similar multi-agent schemas might streamline idea-to-prototype pipelines in other HCI domains such as wearable interface design.
Load-bearing premise
That a user study with 12 XR researchers provides sufficient evidence of the workflow's effectiveness and intent preservation, even without detailed study design or quantitative metrics.
What would settle it
A replication study or larger evaluation where generated prototypes show frequent loss of original research intent or fail to produce functional Unity scenes would disprove the effectiveness claim.
Figures
read the original abstract
Extended Reality (XR) has become an important interaction paradigm in Human-Computer Interaction (HCI). XR studies are used to investigate interaction, perception, and user behavior in immersive environments, and typically involve experimental tasks, 3D scenes, and interactive logic. However, turning an initial XR study idea into a runnable prototype remains fragmented across study design, scene construction, and interaction implementation. We present OrchestrXR, a multi-agent human-AI workflow for early-stage idea-to-prototype XR study authoring. Rather than treating XR study creation as one-shot generation, OrchestrXR supports a controllable workflow across study design, scene generation, and interaction generation through structured schemas, multi-agent orchestration, and interactive human-agent interfaces, producing a Unity-based prototype from a researcher's idea. A user study with 12 XR researchers suggests that OrchestrXR provides effective support for early-stage XR study authoring with strong intent preservation across stages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents OrchestrXR, a multi-agent human-AI workflow for early-stage XR study authoring that transforms researcher ideas into Unity prototypes via structured stages of study design, scene generation, and interaction generation. It uses schemas, multi-agent orchestration, and interactive interfaces to maintain controllability rather than one-shot generation. The central claim is that the system provides effective support for this process with strong intent preservation, as suggested by a user study involving 12 XR researchers.
Significance. If the evaluation were robust, the work could contribute to HCI by addressing the fragmentation in XR prototyping tools through a controllable multi-stage multi-agent approach. The idea of using orchestration to preserve intent across design-to-implementation stages has potential applicability beyond XR. However, the current lack of detailed results limits assessment of its actual significance or advantage over existing methods.
major comments (1)
- [User Study section] User Study section: The claim of effective support and strong intent preservation across stages is evidenced solely by a user study with 12 XR researchers. No information is provided on study protocol, participant tasks, comparison conditions or baselines, measurement instruments for intent preservation, quantitative metrics (e.g., success rates, Likert scores), or statistical tests. This is load-bearing for the central claim, as the data cannot be evaluated for support of the stated conclusions or distinguished from subjective preference.
minor comments (1)
- [Abstract] Abstract: The summary of the user study omits any methodology overview or quantitative outcomes, which reduces the ability to quickly assess the strength of the reported evidence.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the user study. We agree that the current manuscript omits critical details needed to evaluate the claims of effective support and intent preservation, and we will revise the User Study section to address this.
read point-by-point responses
-
Referee: [User Study section] User Study section: The claim of effective support and strong intent preservation across stages is evidenced solely by a user study with 12 XR researchers. No information is provided on study protocol, participant tasks, comparison conditions or baselines, measurement instruments for intent preservation, quantitative metrics (e.g., success rates, Likert scores), or statistical tests. This is load-bearing for the central claim, as the data cannot be evaluated for support of the stated conclusions or distinguished from subjective preference.
Authors: We acknowledge that the User Study section as written provides insufficient detail for independent evaluation of the results. The study was exploratory and primarily qualitative, focusing on researcher feedback regarding workflow usability and intent preservation through post-session interviews and observations rather than controlled quantitative comparisons. In the revised manuscript we will expand the section to report: (1) full study protocol including recruitment criteria, session structure, and think-aloud procedures; (2) specific participant tasks (authoring three distinct XR study ideas of varying complexity); (3) the absence of a formal baseline condition, with rationale that the evaluation targeted the multi-stage orchestration workflow itself; (4) measurement instruments consisting of structured interview questions and 7-point Likert items on perceived support and intent preservation; (5) any quantitative observations collected (e.g., prototype completion rates, iteration counts); and (6) thematic analysis approach used in place of statistical tests. These additions will allow readers to assess the strength of evidence supporting our claims. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper's central claim of effective support for XR study authoring with intent preservation is supported by an external user study involving 12 XR researchers rather than any self-referential metrics, fitted parameters, or derivations that reduce to the system's own outputs by construction. No equations, ansatzes, or load-bearing self-citations are present that would create circularity; the evaluation chain relies on independent participant input and is therefore self-contained.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective
Mohamed Aghzal, Gregory J. Stein, and Ziyu Yao. 2026. Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective.arXiv preprint arXiv:2603.14248 (2026). https://arxiv.org/abs/2603.14248
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[2]
Narges Ashtari, Andrea Bunt, Joanna McGrenere, Michael Nebeling, and Parmit K Chilana. 2020. Creating augmented and virtual reality applications: Current practices, challenges, and opportunities. InProceedings of the 2020 CHI conference on human factors in computing systems. 1–13
2020
-
[3]
Adam O Bebko and Nikolaus F Troje. 2020. bmlTUX: Design and control of exper- iments in virtual reality and beyond.i-Perception11, 4 (2020), 2041669520938400
2020
-
[4]
John Brooke et al. 1996. SUS-A quick and dirty usability scale.Usability evaluation in industry189, 194 (1996), 4–7
1996
-
[5]
Jack Brookes, Matthew Warburton, Mshari Alghadier, Mark Mon-Williams, and Faisal Mushtaq. 2020. Studying human behavior with virtual reality: The Unity Experiment Framework.Behavior research methods52, 2 (2020), 455–463
2020
-
[6]
Alessandro Carcangiu, Marco Manca, Jacopo Mereu, Carmen Santoro, Ludovica Simeoli, and Lucio Davide Spano. 2025. Tell-XR: Conversational End-User De- velopment of XR Automations. InHuman-Computer Interaction – INTERACT
2025
-
[7]
doi:10.1007/978-3-032-04999-5_35
Springer. doi:10.1007/978-3-032-04999-5_35
-
[8]
Patrick Carlson, Anicia Peters, Stephen B Gilbert, Judy M Vance, and Andy Luse
-
[9]
Virtual training: Learning transfer of assembly tasks.IEEE transactions on visualization and computer graphics21, 6 (2015), 770–782
2015
- [10]
-
[11]
Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, et al. 2023. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors. InThe Twelfth International Conference on Learning Representations
2023
-
[12]
Tor-Salve Dalsgaard, Jarrod Knibbe, and Joanna Bergström. 2021. Modeling pointing for 3D target selection in VR. InProceedings of the 27th ACM symposium on virtual reality software and technology. 1–10
2021
-
[13]
Fernanda De La Torre, Cathy Mengying Fang, Han Huang, Andrzej Banburski- Fahey, Judith Amores Fernandez, and Jaron Lanier. 2024. Llmr: Real-time prompt- ing of interactive worlds using large language models. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–22
2024
-
[14]
Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Sam Stevens, Boshi Wang, Huan Sun, and Yu Su. 2023. Mind2web: Towards a generalist agent for the web. Advances in Neural Information Processing Systems36 (2023), 28091–28114
2023
-
[15]
Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongyi Zhou, Xingyue Chen, Jiahao Ren, Robert Timothy Bettridge, et al. 2026. Vibe Coding XR: Accelerating AI+ XR Prototyping with XR Blocks and Gemini. arXiv preprint arXiv:2603.24591(2026)
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[16]
Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch
-
[17]
InForty-first international conference on machine learning
Improving factuality and reasoning in language models through multiagent debate. InForty-first international conference on machine learning
-
[18]
Jonathan Ehret, Andrea Bönsch, Janina Fels, Sabine J Schlittmeier, and Torsten W Kuhlen. 2024. StudyFramework: Comfortably setting up and conducting factorial- design studies using the unreal engine. In2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW). IEEE, 442–449
2024
-
[19]
Epic Games. 2026. Unreal Engine: The Most Powerful Real-Time 3D Creation Tool. https://www.unrealengine.com/. Accessed: 2026-03-31
2026
-
[20]
Andrew Estornell and Yang Liu. 2024. Multi-llm debate: Framework, principals, and interventions.Advances in Neural Information Processing Systems37 (2024), 28938–28964
2024
-
[21]
A Fourney, G Bansal, H Mozannar, C Tan, E Salinas, Zhu Erkang, F Niedt- ner, G Proebsting, G Bassman, J Gerrits, et al . [n. d.]. Magentic-one: A gen- eralist multi-agent system for solving complex tasks, 2024.URL https://arxiv. org/abs/2411.04468([n. d.])
work page internal anchor Pith review Pith/arXiv arXiv 2024
- [22]
-
[23]
Gege Gao, Weiyang Liu, Anpei Chen, Andreas Geiger, and Bernhard Schölkopf
-
[24]
InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Graphdreamer: Compositional 3d scene synthesis from scene graphs. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21295–21304
-
[25]
Alireza Ghafarollahi and Markus J Buehler. 2024. ProtAgents: protein discovery via large language model multi-agent collaborations combining physics and machine learning.Digital Discovery3, 7 (2024), 1389–1409
2024
-
[26]
Juraj Gottweis, Wei-Hung Weng, Alexander Daryin, Tao Tu, Anil Palepu, Petar Sirkovic, Artiom Myaskovsky, Felix Weissenberger, Keran Rong, Ryutaro Tanno, et al. 2025. Towards an AI co-scientist.arXiv preprint arXiv:2502.18864(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Judith Hartfill, Jenny Gabel, Lucie Kruse, Susanne Schmidt, Kevin Riebandt, Simone Kühn, and Frank Steinicke. 2021. Analysis of detection thresholds for hand redirection during mid-air interactions in virtual reality. InProceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1–10
2021
-
[28]
Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, et al. 2025. Pasa: An llm agent for comprehensive academic paper search. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 11663–11679
2025
-
[29]
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al
-
[30]
InThe twelfth international conference on learning representations
MetaGPT: Meta programming for a multi-agent collaborative framework. InThe twelfth international conference on learning representations
- [31]
-
[32]
Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A Ross, Cordelia Schmid, and Alireza Fathi. 2024. Scenecraft: An llm agent for synthesiz- ing 3d scenes as blender code. InForty-first International Conference on Machine Learning
2024
-
[33]
Sebastian Hubenschmid, Jonathan Wieland, Daniel Immanuel Fink, Andrea Batch, Johannes Zagermann, Niklas Elmqvist, and Harald Reiterer. 2022. Relive: Bridging in-situ and ex-situ visual analytics for analyzing mixed reality user studies. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–20
2022
-
[34]
Tal Ifargan, Lukas Hafner, Maor Kern, Ori Alcalay, and Roy Kishony. 2025. Au- tonomous LLM-driven research—from data to human-verifiable research papers. NEJM AI2, 1 (2025), AIoa2400555
2025
-
[35]
Charles Javerliat, Sophie Villenave, Pierre Raimbaud, and Guillaume Lavoué. 2024. Plume: Record, replay, analyze and share user behavior in 6dof xr experiences. IEEE Transactions on Visualization and Computer Graphics30, 5 (2024), 2087– 2097
2024
-
[36]
Hyeonsu B Kang, Tongshuang Wu, Joseph Chee Chang, and Aniket Kittur. 2023. Synergi: A mixed-initiative system for scholarly synthesis and sensemaking. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–19
2023
-
[37]
Veronika Krauß, Alexander Boden, Leif Oppermann, and René Reiners. 2021. Current practices, challenges, and design implications for collaborative ar/vr application development. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15
2021
-
[38]
Radha Kumaran, You-Jin Kim, Anne E Milner, Tom Bullock, Barry Giesbrecht, and Tobias Höllerer. 2023. The impact of navigation aids on search performance and object recall in wide-area augmented reality. InProceedings of the 2023 CHI conference on human factors in computing systems. 1–17
2023
- [39]
-
[40]
Cheryl Lee, Chunqiu Steven Xia, Longji Yang, Jen-tse Huang, Zhouruixing Zhu, Lingming Zhang, and Michael R Lyu. 2025. Unidebugger: Hierarchical multi- agent framework for unified software debugging. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 18248–18277
2025
-
[41]
Jaewook Lee, Filippo Aleotti, Diego Mazala, Guillermo Garcia-Hernando, Sara Vicente, Oliver James Johnston, Isabel Kraus-Liang, Jakub Powierza, Donghoon Shin, Jon E Froehlich, et al. 2025. Imaginatear: Ai-assisted in-situ authoring in augmented reality. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. 1–21
2025
-
[42]
Jaewook Lee, Raahul Natarrajan, Sebastian S Rodriguez, Payod Panda, and Eyal Ofek. 2022. Remotelab: A vr remote study toolkit. InProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–9
2022
- [43]
- [44]
-
[45]
Yunxuan Li, Yibing Du, Jiageng Zhang, Le Hou, Peter Grabowski, Yeqing Li, and Eugene Ie. 2024. Improving multi-agent debate with sparse communication topology. InFindings of the Association for Computational Linguistics: EMNLP Shuqi Liao, Chenfei Zhu, Karthik Ramani, and Voicu Popescu
2024
-
[46]
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. 2024. Encouraging divergent thinking in large language models through multi-agent debate. InProceedings of the 2024 conference on empirical methods in natural language processing. 17889–17904
2024
-
[47]
Nelson F Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. 2024. Lost in the middle: How language models use long contexts.Transactions of the association for computational linguistics12 (2024), 157–173
2024
-
[48]
Chris Lu, Cong Lu, Robert Tjarko Lange, Jakob Foerster, Jeff Clune, and David Ha
-
[49]
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
The ai scientist: Towards fully automated open-ended scientific discovery. arXiv preprint arXiv:2408.06292(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[50]
Donghyeok Ma, Hanbee Jang, Joon Hyub Lee, and Seok-Hyung Bae. 2025. Gar- den of Papers: Finding, Reading, and Organizing Research Papers in a Visual, Integrated, and Flexible Workspace. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. 1–15
2025
-
[51]
Jacopo Mereu, Valentino Artizzu, Alessandro Carcangiu, Lucio Davide Spano, Ludovica Simeoli, Andrea Mattioli, Marco Manca, Carmen Santoro, and Fabio Paternò. 2024. Empowering end-user in creating extended reality content with a conversational chatbot. InInternational Symposium on Engineering Interactive Computer Systems. Springer, 126–137
2024
-
[52]
Paul Milgram and Fumio Kishino. 1994. A taxonomy of mixed reality visual displays.IEICE Transactions on Information and Systems77, 12 (1994), 1321–1329
1994
-
[53]
Model Context Protocol. 2025. Model Context Protocol Specification. https: //modelcontextprotocol.io/specification/2025-11-25. Accessed: 2026-03-29
2025
-
[54]
Michael Nebeling, Maximilian Speicher, Xizi Wang, Shwetha Rajaram, Brian D Hall, Zijian Xie, Alexander RE Raistrick, Michelle Aebersold, Edward G Happ, Jiayin Wang, et al. 2020. MRAT: The mixed reality analytics toolkit. InProceedings of the 2020 CHI Conference on human factors in computing systems. 1–12
2020
-
[55]
Cassandra Overney, Belén Saldías, Dimitra Dimitrakopoulou, and Deb Roy. 2024. Sensemate: An accessible and beginner-friendly human-ai platform for qualita- tive data analysis. InProceedings of the 29th International Conference on Intelligent User Interfaces. 922–939
2024
-
[56]
Xueni Pan and Antonia F de C Hamilton. 2018. Why and how to use virtual reality to study human social interaction: The challenges of exploring a new research landscape.British Journal of Psychology109, 3 (2018), 395–417
2018
-
[57]
Stéven Picard, Ningyuan Sun, and Jean Botev. 2024. XR MUSE: An Open-Source Unity Framework for Extended Reality-Based Networked Multi-User Studies. Virtual Worlds3, 4 (2024), 404–417. doi:10.3390/virtualworlds3040022
-
[58]
Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. 2024. Chatdev: Communicative agents for software development. InProceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long papers). 15174–15186
2024
-
[59]
Yujia Qin, Yining Ye, Junjie Fang, Haoming Wang, Shihao Liang, Shizuo Tian, Junda Zhang, Jiahao Li, Yunxin Li, Shijue Huang, et al. 2025. Ui-tars: Pioneering automated gui interaction with native agents.arXiv preprint arXiv:2501.12326 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[60]
Jack Ratcliffe, Francesco Soave, Nick Bryan-Kinns, Laurissa Tokarchuk, and Ildar Farkhatdinov. 2021. Extended reality (XR) remote research: A survey of drawbacks and opportunities. InProceedings of the 2021 CHI conference on human factors in computing systems. 1–13
2021
-
[61]
Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Michael Moor, Zicheng Liu, and Emad Barsoum. 2025. Agent laboratory: Using llm agents as research assistants.Findings of the Association for Computational Linguistics: EMNLP 2025(2025), 5977–6043
2025
- [62]
-
[63]
Maximilian Speicher, Brian D Hall, and Michael Nebeling. 2019. What is mixed reality?. InProceedings of the 2019 CHI conference on human factors in computing systems. 1–15
2019
-
[64]
Anthony Steed, Lisa Izzouzi, Klara Brandstätter, Sebastian Friston, Ben Congdon, Otto Olkkonen, Daniele Giunchi, Nels Numan, and David Swapp. 2022. Ubiq-exp: A toolkit to build and run remote and distributed mixed reality experiments. Frontiers in Virtual Reality3 (2022), 912078
2022
-
[65]
Helen Stefanidi, Asterios Leonidis, Maria Korozi, and George Papagiannakis
-
[66]
In2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
The ARgus Designer: Supporting experts while conducting user studies of AR/MR applications. In2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE, 885–890
-
[67]
Wei Tao, Yucheng Zhou, Yanlin Wang, Wenqiang Zhang, Hongyu Zhang, and Yu Cheng. 2024. Magis: Llm-based multi-agent framework for github issue resolution.Advances in Neural Information Processing Systems37 (2024), 51963– 51993
2024
-
[68]
Unity Technologies. 2026. Unity Engine: 2D & 3D Development Platform. https: //unity.com/products/unity-engine. Accessed: 2026-03-31
2026
-
[69]
Xingbo Wang, Samantha L Huey, Rui Sheng, Saurabh Mehta, and Fei Wang
-
[70]
Scidasynth: Interactive structured knowledge extraction and synthesis from scientific literature with large language model.arXiv e-prints(2024), arXiv– 2404
2024
-
[71]
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. 2024. Autogen: Enabling next-gen LLM applications via multi-agent conversations. InFirst conference on language modeling
2024
-
[72]
Liwenhan Xie, Chengbo Zheng, Haijun Xia, Huamin Qu, and Chen Zhu-Tian
-
[73]
InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology
Waitgpt: Monitoring and steering conversational llm agent in data anal- ysis with on-the-fly code visualization. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. 1–14
-
[74]
Yutaro Yamada, Robert Tjarko Lange, Cong Lu, Shengran Hu, Chris Lu, Jakob Foerster, Jeff Clune, and David Ha. 2025. The ai scientist-v2: Workshop-level auto- mated scientific discovery via agentic tree search.arXiv preprint arXiv:2504.08066 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[75]
John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. Swe-agent: Agent-computer interfaces enable automated software engineering.Advances in Neural Information Processing Systems37 (2024), 50528–50652
2024
-
[76]
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations
2022
- [77]
-
[78]
Chengbo Zheng, Yuanhao Zhang, Zeyu Huang, Chuhan Shi, Minrui Xu, and Xiaojuan Ma. 2024. Disciplink: Unfolding interdisciplinary information seeking process via human-ai co-exploration. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. 1–20
2024
-
[79]
Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, et al. 2023. Webarena: A realistic web environment for building autonomous agents.arXiv preprint arXiv:2307.13854(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[80]
Chenfei Zhu, Shao-Kang Hsia, Xiyun Hu, Ziyi Liu, Jingyu Shi, and Karthik Ramani. 2025. agentAR: Creating Augmented Reality Applications with Tool- Augmented LLM-based Autonomous Agents. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. doi:10.1145/3746059. 3747676 OrchestrXR: A Multi-Agent System for Idea-to-Prototy...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.