pith. sign in

arxiv: 2505.16120 · v2 · submitted 2025-05-22 · 💻 cs.AI

LLM-Powered AI Agent Systems and Their Applications in Industry

Pith reviewed 2026-05-22 14:01 UTC · model grok-4.3

classification 💻 cs.AI
keywords LLM-powered agentsAI agent systemsindustry applicationsmulti-modal LLMsagent challengessoftware-based agentshybrid agent systemsagent evolution
0
0 comments X

The pith

LLM-powered agents deliver flexibility and cross-domain reasoning that rule-based systems lack, supporting uses from customer service to healthcare.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This review traces agent systems from rigid pre-LLM designs to current architectures built around large language models. It shows how these systems gain the ability to process text, images, audio, and tabular data for adaptive behavior in real settings. The authors group the systems into software-based, physical, and adaptive hybrid types, then map them to concrete industry uses. They also list practical drawbacks such as response delays and security gaps and suggest fixes. A reader would care because the shift could change how organizations automate varied, language-heavy tasks without heavy custom coding.

Core claim

The paper establishes that unlike traditional rule-based agents with limited task scope, LLM-powered agents offer greater flexibility, cross-domain reasoning, and natural language interaction. With multi-modal LLMs, these systems process diverse data types including text, images, audio, and structured data to enable richer real-world behavior. The review categorizes current systems into software-based, physical, and adaptive hybrid types, surveys applications in customer service, software development, manufacturing automation, personalized education, financial trading, and healthcare, and examines challenges such as high inference latency, output uncertainty, lack of evaluation metrics, and

What carries the argument

The categorization of agent systems into software-based, physical, and adaptive hybrid systems, which organizes the current landscape and shows how each type supports adaptive real-world tasks.

If this is right

  • Customer service can shift to natural-language interactions without scripted responses.
  • Software development gains agents that reason across code and requirements.
  • Manufacturing automation incorporates multi-modal data for adaptive control.
  • Personalized education and financial trading receive more responsive agent support.
  • Healthcare applications benefit from agents that handle mixed data sources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Industry teams may begin testing hybrid agents that combine software logic with physical sensors for specific workflows.
  • Reducing latency through optimized inference could open real-time uses in trading and control systems.
  • Security fixes for output uncertainty might become standard requirements before deployment in regulated sectors.
  • The same categorization lens could help compare agents across new domains like logistics or legal review.

Load-bearing premise

The three-way split into software-based, physical, and adaptive hybrid systems is assumed to cover existing agent designs without major omissions or the need for different groupings.

What would settle it

A survey or deployment record showing a common agent system that fits none of the three categories or performs no better than rule-based agents in flexibility and cross-domain tasks.

Figures

Figures reproduced from arXiv: 2505.16120 by Guannan Liang, Qianqian Tong.

Figure 1
Figure 1. Figure 1: LLM-Powered AI Agent System. By combining the capabilities of software-based and phys￾ical agents, hybrid agents emerge as a powerful class of systems that enable seamless integration with the real world. Adaptive and Hybrid Agents (Real-World Integration) operate in a feedback-driven environment, continuously learning from both digital and physical interactions by processing multi￾modal data such as text,… view at source ↗
Figure 2
Figure 2. Figure 2: Architecture of LLM-Powered Agent System. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

The emergence of Large Language Models (LLMs) has reshaped agent systems. Unlike traditional rule-based agents with limited task scope, LLM-powered agents offer greater flexibility, cross-domain reasoning, and natural language interaction. Moreover, with the integration of multi-modal LLMs, current agent systems are highly capable of processing diverse data modalities, including text, images, audio, and structured tabular data, enabling richer and more adaptive real-world behavior. This paper comprehensively examines the evolution of agent systems from the pre-LLM era to current LLM-powered architectures. We categorize agent systems into software-based, physical, and adaptive hybrid systems, highlighting applications across customer service, software development, manufacturing automation, personalized education, financial trading, and healthcare. We further discuss the primary challenges posed by LLM-powered agents, including high inference latency, output uncertainty, lack of evaluation metrics, and security vulnerabilities, and propose potential solutions to mitigate these concerns.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper surveys the evolution of agent systems from pre-LLM rule-based approaches to current LLM-powered architectures. It emphasizes advantages in flexibility, cross-domain reasoning, natural language interaction, and multi-modal data processing. The manuscript categorizes agent systems into software-based, physical, and adaptive hybrid systems, reviews applications in customer service, software development, manufacturing automation, personalized education, financial trading, and healthcare, and outlines challenges including high inference latency, output uncertainty, lack of evaluation metrics, and security vulnerabilities while suggesting potential solutions.

Significance. As a descriptive survey without new empirical results, formal proofs, or falsifiable predictions, the paper's value lies in its organizational framework and synthesis of existing literature. If the taxonomy is well-motivated and the review of applications and challenges is balanced and up-to-date, it could help researchers and practitioners navigate the field. The manuscript receives credit for attempting to structure a fast-moving area through a three-way categorization lens rather than claiming exhaustiveness.

minor comments (3)
  1. [Categorization section] The categorization into software-based, physical, and adaptive hybrid systems is presented without explicit discussion of boundary cases or comparison to alternative taxonomies in the literature; adding a short subsection justifying the framework would improve clarity.
  2. [Applications section] Applications are listed at a high level; including one or two concrete, cited examples per domain (e.g., a specific manufacturing automation case) would make the claims more tangible without altering the survey nature.
  3. [Challenges and solutions] The proposed solutions to challenges such as inference latency and security vulnerabilities should be tied to specific references or ongoing work rather than left as high-level suggestions.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and for recommending minor revision. We appreciate the recognition that the paper's value lies in its organizational framework and synthesis of the literature on LLM-powered agent systems. The three-way categorization into software-based, physical, and adaptive hybrid systems is intended to help structure this fast-moving area, and we are glad this approach was noted favorably.

Circularity Check

0 steps flagged

No significant circularity in descriptive survey

full rationale

The paper is a review article that summarizes the evolution of agent systems, describes LLM-powered architectures, proposes a high-level categorization into software-based, physical, and adaptive hybrid systems as an organizational framework, and lists applications and challenges. No equations, derivations, predictions, or fitted parameters exist. The taxonomy is presented as a review lens rather than a claim derived from or reducing to its own inputs. External literature is cited without load-bearing self-citation chains that would make central claims equivalent to unverified prior work by the same authors. The content remains self-contained against external benchmarks as a descriptive survey.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper containing no free parameters, mathematical axioms, or invented entities. All content rests on summaries of previously published work in the LLM and agent literature.

pith-pipeline@v0.9.0 · 5678 in / 987 out tokens · 36340 ms · 2026-05-22T14:01:09.143703+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LLM-Based Multi-Agent Systems for Code Generation: A Multi-Vocal Literature Review

    cs.SE 2026-02 unverdicted novelty 3.0

    A review of 114 studies classifies motivations into nine categories, analyzes common models and benchmarks, synthesizes challenges into six categories with 26 subcategories and solutions, and identifies six future res...

Reference graph

Works this paper leans on

122 extracted references · 122 canonical work pages · cited by 1 Pith paper · 14 internal anchors

  1. [1]

    Multi-agent systems: which research for which applications,

    E. Oliveira, K. Fischer, and O. Stepankova, “Multi-agent systems: which research for which applications,”Robotics and Autonomous Systems, vol. 27, no. 1-2, pp. 91–106, 1999

  2. [2]

    Agent AI: Surveying the Horizons of Multimodal Interaction

    Z. Durante, Q. Huang, N. Wake, R. Gong, J. S. Park, B. Sarkar, R. Taori, Y . Noda, D. Terzopoulos, Y . Choiet al., “Agent ai: Surveying the horizons of multimodal interaction,”arXiv preprint arXiv:2401.03568, 2024

  3. [3]

    The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey

    T. Masterman, S. Besen, M. Sawtell, and A. Chao, “The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey,”arXiv preprint arXiv:2404.11584, 2024

  4. [4]

    Large multimodal agents: A survey,

    J. Xie, Z. Chen, R. Zhang, X. Wan, and G. Li, “Large multimodal agents: A survey,”arXiv preprint arXiv:2402.15116, 2024

  5. [5]

    Multi-agent systems: A survey,

    A. Dorri, S. S. Kanhere, and R. Jurdak, “Multi-agent systems: A survey,”Ieee Access, vol. 6, pp. 28 573–28 593, 2018

  6. [6]

    Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

    Y . Li, H. Wen, W. Wang, X. Li, Y . Yuan, G. Liu, J. Liu, W. Xu, X. Wang, Y . Sunet al., “Personal llm agents: Insights and sur- vey about the capability, efficiency and security,”arXiv preprint arXiv:2401.05459, 2024

  7. [7]

    Understanding the planning of LLM agents: A survey

    X. Huang, W. Liu, X. Chen, X. Wang, H. Wang, D. Lian, Y . Wang, R. Tang, and E. Chen, “Understanding the planning of llm agents: A survey,”arXiv preprint arXiv:2402.02716, 2024

  8. [8]

    Large Language Model based Multi-Agents: A Survey of Progress and Challenges

    T. Guo, X. Chen, Y . Wang, R. Chang, S. Pei, N. V . Chawla, O. Wiest, and X. Zhang, “Large language model based multi-agents: A survey of progress and challenges,”arXiv preprint arXiv:2402.01680, 2024

  9. [9]

    A survey on llm- based multi-agent systems: workflow, infrastructure, and challenges,

    X. Li, S. Wang, S. Zeng, Y . Wu, and Y . Yang, “A survey on llm- based multi-agent systems: workflow, infrastructure, and challenges,” Vicinagearth, vol. 1, no. 1, p. 9, 2024

  10. [10]

    Mycin: a knowledge-based consultation program for infectious disease diagnosis,

    W. Van Melle, “Mycin: a knowledge-based consultation program for infectious disease diagnosis,”International journal of man-machine studies, vol. 10, no. 3, pp. 313–322, 1978

  11. [11]

    Dendral and meta-dendral: Their applications dimension,

    B. G. Buchanan and E. A. Feigenbaum, “Dendral and meta-dendral: Their applications dimension,” inReadings in artificial intelligence. Elsevier, 1981, pp. 313–322

  12. [12]

    Multi-agent deep reinforcement learning: a survey,

    S. Gronauer and K. Diepold, “Multi-agent deep reinforcement learning: a survey,”Artificial Intelligence Review, vol. 55, no. 2, pp. 895–943, 2022

  13. [13]

    Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications,

    T. T. Nguyen, N. D. Nguyen, and S. Nahavandi, “Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications,”IEEE transactions on cybernetics, vol. 50, no. 9, pp. 3826–3839, 2020

  14. [14]

    A survey and critique of multiagent deep reinforcement learning,

    P. Hernandez-Leal, B. Kartal, and M. E. Taylor, “A survey and critique of multiagent deep reinforcement learning,”Autonomous Agents and Multi-Agent Systems, vol. 33, no. 6, pp. 750–797, 2019

  15. [15]

    Deep reinforcement learning: A brief survey,

    K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,”IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017

  16. [16]

    Intelligent agents: Theory and practice,

    M. Wooldridge and N. R. Jennings, “Intelligent agents: Theory and practice,”The knowledge engineering review, vol. 10, no. 2, pp. 115– 152, 1995

  17. [17]

    Chatgpt by openai,

    OpenAI, “Chatgpt by openai,” https://openai.com/index/chatgpt/

  18. [18]

    Claude: An ai assistant by anthropic,

    Anthropic, “Claude: An ai assistant by anthropic,” https://www.anthropic.com/claude

  19. [19]

    Gemini: Ai by google,

    Google, “Gemini: Ai by google,” https://gemini.google.com/app

  20. [20]

    Deepseek: Next-generation open llms,

    DeepSeek, “Deepseek: Next-generation open llms,” https://www.deepseek.com/

  21. [21]

    Llmfactor: Extracting profitable factors through prompts for explainable stock movement prediction,

    M. Wang, K. Izumi, and H. Sakaji, “Llmfactor: Extracting profitable factors through prompts for explainable stock movement prediction,” arXiv preprint arXiv:2406.10811, 2024

  22. [22]

    Can large language models beat wall street? unveiling the potential o f ai in stock selection

    G. Fatouros, K. Metaxas, J. Soldatos, and D. Kyriazis, “Can large language models beat wall street? unveiling the potential of ai in stock selection,”arXiv preprint arXiv:2401.03737, 2024

  23. [23]

    Can chatgpt forecast stock price movements? return pre- dictability and large language models

    A. Lopez-Lira and Y . Tang, “Can chatgpt forecast stock price move- ments? return predictability and large language models,”arXiv preprint arXiv:2304.07619, 2023

  24. [24]

    TradingGPT: Multi-agent system with layered memory and distinct characters for enhanced financial trading performance.arXiv preprint arXiv:2309.03736,

    Y . Li, Y . Yu, H. Li, Z. Chen, and K. Khashanah, “Tradinggpt: Multi- agent system with layered memory and distinct characters for enhanced financial trading performance,”arXiv preprint arXiv:2309.03736, 2023

  25. [25]

    A multimodal foundation agent for financial trading: Tool-augmented, diversified, and generalist,

    W. Zhang, L. Zhao, H. Xia, S. Sun, J. Sun, M. Qin, X. Li, Y . Zhao, Y . Zhao, X. Caiet al., “A multimodal foundation agent for financial trading: Tool-augmented, diversified, and generalist,” inProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024, pp. 4314–4325

  26. [26]

    Llm-enhanced human-machine interaction for adaptive decision making in dynamic manufacturing process environments,

    Z. Keskin, D. Joosten, N. Klasen, M. Huber, C. Liu, B. Drescher, and R. H. Schmitt, “Llm-enhanced human-machine interaction for adaptive decision making in dynamic manufacturing process environments,” IEEE Access, 2025

  27. [27]

    The use of artificial intelligence to optimize the routing of vehicles and reduce traffic congestion in urban areas,

    S. Dikshit, A. Atiq, M. Shahid, V . Dwivedi, and A. Thusu, “The use of artificial intelligence to optimize the routing of vehicles and reduce traffic congestion in urban areas,”EAI Endorsed Transactions on Energy Web, vol. 10, pp. 1–13, 2023

  28. [28]

    Mdagents: An adaptive collaboration of llms for medical decision-making,

    Y . Kim, C. Park, H. Jeong, Y . S. Chan, X. Xu, D. McDuff, H. Lee, M. Ghassemi, C. Breazeal, H. Parket al., “Mdagents: An adaptive collaboration of llms for medical decision-making,”Advances in Neural Information Processing Systems, vol. 37, pp. 79 410–79 452, 2024

  29. [29]

    Medaide: Towards an omni medical aide via specialized llm-based multi-agent collaboration,

    J. Wei, D. Yang, Y . Li, Q. Xu, Z. Chen, M. Li, Y . Jiang, X. Hou, and L. Zhang, “Medaide: Towards an omni medical aide via specialized llm-based multi-agent collaboration,”arXiv preprint arXiv:2410.12532, 2024

  30. [30]

    Polaris: A safety-focused llm constellation architecture for healthcare,

    S. Mukherjee, P. Gamble, M. S. Ausin, N. Kant, K. Aggarwal, N. Manjunath, D. Datta, Z. Liu, J. Ding, S. Busaccaet al., “Polaris: A safety-focused llm constellation architecture for healthcare,”arXiv preprint arXiv:2403.13313, 2024

  31. [31]

    Evaluating large language models as agents in the clinic,

    N. Mehandru, B. Y . Miao, E. R. Almaraz, M. Sushil, A. J. Butte, and A. Alaa, “Evaluating large language models as agents in the clinic,” NPJ digital medicine, vol. 7, no. 1, p. 84, 2024

  32. [32]

    Ai-powered product data management in industry 4.0: A bibliographical analysis

    S. Mazumdar, “Ai-powered product data management in industry 4.0: A bibliographical analysis.”

  33. [33]

    Ai for predictive maintenance in industrial systems,

    A. Abbas, “Ai for predictive maintenance in industrial systems,” International Journal of Advanced Engineering Technologies and In- novations, vol. 1, no. 1, pp. 31–51, 2024

  34. [34]

    Ai-powered supply chains towards greater efficiency,

    N. Shobhana, “Ai-powered supply chains towards greater efficiency,” in Complex AI Dynamics and Interactions in Management. IGI Global, 2024, pp. 229–249

  35. [35]

    Automated decision making comes of age,

    T. H. Davenport and J. G. Harris, “Automated decision making comes of age,”MIT Sloan Management Review, vol. 46, no. 4, p. 83, 2005

  36. [36]

    Q-learning: Theory and applications,

    J. Clifton and E. Laber, “Q-learning: Theory and applications,”Annual Review of Statistics and Its Application, vol. 7, no. 1, pp. 279–301, 2020

  37. [37]

    Conservative q-learning for offline reinforcement learning,

    A. Kumar, A. Zhou, G. Tucker, and S. Levine, “Conservative q-learning for offline reinforcement learning,”Advances in neural information processing systems, vol. 33, pp. 1179–1191, 2020

  38. [38]

    Deep reinforcement learning with double q-learning,

    H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” inProceedings of the AAAI conference on artificial intelligence, vol. 30, no. 1, 2016

  39. [39]

    A survey of deep q-networks used for reinforcement learning: state of the art,

    A. M. Hafiz, “A survey of deep q-networks used for reinforcement learning: state of the art,”Intelligent Communication Technologies and Virtual Mobile Networks: Proceedings of ICICV 2022, pp. 393–402, 2022

  40. [40]

    Policy gradient methods for reinforcement learning with function approximation,

    R. S. Sutton, D. McAllester, S. Singh, and Y . Mansour, “Policy gradient methods for reinforcement learning with function approximation,” Advances in neural information processing systems, vol. 12, 1999

  41. [41]

    Mastering the game of go without human knowledge,

    D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Boltonet al., “Mastering the game of go without human knowledge,”nature, vol. 550, no. 7676, pp. 354–359, 2017

  42. [42]

    A Comprehensive Overview of Large Language Models

    H. Naveed, A. U. Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, and A. Mian, “A comprehensive overview of large language models,”arXiv preprint arXiv:2307.06435, 2023

  43. [43]

    arXiv, ://arxiv.org/abs/2502.17516, arXiv:2502.17516 [cs], doi:10.48550/arXiv.2502.17516

    Z. Lin, S. Basu, M. Beigi, V . Manjunatha, R. A. Rossi, Z. Wang, Y . Zhou, S. Balasubramanian, A. Zarei, K. Rezaeiet al., “A survey on mechanistic interpretability for multi-modal foundation models,”arXiv preprint arXiv:2502.17516, 2025

  44. [44]

    A survey of llm-based agents in medicine: How far are we from baymax?

    W. Wang, Z. Ma, Z. Wang, C. Wu, W. Chen, X. Li, and Y . Yuan, “A survey of llm-based agents in medicine: How far are we from baymax?” arXiv preprint arXiv:2502.11211, 2025

  45. [45]

    Large language model agent in financial trading: A survey,

    H. Ding, Y . Li, J. Wang, and H. Chen, “Large language model agent in financial trading: A survey,”arXiv preprint arXiv:2408.06361, 2024

  46. [46]

    arXiv preprint arXiv:2410.21418 (2024)

    Y . Li, H. Zhao, H. Jiang, Y . Pan, Z. Liu, Z. Wu, P. Shu, J. Tian, T. Yang, S. Xuet al., “Large language models for manufacturing,” arXiv preprint arXiv:2410.21418, 2024

  47. [47]

    Framework for llm applications in manufacturing,

    C. I. Garcia, M. A. DiBattista, T. A. Letelier, H. D. Halloran, and J. A. Camelio, “Framework for llm applications in manufacturing,” Manufacturing Letters, vol. 41, pp. 253–263, 2024

  48. [48]

    A large language model-based multi-agent manufacturing system for intelligent shopfloor,

    Z. Zhao, D. Tang, H. Zhu, Z. Zhang, K. Chen, C. Liu, and Y . Ji, “A large language model-based multi-agent manufacturing system for intelligent shopfloor,”arXiv preprint arXiv:2405.16887, 2024

  49. [49]

    Introducing the model context protocol,

    Anthropic, “Introducing the model context protocol,” https://www.anthropic.com/news/model-context-protocol, 2024, accessed: 2024-04-10

  50. [50]

    A survey of the model context protocol (mcp): Standardizing context to enhance large language models (llms),

    A. Singh, A. Ehtesham, S. Kumar, and T. T. Khoei, “A survey of the model context protocol (mcp): Standardizing context to enhance large language models (llms),” 2025

  51. [51]

    Retrieval- augmented generation for knowledge-intensive nlp tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K ¨uttler, M. Lewis, W.-t. Yih, T. Rockt ¨aschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020

  52. [52]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, H. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, vol. 2, 2023

  53. [53]

    On faithfulness and factuality in abstractive summarization

    J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, “On faith- fulness and factuality in abstractive summarization,”arXiv preprint arXiv:2005.00661, 2020

  54. [54]

    Survey of hallucination in natural language generation,

    Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y . Xu, E. Ishii, Y . J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,”ACM computing surveys, vol. 55, no. 12, pp. 1–38, 2023

  55. [55]

    Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

    H. Inan, K. Upasani, J. Chi, R. Rungta, K. Iyer, Y . Mao, M. Tontchev, Q. Hu, B. Fuller, D. Testuggineet al., “Llama guard: Llm-based input-output safeguard for human-ai conversations,”arXiv preprint arXiv:2312.06674, 2023

  56. [56]

    Safeguarding large language models: A survey

    Y . Dong, R. Mu, Y . Zhang, S. Sun, T. Zhang, C. Wu, G. Jin, Y . Qi, J. Hu, J. Menget al., “Safeguarding large language models: A survey,” arXiv preprint arXiv:2406.02622, 2024

  57. [57]

    Building guardrails for large language models,

    Y . Dong, R. Mu, G. Jin, Y . Qi, J. Hu, X. Zhao, J. Meng, W. Ruan, and X. Huang, “Building guardrails for large language models,”arXiv preprint arXiv:2402.01822, 2024

  58. [58]

    Llm-based chatbots for mining software repositories: Challenges and opportunities,

    S. Abedu, A. Abdellatif, and E. Shihab, “Llm-based chatbots for mining software repositories: Challenges and opportunities,” inProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, 2024, pp. 201–210

  59. [59]

    Large-language-models (llm)-based ai chatbots: Ar- chitecture, in-depth analysis and their performance evaluation,

    V . Kumar, P. Srivastava, A. Dwivedi, I. Budhiraja, D. Ghosh, V . Goyal, and R. Arora, “Large-language-models (llm)-based ai chatbots: Ar- chitecture, in-depth analysis and their performance evaluation,” in International Conference on Recent Trends in Image Processing and Pattern Recognition. Springer, 2023, pp. 237–249

  60. [60]

    A complete survey on llm-based ai chatbots,

    S. K. Dam, C. S. Hong, Y . Qiao, and C. Zhang, “A complete survey on llm-based ai chatbots,”arXiv preprint arXiv:2406.16937, 2024

  61. [61]

    Just-in-time news: An ai chatbot for the modern information age

    F. Sufi, “Just-in-time news: An ai chatbot for the modern information age.”AI, vol. 6, no. 2, 2025

  62. [62]

    13 generative ai and llm: Case study in e-commerce,

    R. Iyer, V . C. Maralapalle, P. Mahesh, and D. Patil, “13 generative ai and llm: Case study in e-commerce,”Generative AI and LLMs: Natural Language Processing and Generative Adversarial Networks, p. 253, 2024

  63. [63]

    From llms to llm- based agents for software engineering: A survey of current, challenges and future,

    H. Jin, L. Huang, H. Cai, J. Yan, B. Li, and H. Chen, “From llms to llm- based agents for software engineering: A survey of current, challenges and future,”arXiv preprint arXiv:2408.02479, 2024

  64. [64]

    Automatic programming: Large language models and beyond,

    M. R. Lyu, B. Ray, A. Roychoudhury, S. H. Tan, and P. Thongtanunam, “Automatic programming: Large language models and beyond,”ACM Transactions on Software Engineering and Methodology, 2024

  65. [65]

    A comprehensive overview of large language models (llms) for cyber defences: Opportunities and direc- tions,

    M. Hassanin and N. Moustafa, “A comprehensive overview of large language models (llms) for cyber defences: Opportunities and direc- tions,”arXiv preprint arXiv:2405.14487, 2024

  66. [66]

    From vulnerability to defense: The role of large language models in enhancing cybersecurity,

    W. Kasri, Y . Himeur, H. A. Alkhazaleh, S. Tarapiah, S. Atalla, W. Man- soor, and H. Al-Ahmad, “From vulnerability to defense: The role of large language models in enhancing cybersecurity,”Computation, vol. 13, no. 2, p. 30, 2025

  67. [67]

    Github copilot: Your ai pair programmer,

    GitHub, “Github copilot: Your ai pair programmer,” https://github.com/features/copilot

  68. [68]

    Cursor: Ai-powered code editor,

    Cursor, “Cursor: Ai-powered code editor,” https://www.cursor.com/en

  69. [69]

    Llm agents for education: Advances and applications,

    Z. Chu, S. Wang, J. Xie, T. Zhu, Y . Yan, J. Ye, A. Zhong, X. Hu, J. Liang, P. S. Yuet al., “Llm agents for education: Advances and applications,”arXiv preprint arXiv:2503.11733, 2025

  70. [70]

    The role of large language models in personalized learning: a systematic review of educational impact,

    S. Sharma, P. Mittal, M. Kumar, and V . Bhardwaj, “The role of large language models in personalized learning: a systematic review of educational impact,”Discover Sustainability, vol. 6, no. 1, pp. 1– 24, 2025

  71. [71]

    [Xuet al., 2025 ] Songlin Xu, Hao-Ning Wen, Hongyi Pan, Dallas Dominguez, Dongyin Hu, and Xinyu Zhang

    S. Xu, X. Zhang, and L. Qin, “Eduagent: Generative student agents in learning,”arXiv preprint arXiv:2404.07963, 2024

  72. [72]

    Teachtune: Reviewing pedagogical agents against diverse student profiles with simulated students,

    H. Jin, M. Yoo, J. Park, Y . Lee, X. Wang, and J. Kim, “Teachtune: Reviewing pedagogical agents against diverse student profiles with simulated students,”arXiv preprint arXiv:2410.04078, 2024

  73. [73]

    Al-khwarizmi: Discovering physical laws with foundation models,

    C. E. Mower and H. Bou-Ammar, “Al-khwarizmi: Discovering physical laws with foundation models,”arXiv preprint arXiv:2502.01702, 2025

  74. [74]

    Content knowledge identification with multi-agent large language models (llms),

    K. Yang, Y . Chu, T. Darwin, A. Han, H. Li, H. Wen, Y . Copur- Gencturk, J. Tang, and H. Liu, “Content knowledge identification with multi-agent large language models (llms),” inInternational Conference on Artificial Intelligence in Education. Springer, 2024, pp. 284–292

  75. [75]

    Mathagent: Leveraging a mixture-of-math-agent framework for real-world multi- modal mathematical error detection,

    Y . Yan, S. Wang, J. Huo, P. S. Yu, X. Hu, and Q. Wen, “Mathagent: Leveraging a mixture-of-math-agent framework for real-world multi- modal mathematical error detection,”arXiv preprint arXiv:2503.18132, 2025

  76. [76]

    Newton: Are large language models capable of physical reasoning?

    Y . R. Wang, J. Duan, D. Fox, and S. Srinivasa, “Newton: Are large language models capable of physical reasoning?”arXiv preprint arXiv:2310.07018, 2023

  77. [77]

    Augmenting large language models with chemistry tools,

    A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A. D. White, and P. Schwaller, “Augmenting large language models with chemistry tools,”Nature Machine Intelligence, vol. 6, no. 5, pp. 525–535, 2024

  78. [78]

    Ni, and Jian Guo

    S. Wang, H. Yuan, L. M. Ni, and J. Guo, “Quantagent: Seeking holy grail in trading by self-improving large language model,”arXiv preprint arXiv:2402.03755, 2024

  79. [79]

    org/abs/2308.00016

    S. Wang, H. Yuan, L. Zhou, L. M. Ni, H.-Y . Shum, and J. Guo, “Alpha- gpt: Human-ai interactive alpha mining for quantitative investment,” arXiv preprint arXiv:2308.00016, 2023

  80. [80]

    Deploying foundation model powered agent services: A survey,

    W. Xu, J. Chen, P. Zheng, X. Yi, T. Tian, W. Zhu, Q. Wan, H. Wang, Y . Fan, Q. Suet al., “Deploying foundation model powered agent services: A survey,”arXiv preprint arXiv:2412.13437, 2024

Showing first 80 references.