ExCyTIn-Bench is the first benchmark of 7542 questions from Microsoft Sentinel threat investigation graphs, where the best LLM agent achieves a reward of 0.606.
Camel: Communicative agents for "mind" exploration of large scale language model society, 2023
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3representative citing papers
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
The paper introduces a collaborative multi-agent framework for LLMs and applies it conceptually to existing models like Auto-GPT, BabyAGI, and Gorilla through case studies in domains such as courtroom simulations and software development.
citing papers explorer
-
ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
ExCyTIn-Bench is the first benchmark of 7542 questions from Microsoft Sentinel threat investigation graphs, where the best LLM agent achieves a reward of 0.606.
-
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
-
Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents
The paper introduces a collaborative multi-agent framework for LLMs and applies it conceptually to existing models like Auto-GPT, BabyAGI, and Gorilla through case studies in domains such as courtroom simulations and software development.