OS-SPEAR is a new evaluation toolkit that tests 22 OS agents and identifies trade-offs between efficiency and safety or robustness.
Llm-augmented agent-based modelling for social simula- tions: Challenges and opportunities
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 2representative citing papers
WhatIf provides an interactive platform for real-time exploration of LLM-driven social simulations, enabling policymakers to iteratively test plans, reflect on assumptions, and uncover vulnerabilities in emergency preparedness scenarios.
LLMs assigned high or low status personas in multi-turn dialogues exhibit socio-cognitive effects including language coordination, pronoun patterns, persuasion success, and compliance with unsafe requests.
FSTS automates multi-agent social experiment design via LLM script generation across three phases, with tests indicating reproduction of real-world outcomes.
The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.
citing papers explorer
-
OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents
OS-SPEAR is a new evaluation toolkit that tests 22 OS agents and identifies trade-offs between efficiency and safety or robustness.
-
WhatIf: Interactive Exploration of LLM-Powered Social Simulations for Policy Reasoning
WhatIf provides an interactive platform for real-time exploration of LLM-driven social simulations, enabling policymakers to iteratively test plans, reflect on assumptions, and uncover vulnerabilities in emergency preparedness scenarios.
-
Do LLM Agents Mirror Socio-Cognitive Effects in Power-Asymmetric Conversations?
LLMs assigned high or low status personas in multi-turn dialogues exhibit socio-cognitive effects including language coordination, pronoun patterns, persuasion success, and compliance with unsafe requests.
-
From Script to Stage: Automating Experimental Design for Social Simulations with LLMs
FSTS automates multi-agent social experiment design via LLM script generation across three phases, with tests indicating reproduction of real-world outcomes.
-
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.