PolyChartQA is a new mid-scale dataset for multi-chart question answering that reveals a 27.4% accuracy drop for multimodal models on human-authored questions compared to AI-generated ones, plus a modest gain from a proposed prompting method.
LLM -Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A theory-grounded taxonomy of eight communication roles enables scalable annotation via LLMs and outperforms baselines when predicting peer recognition in student teams and performance improvement on a public deliberation dataset.
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.
citing papers explorer
-
Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts
PolyChartQA is a new mid-scale dataset for multi-chart question answering that reveals a 27.4% accuracy drop for multimodal models on human-authored questions compared to AI-generated ones, plus a modest gain from a proposed prompting method.
-
Who Plays Which Role When? Communication Role Dynamics for Peer Recognition and Team Performance Prediction
A theory-grounded taxonomy of eight communication roles enables scalable annotation via LLMs and outperforms baselines when predicting peer recognition in student teams and performance improvement on a public deliberation dataset.
-
Prompt Governance? On Governing Technologies Governed by Natural Language
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.