PolyChartQA is a new mid-scale dataset for multi-chart question answering that reveals a 27.4% accuracy drop for multimodal models on human-authored questions compared to AI-generated ones, plus a modest gain from a proposed prompting method.
LLM -Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A theory-grounded taxonomy of eight communication roles enables scalable annotation via LLMs and outperforms baselines when predicting peer recognition in student teams and performance improvement on a public deliberation dataset.
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.
citing papers explorer
-
Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts
PolyChartQA is a new mid-scale dataset for multi-chart question answering that reveals a 27.4% accuracy drop for multimodal models on human-authored questions compared to AI-generated ones, plus a modest gain from a proposed prompting method.