Open-weight LLMs match or exceed commercial API performance on 9 of 34 political science classification tasks, with average F1 differences under 0.02 and clearest API advantages on complex multi-label tasks.
Clinton, Cassy Dorff, Brenton Kenkel, and Jennifer M
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLM agents display limited alignment with human emotional responses to red tape across cultures, performing worse in Eastern contexts, while cultural prompting offers little improvement.
citing papers explorer
-
Open-Weight LLMs Are Often Competitive with Commercial APIs for Political Science Text Classification
Open-weight LLMs match or exceed commercial API performance on 9 of 34 political science classification tasks, with average F1 differences under 0.02 and clearest API advantages on complex multi-label tasks.
-
Cross-Cultural Simulation of Citizen Emotional Responses to Bureaucratic Red Tape Using LLM Agents
LLM agents display limited alignment with human emotional responses to red tape across cultures, performing worse in Eastern contexts, while cultural prompting offers little improvement.