LLM annotations for social science tasks vary substantially with prompt wording in interpretive cases but become more stable when majority voting is applied across multiple equivalent prompts.
In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI '25)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CY 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
What Is Actually Being Annotated? Inter-Prompt Reliability as a Measurement Problem in LLM-Based Social Science Labeling
LLM annotations for social science tasks vary substantially with prompt wording in interpretive cases but become more stable when majority voting is applied across multiple equivalent prompts.