LLM judges exhibit high stability under neutral re-evaluation but substantial reversibility under targeted post-decision challenges, quantified via a new Evaluation Robustness Score (ERS).
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Case studies with blind UK residents and people from Kerala and Tamil Nadu demonstrate that community input at the systematization stage produces culturally grounded definitions of appropriateness for text-to-image model outputs.
citing papers explorer
-
Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges
LLM judges exhibit high stability under neutral re-evaluation but substantial reversibility under targeted post-decision challenges, quantified via a new Evaluation Robustness Score (ERS).