MultEval supports collaborative creation of LLM-as-a-judge criteria by surfacing disagreements via consensus-building methods, allowing iterative revisions with examples and history, and keeping transparent how human judgments become automated rules.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MultEval: Supporting Collaborative Alignment for LLM-as-a-Judge Evaluation Criteria
MultEval supports collaborative creation of LLM-as-a-judge criteria by surfacing disagreements via consensus-building methods, allowing iterative revisions with examples and history, and keeping transparent how human judgments become automated rules.