WIMPE factorizes reference answers into weighted context-bound points and applies alignment (WPA) and conflict penalty (PCP) metrics, yielding higher human correlation than prior rubric or checklist methods across 10 generative tasks.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Judge Like Human Examiners: A Weighted Importance Multi-Point Evaluation Framework for Generative Tasks with Long-form Answers
WIMPE factorizes reference answers into weighted context-bound points and applies alignment (WPA) and conflict penalty (PCP) metrics, yielding higher human correlation than prior rubric or checklist methods across 10 generative tasks.