Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.
In general, it is most interesting to critique things that are explanation-like, as opposed to short answers with no explanation (e.g
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2022 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Self-critiquing models for assisting human evaluators
Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.