Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.
Essentially, we show the human a slate of critiques (like we did in Section 3.4)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2022 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Self-critiquing models for assisting human evaluators
Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.