For example, we could search for critiques (see Section D) Recall that in Section 5 we found a negative CD gap for the Addition, Alphabetize, and RACE synthetic tasks

More generally, if we do not control for compute

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Self-critiquing models for assisting human evaluators

cs.CL · 2022-06-12 · conditional · novelty 6.0

Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.

citing papers explorer

Showing 1 of 1 citing paper.

Self-critiquing models for assisting human evaluators cs.CL · 2022-06-12 · conditional · none · ref 14
Fine-tuned LLMs produce critiques that improve human detection of errors in summaries, with larger models showing better self-critique and refinement capabilities.

For example, we could search for critiques (see Section D) Recall that in Section 5 we found a negative CD gap for the Addition, Alphabetize, and RACE synthetic tasks

fields

years

verdicts

representative citing papers

citing papers explorer