Introduces RevCI benchmark and IMPACT multi-agent framework for evidence-level contradiction detection and graded intensity scoring in peer reviews, distilled into efficient TIDE model.
A Dataset of Peer Reviews ( P eer R ead): Collection, Insights and NLP Applications
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3roles
dataset 1polarities
use dataset 1representative citing papers
Peer review reports in AI conferences have grown longer and more standardized after LLMs, with increased emphasis on surface-level clarity and summaries at the expense of deeper critiques on originality and replicability.
Numerical scores predict ICLR acceptance at 91% accuracy while review text reaches only 81%, because politeness makes rejected papers' reviews contain more positive than negative words.
citing papers explorer
-
When Reviews Disagree: Fine-Grained Contradiction Analysis in Scientific Peer Reviews
Introduces RevCI benchmark and IMPACT multi-agent framework for evidence-level contradiction detection and graded intensity scoring in peer reviews, distilled into efficient TIDE model.
-
Impact of large language models on peer review opinions from a fine-grained perspective: Evidence from top conference proceedings in AI
Peer review reports in AI conferences have grown longer and more standardized after LLMs, with increased emphasis on surface-level clarity and summaries at the expense of deeper critiques on originality and replicability.
-
Decoupling Scores and Text: The Politeness Principle in Peer Review
Numerical scores predict ICLR acceptance at 91% accuracy while review text reaches only 81%, because politeness makes rejected papers' reviews contain more positive than negative words.