AgentReview is the first LLM-based simulation framework for peer review that quantifies a 37.1% decision variation attributable to reviewer biases.
Can large language models provide useful feedback on research papers? a large-scale empirical analysis
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
FactReview extracts claims from ML papers, positions them via literature retrieval, and verifies them through code execution, labeling each as Supported, Partially supported, or In conflict, as shown in a CompGCN case study.
AI peer reviewers for POMP analyses show jagged performance: strong on technical error detection and invalid inference but weak on interpretive errors, narrative coherence, and domain-informed critique.
citing papers explorer
-
AgentReview: Exploring Peer Review Dynamics with LLM Agents
AgentReview is the first LLM-based simulation framework for peer review that quantifies a 37.1% decision variation attributable to reviewer biases.
-
FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification
FactReview extracts claims from ML papers, positions them via literature retrieval, and verifies them through code execution, labeling each as Supported, Partially supported, or In conflict, as shown in a CompGCN case study.
-
Jagged AI in Scientific Peer Review: Evidence from POMP Data Analysis
AI peer reviewers for POMP analyses show jagged performance: strong on technical error detection and invalid inference but weak on interpretive errors, narrative coherence, and domain-informed critique.