ARA uses LLMs to build workflow graphs linking sources, methods, and outputs in papers, then scores reproducibility, reaching ~61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldStandardDB.
AI-driven review systems: evaluating LLMs in scalable and bias-aware academic reviews
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 2polarities
background 2representative citing papers
A survey synthesizing LLM methods for peer review critique generation and score prediction, including taxonomies, benchmark limitations, domain biases, and robustness risks such as prompt injection.
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.
citing papers explorer
No citing papers match the current filters.