Position: The ML Community Must Build an AI-Augmented Peer-Review Ecosystem

Jing Yang; Markus Wulfmeier; Mihaela van der Schaar; Qiyao Wei; Samuel Holt

arxiv: 2506.08134 · v4 · pith:27MMMTCTnew · submitted 2025-06-09 · 💻 cs.AI · cs.CY

Position: The ML Community Must Build an AI-Augmented Peer-Review Ecosystem

Qiyao Wei , Samuel Holt , Jing Yang , Markus Wulfmeier , Mihaela van der Schaar This is my paper

classification 💻 cs.AI cs.CY

keywords reviewpeerai-assistedai-augmentedauthorsbuildcommunityecosystem

0 comments

read the original abstract

Peer review, the bedrock of scientific advancement in machine learning (ML), is strained by a crisis of scale. Exponential growth in manuscript submissions to premier ML venues such as NeurIPS, ICML, and ICLR is outpacing the finite capacity of qualified reviewers, leading to concerns about review quality, consistency, and reviewer fatigue. This position paper argues that AI-assisted peer review must become an urgent research and infrastructure priority. We advocate for a comprehensive AI-augmented ecosystem, leveraging Large Language Models (LLMs) not as replacements for human judgment, but as sophisticated collaborators for authors, reviewers, and Area Chairs (ACs). We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making. Crucially, we contend that the development of such systems hinges on access to more granular, structured, and ethically-sourced peer review process data. We outline a research agenda, including illustrative experiments, to develop and validate these AI assistants, and discuss significant technical and ethical challenges. We call upon the ML community to proactively build this AI-assisted future, ensuring the continued integrity and scalability of scientific validation, while maintaining high standards of peer review.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review
cs.DL 2026-05 unverdicted novelty 6.0

ARA extracts workflow graphs from papers and scores reproducibility, reaching 61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldStandardDB.
ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review
cs.DL 2026-05 unverdicted novelty 6.0

ARA uses LLMs to build workflow graphs linking sources, methods, and outputs in papers, then scores reproducibility, reaching ~61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldSt...
Toward an Engineering of Science: Rebalancing Generation and Verification in the Age of AI
cs.CY 2026-05 unverdicted novelty 5.0

AI lowers the cost of generating plausible scientific artifacts without lowering verification costs, so the paper proposes blueprints as typed graph components that decompose claims, evidence, and assumptions to enabl...