AI reviews for all 22,977 AAAI-26 papers were preferred by authors and PC members over human reviews on accuracy and suggestions and outperformed baselines at spotting weaknesses.
Russo Latona, M
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Fine-tuned LLMs trained on social science publication records outperform experts and frontier models at judging which research pitches deserve attention.
Authors show prompt injection attacks that jailbreak LLM paper reviewers for biased acceptance and propose embedding triggers to detect when reviews are LLM-generated rather than human.
SafeReview trains a Generator to create adversarial prompts and a Defender to detect them via co-evolution with an IR-GAN-inspired loss, claiming better resilience than static defenses for LLM-based peer review.
Peer review reports in AI conferences have grown longer and more standardized after LLMs, with increased emphasis on surface-level clarity and summaries at the expense of deeper critiques on originality and replicability.
The paper delivers a stage-by-stage roadmap for AI in research, showing reliable assistance in retrieval and tool tasks but fragility in novelty and judgment, advocating human-governed collaboration.
citing papers explorer
-
AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot
AI reviews for all 22,977 AAAI-26 papers were preferred by authors and PC members over human reviews on accuracy and suggestions and outperformed baselines at spotting weaknesses.
-
LLMs learn scientific taste from institutional traces across the social sciences
Fine-tuned LLMs trained on social science publication records outperform experts and frontier models at judging which research pitches deserve attention.
-
ChatGPT: Excellent Paper! Accept It. Editor: Imposter Found! Review Rejected
Authors show prompt injection attacks that jailbreak LLM paper reviewers for biased acceptance and propose embedding triggers to detect when reviews are LLM-generated rather than human.
-
SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts
SafeReview trains a Generator to create adversarial prompts and a Defender to detect them via co-evolution with an IR-GAN-inspired loss, claiming better resilience than static defenses for LLM-based peer review.
-
Impact of large language models on peer review opinions from a fine-grained perspective: Evidence from top conference proceedings in AI
Peer review reports in AI conferences have grown longer and more standardized after LLMs, with increased emphasis on surface-level clarity and summaries at the expense of deeper critiques on originality and replicability.
-
AI for Auto-Research: Roadmap & User Guide
The paper delivers a stage-by-stage roadmap for AI in research, showing reliable assistance in retrieval and tool tasks but fragility in novelty and judgment, advocating human-governed collaboration.