CoCoReviewBench curates 3,900 ICLR and NeurIPS papers into category-specific subsets with discussion-based annotations to evaluate AI reviewers on completeness and correctness rather than human review overlap.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
method 1polarities
use method 1representative citing papers
PeerPrism benchmark demonstrates that state-of-the-art LLM detectors conflate surface text style with intellectual contribution and fail on hybrid human-AI peer reviews.
RankElastor mitigates embedding collapse via spectrum-robust token mixing and GLU-based P-FFNs, yielding better performance and scaling on industrial recommendation datasets.
SID-Coord coordinates semantic IDs with hashed item IDs via attention fusion, adaptive gating, and interest alignment, yielding +0.664% long-play rate and +0.369% playback duration gains in production search ranking.
Peerispect extracts claims from peer reviews, retrieves evidence from the manuscript, and verifies them via NLI in a modular pipeline with a visual interface.
citing papers explorer
-
CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers
CoCoReviewBench curates 3,900 ICLR and NeurIPS papers into category-specific subsets with discussion-based annotations to evaluate AI reviewers on completeness and correctness rather than human review overlap.
-
PeerPrism: Peer Evaluation Expertise vs Review-writing AI
PeerPrism benchmark demonstrates that state-of-the-art LLM detectors conflate surface text style with intellectual contribution and fail on hybrid human-AI peer reviews.
-
Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation
RankElastor mitigates embedding collapse via spectrum-robust token mixing and GLU-based P-FFNs, yielding better performance and scaling on industrial recommendation datasets.
-
SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search
SID-Coord coordinates semantic IDs with hashed item IDs via attention fusion, adaptive gating, and interest alignment, yielding +0.664% long-play rate and +0.369% playback duration gains in production search ranking.
-
Peerispect: Claim Verification in Scientific Peer Reviews
Peerispect extracts claims from peer reviews, retrieves evidence from the manuscript, and verifies them via NLI in a modular pipeline with a visual interface.