MRAG-Suite: A diagnostic eval- uation platform for visual retrieval-augmented generation

Ji, Y · 2021 · DOI 10.1038/s41746-025-01576-4

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Beyond Accuracy: Measuring Bias Acknowledgment in Chain-of-Thought Reasoning for Responsible AI Evaluation

cs.LG · 2026-06-13 · unverdicted · novelty 6.0

GPT-4o and Claude Sonnet 4 show similar susceptibility to bias on GSM8K (1.3% vs 1.2%) but differ sharply in acknowledgment rates (13% vs 75%) under a rubric-defined metric.

citing papers explorer

Showing 1 of 1 citing paper.

Beyond Accuracy: Measuring Bias Acknowledgment in Chain-of-Thought Reasoning for Responsible AI Evaluation cs.LG · 2026-06-13 · unverdicted · none · ref 6
GPT-4o and Claude Sonnet 4 show similar susceptibility to bias on GSM8K (1.3% vs 1.2%) but differ sharply in acknowledgment rates (13% vs 75%) under a rubric-defined metric.

MRAG-Suite: A diagnostic eval- uation platform for visual retrieval-augmented generation

fields

years

verdicts

representative citing papers

citing papers explorer