OlympiadBench is a new bilingual multimodal dataset of 8476 competition problems on which GPT-4V scores 17.97 percent overall and 10.74 percent on physics.
Mainly observed for problems with a simple answer, such as the variables takes 0 as the answer
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench is a new bilingual multimodal dataset of 8476 competition problems on which GPT-4V scores 17.97 percent overall and 10.74 percent on physics.