We additionally gave detailed comments on several example questions for each part (not shown here), highlighting positive and negative aspects of the examples

$30 bonus for each question answered correctly A

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

cs.AI · 2023-11-20 · accept · novelty 7.0

GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.

citing papers explorer

Showing 1 of 1 citing paper.

GPQA: A Graduate-Level Google-Proof Q&A Benchmark cs.AI · 2023-11-20 · accept · none · ref 13
GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.

We additionally gave detailed comments on several example questions for each part (not shown here), highlighting positive and negative aspects of the examples

fields

years

verdicts

representative citing papers

citing papers explorer