pith. sign in

If you haven’t settled on an answer choice in that time, we encourage you to spend as long as you need to be confident in your selection

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2023 1

verdicts

ACCEPT 1

representative citing papers

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

cs.AI · 2023-11-20 · accept · novelty 7.0

GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.

citing papers explorer

Showing 1 of 1 citing paper.

  • GPQA: A Graduate-Level Google-Proof Q&A Benchmark cs.AI · 2023-11-20 · accept · none · ref 44

    GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.