pith. machine review for the scientific record. sign in

Given their answer choice, explanation, and feedback, decide whether you agree with their feedback and wish to incorporate it

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AI 1

years

2023 1

verdicts

ACCEPT 1

representative citing papers

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

cs.AI · 2023-11-20 · accept · novelty 7.0

GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.

citing papers explorer

Showing 1 of 1 citing paper.

  • GPQA: A Graduate-Level Google-Proof Q&A Benchmark cs.AI · 2023-11-20 · accept · none · ref 38

    GPQA is a new graduate-level benchmark where PhD experts score 65% (74% after corrections), skilled non-experts score 34% with web access, and GPT-4 scores 39%, intended to enable realistic tests of human supervision over superhuman AI.