If any part is missing or unclear (e.g., if the answer does not match any of the listed options), the question should be deemed invalid

The question, options, answer must be fully defined

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

cs.CL · 2025-02-20 · conditional · novelty 6.0

SuperGPQA is a new benchmark that tests LLMs on graduate questions from 285 disciplines after human-LLM filtering, with current best models scoring 61.82 percent.

citing papers explorer

Showing 1 of 1 citing paper.

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines cs.CL · 2025-02-20 · conditional · none · ref 32
SuperGPQA is a new benchmark that tests LLMs on graduate questions from 285 disciplines after human-LLM filtering, with current best models scoring 61.82 percent.

If any part is missing or unclear (e.g., if the answer does not match any of the listed options), the question should be deemed invalid

fields

years

verdicts

representative citing papers

citing papers explorer