pith. sign in

Title resolution pending

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CL 2

years

2023 2

verdicts

UNVERDICTED 2

representative citing papers

GAIA: a benchmark for General AI Assistants

cs.CL · 2023-11-21 · unverdicted · novelty 7.0

GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.

citing papers explorer

Showing 2 of 2 citing papers.

  • GAIA: a benchmark for General AI Assistants cs.CL · 2023-11-21 · unverdicted · none · ref 41

    GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.

  • Chain-of-Verification Reduces Hallucination in Large Language Models cs.CL · 2023-09-20 · unverdicted · none · ref 100

    Chain-of-Verification reduces hallucinations in large language models by drafting responses, planning independent verification questions, answering them separately, and generating a final verified output.