pith. sign in

hub

Halueval: A large-scale hallucination evaluation benchmark for large language models

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

hub tools

citation-role summary

background 2

citation-polarity summary

roles

background 2

polarities

background 2

representative citing papers

Design and Report Benchmarks for Knowledge Work

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.

A Survey of Hallucination in Large Foundation Models

cs.AI · 2023-09-12 · accept · novelty 3.0

A survey classifying hallucination phenomena specific to large foundation models, establishing evaluation criteria, examining mitigation strategies, and discussing future directions.

citing papers explorer

Showing 17 of 17 citing papers.