The method aggregates multiple hallucination evaluation scores via conformal p-values to enable calibrated detection with controlled false alarm rates across LLMs and datasets.
Addressing uncertainty in llms to enhance reliability in generative ai
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it