How to steer LLM latents for hallucination detection? InInternational Conference on Machine Learning, 2025

Seongheon Park, Xuefeng Du, Min-Hsuan Yeh, Haobo Wang, Yixuan Li · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping

cs.LG · 2026-01-24 · unverdicted · novelty 6.0

ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.

citing papers explorer

Showing 1 of 1 citing paper.

Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping cs.LG · 2026-01-24 · unverdicted · none · ref 27
ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.

How to steer LLM latents for hallucination detection? InInternational Conference on Machine Learning, 2025

fields

years

verdicts

representative citing papers

citing papers explorer