Truthfulqa: Measuring how models mimic human falsehoods

Stephanie Lin, Jacob Hilton, Owain Evans · 2022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

Dynamic Chunking for Diffusion Language Models

cs.CL · 2026-05-15 · unverdicted · novelty 7.0

DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.

Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping

cs.LG · 2026-01-24 · unverdicted · novelty 6.0

ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.

Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training

cs.CL · 2026-05-06 · unverdicted · novelty 5.0 · 2 refs

LoPT splits LLM post-training at the midpoint with task loss on the second half and feature reconstruction on the first half to reduce cost and interference.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Dynamic Chunking for Diffusion Language Models cs.CL · 2026-05-15 · unverdicted · none · ref 25
DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.
Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping cs.LG · 2026-01-24 · unverdicted · none · ref 21
ARS shapes reasoning trace representations by clustering states that produce consistent answers and separating those that produce inconsistent ones via latent perturbations, improving plug-and-play hallucination detection without human annotations.
Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training cs.CL · 2026-05-06 · unverdicted · none · ref 16 · 2 links
LoPT splits LLM post-training at the midpoint with task loss on the second half and feature reconstruction on the first half to reduce cost and interference.

Truthfulqa: Measuring how models mimic human falsehoods

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer