Fine-tuning Gemma 3 4B on unfiltered self-consistency targets produces a binary verbal correctness discriminator with AUROC 0.774 on TriviaQA, outperforming logit entropy after a modal-filtered pre-registration failed.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B
Fine-tuning Gemma 3 4B on unfiltered self-consistency targets produces a binary verbal correctness discriminator with AUROC 0.774 on TriviaQA, outperforming logit entropy after a modal-filtered pre-registration failed.