Frontier LLMs exhibit premature closure by selecting answers at high rates on medical tasks where the correct choice was removed and on open-ended queries, with safety prompting reducing but not eliminating the behavior.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
The Cylindrical Representation Hypothesis (CRH) models LLM representations as a central axis for concept activation surrounded by a normal plane containing sensitive sectors that determine steering sensitivity and introduce intrinsic uncertainty.
citing papers explorer
-
Quantifying and Mitigating Premature Closure in Frontier LLMs
Frontier LLMs exhibit premature closure by selecting answers at high rates on medical tasks where the correct choice was removed and on open-ended queries, with safety prompting reducing but not eliminating the behavior.
-
The Cylindrical Representation Hypothesis for Language Model Steering
The Cylindrical Representation Hypothesis (CRH) models LLM representations as a central axis for concept activation surrounded by a normal plane containing sensitive sectors that determine steering sensitivity and introduce intrinsic uncertainty.