In the coherent condition, a systematic rule is applied consistently across all problems of the same type (e.g., for distribution: a(b+c) =ab+c instead of ab+ac )

= 6x + 15 Step 2: 6x + 15 - 4x = 2x + 15 Answer: 2x + 15 The corresponding random-error version replaces a derivation step with a plausible but incorrect computation (e · 1978

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Truth as a Compression Artifact in Language Model Training

cs.CL · 2026-03-12 · unverdicted · novelty 6.0

Controlled experiments show language models extract correct answers from contradictory data only when errors are structurally incoherent, supporting the hypothesis that gradient descent selects the most compressible answer cluster.

citing papers explorer

Showing 1 of 1 citing paper.

Truth as a Compression Artifact in Language Model Training cs.CL · 2026-03-12 · unverdicted · none · ref 11
Controlled experiments show language models extract correct answers from contradictory data only when errors are structurally incoherent, supporting the hypothesis that gradient descent selects the most compressible answer cluster.

In the coherent condition, a systematic rule is applied consistently across all problems of the same type (e.g., for distribution: a(b+c) =ab+c instead of ab+ac )

fields

years

verdicts

representative citing papers

citing papers explorer