Reasoning accuracy in latent CoT depends on mutual information fidelity between latent trajectories and explicit steps, with generative reconstruction preserving capacity better than geometric compression.
What Happened in LLM s Layers when Trained for Fast vs
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it