Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Fahd Seddik; Fatemeh Fard

arxiv: 2606.27378 · v1 · pith:4SA7ZHTRnew · submitted 2026-05-07 · 💻 cs.CL · cs.LG

Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Fahd Seddik , Fatemeh Fard This is my paper

Pith reviewed 2026-06-30 23:48 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords latent representationsLLMsaxiomatic evaluationreasoning tasksrepresentation qualitycausalityseparabilitystability

0 comments

The pith

Latent thought representations in LLMs fail to satisfy four axioms of causality, minimality, separability, and stability simultaneously.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper sets out four axioms that any useful latent thought representation in an LLM should obey and supplies a way to measure each one directly from the internal activations. The measurements do not rely on how well the model answers questions on a benchmark. When the measures are applied to many open models across 23 reasoning tasks, no representation meets every axiom at once. The representations can separate broad task categories but cannot tell two questions of the same category apart, and they add almost no information that is not already present in the input embedding. The same pattern appears in dense models, reasoning-distilled models, and reinforcement-learned models alike.

Core claim

No candidate satisfies all four axioms simultaneously; the representations distinguish task type reliably but cannot distinguish between two questions within the same task, and they encode little information beyond what is already present in the input embedding, with the failure consistent across model families.

What carries the argument

Four functional axioms (Causality, Minimality, Separability, Stability) equipped with quantitative measures computed directly on the latent representation.

Load-bearing premise

The quantitative measures defined for each axiom can be computed directly on the representation and are independent of downstream benchmark scores.

What would settle it

A representation extracted from some LLM that scores high on all four quantitative axiom measures across the tested tasks.

Figures

Figures reproduced from arXiv: 2606.27378 by Fahd Seddik, Fatemeh Fard.

**Figure 2.** Figure 2: (a) Discriminator accuracy on across- and within-task pairs, one point per (LLM, candidate). [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: DCS versus the semantic equivalence threshold [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: Distributional Consistency Score (DCS) across source LLMs at [PITH_FULL_IMAGE:figures/full_fig_p027_4.png] view at source ↗

**Figure 5.** Figure 5: Per-beam KL CDFs at the 50-token averaging window on Llama-3.3-70B, with one panel per representation family (anchor candidates, soft thinking, soft thinking with Gumbel noise, latent thinking). Within each thinking family every step count is shown. than a fixed projection-induced shift. The discriminator-projection numbers are retained in [PITH_FULL_IMAGE:figures/full_fig_p028_5.png] view at source ↗

**Figure 6.** Figure 6: Intraclass correlation ICC = σ 2 between/(σ 2 between + σ 2 within) of per-beam causality KL on Llama-3.3-70B, with bars per averaging window. Values above 0.5 indicate that the per-problem mean carries most of the dispersion, validating the cluster bootstrap that resamples problems and keeps within-problem beams glued together (Section D.1). Chain-rule decomposition. The general chain rule for mutual info… view at source ↗

**Figure 7.** Figure 7: Pearson r between per-example causality KL and input length (top) or output length (bottom), in characters, on Llama-3.1-8B-Instruct. Candidates follow the order of [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Mean causality KL on Llama-3.1-8B-Instruct across [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗

**Figure 9.** Figure 9: Mean causality KL on Llama-3.3-70B as the substituted length [PITH_FULL_IMAGE:figures/full_fig_p032_9.png] view at source ↗

**Figure 10.** Figure 10: Each candidate placed on the (PR, k-NN purity) plane, one panel per LLM. Solid lines trace thinking-family trajectories as the step count grows from 1 to 128. Candidates further to the upper right are preferable: higher PR indicates a within-task subspace spread across more directions, and higher k-NN purity indicates that nearest neighbours are drawn from the same task [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗

**Figure 11.** Figure 11: Within-task discriminator accuracy versus BBEH pass@1 for candidates averaged across [PITH_FULL_IMAGE:figures/full_fig_p035_11.png] view at source ↗

**Figure 12.** Figure 12: Output-length distribution per source LLM, in characters, pooled across every generation. [PITH_FULL_IMAGE:figures/full_fig_p037_12.png] view at source ↗

**Figure 13.** Figure 13: Median output length per BBEH task, in characters, with one bar per source LLM. [PITH_FULL_IMAGE:figures/full_fig_p038_13.png] view at source ↗

**Figure 14.** Figure 14: Discriminator-based DCS. The top row shows the score under [PITH_FULL_IMAGE:figures/full_fig_p043_14.png] view at source ↗

read the original abstract

We introduce an axiomatic evaluation framework for latent thought representations in LLMs, comprising metrics that are independent of downstream benchmark scores and reveal representational failures that benchmark accuracy masks. Existing evaluations conflate representation quality with model capacity. Therefore, failures cannot be attributed to the representation rather than to the model that processes it. We formalize four functional axioms (Causality, Minimality, Separability, and Stability) and define a quantitative measure for each, computed directly on the representation independently of downstream accuracy. We audit open-weight LLMs across 23 reasoning tasks (e.g., Spatial Reasoning, Factual QA). We find that no candidate satisfies all four axioms simultaneously, that the representations distinguish task type reliably but cannot distinguish between two questions within the same task, and that the representations encode little information beyond what is already present in the input embedding. The failure is consistent across dense, reasoning-distilled, and RL-trained model families, indicating that the gap is structural rather than a property of model size or training procedure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The four-axiom framework is a clean attempt to isolate representation quality, but the independence of the metrics from downstream scores cannot be checked without the actual definitions and extraction procedures.

read the letter

The paper's core move is to define four axioms for latent thought representations—Causality, Minimality, Separability, and Stability—and claim that quantitative measures for each can be computed directly on the hidden states without reference to task accuracy. It then audits a range of models on 23 reasoning tasks and reports that none satisfy all four, that task-type separation works but within-task discrimination does not, and that the representations add little beyond the input embeddings. That pattern holds across model families.

What is new is the specific combination of these four axioms plus the empirical claim that the failure is structural. The consistent result across dense, distilled, and RL-trained models is worth noting if the measurements really are independent.

The main limitation is that the abstract supplies no formulas, no extraction code, and no controls showing the metrics stay free of performance signals. Without those, the attribution of failure to the representation itself rather than model capacity remains unverified. The stress-test concern lands directly here.

This is aimed at interpretability researchers who already work with representation probing and want a more structured checklist. A reader who cares about evaluation methodology beyond benchmarks could get something useful from the full definitions and the per-axiom breakdowns.

It deserves peer review so the metric definitions and any ablation checks can be examined in detail.

Referee Report

2 major / 1 minor

Summary. The paper introduces an axiomatic evaluation framework for latent thought representations in LLMs, comprising four axioms (Causality, Minimality, Separability, and Stability) with quantitative measures computed directly on the representations and claimed to be independent of downstream benchmark scores. Auditing open-weight LLMs across 23 reasoning tasks, the authors report that no candidate satisfies all four axioms simultaneously, that representations distinguish task type reliably but cannot distinguish between two questions within the same task, and that representations encode little information beyond the input embedding. The failure pattern is consistent across dense, reasoning-distilled, and RL-trained model families, indicating a structural gap.

Significance. If the metrics are verifiably independent of downstream accuracy and the experimental controls are adequate, the work would offer a useful new lens for diagnosing representational limitations that standard benchmarks obscure. The cross-family consistency strengthens the structural-gap interpretation and could usefully redirect attention from scale to representation design.

major comments (2)

[Abstract and §3 (Axiom Definitions)] The central attribution of failures to the representations (rather than model capacity) rests on the claim that the four quantitative measures are computed directly on the representation and independent of downstream accuracy. Without the explicit extraction procedures, formulas, or controls for Causality, Minimality, Separability, and Stability, it is impossible to confirm this independence; the abstract states the claim but provides no equations or pseudocode.
[§4 (Experimental Results)] The claim that representations 'distinguish task type reliably but cannot distinguish between two questions within the same task' is load-bearing for the separability axiom and the overall conclusion. Specific tables or figures reporting inter-task vs. intra-task separability scores (with statistical tests) are required to substantiate this distinction.

minor comments (1)

[Abstract] The abstract lists example tasks ('Spatial Reasoning, Factual QA') but does not enumerate all 23 tasks or provide a reference; a table or appendix listing them would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful comments, which help clarify the presentation of our axiomatic framework. We address the two major comments point by point below.

read point-by-point responses

Referee: [Abstract and §3 (Axiom Definitions)] The central attribution of failures to the representations (rather than model capacity) rests on the claim that the four quantitative measures are computed directly on the representation and independent of downstream accuracy. Without the explicit extraction procedures, formulas, or controls for Causality, Minimality, Separability, and Stability, it is impossible to confirm this independence; the abstract states the claim but provides no equations.

Authors: Section 3 of the manuscript provides the formal definitions of the four axioms along with the quantitative measures and extraction procedures for each. These measures are designed to be computed solely from the latent representations (e.g., via vector operations and statistical properties) without any dependence on task labels or accuracy metrics. We acknowledge that the abstract does not include equations, as is conventional, but we will revise the manuscript to include a short paragraph in the introduction or a new appendix that explicitly lists the formulas and independence arguments to make this clearer. This will allow readers to verify the independence directly. revision: partial
Referee: [§4 (Experimental Results)] The claim that representations 'distinguish task type reliably but cannot distinguish between two questions within the same task' is load-bearing for the separability axiom and the overall conclusion. Specific tables or figures reporting inter-task vs. intra-task separability scores (with statistical tests) are required to substantiate this distinction.

Authors: We agree that more granular evidence is needed for this claim. While §4 reports overall separability results across tasks, the revision will include a new table (or extended figure) that breaks down inter-task separability scores versus intra-task scores for each model family, accompanied by statistical significance tests (e.g., paired t-tests between inter- and intra-task distributions). This will directly support the distinction and strengthen the separability axiom analysis. revision: yes

Circularity Check

0 steps flagged

Axiom metrics presented as direct computations with no reduction shown

full rationale

The paper defines four axioms (Causality, Minimality, Separability, Stability) and states that quantitative measures for each are computed directly on the representation independently of downstream accuracy. No equations, extraction procedures, or self-citations appear in the provided text that would reduce any measure to a fitted parameter, task performance signal, or prior result by construction. The attribution of failures to the representation itself rests on this direct-computation claim, which is presented without internal circularity. This is the most common honest finding when no load-bearing step reduces to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on the four introduced axioms as domain assumptions about what constitutes adequate thought representation; no free parameters or invented entities are described in the abstract.

axioms (2)

domain assumption Metrics for the axioms can be computed directly on the representation independently of downstream accuracy
Central premise stated in the abstract to separate representation quality from model capacity.
domain assumption Failures in the axioms can be attributed to the representation rather than model capacity
Motivating assumption for creating the new evaluation framework.

pith-pipeline@v0.9.1-grok · 5706 in / 1420 out tokens · 32460 ms · 2026-06-30T23:48:39.997656+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

106 extracted references · 48 canonical work pages · 15 internal anchors

[1]

Afzal, F

A. Afzal, F. Matthes, G. Chechik, and Y . Ziser. Knowing before saying: LLM represen- tations encode information about chain-of-thought success before completion. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Findings of the Association for Computa- tional Linguistics: ACL 2025, pages 12791–12806, Vienna, Austria, July 2025. Association f...

work page doi:10.18653/v1/2025.findings-acl.662 2025
[2]

Alabi, M

J. Alabi, M. Mosbach, M. Eyal, D. Klakow, and M. Geva. The hidden space of transformer language adapters. In L.-W. Ku, A. Martins, and V . Srikumar, editors,Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6588–6607, Bangkok, Thailand, Aug. 2024. Association for Computational Linguistic...

work page doi:10.18653/v1/2024.acl-long.356 2024
[3]

Ameisen, J

E. Ameisen, J. Lindsey, A. Pearce, W. Gurnee, N. L. Turner, B. Chen, C. Citro, D. Abra- hams, S. Carter, B. Hosmer, J. Marcus, M. Sklar, A. Templeton, T. Bricken, C. McDougall, H. Cunningham, T. Henighan, A. Jermyn, A. Jones, A. Persic, Z. Qi, T. Ben Thompson, S. Zimmerman, K. Rivoire, T. Conerly, C. Olah, and J. Batson. Circuit Tracing: Reveal- ing Compu...

2025
[4]

A. E. Assadi, I. Chung, R. Solomatin, N. Muennighoff, and K. Enevoldsen. HUME: Measuring the human-model performance gap in text embedding tasks. InThe Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum? id=rcmfu1ydAf

2026
[5]

Babakhin, R

Y . Babakhin, R. Osmulski, R. Ak, G. Moreira, M. Xu, B. Schifferer, B. Liu, and E. Oldridge. Llama-embed-nemotron-8b: A universal text embedding model for multilingual and cross- lingual tasks, 2025. URLhttps://arxiv.org/abs/2511.07025

work page arXiv 2025
[6]

Bandarkar, B

L. Bandarkar, B. Muller, P. Yuvraj, R. Hou, N. Singhal, H. Lv, and B. Liu. Layer swapping for zero-shot cross-lingual transfer in large language models. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=vQhn4wrQ6j

2025
[7]

Barak, B

B. Barak, B. L. Edelman, S. Goel, S. Kakade, E. Malach, and C. Zhang. Hidden Progress in Deep Learning: SGD Learns Parities Near the Computa- tional Limit. InAdvances in Neural Information Processing Systems, volume 35,
[8]

URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/ 884baf65392170763b27c914087bde01-Abstract-Conference.html

2022
[9]

Barber and F

D. Barber and F. Agakov. The IM algorithm: a variational approach to information maximization. InProceedings of the 17th International Conference on Neural Information Processing Systems, NIPS’03, pages 201–208, Cambridge, MA, USA, 2003. MIT Press

2003
[10]

N. Butt, A. Kwiatkowski, I. Labiad, J. Kempe, and Y . Ollivier. Soft Tokens, Hard Truths. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=9JjKTp8Jmy

2026
[11]

Z. Cai, X. Zhu, Y . Dong, Y . He, and S. Arora. T2MLR: Transformer with Temporal Middle- Layer Recurrence. InLIT Workshop @ ICLR 2026, 2026. URL https://openreview.net/ forum?id=fQbk1EQWBO

2026
[12]

Emerging Properties in Self-Supervised Vision Transformers

M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin. Emerg- ing Properties in Self-Supervised Vision Transformers. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9650–9660, 2021. URL https://arxiv.org/abs/2104.14294

work page internal anchor Pith review Pith/arXiv arXiv 2021
[13]

I. V . M. Cencerrado, A. P. Masdemont, A. G. Hawthorne, D. D. Africa, and L. Pacchiardi. No answer needed: Predicting llm answer accuracy from question-only linear probes, 2026. URL https://arxiv.org/abs/2509.10625. 10

work page arXiv 2026
[14]

X. Chen, A. Zhao, H. Xia, X. Lu, H. Wang, Y . Chen, W. Zhang, J. Wang, W. Li, and X. Shen. Reasoning beyond language: A comprehensive survey on latent chain-of-thought reasoning,
[15]

URLhttps://arxiv.org/abs/2505.16782

work page arXiv
[16]

Chételat, J

D. Chételat, J. Cotnareanu, R. Thompson, Y . Zhang, and M. Coates. InnerThoughts: Dis- entangling Representations and Predictions in Large Language Models. In Y . Li, S. Mandt, S. Agrawal, and E. Khan, editors,Proceedings of The 28th International Conference on Arti- ficial Intelligence and Statistics, volume 258 ofProceedings of Machine Learning Research...

2025
[17]

Conklin, T

H. Conklin, T. Hosking, T. Yi-Chern, J. D. Cohen, S.-J. Leslie, T. L. Griffiths, M. Bartolo, and S. Goldfarb-Tarrant. Learning is Forgetting; LLM Training As Lossy Compression. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=tvDlQj0GZB

2026
[18]

Y . Cui, Z. Dai, B. He, Z. Shi, H. Liu, R. Sun, Z. Liu, Y . Xing, J. Tang, and B. Dumoulin. How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision? InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.22441

work page arXiv 2026
[19]

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

DeepSeek-AI. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature, 645(8081):633–638, 2025. doi: 10.1038/s41586-025-09422-z. URL https://www. nature.com/articles/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025
[20]

J. Deng, L. Pang, Z. Wei, S. Xu, Z. Duan, K. Xu, Y . Song, H. Shen, and X. Cheng. Latent Reasoning in LLMs as a V ocabulary-Space Superposition, 2025. URLhttps://arxiv.org/ abs/2510.15522

work page arXiv 2025
[21]

Are Latent Reasoning Models Easily Interpretable?

C. Dilgren and S. Wiegreffe. Are Latent Reasoning Models Easily Interpretable? InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2604.04902

work page internal anchor Pith review Pith/arXiv arXiv 2026
[22]

Dragunov, T

N. Dragunov, T. Rahmatullaev, E. Goncharova, A. Kuznetsov, and A. Razzhigaev. SONAR- LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens,
[23]

URLhttps://arxiv.org/abs/2508.05305

work page internal anchor Pith review Pith/arXiv arXiv
[24]

C. Du, K. Fu, B. Wen, Y . Sun, J. Peng, W. Wei, Y . Gao, S. Wang, C. Zhang, J. Li, S. Qiu, L. Chang, and H. He. Human-like object concept representations emerge naturally in mul- timodal large language models.Nature Machine Intelligence, 7(6):860–875, June 2025. ISSN 2522-5839. doi: 10.1038/s42256-025-01049-z. URL http://dx.doi.org/10.1038/ s42256-025-01049-z

work page doi:10.1038/s42256-025-01049-z 2025
[25]

Duquenne, H

P.-A. Duquenne, H. Schwenk, and B. Sagot. SONAR: Sentence-Level Multimodal and Language-Agnostic Representations, 2023. URLhttps://arxiv.org/abs/2308.11466

work page arXiv 2023
[27]

Fadeeva, M

E. Fadeeva, M. Goloburda, A. Rubashevskii, R. Vashurin, A. Shelmanov, P. Nakov, M. Sachan, and M. Panov. Don’t Throw Away Your Beams: Improving Consistency-based Uncertain- ties in LLMs via Beam Search. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=igcQRiVlgu

2026
[28]

Test of time: A benchmark for evaluating llms on temporal reasoning, 2024

B. Fatemi, M. Kazemi, A. Tsitsulin, K. Malkan, J. Yim, J. Palowitch, S. Seo, J. Halcrow, and B. Perozzi. Test of Time: A benchmark for evaluating LLMs on temporal reasoning.arXiv preprint arXiv:2406.09170, 2024

work page arXiv 2024
[29]

J. Feng, S. Russell, and J. Steinhardt. Monitoring Latent World States in Language Models with Propositional Probes. InThe Thirteenth International Conference on Learning Representations,
[30]

URLhttps://openreview.net/forum?id=0yvZm2AjUr. 11
[31]

S. Feng, G. Fang, X. Ma, and X. Wang. Efficient reasoning models: A survey.Transactions on Machine Learning Research, 2025. ISSN 2835-8856. URL https://openreview.net/ forum?id=sySqlxj8EB

2025
[32]

Godey, É

N. Godey, É. de la Clergerie, and B. Sagot. Anisotropy Is Inherent to Self-Attention in Transformers. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL) (Volume 1: Long Papers), pages 35–48, 2024. URL https://arxiv.org/abs/2401.12143

work page arXiv 2024
[33]

Goyal, Z

S. Goyal, Z. Ji, A. S. Rawat, A. K. Menon, S. Kumar, and V . Nagarajan. Think before you speak: Training language models with pause tokens. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ph04CRkPdC

2024
[34]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughan, et al. The Llama 3 Herd of Models, 2024. URL https://arxiv. org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

S. Hao, S. Sukhbaatar, D. Su, X. Li, Z. Hu, J. Weston, and Y . Tian. Training large language models to reason in a continuous latent space.arXiv preprint arXiv:2412.06769, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[36]

J. He, J. Liu, C. Y . Liu, R. Yan, C. Wang, P. Cheng, X. Zhang, F. Zhang, J. Xu, W. Shen, S. Li, L. Zeng, T. Wei, C. Cheng, B. An, Y . Liu, and Y . Zhou. Skywork Open Reasoner 1 Technical Report.arXiv preprint arXiv:2505.22312, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[37]

J. He, J. Liu, C. Y . Liu, R. Yan, C. Wang, P. Cheng, X. Zhang, F. Zhang, J. Xu, W. Shen, S. Li, L. Zeng, T. Wei, C. Cheng, Y . Liu, and Y . Zhou. Sky- work Open Reasoner Series. https://capricious-hydrogen-41c.notion.site/ Skywork-Open-Reaonser-Series-1d0bc9ae823a80459b46c149e4f51680 , 2025. No- tion Blog

2025
[38]

Helff, R

L. Helff, R. Härle, W. Stammer, F. Friedrich, M. Brack, A. Wüst, H. Shindo, P. Schramowski, and K. Kersting. Activationreasoning: Logical reasoning in latent activation spaces. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=gGJh5AZTG7

2026
[39]

Herrmann, R

V . Herrmann, R. Csordás, and J. Schmidhuber. Measuring In-Context Computation Complexity via Hidden State Prediction. InForty-second International Conference on Machine Learning,
[40]

URLhttps://openreview.net/forum?id=X21P8etjWL
[41]

understanding

J. Hessel, A. Marasovi´c, J. D. Hwang, L. Lee, J. Da, R. Zellers, R. Mankoff, and Y . Choi. Do androids laugh at electric sheep? Humor “understanding” benchmarks from the New Yorker caption contest.arXiv preprint arXiv:2209.06293, 2022

work page arXiv 2022
[42]

Holtzman, J

A. Holtzman, J. Buys, L. Du, M. Forbes, and Y . Choi. The curious case of neural text degeneration. InInternational Conference on Learning Representations, 2020. URL https: //openreview.net/forum?id=rygGQyrFvH

2020
[43]

Less is More: Recursive Reasoning with Tiny Networks

A. Jolicoeur-Martineau. Less is More: Recursive Reasoning with Tiny Networks, 2025. URL https://arxiv.org/abs/2510.04871

work page internal anchor Pith review Pith/arXiv arXiv 2025
[44]

Kazemi, H

M. Kazemi, H. Alvari, A. Anand, J. Wu, X. Chen, and R. Soricut. GeomVerse: A systematic evaluation of large models for geometric reasoning.arXiv preprint arXiv:2312.12241, 2023

work page arXiv 2023
[45]

Kazemi, Q

M. Kazemi, Q. Yuan, D. Bhatia, N. Kim, X. Xu, V . Imbrasaite, and D. Ramachandran. BoardgameQA: A dataset for natural language reasoning with contradictory information.Ad- vances in Neural Information Processing Systems, 36, 2024

2024
[46]

Kazemi, B

M. Kazemi, B. Fatemi, H. Bansal, J. Palowitch, C. Anastasiou, S. V . Mehta, L. K. Jain, V . Agli- etti, D. Jindal, P. Chen, N. Dikkala, G. Tyen, X. Liu, U. Shalit, S. Chiappa, K. Olszewska, Y . Tay, V . Q. Tran, Q. V . Le, and O. Firat. BIG-bench extra hard. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Proceedings of the 63rd Annual Meet...
[47]

Fang, J., Jiang, H., Wang, K., Ma, Y ., Shi, J., Wang, X., He, X., and Chua, T

Association for Computational Linguistics. ISBN 979-8-89176-251-0. doi: 10.18653/v1/ 2025.acl-long.1285. URLhttps://aclanthology.org/2025.acl-long.1285/. 12

work page doi:10.18653/v1/ 2025
[48]

Kıcıman, R

E. Kıcıman, R. Ness, A. Sharma, and C. Tan. Causal reasoning and large language models: Opening a new frontier for causality.arXiv preprint arXiv:2305.00050, 2023

work page arXiv 2023
[49]

Koishekenov, A

Y . Koishekenov, A. Lipani, and N. Cancedda. Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts, 2025. URL https://arxiv.org/abs/2510.07358

work page arXiv 2025
[50]

L. Kuhn, Y . Gal, and S. Farquhar. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=VD-AYtP0dve

2023
[51]

Y . Li, J. Chen, F. Wu, J. Yu, H. Qi, W. Xuan, H. Zhao, P. Nie, D. Jin, and X. Tang. Learning Multi-step Reasoning via Persistent Latent State Propagation. InLIT Workshop @ ICLR 2026,

2026
[52]

URLhttps://openreview.net/forum?id=Dcv4B1UCuW
[53]

Z. Li, X. Bai, K. Chen, Y . Li, J. Yang, C. Lin, and M. Zhang. Dynamics Within Latent Chain- of-Thought: An Empirical Study of Causal Structure. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.08783

work page internal anchor Pith review Pith/arXiv arXiv 2026
[54]

Litwin-Kumar, K

A. Litwin-Kumar, K. D. Harris, R. Axel, H. Sompolinsky, and L. F. Abbott. Optimal Degrees of Synaptic Connectivity.Neuron, 93(5):1153–1164.e7, 2017. doi: 10.1016/j.neuron.2017.01.030

work page doi:10.1016/j.neuron.2017.01.030 2017
[55]

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

W. Lugoloobi, T. Foster, W. Bankes, and C. Russell. LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations. InLIT Workshop @ ICLR 2026, 2026. URL https://arxiv.org/abs/2602.09924

work page internal anchor Pith review Pith/arXiv arXiv 2026
[56]

F. V . Massoli, A. Kuzmin, and A. Behboodi. Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck. InThe 1st Workshop on Scaling Post- training for LLMs, 2026. URLhttps://openreview.net/forum?id=98sbP0T8ck

2026
[57]

Mondorf and B

P. Mondorf and B. Plank. Beyond accuracy: Evaluating the reasoning behavior of large language models – a survey. InFirst Conference on Language Modeling (COLM), 2024. URL https://openreview.net/forum?id=Lmjgl2n11u

2024
[58]

MTEB: Massive Text Embedding Benchmark

N. Muennighoff, N. Tazi, L. Magne, and N. Reimers. MTEB: Massive Text Embedding Benchmark, 2023. URLhttps://arxiv.org/abs/2210.07316

work page internal anchor Pith review Pith/arXiv arXiv 2023
[59]

A. Nie, Y . Zhang, A. S. Amdekar, C. Piech, T. B. Hashimoto, and T. Gerstenberg. MoCa: Measuring human-language model alignment on causal and moral judgment tasks.Advances in Neural Information Processing Systems, 36, 2024

2024
[60]

gpt-oss-120b & gpt-oss-20b Model Card

OpenAI. gpt-oss-120b & gpt-oss-20b Model Card, 2025. URL https://arxiv.org/abs/ 2508.10925

work page internal anchor Pith review Pith/arXiv arXiv 2025
[61]

K. Park, Y . J. Choe, and V . Veitch. The Linear Representation Hypothesis and the Geometry of Large Language Models. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 39643–39666. PMLR, 2024. URLhttps://proceedings.mlr.press/v235/park24c.html

2024
[62]

Enforcing Logical Invariance in Large Language Models via Symmetry Pair Training

Prasanth. Enforcing Logical Invariance in Large Language Models via Symmetry Pair Training. InICLR 2026 Workshop on Logical Reasoning of Large Language Models, 2026. URL https://openreview.net/forum?id=aZFS8rc6Bf

2026
[63]

Recanatesi, M

S. Recanatesi, M. Farrell, M. Advani, T. Moore, G. Lajoie, and E. Shea-Brown. Dimensionality compression and expansion in Deep Neural Networks, 2019. URL https://arxiv.org/abs/ 1906.00443

work page arXiv 2019
[64]

Rizvi-Martel and M

M. Rizvi-Martel and M. Mosbach. The Illusion of Superposition in Latent CoT via Soft Thinking. InLIT Workshop @ ICLR 2026, 2026. URL https://openreview.net/forum? id=FvPx9Nzvnw

2026
[65]

Sahoo, A

S. Sahoo, A. Chadha, V . Jain, and D. Chaudhary. When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning. InLIT Workshop @ ICLR 2026, 2026. URL https://arxiv.org/abs/2603.03475. 13

work page arXiv 2026
[66]

Salhan, E

S. Salhan, E. Zhou, and P. Buttery. Do Monolingual Language Models Learn Cross-Lingual Universal Conceptual Representations? InICLR 2026 Workshop on Unifying Concept Repre- sentation Learning, 2026. URLhttps://openreview.net/forum?id=frKa6ujOyE

2026
[67]

Sánchez, B

E. Sánchez, B. Alastruey, C. Ropers, P. Stenetorp, M. Artetxe, and M. R. Costa-jussà. Linguini: A benchmark for language-agnostic linguistic reasoning.arXiv preprint arXiv:2409.12126, 2024

work page arXiv 2024
[68]

K. Shah, N. Dikkala, X. Wang, and R. Panigrahy. Causal language modeling can elicit search and reasoning capabilities on logic puzzles.arXiv preprint arXiv:2409.10502, 2024

work page arXiv 2024
[69]

Shani, L

C. Shani, L. Soffer, D. Jurafsky, Y . LeCun, and R. Shwartz-Ziv. From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning, 2025. URL https://arxiv.org/abs/ 2505.17117

work page arXiv 2025
[70]

Z. Shen, H. Yan, L. Zhang, Z. Hu, Y . Du, and Y . He. CODI: Compressing chain-of-thought into continuous space via self-distillation. In C. Christodoulopoulos, T. Chakraborty, C. Rose, and V . Peng, editors,Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 677–693, Suzhou, China, Nov. 2025. Association for Compu...

work page doi:10.18653/v1/2025.emnlp-main.36 2025
[71]

Sheshanarayana, R

D. Sheshanarayana, R. S. Pal, M. Sinha, and T. Dasgupta. Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2603.15051

work page arXiv 2026
[72]

Skean, M

O. Skean, M. R. Arefin, D. Zhao, N. N. Patel, J. Naghiyev, Y . LeCun, and R. Shwartz-Ziv. Layer by layer: Uncovering hidden representations in language models. InForty-second International Conference on Machine Learning, 2025. URL https://openreview.net/ forum?id=WGXb7UdvTX

2025
[73]

Sui, Y .-N

Y . Sui, Y .-N. Chuang, G. Wang, J. Zhang, T. Zhang, J. Yuan, H. Liu, A. Wen, S. Zhong, N. Zou, H. Chen, and X. Hu. Stop overthinking: A survey on efficient reasoning for large language models.Transactions on Machine Learning Research, 2025. ISSN 2835-8856. URL https://openreview.net/forum?id=HvoG8SxggZ

2025
[74]

Q. Sun, M. Pickett, A. K. Nain, and L. Jones. Transformer Layers as Painters. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 25219–25227, 2025. doi: 10.1609/aaai.v39i24.34708. URL https://ojs.aaai.org/index.php/AAAI/article/ view/34708

work page doi:10.1609/aaai.v39i24.34708 2025
[75]

L. team, L. Barrault, P.-A. Duquenne, M. Elbayad, A. Kozhevnikov, B. Alastruey, P. Andrews, M. Coria, G. Couairon, M. R. Costa-jussà, D. Dale, H. Elsahar, K. Heffernan, J. M. Janeiro, T. Tran, C. Ropers, E. Sánchez, R. S. Roman, A. Mourachko, S. Saleem, and H. Schwenk. Large Concept Models: Language Modeling in a Sentence Representation Space, 2024. URL h...

work page arXiv 2024
[76]

G. Tyen, H. Mansoor, P. Chen, T. Mak, and V . C˘arbune. LLMs cannot find reasoning errors, but can correct them!arXiv preprint arXiv:2311.08516, 2023

work page arXiv 2023
[77]

Wang and F

W. Wang and F. Reid. Tiny Recursive Reasoning with Mamba-2 Attention Hybrid. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.12078

work page arXiv 2026
[78]

Wendler, V

C. Wendler, V . Veselovsky, G. Monea, and R. West. Do llamas work in english? on the latent language of multilingual transformers. In L.-W. Ku, A. Martins, and V . Srikumar, ed- itors,Proceedings of the 62nd Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 15366–15394, Bangkok, Thailand, Aug. 2024. Associati...

work page doi:10.18653/v1/2024.acl-long.820 2024
[79]

LiveBench: A Challenging, Contamination-Limited LLM Benchmark

C. White, S. Dooley, M. Roberts, A. Pal, B. Feuer, S. Jain, R. Shwartz-Ziv, N. Jain, K. Saifullah, S. Naidu, et al. LiveBench: A challenging, contamination-free LLM benchmark.arXiv preprint arXiv:2406.19314, 2024. 14

work page internal anchor Pith review Pith/arXiv arXiv 2024
[80]

J. Wu, J. Lu, Z. Ren, G. Hu, Z. Wu, D. Dai, and H. Wu. LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking. InThe Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum? id=ASLuOoP78o

2026
[81]

Z. Wu, Y . Xiong, S. X. Yu, and D. Lin. Unsupervised Feature Learning via Non-Parametric Instance Discrimination. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3733–3742, 2018

2018

Showing first 80 references.

[1] [1]

Afzal, F

A. Afzal, F. Matthes, G. Chechik, and Y . Ziser. Knowing before saying: LLM represen- tations encode information about chain-of-thought success before completion. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Findings of the Association for Computa- tional Linguistics: ACL 2025, pages 12791–12806, Vienna, Austria, July 2025. Association f...

work page doi:10.18653/v1/2025.findings-acl.662 2025

[2] [2]

Alabi, M

J. Alabi, M. Mosbach, M. Eyal, D. Klakow, and M. Geva. The hidden space of transformer language adapters. In L.-W. Ku, A. Martins, and V . Srikumar, editors,Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6588–6607, Bangkok, Thailand, Aug. 2024. Association for Computational Linguistic...

work page doi:10.18653/v1/2024.acl-long.356 2024

[3] [3]

Ameisen, J

E. Ameisen, J. Lindsey, A. Pearce, W. Gurnee, N. L. Turner, B. Chen, C. Citro, D. Abra- hams, S. Carter, B. Hosmer, J. Marcus, M. Sklar, A. Templeton, T. Bricken, C. McDougall, H. Cunningham, T. Henighan, A. Jermyn, A. Jones, A. Persic, Z. Qi, T. Ben Thompson, S. Zimmerman, K. Rivoire, T. Conerly, C. Olah, and J. Batson. Circuit Tracing: Reveal- ing Compu...

2025

[4] [4]

A. E. Assadi, I. Chung, R. Solomatin, N. Muennighoff, and K. Enevoldsen. HUME: Measuring the human-model performance gap in text embedding tasks. InThe Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum? id=rcmfu1ydAf

2026

[5] [5]

Babakhin, R

Y . Babakhin, R. Osmulski, R. Ak, G. Moreira, M. Xu, B. Schifferer, B. Liu, and E. Oldridge. Llama-embed-nemotron-8b: A universal text embedding model for multilingual and cross- lingual tasks, 2025. URLhttps://arxiv.org/abs/2511.07025

work page arXiv 2025

[6] [6]

Bandarkar, B

L. Bandarkar, B. Muller, P. Yuvraj, R. Hou, N. Singhal, H. Lv, and B. Liu. Layer swapping for zero-shot cross-lingual transfer in large language models. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=vQhn4wrQ6j

2025

[7] [7]

Barak, B

B. Barak, B. L. Edelman, S. Goel, S. Kakade, E. Malach, and C. Zhang. Hidden Progress in Deep Learning: SGD Learns Parities Near the Computa- tional Limit. InAdvances in Neural Information Processing Systems, volume 35,

[8] [8]

URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/ 884baf65392170763b27c914087bde01-Abstract-Conference.html

2022

[9] [9]

Barber and F

D. Barber and F. Agakov. The IM algorithm: a variational approach to information maximization. InProceedings of the 17th International Conference on Neural Information Processing Systems, NIPS’03, pages 201–208, Cambridge, MA, USA, 2003. MIT Press

2003

[10] [10]

N. Butt, A. Kwiatkowski, I. Labiad, J. Kempe, and Y . Ollivier. Soft Tokens, Hard Truths. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=9JjKTp8Jmy

2026

[11] [11]

Z. Cai, X. Zhu, Y . Dong, Y . He, and S. Arora. T2MLR: Transformer with Temporal Middle- Layer Recurrence. InLIT Workshop @ ICLR 2026, 2026. URL https://openreview.net/ forum?id=fQbk1EQWBO

2026

[12] [12]

Emerging Properties in Self-Supervised Vision Transformers

M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin. Emerg- ing Properties in Self-Supervised Vision Transformers. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9650–9660, 2021. URL https://arxiv.org/abs/2104.14294

work page internal anchor Pith review Pith/arXiv arXiv 2021

[13] [13]

I. V . M. Cencerrado, A. P. Masdemont, A. G. Hawthorne, D. D. Africa, and L. Pacchiardi. No answer needed: Predicting llm answer accuracy from question-only linear probes, 2026. URL https://arxiv.org/abs/2509.10625. 10

work page arXiv 2026

[14] [14]

X. Chen, A. Zhao, H. Xia, X. Lu, H. Wang, Y . Chen, W. Zhang, J. Wang, W. Li, and X. Shen. Reasoning beyond language: A comprehensive survey on latent chain-of-thought reasoning,

[15] [15]

URLhttps://arxiv.org/abs/2505.16782

work page arXiv

[16] [16]

Chételat, J

D. Chételat, J. Cotnareanu, R. Thompson, Y . Zhang, and M. Coates. InnerThoughts: Dis- entangling Representations and Predictions in Large Language Models. In Y . Li, S. Mandt, S. Agrawal, and E. Khan, editors,Proceedings of The 28th International Conference on Arti- ficial Intelligence and Statistics, volume 258 ofProceedings of Machine Learning Research...

2025

[17] [17]

Conklin, T

H. Conklin, T. Hosking, T. Yi-Chern, J. D. Cohen, S.-J. Leslie, T. L. Griffiths, M. Bartolo, and S. Goldfarb-Tarrant. Learning is Forgetting; LLM Training As Lossy Compression. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=tvDlQj0GZB

2026

[18] [18]

Y . Cui, Z. Dai, B. He, Z. Shi, H. Liu, R. Sun, Z. Liu, Y . Xing, J. Tang, and B. Dumoulin. How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision? InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.22441

work page arXiv 2026

[19] [19]

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

DeepSeek-AI. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature, 645(8081):633–638, 2025. doi: 10.1038/s41586-025-09422-z. URL https://www. nature.com/articles/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025

[20] [20]

J. Deng, L. Pang, Z. Wei, S. Xu, Z. Duan, K. Xu, Y . Song, H. Shen, and X. Cheng. Latent Reasoning in LLMs as a V ocabulary-Space Superposition, 2025. URLhttps://arxiv.org/ abs/2510.15522

work page arXiv 2025

[21] [21]

Are Latent Reasoning Models Easily Interpretable?

C. Dilgren and S. Wiegreffe. Are Latent Reasoning Models Easily Interpretable? InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2604.04902

work page internal anchor Pith review Pith/arXiv arXiv 2026

[22] [22]

Dragunov, T

N. Dragunov, T. Rahmatullaev, E. Goncharova, A. Kuznetsov, and A. Razzhigaev. SONAR- LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens,

[23] [23]

URLhttps://arxiv.org/abs/2508.05305

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

C. Du, K. Fu, B. Wen, Y . Sun, J. Peng, W. Wei, Y . Gao, S. Wang, C. Zhang, J. Li, S. Qiu, L. Chang, and H. He. Human-like object concept representations emerge naturally in mul- timodal large language models.Nature Machine Intelligence, 7(6):860–875, June 2025. ISSN 2522-5839. doi: 10.1038/s42256-025-01049-z. URL http://dx.doi.org/10.1038/ s42256-025-01049-z

work page doi:10.1038/s42256-025-01049-z 2025

[25] [25]

Duquenne, H

P.-A. Duquenne, H. Schwenk, and B. Sagot. SONAR: Sentence-Level Multimodal and Language-Agnostic Representations, 2023. URLhttps://arxiv.org/abs/2308.11466

work page arXiv 2023

[26] [27]

Fadeeva, M

E. Fadeeva, M. Goloburda, A. Rubashevskii, R. Vashurin, A. Shelmanov, P. Nakov, M. Sachan, and M. Panov. Don’t Throw Away Your Beams: Improving Consistency-based Uncertain- ties in LLMs via Beam Search. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=igcQRiVlgu

2026

[27] [28]

Test of time: A benchmark for evaluating llms on temporal reasoning, 2024

B. Fatemi, M. Kazemi, A. Tsitsulin, K. Malkan, J. Yim, J. Palowitch, S. Seo, J. Halcrow, and B. Perozzi. Test of Time: A benchmark for evaluating LLMs on temporal reasoning.arXiv preprint arXiv:2406.09170, 2024

work page arXiv 2024

[28] [29]

J. Feng, S. Russell, and J. Steinhardt. Monitoring Latent World States in Language Models with Propositional Probes. InThe Thirteenth International Conference on Learning Representations,

[29] [30]

URLhttps://openreview.net/forum?id=0yvZm2AjUr. 11

[30] [31]

S. Feng, G. Fang, X. Ma, and X. Wang. Efficient reasoning models: A survey.Transactions on Machine Learning Research, 2025. ISSN 2835-8856. URL https://openreview.net/ forum?id=sySqlxj8EB

2025

[31] [32]

Godey, É

N. Godey, É. de la Clergerie, and B. Sagot. Anisotropy Is Inherent to Self-Attention in Transformers. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL) (Volume 1: Long Papers), pages 35–48, 2024. URL https://arxiv.org/abs/2401.12143

work page arXiv 2024

[32] [33]

Goyal, Z

S. Goyal, Z. Ji, A. S. Rawat, A. K. Menon, S. Kumar, and V . Nagarajan. Think before you speak: Training language models with pause tokens. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ph04CRkPdC

2024

[33] [34]

The Llama 3 Herd of Models

A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Vaughan, et al. The Llama 3 Herd of Models, 2024. URL https://arxiv. org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[34] [35]

S. Hao, S. Sukhbaatar, D. Su, X. Li, Z. Hu, J. Weston, and Y . Tian. Training large language models to reason in a continuous latent space.arXiv preprint arXiv:2412.06769, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[35] [36]

J. He, J. Liu, C. Y . Liu, R. Yan, C. Wang, P. Cheng, X. Zhang, F. Zhang, J. Xu, W. Shen, S. Li, L. Zeng, T. Wei, C. Cheng, B. An, Y . Liu, and Y . Zhou. Skywork Open Reasoner 1 Technical Report.arXiv preprint arXiv:2505.22312, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[36] [37]

J. He, J. Liu, C. Y . Liu, R. Yan, C. Wang, P. Cheng, X. Zhang, F. Zhang, J. Xu, W. Shen, S. Li, L. Zeng, T. Wei, C. Cheng, Y . Liu, and Y . Zhou. Sky- work Open Reasoner Series. https://capricious-hydrogen-41c.notion.site/ Skywork-Open-Reaonser-Series-1d0bc9ae823a80459b46c149e4f51680 , 2025. No- tion Blog

2025

[37] [38]

Helff, R

L. Helff, R. Härle, W. Stammer, F. Friedrich, M. Brack, A. Wüst, H. Shindo, P. Schramowski, and K. Kersting. Activationreasoning: Logical reasoning in latent activation spaces. In The Fourteenth International Conference on Learning Representations, 2026. URL https: //openreview.net/forum?id=gGJh5AZTG7

2026

[38] [39]

Herrmann, R

V . Herrmann, R. Csordás, and J. Schmidhuber. Measuring In-Context Computation Complexity via Hidden State Prediction. InForty-second International Conference on Machine Learning,

[39] [40]

URLhttps://openreview.net/forum?id=X21P8etjWL

[40] [41]

understanding

J. Hessel, A. Marasovi´c, J. D. Hwang, L. Lee, J. Da, R. Zellers, R. Mankoff, and Y . Choi. Do androids laugh at electric sheep? Humor “understanding” benchmarks from the New Yorker caption contest.arXiv preprint arXiv:2209.06293, 2022

work page arXiv 2022

[41] [42]

Holtzman, J

A. Holtzman, J. Buys, L. Du, M. Forbes, and Y . Choi. The curious case of neural text degeneration. InInternational Conference on Learning Representations, 2020. URL https: //openreview.net/forum?id=rygGQyrFvH

2020

[42] [43]

Less is More: Recursive Reasoning with Tiny Networks

A. Jolicoeur-Martineau. Less is More: Recursive Reasoning with Tiny Networks, 2025. URL https://arxiv.org/abs/2510.04871

work page internal anchor Pith review Pith/arXiv arXiv 2025

[43] [44]

Kazemi, H

M. Kazemi, H. Alvari, A. Anand, J. Wu, X. Chen, and R. Soricut. GeomVerse: A systematic evaluation of large models for geometric reasoning.arXiv preprint arXiv:2312.12241, 2023

work page arXiv 2023

[44] [45]

Kazemi, Q

M. Kazemi, Q. Yuan, D. Bhatia, N. Kim, X. Xu, V . Imbrasaite, and D. Ramachandran. BoardgameQA: A dataset for natural language reasoning with contradictory information.Ad- vances in Neural Information Processing Systems, 36, 2024

2024

[45] [46]

Kazemi, B

M. Kazemi, B. Fatemi, H. Bansal, J. Palowitch, C. Anastasiou, S. V . Mehta, L. K. Jain, V . Agli- etti, D. Jindal, P. Chen, N. Dikkala, G. Tyen, X. Liu, U. Shalit, S. Chiappa, K. Olszewska, Y . Tay, V . Q. Tran, Q. V . Le, and O. Firat. BIG-bench extra hard. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Proceedings of the 63rd Annual Meet...

[46] [47]

Fang, J., Jiang, H., Wang, K., Ma, Y ., Shi, J., Wang, X., He, X., and Chua, T

Association for Computational Linguistics. ISBN 979-8-89176-251-0. doi: 10.18653/v1/ 2025.acl-long.1285. URLhttps://aclanthology.org/2025.acl-long.1285/. 12

work page doi:10.18653/v1/ 2025

[47] [48]

Kıcıman, R

E. Kıcıman, R. Ness, A. Sharma, and C. Tan. Causal reasoning and large language models: Opening a new frontier for causality.arXiv preprint arXiv:2305.00050, 2023

work page arXiv 2023

[48] [49]

Koishekenov, A

Y . Koishekenov, A. Lipani, and N. Cancedda. Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts, 2025. URL https://arxiv.org/abs/2510.07358

work page arXiv 2025

[49] [50]

L. Kuhn, Y . Gal, and S. Farquhar. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=VD-AYtP0dve

2023

[50] [51]

Y . Li, J. Chen, F. Wu, J. Yu, H. Qi, W. Xuan, H. Zhao, P. Nie, D. Jin, and X. Tang. Learning Multi-step Reasoning via Persistent Latent State Propagation. InLIT Workshop @ ICLR 2026,

2026

[51] [52]

URLhttps://openreview.net/forum?id=Dcv4B1UCuW

[52] [53]

Z. Li, X. Bai, K. Chen, Y . Li, J. Yang, C. Lin, and M. Zhang. Dynamics Within Latent Chain- of-Thought: An Empirical Study of Causal Structure. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.08783

work page internal anchor Pith review Pith/arXiv arXiv 2026

[53] [54]

Litwin-Kumar, K

A. Litwin-Kumar, K. D. Harris, R. Axel, H. Sompolinsky, and L. F. Abbott. Optimal Degrees of Synaptic Connectivity.Neuron, 93(5):1153–1164.e7, 2017. doi: 10.1016/j.neuron.2017.01.030

work page doi:10.1016/j.neuron.2017.01.030 2017

[54] [55]

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

W. Lugoloobi, T. Foster, W. Bankes, and C. Russell. LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations. InLIT Workshop @ ICLR 2026, 2026. URL https://arxiv.org/abs/2602.09924

work page internal anchor Pith review Pith/arXiv arXiv 2026

[55] [56]

F. V . Massoli, A. Kuzmin, and A. Behboodi. Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck. InThe 1st Workshop on Scaling Post- training for LLMs, 2026. URLhttps://openreview.net/forum?id=98sbP0T8ck

2026

[56] [57]

Mondorf and B

P. Mondorf and B. Plank. Beyond accuracy: Evaluating the reasoning behavior of large language models – a survey. InFirst Conference on Language Modeling (COLM), 2024. URL https://openreview.net/forum?id=Lmjgl2n11u

2024

[57] [58]

MTEB: Massive Text Embedding Benchmark

N. Muennighoff, N. Tazi, L. Magne, and N. Reimers. MTEB: Massive Text Embedding Benchmark, 2023. URLhttps://arxiv.org/abs/2210.07316

work page internal anchor Pith review Pith/arXiv arXiv 2023

[58] [59]

A. Nie, Y . Zhang, A. S. Amdekar, C. Piech, T. B. Hashimoto, and T. Gerstenberg. MoCa: Measuring human-language model alignment on causal and moral judgment tasks.Advances in Neural Information Processing Systems, 36, 2024

2024

[59] [60]

gpt-oss-120b & gpt-oss-20b Model Card

OpenAI. gpt-oss-120b & gpt-oss-20b Model Card, 2025. URL https://arxiv.org/abs/ 2508.10925

work page internal anchor Pith review Pith/arXiv arXiv 2025

[60] [61]

K. Park, Y . J. Choe, and V . Veitch. The Linear Representation Hypothesis and the Geometry of Large Language Models. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 39643–39666. PMLR, 2024. URLhttps://proceedings.mlr.press/v235/park24c.html

2024

[61] [62]

Enforcing Logical Invariance in Large Language Models via Symmetry Pair Training

Prasanth. Enforcing Logical Invariance in Large Language Models via Symmetry Pair Training. InICLR 2026 Workshop on Logical Reasoning of Large Language Models, 2026. URL https://openreview.net/forum?id=aZFS8rc6Bf

2026

[62] [63]

Recanatesi, M

S. Recanatesi, M. Farrell, M. Advani, T. Moore, G. Lajoie, and E. Shea-Brown. Dimensionality compression and expansion in Deep Neural Networks, 2019. URL https://arxiv.org/abs/ 1906.00443

work page arXiv 2019

[63] [64]

Rizvi-Martel and M

M. Rizvi-Martel and M. Mosbach. The Illusion of Superposition in Latent CoT via Soft Thinking. InLIT Workshop @ ICLR 2026, 2026. URL https://openreview.net/forum? id=FvPx9Nzvnw

2026

[64] [65]

Sahoo, A

S. Sahoo, A. Chadha, V . Jain, and D. Chaudhary. When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning. InLIT Workshop @ ICLR 2026, 2026. URL https://arxiv.org/abs/2603.03475. 13

work page arXiv 2026

[65] [66]

Salhan, E

S. Salhan, E. Zhou, and P. Buttery. Do Monolingual Language Models Learn Cross-Lingual Universal Conceptual Representations? InICLR 2026 Workshop on Unifying Concept Repre- sentation Learning, 2026. URLhttps://openreview.net/forum?id=frKa6ujOyE

2026

[66] [67]

Sánchez, B

E. Sánchez, B. Alastruey, C. Ropers, P. Stenetorp, M. Artetxe, and M. R. Costa-jussà. Linguini: A benchmark for language-agnostic linguistic reasoning.arXiv preprint arXiv:2409.12126, 2024

work page arXiv 2024

[67] [68]

K. Shah, N. Dikkala, X. Wang, and R. Panigrahy. Causal language modeling can elicit search and reasoning capabilities on logic puzzles.arXiv preprint arXiv:2409.10502, 2024

work page arXiv 2024

[68] [69]

Shani, L

C. Shani, L. Soffer, D. Jurafsky, Y . LeCun, and R. Shwartz-Ziv. From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning, 2025. URL https://arxiv.org/abs/ 2505.17117

work page arXiv 2025

[69] [70]

Z. Shen, H. Yan, L. Zhang, Z. Hu, Y . Du, and Y . He. CODI: Compressing chain-of-thought into continuous space via self-distillation. In C. Christodoulopoulos, T. Chakraborty, C. Rose, and V . Peng, editors,Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 677–693, Suzhou, China, Nov. 2025. Association for Compu...

work page doi:10.18653/v1/2025.emnlp-main.36 2025

[70] [71]

Sheshanarayana, R

D. Sheshanarayana, R. S. Pal, M. Sinha, and T. Dasgupta. Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2603.15051

work page arXiv 2026

[71] [72]

Skean, M

O. Skean, M. R. Arefin, D. Zhao, N. N. Patel, J. Naghiyev, Y . LeCun, and R. Shwartz-Ziv. Layer by layer: Uncovering hidden representations in language models. InForty-second International Conference on Machine Learning, 2025. URL https://openreview.net/ forum?id=WGXb7UdvTX

2025

[72] [73]

Sui, Y .-N

Y . Sui, Y .-N. Chuang, G. Wang, J. Zhang, T. Zhang, J. Yuan, H. Liu, A. Wen, S. Zhong, N. Zou, H. Chen, and X. Hu. Stop overthinking: A survey on efficient reasoning for large language models.Transactions on Machine Learning Research, 2025. ISSN 2835-8856. URL https://openreview.net/forum?id=HvoG8SxggZ

2025

[73] [74]

Q. Sun, M. Pickett, A. K. Nain, and L. Jones. Transformer Layers as Painters. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 25219–25227, 2025. doi: 10.1609/aaai.v39i24.34708. URL https://ojs.aaai.org/index.php/AAAI/article/ view/34708

work page doi:10.1609/aaai.v39i24.34708 2025

[74] [75]

L. team, L. Barrault, P.-A. Duquenne, M. Elbayad, A. Kozhevnikov, B. Alastruey, P. Andrews, M. Coria, G. Couairon, M. R. Costa-jussà, D. Dale, H. Elsahar, K. Heffernan, J. M. Janeiro, T. Tran, C. Ropers, E. Sánchez, R. S. Roman, A. Mourachko, S. Saleem, and H. Schwenk. Large Concept Models: Language Modeling in a Sentence Representation Space, 2024. URL h...

work page arXiv 2024

[75] [76]

G. Tyen, H. Mansoor, P. Chen, T. Mak, and V . C˘arbune. LLMs cannot find reasoning errors, but can correct them!arXiv preprint arXiv:2311.08516, 2023

work page arXiv 2023

[76] [77]

Wang and F

W. Wang and F. Reid. Tiny Recursive Reasoning with Mamba-2 Attention Hybrid. InLIT Workshop @ ICLR 2026, 2026. URLhttps://arxiv.org/abs/2602.12078

work page arXiv 2026

[77] [78]

Wendler, V

C. Wendler, V . Veselovsky, G. Monea, and R. West. Do llamas work in english? on the latent language of multilingual transformers. In L.-W. Ku, A. Martins, and V . Srikumar, ed- itors,Proceedings of the 62nd Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 15366–15394, Bangkok, Thailand, Aug. 2024. Associati...

work page doi:10.18653/v1/2024.acl-long.820 2024

[78] [79]

LiveBench: A Challenging, Contamination-Limited LLM Benchmark

C. White, S. Dooley, M. Roberts, A. Pal, B. Feuer, S. Jain, R. Shwartz-Ziv, N. Jain, K. Saifullah, S. Naidu, et al. LiveBench: A challenging, contamination-free LLM benchmark.arXiv preprint arXiv:2406.19314, 2024. 14

work page internal anchor Pith review Pith/arXiv arXiv 2024

[79] [80]

J. Wu, J. Lu, Z. Ren, G. Hu, Z. Wu, D. Dai, and H. Wu. LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking. InThe Fourteenth International Conference on Learning Representations, 2026. URL https://openreview.net/forum? id=ASLuOoP78o

2026

[80] [81]

Z. Wu, Y . Xiong, S. X. Yu, and D. Lin. Unsupervised Feature Learning via Non-Parametric Instance Discrimination. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3733–3742, 2018

2018