A Navigable Manifold of Hypothesized Consciousness-Spectrum States in Language Model Representations

Sophie Zhao

arxiv: 2606.09894 · v1 · pith:X3N7AVQ7new · submitted 2026-06-04 · 💻 cs.LG · cs.CL

A Navigable Manifold of Hypothesized Consciousness-Spectrum States in Language Model Representations

Sophie Zhao This is my paper

Pith reviewed 2026-06-28 03:22 UTC · model grok-4.3

classification 💻 cs.LG cs.CL

keywords consciousness spectrumlanguage model embeddingsmanifold structuregeometric organizationnavigabilitytransformer representationsclusteringtrajectory analysis

0 comments

The pith

Language model embeddings form a structured navigable manifold aligned with a hypothesized consciousness spectrum.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether transformer embedding spaces encode a human-interpretable consciousness spectrum ranging from reactive self-focused patterns to integrative coherent ones. Sentences tied to similar states cluster into locally coherent regions that together form a globally organized manifold. Higher-level and lower-level regions show convexity-like stability while intermediate areas create a transition corridor. Both utility-guided and purely geometric greedy paths move consistently from lower to higher regions, indicating that navigability is an intrinsic feature of the space rather than imposed by external signals.

Core claim

Embeddings exhibit a globally organized geometry aligned with this spectrum: sentences associated with similar states cluster into locally coherent regions, forming a structured manifold. In particular, higher-level and lower-level regions exhibit convexity-like stability, while intermediate regions form a transition corridor. Dynamically, both utility-guided and geometry-only greedy trajectories consistently traverse from lower- to higher-level regions, passing through intermediate tiers, indicating that navigability is an intrinsic property of the representation space, guided but not dictated by a global directional signal.

What carries the argument

The consciousness-spectrum manifold in embedding space, where similar-state sentences form coherent clusters with stable poles at the extremes and a navigable transition corridor in between.

If this is right

Embedding spaces encode structured and navigable geometry aligned with the hypothesized taxonomy.
Navigability from lower- to higher-level states holds for both guided and geometry-only trajectories.
Higher- and lower-level regions exhibit stability while intermediate regions act as a transition corridor.
Representation-level geometry offers a perspective for analyzing and guiding model behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Steering generations along manifold paths could shift model outputs toward higher states without external reward signals.
The same manifold structure might appear when applying the spectrum to other model architectures or modalities.
Varying the sentence generation process or embedding model could test whether the geometry is robust or tied to specific training data patterns.

Load-bearing premise

The hypothesized consciousness-spectrum taxonomy can be translated into natural-language sentences whose embeddings will reveal an intrinsic geometric structure rather than one created by the choice of labels or clustering method.

What would settle it

Finding that random or label-shuffled sentence sets produce the same clustering into stable poles, transition corridors, and upward trajectories would show the structure is not specific to the spectrum.

Figures

Figures reproduced from arXiv: 2606.09894 by Sophie Zhao.

**Figure 2.** Figure 2: GES heatmaps (BGE) for k = 15 (left) and k = 30 (right). Within-tier pairs exhibit lower stretch, while distant tiers show higher values. Intermediate tiers show comparatively lower stretch to neighboring tiers, consistent with their role as transitional regions [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Tier dynamics under utility-guided trajectories. Paths initialized from low-tier states move through intermediate tiers and converge toward Unity. where expressions become more regulated and action-oriented. It subsequently passes through Clarity, characterized by structured and integrative reasoning, before converging to Unity, where expressions become abstract, non-dual, and globally coherent. Notably, p… view at source ↗

**Figure 4.** Figure 4: Representative BGE trajectory at k = 30 (left), with corresponding path overlaid on the UMAP manifold (right). Overall, these results indicate that the manifold is not only geometrically structured, but also dynamically navigable. A weak global score-guided direction, combined with local neighborhood constraints, is sufficient to induce consistent and interpretable trajectories from low-level to high-leve… view at source ↗

**Figure 5.** Figure 5: Geometry-only greedy walk dynamics (k = 30). Left: BGE. Right: Qwen. Interpretation. Intermediate tiers thus form a geometrically embedded transition corridor. Crucially, this structure emerges without utility guidance, indicating that navigability is an intrinsic property of the embedding manifold rather than a result of trajectory design. 5 Discussion This work provides evidence that embedding spaces ex… view at source ↗

**Figure 6.** Figure 6: UMAP projections across neighborhood scales k ∈ {10, 15, 20, 30} for BGE, Qwen, MPNET, and MiniLM embeddings. Across models and scales, the overall tier organization remains stable, while larger neighborhood sizes produce smoother local structure, suggesting that the observed manifold geometry is robust to the choice of connectivity scale. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Tier-pair Geodesic–Euclidean Stretch (GES) for BGE embeddings across k ∈ {10, 15, 20, 30}. Within-tier pairs show lower stretch, while distant tier pairs show higher values. Absolute stretch decreases as k increases, but the overall structure remains stable. G Full Convexity Results [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Representative BGE trajectory at k = 30, with the corresponding path overlaid on the UMAP manifold. The trajectory progresses gradually from Collapse through intermediate regions such as Striving, Conflict, Activation, and Growth before converging to Unity. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Representative Qwen trajectory at k = 15, with the corresponding path overlaid on the UMAP manifold. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

read the original abstract

Across contemplative, philosophical, and psychological accounts, human consciousness is often described along a similar spectrum, ranging from reactive and self-focused patterns to more integrative and coherent ones. Understanding whether language models encode such a structured, human-interpretable consciousness spectrum in representation space is important for model guidance, evaluation and alignment. In this work, we study the geometric structure and dynamics of patterns along this spectrum in transformer embedding spaces. We show that embeddings exhibit a globally organized geometry aligned with this spectrum: sentences associated with similar states cluster into locally coherent regions, forming a structured manifold. In particular, higher-level and lower-level regions exhibit convexity-like stability, while intermediate regions form a transition corridor. Dynamically, both utility-guided and geometry-only greedy trajectories consistently traverse from lower- to higher-level regions, passing through intermediate tiers, indicating that navigability is an intrinsic property of the representation space, guided but not dictated by a global directional signal. These results suggest that embedding spaces encode structured and navigable geometry aligned with a hypothesized consciousness-spectrum taxonomy, broadly inspired by recurring structural descriptions of human consciousness across contemplative traditions, philosophy, and modern psychology, providing a representation-level perspective for analyzing and guiding model behavior.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract claims a navigable manifold in embeddings aligned with an author-defined consciousness spectrum, but without methods or controls the geometry is probably just an artifact of sentence curation.

read the letter

The main takeaway is that this paper describes embeddings forming clusters and trajectories that match a spectrum from reactive to integrative states, with higher and lower regions stable and a corridor in between. The dynamic claim is that both guided and unguided walks move toward higher states.

What is new is the application of standard embedding geometry and trajectory methods to this particular taxonomy drawn from contemplative traditions and psychology. The trajectory analysis adds a dynamic element beyond simple clustering.

The paper does a reasonable job in the abstract of connecting the idea to alignment and evaluation questions, and the framing is clear enough on why someone might care about representation-level structure.

The soft spots are substantial and central. Only the abstract is available, so there are no methods, datasets, clustering details, or metrics to examine. The taxonomy is introduced by the authors and then used to label the sentences whose geometry is reported as matching it. Without independent labeling, label-permutation baselines, random sentence sets, or any check that the structure survives controls for semantic similarity built into the examples, the claim that the manifold is intrinsic rather than induced does not hold up. The stress-test concern lands directly here. The circularity is load-bearing.

This is aimed at interpretability researchers willing to connect geometric findings to philosophical taxonomies. A reader looking for a fully worked result with falsifiable steps will not get much from it yet.

I would not bring this to a reading group. I would not cite it. It does not deserve peer review in its current state because the central claim cannot be assessed.

Referee Report

2 major / 0 minor

Summary. The manuscript claims that transformer embedding spaces encode a globally organized, navigable manifold aligned with a hypothesized consciousness-spectrum taxonomy (ranging from reactive/self-focused to integrative/coherent states). Sentences instantiating similar states form locally coherent clusters; higher- and lower-level regions exhibit convexity-like stability while intermediate regions act as a transition corridor; both utility-guided and geometry-only greedy trajectories reliably traverse from lower- to higher-level states.

Significance. If the reported geometry were shown to be intrinsic rather than induced by the authors' sentence curation and taxonomy, the work would supply a concrete representation-level lens for analyzing and steering model behavior. However, the absence of controls for label-induced structure substantially weakens the evidential basis for that interpretation.

major comments (2)

[Abstract] Abstract (paragraph 3): the central claim that the observed clusters, stability, corridor, and lower-to-higher trajectories constitute an 'intrinsic' property of the representation space is load-bearing yet unsupported by any reported controls (random sentence baselines, alternative taxonomies, label-permutation tests, or within- vs. between-state semantic-distance comparisons independent of the authors' framing). Without these, the geometry discovery reduces to a description of the input curation.
[Abstract] Abstract (paragraph 3) and methods description: the spectrum taxonomy is introduced by the authors and then used both to generate the sentence exemplars and to interpret the resulting embedding geometry as 'aligned' with that taxonomy. No independent, pre-specified labeling scheme or out-of-sample validation is described, rendering the alignment claim circular by construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the need for controls to substantiate claims of intrinsic geometry and for identifying the risk of circularity in the taxonomy-based approach. We respond to each major comment below, indicating revisions where the manuscript will be updated to address the concerns directly.

read point-by-point responses

Referee: [Abstract] Abstract (paragraph 3): the central claim that the observed clusters, stability, corridor, and lower-to-higher trajectories constitute an 'intrinsic' property of the representation space is load-bearing yet unsupported by any reported controls (random sentence baselines, alternative taxonomies, label-permutation tests, or within- vs. between-state semantic-distance comparisons independent of the authors' framing). Without these, the geometry discovery reduces to a description of the input curation.

Authors: We agree that the submitted manuscript lacks the recommended controls, and this limits the strength of the 'intrinsic' claim. In revision we will add random sentence baselines, label-permutation tests, and within- versus between-state semantic-distance comparisons computed independently of the taxonomy framing. The geometry-only greedy trajectories already operate without label access and still produce consistent lower-to-higher traversals; we will quantify how this exceeds chance under the new controls. These additions will be reported in a new results subsection. revision: yes
Referee: [Abstract] Abstract (paragraph 3) and methods description: the spectrum taxonomy is introduced by the authors and then used both to generate the sentence exemplars and to interpret the resulting embedding geometry as 'aligned' with that taxonomy. No independent, pre-specified labeling scheme or out-of-sample validation is described, rendering the alignment claim circular by construction.

Authors: The taxonomy synthesizes recurring structural descriptions from the cited contemplative, philosophical, and psychological literature rather than being invented ad hoc. Sentence generation was guided by it, yet the manifold geometry and directed navigability are emergent properties of the embeddings. To remove circularity we will add out-of-sample validation on sentences drawn from independent sources never used in generation, plus explicit comparison against two alternative taxonomies. These results will be included in the revised methods and results sections. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical geometry claims remain independent of input taxonomy

full rationale

The provided abstract describes an empirical study of transformer embeddings for sentences associated with a hypothesized consciousness-spectrum taxonomy that is explicitly framed as inspired by external contemplative, philosophical, and psychological traditions rather than internally defined. No equations, self-citations, or derivation steps are shown that reduce the reported manifold properties (local clusters, convexity-like stability, transition corridor, or lower-to-higher trajectories) to the authors' labeling choices by construction. The central results concern specific geometric and dynamic features in representation space, which are presented as observations rather than tautological outputs of the taxonomy itself. Absent any quoted reduction matching the enumerated patterns, the derivation chain is treated as self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The central claim rests on an author-defined taxonomy whose translation into sentences is not independently validated, plus standard but unspecified embedding and clustering procedures whose parameters are not reported.

free parameters (2)

spectrum level definitions and sentence exemplars
The mapping from contemplative traditions to concrete sentence labels is chosen by the authors and directly determines which points are grouped together.
clustering and trajectory hyperparameters
Any k, distance metric, or step-size choices that produce the reported manifold and corridors are free parameters not constrained by external data.

axioms (2)

domain assumption A single, low-dimensional manifold structure exists in the embedding space that is meaningfully aligned with the authors' consciousness spectrum.
Invoked in the abstract when the geometry is described as 'globally organized' and 'intrinsic'.
domain assumption Greedy trajectories on the embedding graph reflect intrinsic navigability rather than artifacts of the chosen utility function or local density.
Stated when both utility-guided and geometry-only walks are said to traverse the spectrum consistently.

invented entities (1)

consciousness-spectrum manifold no independent evidence
purpose: A geometric object in embedding space that organizes model states according to the hypothesized taxonomy.
Postulated to explain the observed clustering and trajectory behavior; no independent falsifiable prediction (e.g., a specific predicted dimension or external validation set) is supplied.

pith-pipeline@v0.9.1-grok · 5733 in / 1647 out tokens · 49223 ms · 2026-06-28T03:22:37.491638+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 11 canonical work pages · 4 internal anchors

[1]

Shambhala, Boston, MA, 1993

Thomas Cleary.The Flower Ornament Scripture: A Translation of the Avatamsaka Sutra. Shambhala, Boston, MA, 1993

1993
[2]

Oxford University Press, Oxford, 1891

Laozi.Tao Te Ching. Oxford University Press, Oxford, 1891
[3]

Methuen, London, 1911

Evelyn Underhill.Mysticism: A Study in the Nature and Development of Spiritual Consciousness. Methuen, London, 1911

1911
[4]

Teresa of Avila.The Interior Castle

St. Teresa of Avila.The Interior Castle. Riverhead, 2004

2004
[5]

International Universities Press, 1952

Jean Piaget.The Origins of Intelligence in Children. International Universities Press, 1952

1952
[6]

Harvard University Press, 1983

Robert Kegan.The Evolving Self: Problem and Process in Human Development. Harvard University Press, 1983

1983
[7]

Harper and Brothers, 1954

Abraham Maslow.Motivation and Personality. Harper and Brothers, 1954

1954
[8]

Maslow.Toward a Psychology of Being

Abraham H. Maslow.Toward a Psychology of Being. Van Nostrand, 1968

1968
[9]

Quest Books, Wheaton, IL, 1977

Ken Wilber.The Spectrum of Consciousness. Quest Books, Wheaton, IL, 1977

1977
[10]

Emotion concepts and their function in a large language model

Nicholas Sofroniew, Isaac Kauvar, William Saunders, Runjin Chen, Tom Henighan, Sasha Hydrie, Craig Citro, Adam Pearce, Julius Tarng, Wes Gurnee, Joshua Batson, Sam Zimmerman, Kelley Rivoire, Kyle Fish, Chris Olah, and Jack Lindsey. Emotion concepts and their function in a large language model. Transformer Circuits Thread, April 2026

2026
[11]

Emotions where art thou: Understanding and characterizing the emotional latent space of large language models.arXiv preprint arXiv:2510.22042, 2026

Benjamin Reichman, Adar Avsian, and Larry Heck. Emotions where art thou: Understanding and characterizing the emotional latent space of large language models.arXiv preprint arXiv:2510.22042, 2026

work page arXiv 2026
[12]

Decoding emotion in the deep: A systematic study of how llms represent, retain, and express emotion.arXiv preprint arXiv:2510.04064, 2025

Jingxiang Zhang and Lujia Zhong. Decoding emotion in the deep: A systematic study of how llms represent, retain, and express emotion.arXiv preprint arXiv:2510.04064, 2025

work page arXiv 2025
[13]

InFindings of ACL

Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, and Jonathan Gratch. Mech- anistic interpretability of emotion inference in large language models.arXiv preprint arXiv:2502.05489, 2025

work page arXiv 2025
[14]

Persona Vectors: Monitoring and Controlling Character Traits in Language Models

Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, and Jack Lindsey. Persona vectors: Monitoring and controlling character traits in language models.arXiv preprint arXiv:2507.21509, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Emergence of Hierarchical Emotion Organization in Large Language Models

Bo Zhao, Maya Okawa, Eric J. Bigelow, Rose Yu, Tomer Ullman, Ekdeep Singh Lubana, and Hidenori Tanaka. Emergence of hierarchical emotion organization in large language models.arXiv preprint arXiv:2507.10599, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[16]

Ai shares emotion with humans across languages and cultures.arXiv preprint arXiv:2506.13978, 2025

Xiuwen Wu, Hao Wang, Zhiang Yan, Xiaohan Tang, Pengfei Xu, Wai-Ting Siok, Ping Li, Jia-Hong Gao, Bingjiang Lyu, and Lang Qin. Ai shares emotion with humans across languages and cultures.arXiv preprint arXiv:2506.13978, 2025

work page arXiv 2025
[17]

Tenenbaum, Vin de Silva, and John C

Joshua B. Tenenbaum, Vin de Silva, and John C. Langford. A global geometric framework for nonlinear dimensionality reduction.Science, 290(5500):2319–2323, 2000

2000
[18]

Roweis and Lawrence K

Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000. 10

2000
[19]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Leland McInnes, John Healy, and James Melville. UMAP: Uniform manifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[20]

Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

2016
[21]

Manifold learning: What, how, and why.Annual Review of Statistics and Its Application, 11, 2024

Marina Meila and Hanyu Zhang. Manifold learning: What, how, and why.Annual Review of Statistics and Its Application, 11, 2024

2024
[22]

Langford, and Joshua B

Mikhail Bernstein, Vin de Silva, John C. Langford, and Joshua B. Tenenbaum. Graph approximations to geodesics on embedded manifolds. Technical report, Stanford University, 2000

2000
[23]

Laplacian eigenmaps for dimensionality reduction and data representa- tion.Neural Computation, 15(6):1373–1396, 2003

Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representa- tion.Neural Computation, 15(6):1373–1396, 2003

2003
[24]

The geometry of categorical and hierarchical concepts in large language models.arXiv preprint arXiv:2406.01506, 2024

Kiho Park, Yo Joong Choe, Yibo Jiang, and Victor Veitch. The geometry of categorical and hierarchical concepts in large language models.arXiv preprint arXiv:2406.01506, 2024

work page arXiv 2024
[25]

The origins of representation manifolds in large language models.arXiv preprint arXiv:2505.18235, 2025

Alexander Modell, Patrick Rubin-Delanchy, and Nick Whiteley. The origins of representation manifolds in large language models.arXiv preprint arXiv:2505.18235, 2025

work page arXiv 2025
[26]

Coifman and Stephane Lafon

Ronald R. Coifman and Stephane Lafon. Diffusion maps.Applied and Computational Harmonic Analysis, 21(1):5–30, 2006

2006
[27]

do Carmo.Riemannian Geometry

Manfredo P. do Carmo.Riemannian Geometry. Birkhäuser, 1992

1992
[28]

On convex decision regions in deep network representations.arXiv preprint arXiv:2305.17154, 2023

Lenka Tˇetková, Thea Brüsch, Teresa Karen Scheidt, Fabian Martin Mager, Rasmus Ørtoft Aagaard, et al. On convex decision regions in deep network representations.arXiv preprint arXiv:2305.17154, 2023

work page arXiv 2023
[29]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton et al. Distilling the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. A Additional Ablation Results Table 6: Full score regression results under directional ablation. Normal denotes original embeddings, Ablated denotes projection removal along the learned score-aligned direction, and Perm-label control denotes removal ...

work page internal anchor Pith review Pith/arXiv arXiv 2015

[1] [1]

Shambhala, Boston, MA, 1993

Thomas Cleary.The Flower Ornament Scripture: A Translation of the Avatamsaka Sutra. Shambhala, Boston, MA, 1993

1993

[2] [2]

Oxford University Press, Oxford, 1891

Laozi.Tao Te Ching. Oxford University Press, Oxford, 1891

[3] [3]

Methuen, London, 1911

Evelyn Underhill.Mysticism: A Study in the Nature and Development of Spiritual Consciousness. Methuen, London, 1911

1911

[4] [4]

Teresa of Avila.The Interior Castle

St. Teresa of Avila.The Interior Castle. Riverhead, 2004

2004

[5] [5]

International Universities Press, 1952

Jean Piaget.The Origins of Intelligence in Children. International Universities Press, 1952

1952

[6] [6]

Harvard University Press, 1983

Robert Kegan.The Evolving Self: Problem and Process in Human Development. Harvard University Press, 1983

1983

[7] [7]

Harper and Brothers, 1954

Abraham Maslow.Motivation and Personality. Harper and Brothers, 1954

1954

[8] [8]

Maslow.Toward a Psychology of Being

Abraham H. Maslow.Toward a Psychology of Being. Van Nostrand, 1968

1968

[9] [9]

Quest Books, Wheaton, IL, 1977

Ken Wilber.The Spectrum of Consciousness. Quest Books, Wheaton, IL, 1977

1977

[10] [10]

Emotion concepts and their function in a large language model

Nicholas Sofroniew, Isaac Kauvar, William Saunders, Runjin Chen, Tom Henighan, Sasha Hydrie, Craig Citro, Adam Pearce, Julius Tarng, Wes Gurnee, Joshua Batson, Sam Zimmerman, Kelley Rivoire, Kyle Fish, Chris Olah, and Jack Lindsey. Emotion concepts and their function in a large language model. Transformer Circuits Thread, April 2026

2026

[11] [11]

Emotions where art thou: Understanding and characterizing the emotional latent space of large language models.arXiv preprint arXiv:2510.22042, 2026

Benjamin Reichman, Adar Avsian, and Larry Heck. Emotions where art thou: Understanding and characterizing the emotional latent space of large language models.arXiv preprint arXiv:2510.22042, 2026

work page arXiv 2026

[12] [12]

Decoding emotion in the deep: A systematic study of how llms represent, retain, and express emotion.arXiv preprint arXiv:2510.04064, 2025

Jingxiang Zhang and Lujia Zhong. Decoding emotion in the deep: A systematic study of how llms represent, retain, and express emotion.arXiv preprint arXiv:2510.04064, 2025

work page arXiv 2025

[13] [13]

InFindings of ACL

Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, and Jonathan Gratch. Mech- anistic interpretability of emotion inference in large language models.arXiv preprint arXiv:2502.05489, 2025

work page arXiv 2025

[14] [14]

Persona Vectors: Monitoring and Controlling Character Traits in Language Models

Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, and Jack Lindsey. Persona vectors: Monitoring and controlling character traits in language models.arXiv preprint arXiv:2507.21509, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

Emergence of Hierarchical Emotion Organization in Large Language Models

Bo Zhao, Maya Okawa, Eric J. Bigelow, Rose Yu, Tomer Ullman, Ekdeep Singh Lubana, and Hidenori Tanaka. Emergence of hierarchical emotion organization in large language models.arXiv preprint arXiv:2507.10599, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[16] [16]

Ai shares emotion with humans across languages and cultures.arXiv preprint arXiv:2506.13978, 2025

Xiuwen Wu, Hao Wang, Zhiang Yan, Xiaohan Tang, Pengfei Xu, Wai-Ting Siok, Ping Li, Jia-Hong Gao, Bingjiang Lyu, and Lang Qin. Ai shares emotion with humans across languages and cultures.arXiv preprint arXiv:2506.13978, 2025

work page arXiv 2025

[17] [17]

Tenenbaum, Vin de Silva, and John C

Joshua B. Tenenbaum, Vin de Silva, and John C. Langford. A global geometric framework for nonlinear dimensionality reduction.Science, 290(5500):2319–2323, 2000

2000

[18] [18]

Roweis and Lawrence K

Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000. 10

2000

[19] [19]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Leland McInnes, John Healy, and James Melville. UMAP: Uniform manifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[20] [20]

Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

2016

[21] [21]

Manifold learning: What, how, and why.Annual Review of Statistics and Its Application, 11, 2024

Marina Meila and Hanyu Zhang. Manifold learning: What, how, and why.Annual Review of Statistics and Its Application, 11, 2024

2024

[22] [22]

Langford, and Joshua B

Mikhail Bernstein, Vin de Silva, John C. Langford, and Joshua B. Tenenbaum. Graph approximations to geodesics on embedded manifolds. Technical report, Stanford University, 2000

2000

[23] [23]

Laplacian eigenmaps for dimensionality reduction and data representa- tion.Neural Computation, 15(6):1373–1396, 2003

Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representa- tion.Neural Computation, 15(6):1373–1396, 2003

2003

[24] [24]

The geometry of categorical and hierarchical concepts in large language models.arXiv preprint arXiv:2406.01506, 2024

Kiho Park, Yo Joong Choe, Yibo Jiang, and Victor Veitch. The geometry of categorical and hierarchical concepts in large language models.arXiv preprint arXiv:2406.01506, 2024

work page arXiv 2024

[25] [25]

The origins of representation manifolds in large language models.arXiv preprint arXiv:2505.18235, 2025

Alexander Modell, Patrick Rubin-Delanchy, and Nick Whiteley. The origins of representation manifolds in large language models.arXiv preprint arXiv:2505.18235, 2025

work page arXiv 2025

[26] [26]

Coifman and Stephane Lafon

Ronald R. Coifman and Stephane Lafon. Diffusion maps.Applied and Computational Harmonic Analysis, 21(1):5–30, 2006

2006

[27] [27]

do Carmo.Riemannian Geometry

Manfredo P. do Carmo.Riemannian Geometry. Birkhäuser, 1992

1992

[28] [28]

On convex decision regions in deep network representations.arXiv preprint arXiv:2305.17154, 2023

Lenka Tˇetková, Thea Brüsch, Teresa Karen Scheidt, Fabian Martin Mager, Rasmus Ørtoft Aagaard, et al. On convex decision regions in deep network representations.arXiv preprint arXiv:2305.17154, 2023

work page arXiv 2023

[29] [29]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton et al. Distilling the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. A Additional Ablation Results Table 6: Full score regression results under directional ablation. Normal denotes original embeddings, Ablated denotes projection removal along the learned score-aligned direction, and Perm-label control denotes removal ...

work page internal anchor Pith review Pith/arXiv arXiv 2015