Functional Subspace, where language models can use vector algebra to solve problems
Pith reviewed 2026-05-16 08:44 UTC · model grok-4.3
The pith
Large language models create functional subspaces in their activations where evidence accumulates and in-context learning tasks are solved with vector algebra operations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Analyses of residual streams and functional modules collected during in-context learning indicate that LLMs form subspaces in which evidence can be accumulated and that ICL tasks can be solved via simple algebraic operations performed inside those subspaces.
What carries the argument
Functional subspaces within residual stream activations, which serve as regions where evidence from in-context examples is linearly combined to produce task outputs.
Load-bearing premise
The observed patterns in activations during in-context learning reflect subspaces that the model actually uses for computation rather than artifacts produced by the analysis method or layer choices.
What would settle it
Select the dimensions that define a candidate subspace for a given task, zero them out or replace them with noise during inference, and check whether accuracy on that specific in-context learning task falls while unrelated tasks remain unaffected.
Figures
read the original abstract
Large language models (LLMs) were invented for natural language tasks such as translation, but they have proved that they can perform highly complex functions across domains. Additionally, they have been thought to develop new skills without being trained on them. These learning capabilities lead to LLMs adoption in a wide range of domains. Thus, it is imperative that we understand their operating mechanisms and limitations for proper diagnostics and repair. The earlier studies proposed that high level concepts are encoded as linear directions in LLMs activation space and that the geometry of embeddings have semantic meanings. Inspired by these studies, we hypothesize that LLMs may use subspaces and vector algebra in subspaces to perform tasks. To address this hypothesis, we analyze LLMs' functional modules and residual streams collected from LLMs engaging in in-context learning (ICL), one of the emergent abilities. Our analyses suggest that 1) LLMs can create subspaces, where evidence can be accumulated and 2) ICL tasks can be solved via simple algebraic operations in subspaces.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper hypothesizes that LLMs create functional subspaces in activation space during in-context learning (ICL) to accumulate evidence and solve tasks via simple vector algebraic operations. This is investigated through analyses of functional modules and residual streams collected while models perform ICL.
Significance. If the central claims hold with causal validation, the work would advance mechanistic interpretability of emergent ICL abilities and suggest new directions for subspace-based model editing. The current observational analyses, however, do not yet establish that the identified geometric patterns are causally used rather than correlational artifacts.
major comments (2)
- [Abstract] Abstract: the claims that 'LLMs can create subspaces, where evidence can be accumulated' and 'ICL tasks can be solved via simple algebraic operations in subspaces' are stated without any equations defining the operations, any statistical tests, controls, or error bars on the collected activations, rendering it impossible to determine whether the patterns are predictive or post-hoc fits.
- [Analyses (implied)] No section on causal interventions: the manuscript reports geometric patterns and module activations in residual streams but contains no targeted editing, ablation, or algebraic manipulation experiments that would test whether altering the identified directions changes ICL accuracy in the predicted direction; without such tests the patterns could be downstream effects of standard attention or feed-forward layers.
minor comments (1)
- [Abstract] Abstract: the phrase 'earlier studies proposed that high level concepts are encoded as linear directions' would benefit from explicit citations to ground the novelty claim.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, revising the manuscript to improve rigor in the abstract and results while acknowledging the observational nature of the study.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claims that 'LLMs can create subspaces, where evidence can be accumulated' and 'ICL tasks can be solved via simple algebraic operations in subspaces' are stated without any equations defining the operations, any statistical tests, controls, or error bars on the collected activations, rendering it impossible to determine whether the patterns are predictive or post-hoc fits.
Authors: We agree the original abstract was insufficiently precise. We have revised it to reference the specific operations (evidence accumulation via vector addition in the identified subspace and task resolution via subtraction, as formalized in Equations 2 and 3 of the methods section). We have also added statistical tests (paired t-tests with p < 0.01) and error bars from 5 independent runs in the results figures, along with controls comparing against random subspaces and shuffled ICL examples to rule out post-hoc fitting. revision: yes
-
Referee: [Analyses (implied)] No section on causal interventions: the manuscript reports geometric patterns and module activations in residual streams but contains no targeted editing, ablation, or algebraic manipulation experiments that would test whether altering the identified directions changes ICL accuracy in the predicted direction; without such tests the patterns could be downstream effects of standard attention or feed-forward layers.
Authors: We acknowledge that the work is observational and does not include direct causal interventions such as subspace editing. In the revision we have added module ablation experiments (zeroing activations in the identified functional modules) that reduce ICL accuracy in the expected manner, providing correlational support. We have also expanded the discussion to explicitly note that the patterns could be downstream effects and to outline how future editing experiments could test causality. Full targeted algebraic manipulations remain outside the scope of this initial study. revision: partial
Circularity Check
No circularity: claims rest on observational analysis of residual streams, not self-referential derivation
full rationale
The paper hypothesizes subspaces and vector algebra for ICL based on prior linear geometry studies, then reports analyses of functional modules and residual streams during ICL tasks. No equations, fitted parameters, or self-citations are shown reducing the central claims (subspace creation and algebraic solving) to inputs by construction. The derivation chain is self-contained as empirical pattern detection rather than a closed loop of definitions or predictions forced by prior author work.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption High-level concepts are encoded as linear directions in activation space
- domain assumption Residual streams and functional modules can be isolated without destroying the relevant geometry
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our analyses suggest that 1) LLMs can create subspaces, where evidence can be accumulated and 2) ICL tasks can be solved via simple algebraic operations in subspaces.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
residual streams are superpositions of possible answers (Anl_k) stored in FFNs
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Language models are few-shot learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Nee- lakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott...
work page 1901
-
[2]
Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus
Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus. Emergent abilities of large language models.Transactions on Machine Learning Research,
-
[3]
Survey Certification
-
[4]
A survey on in-context learning
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li, and Zhifang Sui. A survey on in-context learning. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen, editors,Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1107–1128, Miami, Fl...
work page 2024
-
[5]
An explanation of in-context learning as implicit bayesian inference
Sang Michael Xie, Aditi Raghunathan, Percy Liang, and Tengyu Ma. An explanation of in-context learning as implicit bayesian inference. InInternational Conference on Learning Representations, 2022
work page 2022
-
[6]
Fabian Falck, Ziyu Wang, and Christopher C. Holmes. Is in-context learning in large language models bayesian? a martingale perspective. InForty-first International Conference on Machine Learning, 2024
work page 2024
-
[7]
Transformers learn in-context by gradient descent
Johannes V on Oswald, Eyvind Niklasson, Ettore Randazzo, Jo˜ao Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, and Max Vladymyrov. Transformers learn in-context by gradient descent. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023
work page 2023
-
[8]
Why can GPT learn in-context? language models secretly perform gradient descent as meta-optimizers
Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, and Furu Wei. Why can GPT learn in-context? language models secretly perform gradient descent as meta-optimizers. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors,Findings of the Association for Computational Linguistics: ACL 2023, pages 4005–4019, Toronto, Canada, July 2023....
work page 2023
-
[9]
In-context learning and gradient descent revisited
Gilad Deutch, Nadav Magar, Tomer Natan, and Guy Dar. In-context learning and gradient descent revisited. In Kevin Duh, Helena Gomez, and Steven Bethard, editors,Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1017–1028, Mexico City...
work page 2024
-
[10]
Gomez, Lukasz Kaiser, and Illia Polosukhin
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2023
work page 2023
-
[11]
Efficient estimation of word representations in vector space
Tom´as Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013
work page 2013
-
[12]
Function vectors in large language models
Eric Todd, Millicent Li, Arnab Sen Sharma, Aaron Mueller, Byron C Wallace, and David Bau. Function vectors in large language models. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[13]
In-context learning creates task vectors
Roee Hendel, Mor Geva, and Amir Globerson. In-context learning creates task vectors. In Houda Bouamor, Juan Pino, and Kalika Bali, editors,Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9318–9333, Singapore, December 2023. Association for Computational Linguistics
work page 2023
-
[14]
One-shot optimized steering vectors mediate safety-relevant behaviors in llms, 2025
Jacob Dunefsky and Arman Cohan. One-shot optimized steering vectors mediate safety-relevant behaviors in llms, 2025
work page 2025
-
[15]
A mathematical framework for transformer circuits.Transformer Circuits Thread, 2021
Nelson Elhage, Neel Nanda, Catherine Olsson, Tom Henighan, Nicholas Joseph, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Nova DasSarma, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, and Chris Olah. A...
work page 2021
-
[16]
Mass-editing memory in a transformer
Kevin Meng, Arnab Sen Sharma, Alex J Andonian, Yonatan Belinkov, and David Bau. Mass-editing memory in a transformer. InThe Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[17]
Transformer feed-forward layers are key-value memories
Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer feed-forward layers are key-value memories. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5484–5495, 2021
work page 2021
-
[18]
Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in GPT.Advances in Neural Information Processing Systems, 35, 2022
work page 2022
-
[19]
Vedang Lad, Wes Gurnee, and Max Tegmark. The remarkable robustness of LLMs: Stages of inference? InICML 2024 Workshop on Mechanistic Interpretability, 2024
work page 2024
-
[20]
Isotropy in the contextual embedding space: Clusters and manifolds
Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. Isotropy in the contextual embedding space: Clusters and manifolds. InInternational Conference on Learning Representations, 2021. 12 APREPRINT- FEBRUARY3, 2026
work page 2021
-
[21]
On the origins of linear representations in large language models
Yibo Jiang, Goutham Rajendran, Pradeep Kumar Ravikumar, Bryon Aragam, and Victor Veitch. On the origins of linear representations in large language models. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Machine Learnin...
work page 2024
-
[22]
Beyond single concept vector: Modeling concept subspace in LLMs with gaussian distribution
Haiyan Zhao, Heng Zhao, Bo Shen, Ali Payani, Fan Yang, and Mengnan Du. Beyond single concept vector: Modeling concept subspace in LLMs with gaussian distribution. InThe Thirteenth International Conference on Learning Representations, 2025
work page 2025
-
[23]
Probing toxic content in large pre-trained language models
Nedjma Ousidhoum, Xinran Zhao, Tianqing Fang, Yangqiu Song, and Dit-Yan Yeung. Probing toxic content in large pre-trained language models. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, editors,Proceed- ings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Langu...
work page 2021
-
[24]
What makes a good order of examples in in-context learning
Qi Guo, Leiyu Wang, Yidong Wang, Wei Ye, and Shikun Zhang. What makes a good order of examples in in-context learning. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors,Findings of the Association for Computational Linguistics: ACL 2024, pages 14892–14904, Bangkok, Thailand, August 2024. Association for Computational Linguistics
work page 2024
-
[25]
Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity
Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors,Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)...
work page 2022
-
[26]
A. Hyv ¨arinen and E. Oja. Independent component analysis: algorithms and applications.Neural Netw., 13(4–5):411–430, May 2000
work page 2000
-
[27]
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python.Journal of Machine Learning Research, 12:2825–2830, 2011
work page 2011
-
[28]
GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model
Ben Wang and Aran Komatsuzaki. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https: //github.com/kingoflolz/mesh-transformer-jax, May 2021
work page 2021
-
[29]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava S...
work page 2026
-
[30]
Team OLMo, Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William ...
work page 2026
-
[31]
Pythia: A suite for analyzing large language models across training and scaling, 2023
Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Moham- mad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, and Oskar van der Wal. Pythia: A suite for analyzing large language models across training and scaling, 2023
work page 2023
-
[32]
Gpt-neox-20b: An open-source autoregressive language model, 2022
Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, and Samuel Weinbach. Gpt-neox-20b: An open-source autoregressive language model, 2022
work page 2022
-
[33]
Automatic differentiation in PyTorch
Adam Paszke, Sam Gross, Soumith Chintala, Edward Chanan, Gregory Yang, Zachary DeVito, Alban Lin, Zeming Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. InNIPS Autodiff Workshop, 2017
work page 2017
-
[34]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R´emi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Huggingface’s transformers: S...
work page 2020
-
[35]
Nnsight and ndif: Democratizing access to foundation model internals
Jaden Fiotto-Kaufman, Alexander R Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Michael Ripa, Adam Belfki, Nikhil Prakash, Sumeet Multani, Carla Brodley, Arjun Guha, Jonathan Bell, Byron Wallace, and David Bau. Nnsight and ndif: Democratizing access to foundatio...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.