Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

Dongxin Guo; Jikun Wu; Siu Ming Yiu

arxiv: 2605.23035 · v1 · pith:HXAK6GFKnew · submitted 2026-05-21 · 💻 cs.CL · cs.AI· q-bio.NC

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

Dongxin Guo , Jikun Wu , Siu Ming Yiu This is my paper

Pith reviewed 2026-05-25 05:36 UTC · model grok-4.3

classification 💻 cs.CL cs.AIq-bio.NC

keywords sparse autoencodersbrain encodingsemantic featurescortical topographyneural encoding modelsLLM interpretabilitylanguage processing

0 comments

The pith

Sparse autoencoders extract semantic features from LLMs that map onto distinct brain regions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper decomposes intermediate layers of GPT-2 XL and Llama-3.1-8B using sparse autoencoders into 16K-32K features per layer and shows that the semantic subset alone recovers 94 percent of the models' ability to predict human brain responses to language. It then tests whether five semantic subcategories, chosen in advance from independent neuroscience studies, align with specific cortical regions. A convergence test finds statistically reliable subcategory-to-region mapping. This supplies a mechanistic account for why certain LLM layers predict brain activity and demonstrates that the alignment occurs at a finer grain than earlier methods reached.

Core claim

Semantic features identified by sparse autoencoders recover 94 percent of peak brain-encoding performance (r=0.285) and exceed variance-matched baselines. The same features, when grouped into five a priori semantic subcategories, map onto distinct brain regions with Spearman ρ=0.72 and hypergeometric p=0.007. These features also improve prediction of reading times beyond lexical controls and the pattern holds across English, Chinese, and French.

What carries the argument

Sparse autoencoder features, especially the human-validated semantic subset, inserted into neural encoding models to predict fMRI responses and tested for alignment with cortical semantic topography via a formal convergence statistic.

If this is right

Semantic features alone account for nearly all LLM-to-brain predictive power.
The five-category mapping to brain regions survives a formal statistical convergence test.
SAE features improve reading-time prediction beyond lexical variables.
Exploratory analysis indicates the brain may additionally encode semantic prediction errors.
The alignment pattern generalizes across three languages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same SAE taxonomy could be applied to other imaging modalities or to models of different sizes to test whether the topography mapping remains stable.
If specific SAE features drive the alignment, targeted ablation or steering of those features should measurably change brain-prediction accuracy.
Extending the approach to sentence-level or discourse-level stimuli might reveal whether higher-order semantic structure also shows topographic organization.

Load-bearing premise

The five semantic subcategories taken from prior neuroscience work classify the SAE features correctly and correspond to separate brain regions without any post-hoc selection or adjustment of the mapping.

What would settle it

Absence of significant subcategory-to-region alignment in the convergence test, for example a Spearman ρ near zero or hypergeometric p-value above 0.05, would falsify the topography claim.

Figures

Figures reproduced from arXiv: 2605.23035 by Dongxin Guo, Jikun Wu, Siu Ming Yiu.

**Figure 2.** Figure 2: Brain prediction across layers. Both mod [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Predicted vs. observed subcategory × region patterns. Left: a priori predictions from Binder et al. (2009)/Huth et al. (2016)/Deniz et al. (2019) (dark = predicted primary association). Right: observed SAE encoding r-values with FDR-significant cells marked. Formal convergence: ρ=0.72, p<0.001; hypergeometric p=0.007; Mantel r=0.64, p=0.002. 4.5 Formal Convergence Test We formally test whether the observe… view at source ↗

**Figure 4.** Figure 4: Subcategory × region encoding at L24. ** survives FDR correction (q<0.05, Benjamini-Hochberg); * nominally significant (p<0.05, permutation). AA Feature Categorization Confusion Matrix GPT-4↓ / Human→ Sem Syn Lex Pred Oth Semantic 82 4 3 2 9 Syntactic 3 86 2 1 8 Lexical 5 3 84 1 7 Prediction 4 2 1 79 14 Other 6 5 10 17 62 Per-cat. κ .78 .83 .80 .74 .58 [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Activation patching at L24 with 95% CIs. [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

read the original abstract

Intermediate layers of large language models (LLMs) best predict human brain responses to language, one of the most robust findings in computational neurolinguistics, yet why remains mechanistically unexplained. We address this gap by bridging sparse autoencoders (SAEs) from mechanistic interpretability with neural encoding models, decomposing GPT-2 XL and Llama-3.1-8B into 16K-32K interpretable features per layer. A human-validated taxonomy ($\kappa \geq 0.74$) reveals that semantic features alone recover 94% of peak encoding performance ($r=0.285$), substantially exceeding variance-matched baselines ($p<0.001$, $d=1.31$). Beyond this aggregate dominance, we test a novel cortical topography prediction: five semantic subcategories derived a priori from three independent neuroscience programs should map onto distinct brain regions. A formal convergence test confirms this alignment (Spearman $\rho=0.72$, $p<0.001$; hypergeometric $p=0.007$), demonstrating that SAE-discovered features recapitulate known cortical semantic organization at a granularity inaccessible to prior methods. SAE features further predict human reading times beyond lexical controls ($\Delta\mathrm{logLik}=38.4$, $p<0.001$), and an exploratory prediction-error analysis provides preliminary evidence that the brain additionally encodes unexpected semantic content. Results generalize across English, Chinese, and French.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAE features recover most of the LLM-brain alignment and map onto cortical regions, but the feature classification needs checking for bias.

read the letter

This paper's main result is that semantic features identified by SAEs in LLMs explain 94% of the peak brain encoding performance and align with known cortical semantic topography through a formal convergence test. What is new is the decomposition of LLM activations using SAEs and then testing specific subcategory mappings to brain regions using stats like Spearman rho and hypergeometric test. They credit the a priori categories from neuroscience literature and show the semantic features outperform variance-matched baselines with p<0.001 and d=1.31. The work also includes a prediction for reading times and some evidence for encoding of unexpected content. The paper does well in reporting effect sizes, p-values, and cross-language results, and in attempting to bridge two fields with a testable prediction. The soft spots are around the human validation of the taxonomy. While kappa >=0.74 is reported, the process of assigning thousands of features to the five subcategories could introduce bias if not done blindly to the brain data. The stress-test concern about post-hoc selection is reasonable based on the abstract alone, and it would be good to see the full methods to confirm independence. The soundness is hard to judge fully without the details on data exclusion and baseline construction. This is for readers in mech interp or neurolinguistics looking for mechanistic links. It has enough structure to deserve a serious referee who can check the methods. I recommend engaging with it in peer review.

Referee Report

2 major / 2 minor

Summary. The paper claims that sparse autoencoders (SAEs) decompose activations from GPT-2 XL and Llama-3.1-8B into 16K–32K features per layer; semantic features alone recover 94% of peak brain encoding performance (r=0.285, exceeding variance-matched baselines at p<0.001, d=1.31); a human-validated taxonomy (κ≥0.74) of five a priori semantic subcategories derived from independent neuroscience programs aligns with distinct cortical regions via a convergence test (Spearman ρ=0.72, p<0.001; hypergeometric p=0.007); SAE features also predict reading times beyond lexical controls (ΔlogLik=38.4, p<0.001); results generalize across English, Chinese, and French.

Significance. If the central claims hold, the work offers a mechanistic bridge between LLM interpretability and neurolinguistics by mapping SAE features onto known cortical semantic topography at a granularity finer than prior methods, supported by formal statistical tests and cross-lingual generalization. The use of a priori subcategories, inter-rater reliability reporting, and explicit baseline comparisons are positive elements.

major comments (2)

[Methods (Convergence Test)] Methods (taxonomy application and convergence test): The manuscript states that the five subcategories are derived a priori and that feature classification was human-validated (κ≥0.74), but provides no description of blinding, pre-registration, total features labeled versus discarded, or whether raters had access to brain encoding or topography results. This information is required to confirm that the reported Spearman ρ=0.72 and hypergeometric p=0.007 are independent of data-dependent choices.
[Results (Encoding Performance)] Results (encoding performance): The claim that semantic features recover 94% of peak performance (r=0.285) requires explicit definition of the peak baseline (which layers, how many features, exact construction of variance-matched controls) and confirmation that the 94% figure is not sensitive to post-hoc feature selection; without these details the dominance claim cannot be fully evaluated.

minor comments (2)

The abstract reports generalization across three languages but the main text should include a dedicated table or supplementary figure breaking down encoding performance and convergence statistics per language.
[Methods] Notation for the hypergeometric test and the exact null model (how many features per subcategory, total feature pool) should be stated explicitly in the Methods to allow replication.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to improve methodological transparency. We address each major point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Methods (Convergence Test)] Methods (taxonomy application and convergence test): The manuscript states that the five subcategories are derived a priori and that feature classification was human-validated (κ≥0.74), but provides no description of blinding, pre-registration, total features labeled versus discarded, or whether raters had access to brain encoding or topography results. This information is required to confirm that the reported Spearman ρ=0.72 and hypergeometric p=0.007 are independent of data-dependent choices.

Authors: We agree that the Methods section requires expanded detail on the taxonomy application to demonstrate independence from data-dependent choices. The five subcategories were fixed a priori from three independent neuroscience programs before any SAE or brain data inspection. Human validation (κ≥0.74) was performed on features drawn from the full set of 16K–32K per layer. We will revise the manuscript to specify: the total number of features labeled and the subset discarded; that classification occurred without reference to per-feature brain encoding values or topography maps; and that the study was not pre-registered. The convergence test (Spearman ρ and hypergeometric p) was computed only after the taxonomy and labels were finalized. revision: yes
Referee: [Results (Encoding Performance)] Results (encoding performance): The claim that semantic features recover 94% of peak performance (r=0.285) requires explicit definition of the peak baseline (which layers, how many features, exact construction of variance-matched controls) and confirmation that the 94% figure is not sensitive to post-hoc feature selection; without these details the dominance claim cannot be fully evaluated.

Authors: We will revise the Results and Methods to define the peak baseline explicitly as the highest encoding performance obtained from any layer using the full set of SAE features (no selection). Variance-matched controls were constructed by sampling an equal number of non-semantic features while preserving the distribution of explained variance in the brain data. The 94% ratio is computed directly from these quantities. Semantic feature selection was determined solely by the a priori taxonomy and human labels, independent of brain encoding performance, so the comparison is not post-hoc with respect to the brain data. We will also report sensitivity checks across layer ranges. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation uses independent priors and formal tests

full rationale

The paper's load-bearing steps rest on subcategories derived a priori from three independent neuroscience programs, a human-validated taxonomy with reported κ ≥ 0.74, and formal statistical tests (Spearman ρ=0.72, hypergeometric p=0.007) applied to SAE features extracted from LLMs. No equations, self-citations, or fitted parameters are shown reducing the alignment result to its own inputs by construction. The convergence test is presented as confirmation against external brain data rather than a renaming or self-definition. This is a self-contained derivation against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that the human-validated taxonomy correctly partitions features into semantically meaningful subcategories and that these subcategories have independent grounding in neuroscience programs; no free parameters are explicitly fitted in the abstract beyond the choice of SAE dictionary size, and no new entities are postulated.

free parameters (1)

SAE dictionary size
16K-32K features chosen per layer for decomposition; value is selected rather than derived.

axioms (2)

domain assumption Semantic features are the dominant component explaining LLM-brain alignment
Invoked when claiming semantic features recover 94% of peak performance.
domain assumption The five semantic subcategories derived from prior neuroscience programs are valid and distinct
Required for the a priori mapping test and convergence statistic.

pith-pipeline@v0.9.0 · 5797 in / 1429 out tokens · 22902 ms · 2026-05-25T05:36:14.105333+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages · 2 internal anchors

[1]

Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics,

John Hale , title =. Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics,. 2001 , burl =

work page 2001
[2]

Huth and Wendy A

Alexander G. Huth and Wendy A. de Heer and Thomas L. Griffiths and Fr. Natural speech reveals the semantic maps that tile human cerebral cortex , journal =. 2016 , burl =. doi:10.1038/NATURE17637 , timestamp =

work page doi:10.1038/nature17637 2016
[3]

Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) , booktitle =

Mariya Toneva and Leila Wehbe , editor =. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) , booktitle =. 2019 , burl =

work page 2019
[4]

Hinton , editor =

Simon Kornblith and Mohammad Norouzi and Honglak Lee and Geoffrey E. Hinton , editor =. Similarity of Neural Network Representations Revisited , booktitle =. 2019 , burl =

work page 2019
[5]

The Twelfth International Conference on Learning Representations,

Robert Huben and Hoagy Cunningham and Logan Riggs Smith and Aidan Ewart and Lee Sharkey , title =. The Twelfth International Conference on Learning Representations,. 2024 , burl =

work page 2024
[6]

Disentangling syntax and semantics in the brain with deep networks , booktitle =

Charlotte Caucheteux and Alexandre Gramfort and Jean. Disentangling syntax and semantics in the brain with deep networks , booktitle =. 2021 , burl =

work page 2021
[7]

Hale and Bertrand Thirion and Christophe Pallier , editor =

Alexandre Pasquiou and Yair Lakretz and John T. Hale and Bertrand Thirion and Christophe Pallier , editor =. Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps , booktitle =. 2022 , burl =

work page 2022
[8]

Eliciting Latent Predictions from Transformers with the Tuned Lens

Nora Belrose and Zach Furman and Logan Smith and Danny Halawi and Igor Ostrovsky and Lev McKinney and Stella Biderman and Jacob Steinhardt , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2303.08112 , beprinttype =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08112 2023
[9]

Michaud and David D

Yuxiao Li and Eric J. Michaud and David D. Baek and Joshua Engels and Xiaoqing Sun and Max Tegmark , title =. Entropy , volume =. 2025 , burl =. doi:10.3390/E27040344 , timestamp =

work page doi:10.3390/e27040344 2025
[10]

Michaud and Yonatan Belinkov and David Bau and Aaron Mueller , title =

Samuel Marks and Can Rager and Eric J. Michaud and Yonatan Belinkov and David Bau and Aaron Mueller , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025
[11]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),

Jing Huang and Zhengxuan Wu and Christopher Potts and Mor Geva and Atticus Geiger , editor =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),. 2024 , burl =. doi:10.18653/V1/2024.ACL-LONG.470 , timestamp =

work page doi:10.18653/v1/2024.acl-long.470 2024
[13]

Lundberg and Su

Ian Covert and Scott M. Lundberg and Su. Explaining by Removing:. J. Mach. Learn. Res. , volume =. 2021 , burl =

work page 2021
[14]

Goodman , editor =

Zhengxuan Wu and Atticus Geiger and Thomas Icard and Christopher Potts and Noah D. Goodman , editor =. Interpretability at Scale: Identifying Causal Mechanisms in Alpaca , booktitle =. 2023 , burl =

work page 2023
[15]

Goodman , editor =

Atticus Geiger and Zhengxuan Wu and Christopher Potts and Thomas Icard and Noah D. Goodman , editor =. Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations , booktitle =. 2024 , burl =

work page 2024
[16]

Antonello and Aditya R

Richard J. Antonello and Aditya R. Vaidya and Alexander Huth , editor =. Scaling laws for language encoding models in fMRI , booktitle =. 2023 , burl =

work page 2023
[17]

Proceedings of the 57th Conference of the Association for Computational Linguistics,

Ian Tenney and Dipanjan Das and Ellie Pavlick , editor =. Proceedings of the 57th Conference of the Association for Computational Linguistics,. 2019 , burl =. doi:10.18653/V1/P19-1452 , timestamp =

work page doi:10.18653/v1/p19-1452 2019
[18]

Saxe and James L

Andrew M. Saxe and James L. McClelland and Surya Ganguli , title =. arXiv preprint , volume =. 2018 , burl =

work page 2018
[19]

Incorporating Context into Language Encoding Models for fMRI , booktitle =

Shailee Jain and Alexander Huth , editor =. Incorporating Context into Language Encoding Models for fMRI , booktitle =. 2018 , burl =

work page 2018
[20]

The Thirteenth International Conference on Learning Representations,

Aleksandar Makelov and Georg Lange and Neel Nanda , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025
[21]

Hamed Nili and Cai Wingfield and Alexander Walther and Li Su and William D. Marslen. A Toolbox for Representational Similarity Analysis , journal =. 2014 , burl =. doi:10.1371/JOURNAL.PCBI.1003553 , timestamp =

work page doi:10.1371/journal.pcbi.1003553 2014
[22]

Iamshchinina and Monika Graumann and Alex Andonian and N

Radoslaw Martin Cichy and Kshitij Dwivedi and Benjamin Lahner and Alex Lascelles and P. Iamshchinina and Monika Graumann and Alex Andonian and N. A. R. Murty and K. Kay and Gemma Roig and Aude Oliva , title =. arXiv preprint , volume =. 2021 , burl =

work page 2021
[23]

Cognition , year =

Levy, Roger , title =. Cognition , year =. doi:10.1016/j.cognition.2007.05.006 , pmid =

work page doi:10.1016/j.cognition.2007.05.006 2007
[24]

Mitchell, Svetlana V

Mitchell, Tom M. and Shinkareva, Svetlana V. and Carlson, Andrew and Chang, Kai-Min and Malave, Vicente L. and Mason, Robert A. and Just, Marcel Adam , year =. Predicting Human Brain Activity Associated with the Meanings of Nouns , volume =. Science , publisher =. doi:10.1126/science.1152876 , number =

work page doi:10.1126/science.1152876
[25]

Proceedings of the National Academy of Sciences , year =

Schrimpf, Martin and Blank, Idan Asher and Tuckute, Greta and Kauf, Carina and Hosseini, Eghbal A. and Kanwisher, Nancy and Tenenbaum, Joshua B. and Fedorenko, Evelina , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2105646118 , pmid =

work page doi:10.1073/pnas.2105646118
[26]

Goldstein, Ariel and Zada, Zaid and Buchnik, Eliav and Schain, Mariano and Price, Amy and Aubrey, Bobbi and Nastase, Samuel A. and Feder, Amir and Emanuel, Dotan and Cohen, Alon and Jansen, Aren and Gazula, Harshvardhan and Choe, Gina and Rao, Aditi and Kim, Catherine and Casto, Colton and Fanda, Lora and Doyle, Werner and Friedman, Daniel and Dugan, Patr...

work page doi:10.1038/s41593-022-01026-4
[27]

Gershman, Nancy Kanwisher, Matthew Botvinick, and Evelina Fedorenko

Pereira, Francisco and Lou, Bin and Pritchett, Brianna and Ritter, Samuel and Gershman, Samuel J. and Kanwisher, Nancy and Botvinick, Matthew and Fedorenko, Evelina , year =. Toward a universal decoder of linguistic meaning from brain activation , volume =. Nature Communications , publisher =. doi:10.1038/s41467-018-03068-4 , number =

work page doi:10.1038/s41467-018-03068-4
[28]

Nature Human Behaviour , year =

Tuckute, Greta and Sathe, Aalok and Srikant, Shashank and Taliaferro, Maya and Wang, Mingye and Schrimpf, Martin and Kay, Kendrick and Fedorenko, Evelina , title =. Nature Human Behaviour , year =. doi:10.1038/s41562-023-01783-7 , pmid =

work page doi:10.1038/s41562-023-01783-7
[29]

and Regev, Tamar I

Fedorenko, Evelina and Ivanova, Anna A. and Regev, Tamar I. , year =. The language network as a natural kind within the broader landscape of the human brain , volume =. Nature Reviews Neuroscience , publisher =. doi:10.1038/s41583-024-00802-4 , number =

work page doi:10.1038/s41583-024-00802-4
[30]

Brains and algorithms partially converge in natural language processing , volume =

Caucheteux, Charlotte and King, Jean-Rémi , year =. Brains and algorithms partially converge in natural language processing , volume =. Communications Biology , publisher =. doi:10.1038/s42003-022-03036-1 , number =

work page doi:10.1038/s42003-022-03036-1
[31]

Rao, Rajesh P. N. and Ballard, Dana H. , year =. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects , volume =. Nature Neuroscience , publisher =. doi:10.1038/4580 , number =

work page doi:10.1038/4580
[32]

A theory of cortical responses , volume =

Friston, Karl , year =. A theory of cortical responses , volume =. Philosophical Transactions of the Royal Society B: Biological Sciences , publisher =. doi:10.1098/rstb.2005.1622 , number =

work page doi:10.1098/rstb.2005.1622 2005
[33]

The free-energy principle: a unified brain theory? , volume =

Friston, Karl , year =. The free-energy principle: a unified brain theory? , volume =. Nature Reviews Neuroscience , publisher =. doi:10.1038/nrn2787 , number =

work page doi:10.1038/nrn2787
[34]

Whatever next? Predictive brains, situated agents, and the future of cognitive science , volume =

Clark, Andy , year =. Whatever next? Predictive brains, situated agents, and the future of cognitive science , volume =. Behavioral and Brain Sciences , publisher =. doi:10.1017/s0140525x12000477 , number =

work page doi:10.1017/s0140525x12000477
[35]

Proceedings of the National Academy of Sciences , year =

Heilbron, Micha and Armeni, Kristijan and Schoffelen, Jan-Mathijs and Hagoort, Peter and de Lange, Floris P. , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2201968119 , pmid =

work page doi:10.1073/pnas.2201968119
[36]

Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data , issn =

Antonello, Richard and Huth, Alexander , year =. Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data , issn =. doi:10.1162/nol_a_00087 , journal =

work page doi:10.1162/nol_a_00087
[37]

Neurobiology of Language , year =

Kauf, Carina and Tuckute, Greta and Levy, Roger and Andreas, Jacob and Fedorenko, Evelina , title =. Neurobiology of Language , year =. doi:10.1162/nol_a_00116 , pmid =

work page doi:10.1162/nol_a_00116
[38]

Joint processing of linguistic properties in brains and language models , booktitle =

Subba Reddy Oota and Manish Gupta and Mariya Toneva , editor =. Joint processing of linguistic properties in brains and language models , booktitle =. 2023 , burl =

work page 2023
[39]

Neurobiology of Language , year =

Hosseini, Eghbal A. and Schrimpf, Martin and Zhang, Yian and Bowman, Samuel and Zaslavsky, Noga and Fedorenko, Evelina , title =. Neurobiology of Language , year =. doi:10.1162/nol_a_00137 , pmid =

work page doi:10.1162/nol_a_00137
[40]

and Yamakoshi, Takateru and Goldstein, Ariel and Hasson, Uri and Norman, Kenneth A

Kumar, Sreejan and Sumers, Theodore R. and Yamakoshi, Takateru and Goldstein, Ariel and Hasson, Uri and Norman, Kenneth A. and Griffiths, Thomas L. and Hawkins, Robert D. and Nastase, Samuel A. , title =. Nature Communications , year =. doi:10.1038/s41467-024-49173-5 , pmid =

work page doi:10.1038/s41467-024-49173-5
[41]

Proceedings of the National Academy of Sciences , year =

Shain, Cory and Meister, Clara and Pimentel, Tiago and Cotterell, Ryan and Levy, Roger , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2307876121 , pmid =

work page doi:10.1073/pnas.2307876121
[42]

Evidence of a predictive coding hierarchy in the human brain listening to speech , volume =

Caucheteux, Charlotte and Gramfort, Alexandre and King, Jean-Rémi , year =. Evidence of a predictive coding hierarchy in the human brain listening to speech , volume =. Nature Human Behaviour , publisher =. doi:10.1038/s41562-022-01516-2 , number =

work page doi:10.1038/s41562-022-01516-2
[43]

Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B

Kyle Mahowald and Anna A. Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2301.06627 , beprinttype =

work page doi:10.48550/arxiv.2301.06627 2023
[44]

Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D

McCoy, R. Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D. and Griffiths, Thomas L. , year =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.2322420121 , number =

work page doi:10.1073/pnas.2322420121
[45]

Language in Brains, Minds, and Machines , volume =

Tuckute, Greta and Kanwisher, Nancy and Fedorenko, Evelina , year =. Language in Brains, Minds, and Machines , volume =. Annual Review of Neuroscience , publisher =. doi:10.1146/annurev-neuro-120623-101142 , number =

work page doi:10.1146/annurev-neuro-120623-101142
[46]

and Desai, Rutvik H

Binder, Jeffrey R. and Desai, Rutvik H. and Graves, William W. and Conant, Lisa L. , year =. Where Is the Semantic System? A Critical Review and Meta-Analysis of 120 Functional Neuroimaging Studies , volume =. Cerebral Cortex , publisher =. doi:10.1093/cercor/bhp055 , number =

work page doi:10.1093/cercor/bhp055
[47]

and Huth, Alexander G

Deniz, Fatma and Nunez-Elizalde, Anwar O. and Huth, Alexander G. and Gallant, Jack L. , year =. The Representation of Semantic Information Across Human Cerebral Cortex During Listening Versus Reading Is Invariant to Stimulus Modality , volume =. The Journal of Neuroscience , publisher =. doi:10.1523/jneurosci.0675-19.2019 , number =

work page doi:10.1523/jneurosci.0675-19.2019 2019
[48]

, title =

LeBel, Amanda and Wagner, Lauren and Jain, Shailee and Adhikari-Desai, Aneesh and Gupta, Bhavin and Morgenthal, Allyson and Tang, Jerry and Xu, Lixiang and Huth, Alexander G. , title =. Scientific Data , year =. doi:10.1038/s41597-023-02437-z , pmid =

work page doi:10.1038/s41597-023-02437-z
[49]

Morris and Richard J

Vinamra Benara and Chandan Singh and John X. Morris and Richard J. Antonello and Ion Stoica and Alexander Huth and Jianfeng Gao , editor =. Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions , booktitle =. 2024 , burl =

work page 2024
[50]

Lost in the Middle: How Language Models Use Long Contexts

Byung. Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times? , journal =. 2023 , burl =. doi:10.1162/TACL\_A\_00548 , timestamp =

work page internal anchor Pith review doi:10.1162/tacl 2023
[51]

and Spoerer, Courtney J

Kietzmann, Tim C. and Spoerer, Courtney J. and Sörensen, Lynn K. A. and Cichy, Radoslaw M. and Hauk, Olaf and Kriegeskorte, Nikolaus , year =. Recurrence is required to capture the representational dynamics of the human visual system , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.1905544116 , number =

work page doi:10.1073/pnas.1905544116
[52]

Neuron , year =

Hasson, Uri and Nastase, Samuel A. and Goldstein, Ariel , title =. Neuron , year =. doi:10.1016/j.neuron.2019.12.002 , pmid =

work page doi:10.1016/j.neuron.2019.12.002 2019
[53]

and Bardolph, Megan D

Michaelov, James A. and Bardolph, Megan D. and Van Petten, Cyma K. and Bergen, Benjamin K. and Coulson, Seana , year =. Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects , volume =. Neurobiology of Language , publisher =. doi:10.1162/nol_a_00105 , number =

work page doi:10.1162/nol_a_00105
[54]

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders , journal =

David Chanin and James Wilken. A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders , journal =. 2024 , burl =. doi:10.48550/ARXIV.2409.14507 , beprinttype =

work page doi:10.48550/arxiv.2409.14507 2024
[55]

and Christianson, Kiel , year =

Luke, Steven G. and Christianson, Kiel , year =. The Provo Corpus: A large eye-tracking corpus with predictability norms , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-017-0908-4 , number =

work page doi:10.3758/s13428-017-0908-4
[56]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , volume =

Benjamini, Yoav and Hochberg, Yosef , year =. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , volume =. Journal of the Royal Statistical Society Series B: Statistical Methodology , publisher =. doi:10.1111/j.2517-6161.1995.tb02031.x , number =

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995
[57]

and Mahowald, K

Kanishka Misra and Kyle Mahowald , title =. arXiv preprint , volume =. 2024 , burl =. doi:10.48550/ARXIV.2403.19827 , beprinttype =

work page doi:10.48550/arxiv.2403.19827 2024
[58]

and Coulson, Seana and Bergen, Benjamin K

Michaelov, James A. and Coulson, Seana and Bergen, Benjamin K. , year =. So Cloze Yet So Far: N400 Amplitude Is Better Predicted by Distributional Information Than Human Predictability Judgements , volume =. IEEE Transactions on Cognitive and Developmental Systems , publisher =. doi:10.1109/tcds.2022.3176783 , number =

work page doi:10.1109/tcds.2022.3176783 2022
[59]

Representational similarity analysis – connecting the branches of systems neuroscience , issn =

Kriegeskorte, Nikolaus , year =. Representational similarity analysis – connecting the branches of systems neuroscience , issn =. doi:10.3389/neuro.06.004.2008 , journal =

work page doi:10.3389/neuro.06.004.2008 2008
[60]

and Muller, Dyana C.Y

Piantadosi, Steven T. and Muller, Dyana C.Y. and Rule, Joshua S. and Kaushik, Karthikeya and Gorenstein, Mark and Leib, Elena R. and Sanford, Emily , year =. Why concepts are (probably) vectors , volume =. Trends in Cognitive Sciences , publisher =. doi:10.1016/j.tics.2024.06.011 , number =

work page doi:10.1016/j.tics.2024.06.011 2024
[61]

and Conant, Lisa L

Binder, Jeffrey R. and Conant, Lisa L. and Humphries, Colin J. and Fernandino, Leonardo and Simons, Stephen B. and Aguilar, Mario and Desai, Rutvik H. , year =. Toward a brain-based componential semantic representation , volume =. Cognitive Neuropsychology , publisher =. doi:10.1080/02643294.2016.1147426 , number =

work page doi:10.1080/02643294.2016.1147426 2016
[62]

and Seidenberg, Mark S

McRae, Ken and Cree, George S. and Seidenberg, Mark S. and Mcnorgan, Chris , year =. Semantic feature production norms for a large set of living and nonliving things , volume =. Behavior Research Methods , publisher =. doi:10.3758/bf03192726 , number =

work page doi:10.3758/bf03192726
[63]

Concreteness ratings for 40 thousand generally known English word lemmas , volume =

Brysbaert, Marc and Warriner, Amy Beth and Kuperman, Victor , year =. Concreteness ratings for 40 thousand generally known English word lemmas , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-013-0403-5 , number =

work page doi:10.3758/s13428-013-0403-5
[64]

Norms of valence, arousal, and dominance for 13,915 English lemmas , volume =

Warriner, Amy Beth and Kuperman, Victor and Brysbaert, Marc , year =. Norms of valence, arousal, and dominance for 13,915 English lemmas , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-012-0314-x , number =

work page doi:10.3758/s13428-012-0314-x
[65]

THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior , volume =

Hebart, Martin N and Contier, Oliver and Teichmann, Lina and Rockter, Adam H and Zheng, Charles Y and Kidder, Alexis and Corriveau, Anna and Vaziri-Pashkam, Maryam and Baker, Chris I , year =. THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior , volume =. doi:10.7554/elife.8258...

work page doi:10.7554/elife.82580
[66]

Memorisation versus Generalisation in Pre-trained Language Models

T. Memorisation versus Generalisation in Pre-trained Language Models. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.521

work page doi:10.18653/v1/2022.acl-long.521 2022
[67]

arXiv preprint , volume =

Rui Mao and Qian Liu and Xiao Li and Erik Cambria and Amir Hussain , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2508.20674 , beprinttype =

work page doi:10.48550/arxiv.2508.20674 2025
[68]

Daniel and Sumers, Theodore R

Templeton, Adly and Conerly, Tom and Marcus, Jonathan and Lindsey, Jack and Bricken, Trenton and Chen, Brian and Pearce, Adam and Citro, Craig and Ameisen, Emmanuel and Jones, Andy and Cunningham, Hoagy and Turner, Nicholas L and McDougall, Callum and MacDiarmid, Monte and Tamkin, Alex and Durmus, Esin and Hume, Tristan and Mosconi, Francesco and Freeman,...

work page 2024
[69]

Kietzmann and Emily Allen and Yihan Wu and Thomas Naselaris and Kendrick Kay and Ian Charest , doi =

Adrien Doerig and Tim C. Kietzmann and Emily Allen and Yihan Wu and Thomas Naselaris and Kendrick Kay and Ian Charest , doi =. Nature Machine Intelligence , title =

work page
[70]

Decoding the brain: From neural representations to mechanistic models , booktitle =

Mackenzie Weygandt Mathis and Adriana. Decoding the brain: From neural representations to mechanistic models , booktitle =. 2024 , issn =. doi:https://doi.org/10.1016/j.cell.2024.08.051 , url =

work page doi:10.1016/j.cell.2024.08.051 2024
[71]

Transformer Circuits Thread , year =

A Mathematical Framework for Transformer Circuits , author =. Transformer Circuits Thread , year =

work page
[72]

2023 , journal=

Language Models Can Explain Neurons in Language Models , author =. 2023 , journal=

work page 2023
[73]

Chapter 61 - The Hub-and-Spoke Hypothesis of Semantic Memory , editor =

Karalyn Patterson and Matthew A. Chapter 61 - The Hub-and-Spoke Hypothesis of Semantic Memory , editor =. Neurobiology of Language , publisher =. 2016 , isbn =. doi:https://doi.org/10.1016/B978-0-12-407794-2.00061-4 , url =

work page doi:10.1016/b978-0-12-407794-2.00061-4 2016
[74]

Communications Biology , title =

Shirin Vafaei and Ryohei Fukuma and Takufumi Yanagisawa and Huixiang Yang and Satoru Oshino and Naoki Tani and Hui Ming Khoo and Hidenori Sugano and Yasushi Iimura and Hiroharu Suzuki and Madoka Nakajima and Kentaro Tamura and Haruhiko Kishima , doi =. Communications Biology , title =

work page
[75]

Glossa Psycholinguistics , title =

Veronica Boyce and Roger Levy , doi =. Glossa Psycholinguistics , title =

work page
[76]

and Binney, Richard J

Diveica, Veronica and Pexman, Penny M. and Binney, Richard J. , journal =. Quantifying Social Semantics: An Inclusive Definition of Socialness and Ratings for 8388. 2023 , doi =

work page 2023
[77]

Vasilakos and Giovanni Iacca and Arshad Ali Khan and Arvind Kumar and Jae Won Cho and Ajmal Mian and Lihua Xie and Erik Cambria and Lin Wang , title =

Jian Liu and Xiongtao Shi and Thai Duy Nguyen and Haitian Zhang and Tianxiang Zhang and Wei Sun and Yanjie Li and Athanasios V. Vasilakos and Giovanni Iacca and Arshad Ali Khan and Arvind Kumar and Jae Won Cho and Ajmal Mian and Lihua Xie and Erik Cambria and Lin Wang , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2505.07634 , bepr...

work page doi:10.48550/arxiv.2505.07634 2025
[78]

2025 , journal =

BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain , author=. 2025 , journal =

work page 2025
[79]

A udio SAE : Towards Understanding of Audio-Processing Models with Sparse A uto E ncoders

Aparin, Georgii and Sadekova, Tasnima and Rukhovich, Alexey and Yermekova, Assel and Kushnareva, Laida and Popov, Vadim and Kuznetsov, Kristian and Piontkovskaya, Irina. A udio SAE : Towards Understanding of Audio-Processing Models with Sparse A uto E ncoders. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputation...

work page doi:10.18653/v1/2026.eacl-long.149 2026
[80]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page

[1] [1]

Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics,

John Hale , title =. Language Technologies 2001: The Second Meeting of the North American Chapter of the Association for Computational Linguistics,. 2001 , burl =

work page 2001

[2] [2]

Huth and Wendy A

Alexander G. Huth and Wendy A. de Heer and Thomas L. Griffiths and Fr. Natural speech reveals the semantic maps that tile human cerebral cortex , journal =. 2016 , burl =. doi:10.1038/NATURE17637 , timestamp =

work page doi:10.1038/nature17637 2016

[3] [3]

Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) , booktitle =

Mariya Toneva and Leila Wehbe , editor =. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) , booktitle =. 2019 , burl =

work page 2019

[4] [4]

Hinton , editor =

Simon Kornblith and Mohammad Norouzi and Honglak Lee and Geoffrey E. Hinton , editor =. Similarity of Neural Network Representations Revisited , booktitle =. 2019 , burl =

work page 2019

[5] [5]

The Twelfth International Conference on Learning Representations,

Robert Huben and Hoagy Cunningham and Logan Riggs Smith and Aidan Ewart and Lee Sharkey , title =. The Twelfth International Conference on Learning Representations,. 2024 , burl =

work page 2024

[6] [6]

Disentangling syntax and semantics in the brain with deep networks , booktitle =

Charlotte Caucheteux and Alexandre Gramfort and Jean. Disentangling syntax and semantics in the brain with deep networks , booktitle =. 2021 , burl =

work page 2021

[7] [7]

Hale and Bertrand Thirion and Christophe Pallier , editor =

Alexandre Pasquiou and Yair Lakretz and John T. Hale and Bertrand Thirion and Christophe Pallier , editor =. Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps , booktitle =. 2022 , burl =

work page 2022

[8] [8]

Eliciting Latent Predictions from Transformers with the Tuned Lens

Nora Belrose and Zach Furman and Logan Smith and Danny Halawi and Igor Ostrovsky and Lev McKinney and Stella Biderman and Jacob Steinhardt , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2303.08112 , beprinttype =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08112 2023

[9] [9]

Michaud and David D

Yuxiao Li and Eric J. Michaud and David D. Baek and Joshua Engels and Xiaoqing Sun and Max Tegmark , title =. Entropy , volume =. 2025 , burl =. doi:10.3390/E27040344 , timestamp =

work page doi:10.3390/e27040344 2025

[10] [10]

Michaud and Yonatan Belinkov and David Bau and Aaron Mueller , title =

Samuel Marks and Can Rager and Eric J. Michaud and Yonatan Belinkov and David Bau and Aaron Mueller , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025

[11] [11]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),

Jing Huang and Zhengxuan Wu and Christopher Potts and Mor Geva and Atticus Geiger , editor =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),. 2024 , burl =. doi:10.18653/V1/2024.ACL-LONG.470 , timestamp =

work page doi:10.18653/v1/2024.acl-long.470 2024

[12] [13]

Lundberg and Su

Ian Covert and Scott M. Lundberg and Su. Explaining by Removing:. J. Mach. Learn. Res. , volume =. 2021 , burl =

work page 2021

[13] [14]

Goodman , editor =

Zhengxuan Wu and Atticus Geiger and Thomas Icard and Christopher Potts and Noah D. Goodman , editor =. Interpretability at Scale: Identifying Causal Mechanisms in Alpaca , booktitle =. 2023 , burl =

work page 2023

[14] [15]

Goodman , editor =

Atticus Geiger and Zhengxuan Wu and Christopher Potts and Thomas Icard and Noah D. Goodman , editor =. Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations , booktitle =. 2024 , burl =

work page 2024

[15] [16]

Antonello and Aditya R

Richard J. Antonello and Aditya R. Vaidya and Alexander Huth , editor =. Scaling laws for language encoding models in fMRI , booktitle =. 2023 , burl =

work page 2023

[16] [17]

Proceedings of the 57th Conference of the Association for Computational Linguistics,

Ian Tenney and Dipanjan Das and Ellie Pavlick , editor =. Proceedings of the 57th Conference of the Association for Computational Linguistics,. 2019 , burl =. doi:10.18653/V1/P19-1452 , timestamp =

work page doi:10.18653/v1/p19-1452 2019

[17] [18]

Saxe and James L

Andrew M. Saxe and James L. McClelland and Surya Ganguli , title =. arXiv preprint , volume =. 2018 , burl =

work page 2018

[18] [19]

Incorporating Context into Language Encoding Models for fMRI , booktitle =

Shailee Jain and Alexander Huth , editor =. Incorporating Context into Language Encoding Models for fMRI , booktitle =. 2018 , burl =

work page 2018

[19] [20]

The Thirteenth International Conference on Learning Representations,

Aleksandar Makelov and Georg Lange and Neel Nanda , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025

[20] [21]

Hamed Nili and Cai Wingfield and Alexander Walther and Li Su and William D. Marslen. A Toolbox for Representational Similarity Analysis , journal =. 2014 , burl =. doi:10.1371/JOURNAL.PCBI.1003553 , timestamp =

work page doi:10.1371/journal.pcbi.1003553 2014

[21] [22]

Iamshchinina and Monika Graumann and Alex Andonian and N

Radoslaw Martin Cichy and Kshitij Dwivedi and Benjamin Lahner and Alex Lascelles and P. Iamshchinina and Monika Graumann and Alex Andonian and N. A. R. Murty and K. Kay and Gemma Roig and Aude Oliva , title =. arXiv preprint , volume =. 2021 , burl =

work page 2021

[22] [23]

Cognition , year =

Levy, Roger , title =. Cognition , year =. doi:10.1016/j.cognition.2007.05.006 , pmid =

work page doi:10.1016/j.cognition.2007.05.006 2007

[23] [24]

Mitchell, Svetlana V

Mitchell, Tom M. and Shinkareva, Svetlana V. and Carlson, Andrew and Chang, Kai-Min and Malave, Vicente L. and Mason, Robert A. and Just, Marcel Adam , year =. Predicting Human Brain Activity Associated with the Meanings of Nouns , volume =. Science , publisher =. doi:10.1126/science.1152876 , number =

work page doi:10.1126/science.1152876

[24] [25]

Proceedings of the National Academy of Sciences , year =

Schrimpf, Martin and Blank, Idan Asher and Tuckute, Greta and Kauf, Carina and Hosseini, Eghbal A. and Kanwisher, Nancy and Tenenbaum, Joshua B. and Fedorenko, Evelina , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2105646118 , pmid =

work page doi:10.1073/pnas.2105646118

[25] [26]

Goldstein, Ariel and Zada, Zaid and Buchnik, Eliav and Schain, Mariano and Price, Amy and Aubrey, Bobbi and Nastase, Samuel A. and Feder, Amir and Emanuel, Dotan and Cohen, Alon and Jansen, Aren and Gazula, Harshvardhan and Choe, Gina and Rao, Aditi and Kim, Catherine and Casto, Colton and Fanda, Lora and Doyle, Werner and Friedman, Daniel and Dugan, Patr...

work page doi:10.1038/s41593-022-01026-4

[26] [27]

Gershman, Nancy Kanwisher, Matthew Botvinick, and Evelina Fedorenko

Pereira, Francisco and Lou, Bin and Pritchett, Brianna and Ritter, Samuel and Gershman, Samuel J. and Kanwisher, Nancy and Botvinick, Matthew and Fedorenko, Evelina , year =. Toward a universal decoder of linguistic meaning from brain activation , volume =. Nature Communications , publisher =. doi:10.1038/s41467-018-03068-4 , number =

work page doi:10.1038/s41467-018-03068-4

[27] [28]

Nature Human Behaviour , year =

Tuckute, Greta and Sathe, Aalok and Srikant, Shashank and Taliaferro, Maya and Wang, Mingye and Schrimpf, Martin and Kay, Kendrick and Fedorenko, Evelina , title =. Nature Human Behaviour , year =. doi:10.1038/s41562-023-01783-7 , pmid =

work page doi:10.1038/s41562-023-01783-7

[28] [29]

and Regev, Tamar I

Fedorenko, Evelina and Ivanova, Anna A. and Regev, Tamar I. , year =. The language network as a natural kind within the broader landscape of the human brain , volume =. Nature Reviews Neuroscience , publisher =. doi:10.1038/s41583-024-00802-4 , number =

work page doi:10.1038/s41583-024-00802-4

[29] [30]

Brains and algorithms partially converge in natural language processing , volume =

Caucheteux, Charlotte and King, Jean-Rémi , year =. Brains and algorithms partially converge in natural language processing , volume =. Communications Biology , publisher =. doi:10.1038/s42003-022-03036-1 , number =

work page doi:10.1038/s42003-022-03036-1

[30] [31]

Rao, Rajesh P. N. and Ballard, Dana H. , year =. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects , volume =. Nature Neuroscience , publisher =. doi:10.1038/4580 , number =

work page doi:10.1038/4580

[31] [32]

A theory of cortical responses , volume =

Friston, Karl , year =. A theory of cortical responses , volume =. Philosophical Transactions of the Royal Society B: Biological Sciences , publisher =. doi:10.1098/rstb.2005.1622 , number =

work page doi:10.1098/rstb.2005.1622 2005

[32] [33]

The free-energy principle: a unified brain theory? , volume =

Friston, Karl , year =. The free-energy principle: a unified brain theory? , volume =. Nature Reviews Neuroscience , publisher =. doi:10.1038/nrn2787 , number =

work page doi:10.1038/nrn2787

[33] [34]

Whatever next? Predictive brains, situated agents, and the future of cognitive science , volume =

Clark, Andy , year =. Whatever next? Predictive brains, situated agents, and the future of cognitive science , volume =. Behavioral and Brain Sciences , publisher =. doi:10.1017/s0140525x12000477 , number =

work page doi:10.1017/s0140525x12000477

[34] [35]

Proceedings of the National Academy of Sciences , year =

Heilbron, Micha and Armeni, Kristijan and Schoffelen, Jan-Mathijs and Hagoort, Peter and de Lange, Floris P. , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2201968119 , pmid =

work page doi:10.1073/pnas.2201968119

[35] [36]

Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data , issn =

Antonello, Richard and Huth, Alexander , year =. Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data , issn =. doi:10.1162/nol_a_00087 , journal =

work page doi:10.1162/nol_a_00087

[36] [37]

Neurobiology of Language , year =

Kauf, Carina and Tuckute, Greta and Levy, Roger and Andreas, Jacob and Fedorenko, Evelina , title =. Neurobiology of Language , year =. doi:10.1162/nol_a_00116 , pmid =

work page doi:10.1162/nol_a_00116

[37] [38]

Joint processing of linguistic properties in brains and language models , booktitle =

Subba Reddy Oota and Manish Gupta and Mariya Toneva , editor =. Joint processing of linguistic properties in brains and language models , booktitle =. 2023 , burl =

work page 2023

[38] [39]

Neurobiology of Language , year =

Hosseini, Eghbal A. and Schrimpf, Martin and Zhang, Yian and Bowman, Samuel and Zaslavsky, Noga and Fedorenko, Evelina , title =. Neurobiology of Language , year =. doi:10.1162/nol_a_00137 , pmid =

work page doi:10.1162/nol_a_00137

[39] [40]

and Yamakoshi, Takateru and Goldstein, Ariel and Hasson, Uri and Norman, Kenneth A

Kumar, Sreejan and Sumers, Theodore R. and Yamakoshi, Takateru and Goldstein, Ariel and Hasson, Uri and Norman, Kenneth A. and Griffiths, Thomas L. and Hawkins, Robert D. and Nastase, Samuel A. , title =. Nature Communications , year =. doi:10.1038/s41467-024-49173-5 , pmid =

work page doi:10.1038/s41467-024-49173-5

[40] [41]

Proceedings of the National Academy of Sciences , year =

Shain, Cory and Meister, Clara and Pimentel, Tiago and Cotterell, Ryan and Levy, Roger , title =. Proceedings of the National Academy of Sciences , year =. doi:10.1073/pnas.2307876121 , pmid =

work page doi:10.1073/pnas.2307876121

[41] [42]

Evidence of a predictive coding hierarchy in the human brain listening to speech , volume =

Caucheteux, Charlotte and Gramfort, Alexandre and King, Jean-Rémi , year =. Evidence of a predictive coding hierarchy in the human brain listening to speech , volume =. Nature Human Behaviour , publisher =. doi:10.1038/s41562-022-01516-2 , number =

work page doi:10.1038/s41562-022-01516-2

[42] [43]

Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B

Kyle Mahowald and Anna A. Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2301.06627 , beprinttype =

work page doi:10.48550/arxiv.2301.06627 2023

[43] [44]

Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D

McCoy, R. Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D. and Griffiths, Thomas L. , year =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.2322420121 , number =

work page doi:10.1073/pnas.2322420121

[44] [45]

Language in Brains, Minds, and Machines , volume =

Tuckute, Greta and Kanwisher, Nancy and Fedorenko, Evelina , year =. Language in Brains, Minds, and Machines , volume =. Annual Review of Neuroscience , publisher =. doi:10.1146/annurev-neuro-120623-101142 , number =

work page doi:10.1146/annurev-neuro-120623-101142

[45] [46]

and Desai, Rutvik H

Binder, Jeffrey R. and Desai, Rutvik H. and Graves, William W. and Conant, Lisa L. , year =. Where Is the Semantic System? A Critical Review and Meta-Analysis of 120 Functional Neuroimaging Studies , volume =. Cerebral Cortex , publisher =. doi:10.1093/cercor/bhp055 , number =

work page doi:10.1093/cercor/bhp055

[46] [47]

and Huth, Alexander G

Deniz, Fatma and Nunez-Elizalde, Anwar O. and Huth, Alexander G. and Gallant, Jack L. , year =. The Representation of Semantic Information Across Human Cerebral Cortex During Listening Versus Reading Is Invariant to Stimulus Modality , volume =. The Journal of Neuroscience , publisher =. doi:10.1523/jneurosci.0675-19.2019 , number =

work page doi:10.1523/jneurosci.0675-19.2019 2019

[47] [48]

, title =

LeBel, Amanda and Wagner, Lauren and Jain, Shailee and Adhikari-Desai, Aneesh and Gupta, Bhavin and Morgenthal, Allyson and Tang, Jerry and Xu, Lixiang and Huth, Alexander G. , title =. Scientific Data , year =. doi:10.1038/s41597-023-02437-z , pmid =

work page doi:10.1038/s41597-023-02437-z

[48] [49]

Morris and Richard J

Vinamra Benara and Chandan Singh and John X. Morris and Richard J. Antonello and Ion Stoica and Alexander Huth and Jianfeng Gao , editor =. Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions , booktitle =. 2024 , burl =

work page 2024

[49] [50]

Lost in the Middle: How Language Models Use Long Contexts

Byung. Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times? , journal =. 2023 , burl =. doi:10.1162/TACL\_A\_00548 , timestamp =

work page internal anchor Pith review doi:10.1162/tacl 2023

[50] [51]

and Spoerer, Courtney J

Kietzmann, Tim C. and Spoerer, Courtney J. and Sörensen, Lynn K. A. and Cichy, Radoslaw M. and Hauk, Olaf and Kriegeskorte, Nikolaus , year =. Recurrence is required to capture the representational dynamics of the human visual system , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.1905544116 , number =

work page doi:10.1073/pnas.1905544116

[51] [52]

Neuron , year =

Hasson, Uri and Nastase, Samuel A. and Goldstein, Ariel , title =. Neuron , year =. doi:10.1016/j.neuron.2019.12.002 , pmid =

work page doi:10.1016/j.neuron.2019.12.002 2019

[52] [53]

and Bardolph, Megan D

Michaelov, James A. and Bardolph, Megan D. and Van Petten, Cyma K. and Bergen, Benjamin K. and Coulson, Seana , year =. Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects , volume =. Neurobiology of Language , publisher =. doi:10.1162/nol_a_00105 , number =

work page doi:10.1162/nol_a_00105

[53] [54]

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders , journal =

David Chanin and James Wilken. A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders , journal =. 2024 , burl =. doi:10.48550/ARXIV.2409.14507 , beprinttype =

work page doi:10.48550/arxiv.2409.14507 2024

[54] [55]

and Christianson, Kiel , year =

Luke, Steven G. and Christianson, Kiel , year =. The Provo Corpus: A large eye-tracking corpus with predictability norms , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-017-0908-4 , number =

work page doi:10.3758/s13428-017-0908-4

[55] [56]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , volume =

Benjamini, Yoav and Hochberg, Yosef , year =. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , volume =. Journal of the Royal Statistical Society Series B: Statistical Methodology , publisher =. doi:10.1111/j.2517-6161.1995.tb02031.x , number =

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995

[56] [57]

and Mahowald, K

Kanishka Misra and Kyle Mahowald , title =. arXiv preprint , volume =. 2024 , burl =. doi:10.48550/ARXIV.2403.19827 , beprinttype =

work page doi:10.48550/arxiv.2403.19827 2024

[57] [58]

and Coulson, Seana and Bergen, Benjamin K

Michaelov, James A. and Coulson, Seana and Bergen, Benjamin K. , year =. So Cloze Yet So Far: N400 Amplitude Is Better Predicted by Distributional Information Than Human Predictability Judgements , volume =. IEEE Transactions on Cognitive and Developmental Systems , publisher =. doi:10.1109/tcds.2022.3176783 , number =

work page doi:10.1109/tcds.2022.3176783 2022

[58] [59]

Representational similarity analysis – connecting the branches of systems neuroscience , issn =

Kriegeskorte, Nikolaus , year =. Representational similarity analysis – connecting the branches of systems neuroscience , issn =. doi:10.3389/neuro.06.004.2008 , journal =

work page doi:10.3389/neuro.06.004.2008 2008

[59] [60]

and Muller, Dyana C.Y

Piantadosi, Steven T. and Muller, Dyana C.Y. and Rule, Joshua S. and Kaushik, Karthikeya and Gorenstein, Mark and Leib, Elena R. and Sanford, Emily , year =. Why concepts are (probably) vectors , volume =. Trends in Cognitive Sciences , publisher =. doi:10.1016/j.tics.2024.06.011 , number =

work page doi:10.1016/j.tics.2024.06.011 2024

[60] [61]

and Conant, Lisa L

Binder, Jeffrey R. and Conant, Lisa L. and Humphries, Colin J. and Fernandino, Leonardo and Simons, Stephen B. and Aguilar, Mario and Desai, Rutvik H. , year =. Toward a brain-based componential semantic representation , volume =. Cognitive Neuropsychology , publisher =. doi:10.1080/02643294.2016.1147426 , number =

work page doi:10.1080/02643294.2016.1147426 2016

[61] [62]

and Seidenberg, Mark S

McRae, Ken and Cree, George S. and Seidenberg, Mark S. and Mcnorgan, Chris , year =. Semantic feature production norms for a large set of living and nonliving things , volume =. Behavior Research Methods , publisher =. doi:10.3758/bf03192726 , number =

work page doi:10.3758/bf03192726

[62] [63]

Concreteness ratings for 40 thousand generally known English word lemmas , volume =

Brysbaert, Marc and Warriner, Amy Beth and Kuperman, Victor , year =. Concreteness ratings for 40 thousand generally known English word lemmas , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-013-0403-5 , number =

work page doi:10.3758/s13428-013-0403-5

[63] [64]

Norms of valence, arousal, and dominance for 13,915 English lemmas , volume =

Warriner, Amy Beth and Kuperman, Victor and Brysbaert, Marc , year =. Norms of valence, arousal, and dominance for 13,915 English lemmas , volume =. Behavior Research Methods , publisher =. doi:10.3758/s13428-012-0314-x , number =

work page doi:10.3758/s13428-012-0314-x

[64] [65]

THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior , volume =

Hebart, Martin N and Contier, Oliver and Teichmann, Lina and Rockter, Adam H and Zheng, Charles Y and Kidder, Alexis and Corriveau, Anna and Vaziri-Pashkam, Maryam and Baker, Chris I , year =. THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior , volume =. doi:10.7554/elife.8258...

work page doi:10.7554/elife.82580

[65] [66]

Memorisation versus Generalisation in Pre-trained Language Models

T. Memorisation versus Generalisation in Pre-trained Language Models. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.521

work page doi:10.18653/v1/2022.acl-long.521 2022

[66] [67]

arXiv preprint , volume =

Rui Mao and Qian Liu and Xiao Li and Erik Cambria and Amir Hussain , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2508.20674 , beprinttype =

work page doi:10.48550/arxiv.2508.20674 2025

[67] [68]

Daniel and Sumers, Theodore R

Templeton, Adly and Conerly, Tom and Marcus, Jonathan and Lindsey, Jack and Bricken, Trenton and Chen, Brian and Pearce, Adam and Citro, Craig and Ameisen, Emmanuel and Jones, Andy and Cunningham, Hoagy and Turner, Nicholas L and McDougall, Callum and MacDiarmid, Monte and Tamkin, Alex and Durmus, Esin and Hume, Tristan and Mosconi, Francesco and Freeman,...

work page 2024

[68] [69]

Kietzmann and Emily Allen and Yihan Wu and Thomas Naselaris and Kendrick Kay and Ian Charest , doi =

Adrien Doerig and Tim C. Kietzmann and Emily Allen and Yihan Wu and Thomas Naselaris and Kendrick Kay and Ian Charest , doi =. Nature Machine Intelligence , title =

work page

[69] [70]

Decoding the brain: From neural representations to mechanistic models , booktitle =

Mackenzie Weygandt Mathis and Adriana. Decoding the brain: From neural representations to mechanistic models , booktitle =. 2024 , issn =. doi:https://doi.org/10.1016/j.cell.2024.08.051 , url =

work page doi:10.1016/j.cell.2024.08.051 2024

[70] [71]

Transformer Circuits Thread , year =

A Mathematical Framework for Transformer Circuits , author =. Transformer Circuits Thread , year =

work page

[71] [72]

2023 , journal=

Language Models Can Explain Neurons in Language Models , author =. 2023 , journal=

work page 2023

[72] [73]

Chapter 61 - The Hub-and-Spoke Hypothesis of Semantic Memory , editor =

Karalyn Patterson and Matthew A. Chapter 61 - The Hub-and-Spoke Hypothesis of Semantic Memory , editor =. Neurobiology of Language , publisher =. 2016 , isbn =. doi:https://doi.org/10.1016/B978-0-12-407794-2.00061-4 , url =

work page doi:10.1016/b978-0-12-407794-2.00061-4 2016

[73] [74]

Communications Biology , title =

Shirin Vafaei and Ryohei Fukuma and Takufumi Yanagisawa and Huixiang Yang and Satoru Oshino and Naoki Tani and Hui Ming Khoo and Hidenori Sugano and Yasushi Iimura and Hiroharu Suzuki and Madoka Nakajima and Kentaro Tamura and Haruhiko Kishima , doi =. Communications Biology , title =

work page

[74] [75]

Glossa Psycholinguistics , title =

Veronica Boyce and Roger Levy , doi =. Glossa Psycholinguistics , title =

work page

[75] [76]

and Binney, Richard J

Diveica, Veronica and Pexman, Penny M. and Binney, Richard J. , journal =. Quantifying Social Semantics: An Inclusive Definition of Socialness and Ratings for 8388. 2023 , doi =

work page 2023

[76] [77]

Vasilakos and Giovanni Iacca and Arshad Ali Khan and Arvind Kumar and Jae Won Cho and Ajmal Mian and Lihua Xie and Erik Cambria and Lin Wang , title =

Jian Liu and Xiongtao Shi and Thai Duy Nguyen and Haitian Zhang and Tianxiang Zhang and Wei Sun and Yanjie Li and Athanasios V. Vasilakos and Giovanni Iacca and Arshad Ali Khan and Arvind Kumar and Jae Won Cho and Ajmal Mian and Lihua Xie and Erik Cambria and Lin Wang , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2505.07634 , bepr...

work page doi:10.48550/arxiv.2505.07634 2025

[77] [78]

2025 , journal =

BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain , author=. 2025 , journal =

work page 2025

[78] [79]

A udio SAE : Towards Understanding of Audio-Processing Models with Sparse A uto E ncoders

Aparin, Georgii and Sadekova, Tasnima and Rukhovich, Alexey and Yermekova, Assel and Kushnareva, Laida and Popov, Vadim and Kuznetsov, Kristian and Piontkovskaya, Irina. A udio SAE : Towards Understanding of Audio-Processing Models with Sparse A uto E ncoders. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputation...

work page doi:10.18653/v1/2026.eacl-long.149 2026

[79] [80]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page