Recognition: unknown
Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment
Pith reviewed 2026-05-10 16:46 UTC · model grok-4.3
The pith
Lesioning a compact shared core in multilingual models reduces brain prediction accuracy by 60 percent while language-specific lesions impair only the matching native language
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By zeroing small parameter sets identified as shared across languages or as language-specific within six multilingual LLMs, the authors show that shared-core lesions reduce whole-brain fMRI encoding correlation by 60.32 percent relative to intact models. Language-specific lesions, by contrast, preserve cross-language separation in embedding space but selectively weaken brain predictivity for the matched native language. These outcomes support the view that multilingual brain alignment rests on a shared backbone with embedded specializations and supply a causal method for linking model components to human brain responses during naturalistic story listening.
What carries the argument
Targeted zeroing of small parameter sets in multilingual LLMs that have been classified as shared across languages or as specific to one language, used to measure the resulting change in how well the models predict fMRI responses in language areas.
Load-bearing premise
That the parameters marked as shared or language-specific inside the models correspond to the actual shared or language-specific computations performed by the human brain.
What would settle it
Brain prediction accuracy staying the same after zeroing the shared-core parameters, or language-specific lesions reducing accuracy equally for all languages instead of only the matched one.
read the original abstract
How the brain supports language across different languages is a basic question in neuroscience and a useful test for multilingual artificial intelligence. Neuroimaging has identified language-responsive brain regions across languages, but it cannot by itself show whether the underlying processing is shared or language-specific. Here we use six multilingual large language models (LLMs) as controllable systems and create targeted ``computational lesions'' by zeroing small parameter sets that are important across languages or especially important for one language. We then compare intact and lesioned models in predicting functional magnetic resonance imaging (fMRI) responses during 100 minutes of naturalistic story listening in native English, Chinese and French (112 participants). Lesioning a compact shared core reduces whole-brain encoding correlation by 60.32% relative to intact models, whereas language-specific lesions preserve cross-language separation in embedding space but selectively weaken brain predictivity for the matched native language. These results support a shared backbone with embedded specializations and provide a causal framework for studying multilingual brain-model alignment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper uses six multilingual LLMs to create targeted computational lesions by zeroing small parameter sets identified as important across languages (shared core) or for one language specifically. These lesioned models are compared to intact versions in their ability to predict fMRI responses from 112 participants listening to 100 minutes of naturalistic stories in English, Chinese, and French. The central quantitative finding is that lesioning the compact shared core reduces whole-brain encoding correlation by 60.32% relative to intact models, while language-specific lesions preserve cross-language embedding separation but selectively weaken brain predictivity only for the matched native language. The results are interpreted as evidence for a shared backbone with embedded language-specific specializations in both models and brain language areas.
Significance. If the selective effects can be shown to arise from the shared versus specific nature of the lesioned parameters rather than differences in overall representational disruption, the work supplies a causal, model-based framework for dissecting multilingual brain alignment that goes beyond correlational neuroimaging. The approach of using controllable lesions in LLMs to generate falsifiable predictions about selective brain predictivity is a clear methodological strength and could be extended to other domains where shared versus specialized computations are at issue.
major comments (2)
- [Results] The 60.32% reduction reported for the shared-core lesion (abstract and Results) is the primary quantitative support for the shared-backbone claim, yet the manuscript provides no indication of controls that equate lesion size, total parameter count, magnitude of change in hidden-state geometry, or downstream performance (e.g., perplexity on the stimulus stories) between the shared lesion and the language-specific or random-lesion baselines. Without such equating, the larger effect size cannot be unambiguously attributed to the shared nature of the parameters rather than greater overall model degradation.
- [Methods] The lesion-construction procedure (Methods) identifies the 'compact shared core' via cross-language importance but does not report the exact selection criteria, the relative sizes of the shared versus language-specific parameter sets, or any matching procedure that would ensure the lesions are comparable in their impact on model outputs. This detail is load-bearing for interpreting the selective weakening observed only for matched-language lesions as evidence of embedded specializations.
minor comments (2)
- [Abstract] The abstract states the participant count (112) and total listening time (100 minutes) but does not break down the distribution across the three languages; adding this information would improve interpretability of the cross-language comparisons.
- [Figures/Tables] Figure legends and table captions should explicitly state the number of random-lesion controls and the statistical test used for the 60.32% reduction to allow readers to assess robustness without consulting the main text.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The two major comments highlight important issues of interpretability and methodological transparency. We address each point below and will revise the manuscript to incorporate the requested controls and details.
read point-by-point responses
-
Referee: [Results] The 60.32% reduction reported for the shared-core lesion (abstract and Results) is the primary quantitative support for the shared-backbone claim, yet the manuscript provides no indication of controls that equate lesion size, total parameter count, magnitude of change in hidden-state geometry, or downstream performance (e.g., perplexity on the stimulus stories) between the shared lesion and the language-specific or random-lesion baselines. Without such equating, the larger effect size cannot be unambiguously attributed to the shared nature of the parameters rather than greater overall model degradation.
Authors: We agree that equating lesion impact is necessary to isolate the effect of shared versus language-specific parameters. The current manuscript reports random-lesion baselines but does not explicitly match lesion size or report auxiliary metrics such as perplexity on the stimulus stories or changes in hidden-state geometry. In the revision we will add (i) random lesions of exactly matched parameter count to both the shared-core and language-specific conditions, (ii) tables of pre- and post-lesion perplexity on the fMRI stimulus stories for all conditions, and (iii) quantitative measures of representational disruption (e.g., cosine distance in hidden states). These additions will allow readers to evaluate whether the 60.32% drop is attributable to the shared nature of the lesioned parameters. revision: yes
-
Referee: [Methods] The lesion-construction procedure (Methods) identifies the 'compact shared core' via cross-language importance but does not report the exact selection criteria, the relative sizes of the shared versus language-specific parameter sets, or any matching procedure that would ensure the lesions are comparable in their impact on model outputs. This detail is load-bearing for interpreting the selective weakening observed only for matched-language lesions as evidence of embedded specializations.
Authors: We acknowledge that the Methods section currently lacks the precise numerical details needed for full reproducibility and comparability. In the revised manuscript we will add: (a) the exact cross-language importance threshold and aggregation rule used to define the shared core, (b) the per-layer and total parameter counts for the shared core and for each language-specific set, and (c) a description of any post-selection matching or normalization steps. These additions will make the lesion sizes and selection criteria transparent and will support the claim that the observed selectivity arises from the functional specialization of the lesioned parameters rather than from differences in lesion magnitude. revision: yes
Circularity Check
No significant circularity; central claims rest on independent empirical measurements
full rationale
The paper's derivation consists of (1) identifying parameter subsets via cross-language importance metrics, (2) applying zeroing lesions, and (3) measuring changes in linear encoding correlation against held-out fMRI data. These steps are operationally distinct: the lesion definition uses model-internal importance scores, while the reported 60.32% drop and language-specific effects are computed from external brain data. No equations, fitted parameters, or self-citations are shown to make the outcome equivalent to the input by construction. The result is therefore a genuine empirical finding rather than a renaming or definitional reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption fMRI BOLD responses during naturalistic story listening index language-related neural computations
- domain assumption LLM internal activations can be linearly mapped to brain activity via encoding models
Reference graph
Works this paper leans on
-
[1]
The importance of linguistic typology for the neurobiology of language.Linguistic Typology, 20(3):615–621, 2016
Ina Bornkessel-Schlesewsky and Matthias Schlesewsky. The importance of linguistic typology for the neurobiology of language.Linguistic Typology, 20(3):615–621, 2016
2016
-
[2]
Native language differences in the structural connectome of the human brain.Neuroimage, 270:119955, 2023
Xuehu Wei, Helyne Adamson, Matthias Schwendemann, Tom´ as Goucha, Angela˜D Friederici, and Alfred Anwander. Native language differences in the structural connectome of the human brain.Neuroimage, 270:119955, 2023
2023
-
[3]
An investigation across 45 languages and 12 language families reveals a universal language network.Nature neuroscience, 25(8):1014–1019, 2022
Saima Malik-Moraleda, Dima Ayyash, Jeanne Gall´ ee, Josef Affourtit, Malte Hoffmann, Zachary Mineroff, Olessia Jouravlev, and Evelina Fedorenko. An investigation across 45 languages and 12 language families reveals a universal language network.Nature neuroscience, 25(8):1014–1019, 2022
2022
-
[4]
The language network as a natural kind within the broader landscape of the human brain.Nature Reviews Neuroscience, 25(5):289–312, 2024
Evelina Fedorenko, Anna˜A Ivanova, and Tamar˜I Regev. The language network as a natural kind within the broader landscape of the human brain.Nature Reviews Neuroscience, 25(5):289–312, 2024
2024
-
[5]
Reworking the language network.Trends in cognitive sciences, 18(3):120–126, 2014
Evelina Fedorenko and Sharon˜L Thompson-Schill. Reworking the language network.Trends in cognitive sciences, 18(3):120–126, 2014
2014
-
[6]
Jennifer Hu, Hannah Small, Hope Kean, Atsushi Takahashi, Leo Zekelman, Daniel Kleinman, Elizabeth Ryan, Alfonso Nieto-Casta˜ n´ on, Victor Ferreira, and Evelina Fedorenko. Precision fmri reveals that the language-selective net- work supports both phrase-structure building and lexical access during language production.Cerebral Cortex, 33(8):4384–4404, 2023
2023
-
[7]
Lexical and syntactic representations in the brain: an fmri investigation with multi-voxel pattern analyses.Neuropsychologia, 50(4):499–513, 2012
Evelina Fedorenko, Alfonso Nieto-Castanon, and Nancy Kanwisher. Lexical and syntactic representations in the brain: an fmri investigation with multi-voxel pattern analyses.Neuropsychologia, 50(4):499–513, 2012
2012
-
[8]
High-level language brain regions process sublexical regularities.Cerebral Cortex, 34(3):bhae077, 2024
Tamar˜I Regev, Hee˜So Kim, Xuanyi Chen, Josef Affourtit, Abigail˜E Schipper, Leon Bergen, Kyle Mahowald, and Evelina Fedorenko. High-level language brain regions process sublexical regularities.Cerebral Cortex, 34(3):bhae077, 2024
2024
-
[9]
Broca’s area is not a natural kind.Trends in cognitive sciences, 24(4):270–284, 2020
Evelina Fedorenko and Idan˜A Blank. Broca’s area is not a natural kind.Trends in cognitive sciences, 24(4):270–284, 2020
2020
-
[10]
What we can do and what we cannot do with fmri.Nature, 453(7197):869–878, 2008
Nikos˜K Logothetis. What we can do and what we cannot do with fmri.Nature, 453(7197):869–878, 2008
2008
-
[11]
Deep lan- guage algorithms predict semantic comprehension from brain activity.Scientific reports, 12(1):16327, 2022
Charlotte Caucheteux, Alexandre Gramfort, and Jean-R´ emi King. Deep lan- guage algorithms predict semantic comprehension from brain activity.Scientific reports, 12(1):16327, 2022
2022
-
[12]
Scaling laws for lan- guage encoding models in fmri.Advances in Neural Information Processing 26 Systems, 36:21895–21907, 2023
Richard Antonello, Aditya Vaidya, and Alexander Huth. Scaling laws for lan- guage encoding models in fmri.Advances in Neural Information Processing 26 Systems, 36:21895–21907, 2023
2023
-
[13]
The neural architecture of language: integrative modeling converges on predictive processing.Proceedings of the National Academy of Sciences, 118(45):e2105646118, 2021
Martin Schrimpf, Idan˜Asher Blank, Greta Tuckute, Carina Kauf, Eghbal˜A Hosseini, Nancy Kanwisher, Joshua˜B Tenenbaum, and Evelina Fedorenko. The neural architecture of language: integrative modeling converges on predictive processing.Proceedings of the National Academy of Sciences, 118(45):e2105646118, 2021
2021
-
[14]
Shared functional specialization in transformer-based lan- guage models and the human brain.Nature communications, 15(1):5523, 2024
Sreejan Kumar, Theodore˜R Sumers, Takateru Yamakoshi, Ariel Goldstein, Uri Hasson, Kenneth˜A Norman, Thomas˜L Griffiths, Robert˜D Hawkins, and Samuel˜A Nastase. Shared functional specialization in transformer-based lan- guage models and the human brain.Nature communications, 15(1):5523, 2024
2024
- [15]
-
[16]
Incorporating context into language encoding models for fmri.Advances in neural information processing systems, 2018
Shailee Jain and Alexander Huth. Incorporating context into language encoding models for fmri.Advances in neural information processing systems, 2018
2018
-
[17]
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
Mariya Toneva and Leila Wehbe. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). Advances in neural information processing systems, 2019
2019
-
[18]
Shared computational principles for language processing in humans and deep language models.Nature neuroscience, 25(3):369–380, 2022
Ariel Goldstein, Zaid Zada, Eliav Buchnik, Mariano Schain, Amy Price, Bobbi Aubrey, Samuel˜A Nastase, Amir Feder, Dotan Emanuel, Alon Cohen, and oth- ers. Shared computational principles for language processing in humans and deep language models.Nature neuroscience, 25(3):369–380, 2022
2022
-
[19]
Unsupervised cross-lingual representation learning at scale
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzm´ an, Edouard Grave, Myle Ott, Luke Zettle- moyer, and Veselin Stoyanov. Unsupervised cross-lingual representation learning at scale. InProceedings of the 58th annual meeting of the association for computational linguistics, 8440–8451. 2020
2020
-
[20]
Felix Gaschi, Patricio Cerda, Parisa Rastin, and Yannick Toussaint. Exploring the relationship between alignment and cross-lingual transfer in multilingual transformers.arXiv preprint arXiv:2306.02790, 2023
-
[21]
Task representations in neural networks trained to perform many cognitive tasks.Nature neuroscience, 22(2):297–306, 2019
Guangyu˜Robert Yang, Madhura˜R Joglekar, H˜Francis Song, William˜T New- some, and Xiao-Jing Wang. Task representations in neural networks trained to perform many cognitive tasks.Nature neuroscience, 22(2):297–306, 2019
2019
-
[22]
Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science, 1(1):417–446, 2015
Nikolaus Kriegeskorte. Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science, 1(1):417–446, 2015. 27
2015
-
[23]
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and others. Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[24]
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, and others. Qwen2. 5-vl technical report.arXiv preprint arXiv:2502.13923, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[25]
Qwen Team and others. Qwen2 technical report.arXiv preprint arXiv:2407.10671, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[26]
Nemotron-4 340b technical report
Bo˜Adler, Niket Agarwal, Ashwath Aithal, Dong˜H Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, and others. Nemotron-4 340b technical report.arXiv preprint arXiv:2406.11704, 2024
-
[27]
Compact language models via pruning and knowledge distillation.Advances in Neural Information Processing Systems, 37:41076–41102, 2024
Saurav Muralidharan, Sharath Turuvekere˜Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, and Pavlo Molchanov. Compact language models via pruning and knowledge distillation.Advances in Neural Information Processing Systems, 37:41076–41102, 2024
2024
-
[28]
Llm pruning and distillation in practice: The minitron approach.arXiv preprint arXiv:2408.11796,
Sharath˜Turuvekere Sreenivas, Saurav Muralidharan, Raviraj Joshi, Marcin Chochowski, Ameya˜Sunil Mahabaleshwarkar, Gerald Shen, Jiaqi Zeng, Zijia Chen, Yoshi Suhara, Shizhe Diao, and others. Llm pruning and distillation in practice: the minitron approach.arXiv preprint arXiv:2408.11796, 2024
-
[29]
Unveil- ing linguistic regions in large language models
Zhihao Zhang, Jun Zhao, Qi˜Zhang, Tao Gui, and Xuan-Jing Huang. Unveil- ing linguistic regions in large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 6228–6247. 2024
2024
-
[30]
Le petit prince multilingual naturalistic fmri corpus.Scientific data, 9(1):530, 2022
Jixing Li, Shohini Bhattasali, Shulin Zhang, Berta Franzluebbers, Wen-Ming Luh, R˜Nathan Spreng, Jonathan˜R Brennan, Yiming Yang, Christophe Pallier, and John Hale. Le petit prince multilingual naturalistic fmri corpus.Scientific data, 9(1):530, 2022
2022
-
[31]
Natural speech reveals the semantic maps that tile human cerebral cortex.Nature, 532(7600):453–458, 2016
Alexander˜G Huth, Wendy˜A De˜Heer, Thomas˜L Griffiths, Fr´ ed´ eric˜E The- unissen, and Jack˜L Gallant. Natural speech reveals the semantic maps that tile human cerebral cortex.Nature, 532(7600):453–458, 2016
2016
-
[32]
Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses.PloS one, 9(11):e112575, 2014
Leila Wehbe, Brian Murphy, Partha Talukdar, Alona Fyshe, Aaditya Ramdas, and Tom Mitchell. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses.PloS one, 9(11):e112575, 2014. 28
2014
-
[33]
The control of the false discovery rate in multiple testing under dependency.Annals of statistics, pages 1165–1188, 2001
Yoav Benjamini and Daniel Yekutieli. The control of the false discovery rate in multiple testing under dependency.Annals of statistics, pages 1165–1188, 2001
2001
-
[34]
Functional speci- ficity for high-level linguistic processing in the human brain.Proceedings of the National Academy of Sciences, 108(39):16428–16433, 2011
Evelina Fedorenko, Michael˜K Behr, and Nancy Kanwisher. Functional speci- ficity for high-level linguistic processing in the human brain.Proceedings of the National Academy of Sciences, 108(39):16428–16433, 2011
2011
-
[35]
Functional network dynamics of the language system
Lucy˜R Chai, Marcelo˜G Mattar, Idan˜Asher Blank, Evelina Fedorenko, and Danielle˜S Bassett. Functional network dynamics of the language system. Cerebral Cortex, 26(11):4148–4159, 2016
2016
-
[36]
Precision fmri reveals that the language network exhibits adult-like left-hemispheric lateralization by 4 years of age.bioRxiv, pages 2024–05, 2025
Ola Ozernov-Palchik, Amanda˜M O’Brien, Elizabeth˜Jiachen Lee, Hilary Richardson, Rachel Romeo, Moshe Poliak, Benjamin Lipkin, Hannah Small, Jimmy Capella, Alfonso Nieto-Casta˜ n´ on, and others. Precision fmri reveals that the language network exhibits adult-like left-hemispheric lateralization by 4 years of age.bioRxiv, pages 2024–05, 2025
2024
-
[37]
Complementary hemispheric lateralization of language and social processing in the human brain.Cell reports, 2022
Reza Rajimehr, Arsalan Firoozi, Hossein Rafipoor, Nooshin Abbasi, and John Duncan. Complementary hemispheric lateralization of language and social processing in the human brain.Cell reports, 2022
2022
-
[38]
Pointer Sentinel Mixture Models
Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. Pointer sentinel mixture models.arXiv preprint arXiv:1609.07843, 2016
work page internal anchor Pith review arXiv 2016
-
[39]
An empirical study of smoothing tech- niques for language modeling.Computer Speech & Language, 13(4):359–394, 1999
Stanley˜F Chen and Joshua Goodman. An empirical study of smoothing tech- niques for language modeling.Computer Speech & Language, 13(4):359–394, 1999
1999
-
[40]
One billion word benchmark for measuring progress in statistical language modeling
Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi˜Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. One billion word benchmark for measuring progress in statistical language modeling.arXiv preprint arXiv:1312.3005, 2013
-
[41]
BERT Rediscovers the Classical NLP Pipeline , publisher =
Ian Tenney, Dipanjan Das, and Ellie Pavlick. Bert rediscovers the classical nlp pipeline.arXiv preprint arXiv:1905.05950, 2019
-
[42]
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan˜N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017
2017
-
[43]
Transformer feed- forward layers are key-value memories
Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer feed- forward layers are key-value memories. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 5484–5495. 2021
2021
-
[44]
A mathematical framework for transformer circuits.Transformer Circuits Thread, 1(1):12, 2021
Nelson Elhage, Neel Nanda, Catherine Olsson, Tom Henighan, Nicholas Joseph, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, and others. A mathematical framework for transformer circuits.Transformer Circuits Thread, 1(1):12, 2021. 29
2021
-
[45]
SentEval: An evaluation toolkit for universal sentence representations
Alexis Conneau and Douwe Kiela. Senteval: an evaluation toolkit for universal sentence representations.arXiv preprint arXiv:1803.05449, 2018
-
[46]
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis: finding sparse, trainable neural networks.arXiv preprint arXiv:1803.03635, 2018
work page Pith review arXiv 2018
-
[47]
Language-specific neurons: the key to multilingual capabilities in large language models
Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Wayne˜Xin Zhao, Furu Wei, and Ji-Rong Wen. Language-specific neurons: the key to multilingual capabilities in large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5701–5715. 2024
2024
-
[48]
Driving and suppress- ing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024
Greta Tuckute, Aalok Sathe, Shashank Srikant, Maya Taliaferro, Mingye Wang, Martin Schrimpf, Kendrick Kay, and Evelina Fedorenko. Driving and suppress- ing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024
2024
-
[49]
How multilingual is Multilingual BERT?
Telmo Pires, Eva Schlinger, and Dan Garrette. How multilingual is multilingual bert?arXiv preprint arXiv:1906.01502, 2019
work page Pith review arXiv 1906
-
[50]
Investigating multilingual nmt representations at scale
Sneha Kudugunta, Ankur Bapna, Isaac Caswell, and Orhan Firat. Investigating multilingual nmt representations at scale. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 1565–
2019
-
[51]
The geometry of multilin- gual language model representations
Tyler Chang, Zhuowen Tu, and Benjamin Bergen. The geometry of multilin- gual language model representations. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 119–136. 2022
2022
-
[52]
Language dominance determined by whole brain functional mri in patients with brain lesions.Neurology, 52(4):798–798, 1999
RR˜Benson, DB˜FitzGerald, LL˜LeSueur, DN˜Kennedy, KK˜Kwong, BR˜Buchbinder, TL˜Davis, RM˜Weisskoff, TM˜Talavage, WJ˜Logan, and oth- ers. Language dominance determined by whole brain functional mri in patients with brain lesions.Neurology, 52(4):798–798, 1999
1999
-
[53]
Disentan- gling syntax and semantics in the brain with deep networks
Charlotte Caucheteux, Alexandre Gramfort, and Jean-Remi King. Disentan- gling syntax and semantics in the brain with deep networks. InInternational conference on machine learning, 1336–1348. PMLR, 2021
2021
-
[54]
Human scene-selective areas represent 3d configurations of surfaces.Neuron, 101(1):178–192, 2019
Mark˜D Lescroart and Jack˜L Gallant. Human scene-selective areas represent 3d configurations of surfaces.Neuron, 101(1):178–192, 2019
2019
-
[55]
Cross-cultural effect on the brain revisited: universal structures plus writing system variation
Donald˜J Bolger, Charles˜A Perfetti, and Walter Schneider. Cross-cultural effect on the brain revisited: universal structures plus writing system variation. Human brain mapping, 25(1):92–104, 2005. 30
2005
-
[56]
The neural basis of first and second language processing.Current opinion in neurobiology, 15(2):202–206, 2005
Daniela Perani and Jubin Abutalebi. The neural basis of first and second language processing.Current opinion in neurobiology, 15(2):202–206, 2005
2005
-
[57]
Brains and algorithms partially converge in natural language processing.Communications biology, 5(1):134, 2022
Charlotte Caucheteux and Jean-R´ emi King. Brains and algorithms partially converge in natural language processing.Communications biology, 5(1):134, 2022
2022
-
[58]
If deep learning is the answer, what is the question?Nature Reviews Neuroscience, 22(1):55–67, 2021
Andrew Saxe, Stephanie Nelli, and Christopher Summerfield. If deep learning is the answer, what is the question?Nature Reviews Neuroscience, 22(1):55–67, 2021
2021
-
[59]
Deep problems with neural network models of human vision.Behavioral and Brain Sciences, 46:e385, 2023
Jeffrey˜S Bowers, Gaurav Malhotra, Marin Dujmovi´ c, Milton˜Llera Montero, Christian Tsvetkov, Valerio Biscione, Guillermo Puebla, Federico Adolfi, John˜E Hummel, Rachel˜F Heaton, and others. Deep problems with neural network models of human vision.Behavioral and Brain Sciences, 46:e385, 2023
2023
-
[60]
Composition is the core driver of the language-selective network
Francis Mollica, Matthew Siegelman, Evgeniia Diachek, Steven˜T Pianta- dosi, Zachary Mineroff, Richard Futrell, Hope Kean, Peng Qian, and Evelina Fedorenko. Composition is the core driver of the language-selective network. Neurobiology of Language, 1(1):104–134, 2020
2020
-
[61]
Language-selective and domain-general regions lie side by side within broca’s area.Current Biology, 22(21):2059–2062, 2012
Evelina Fedorenko, John Duncan, and Nancy Kanwisher. Language-selective and domain-general regions lie side by side within broca’s area.Current Biology, 22(21):2059–2062, 2012
2059
-
[62]
Bilingual language production: the neu- rocognition of language representation and control.Journal of neurolinguistics, 20(3):242–275, 2007
Jubin Abutalebi and David Green. Bilingual language production: the neu- rocognition of language representation and control.Journal of neurolinguistics, 20(3):242–275, 2007
2007
-
[63]
Language control in bilinguals: the adaptive control hypothesis.Journal of cognitive psychology, 25(5):515–530, 2013
David˜W Green and Jubin Abutalebi. Language control in bilinguals: the adaptive control hypothesis.Journal of cognitive psychology, 25(5):515–530, 2013
2013
-
[64]
Null it out: Guarding protected attributes by iterative nullspace projection
Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, and Yoav Goldberg. Null it out: guarding protected attributes by iterative nullspace projection.arXiv preprint arXiv:2004.07667, 2020
-
[65]
Not lost in translation: neural responses shared across languages.Journal of Neuroscience, 32(44):15277–15283, 2012
Christopher˜J Honey, Christopher˜R Thompson, Yulia Lerner, and Uri Hasson. Not lost in translation: neural responses shared across languages.Journal of Neuroscience, 32(44):15277–15283, 2012
2012
-
[66]
The revolution will not be controlled: natural stimuli in speech neuroscience.Language, cognition and neuroscience, 35(5):573–582, 2020
Liberty˜S Hamilton and Alexander˜G Huth. The revolution will not be controlled: natural stimuli in speech neuroscience.Language, cognition and neuroscience, 35(5):573–582, 2020
2020
-
[67]
Over-reliance on english hinders cognitive science.Trends in 31 cognitive sciences, 26(12):1153–1170, 2022
Dami´ an˜E Blasi, Joseph Henrich, Evangelia Adamou, David Kemmerer, and Asifa Majid. Over-reliance on english hinders cognitive science.Trends in 31 cognitive sciences, 26(12):1153–1170, 2022
2022
-
[68]
Is it that simple? linear mapping models in cognitive neuroscience.bioRxiv, 2021
Anna˜A Ivanova, Martin Schrimpf, Stefano Anzellotti, Noga Zaslavsky, Evelina Fedorenko, and Leyla Isik. Is it that simple? linear mapping models in cognitive neuroscience.bioRxiv, 2021
2021
-
[69]
Neural source dynamics of brain responses to continuous stimuli: speech processing from acoustics to comprehension.NeuroImage, 172:162–174, 2018
Christian Brodbeck, Alessandro Presacco, and Jonathan˜Z Simon. Neural source dynamics of brain responses to continuous stimuli: speech processing from acoustics to comprehension.NeuroImage, 172:162–174, 2018
2018
-
[70]
Neu- ral dynamics of phoneme sequencing in real speech jointly encode order and invariant content.BioRxiv, pages 2020–04, 2020
Laura Gwilliams, Jean-Remi King, Alec Marantz, and David Poeppel. Neu- ral dynamics of phoneme sequencing in real speech jointly encode order and invariant content.BioRxiv, pages 2020–04, 2020
2020
-
[71]
narratives
Samuel˜A Nastase, Yun-Fei Liu, Hanna Hillman, Asieh Zadbood, Liat Hasen- fratz, Neggin Keshavarzian, Janice Chen, Christopher˜J Honey, Yaara Yeshurun, Mor Regev, and others. The “narratives” fmri dataset for evaluating models of naturalistic language comprehension.Scientific data, 8(1):250, 2021
2021
-
[72]
Ayumu Yamashita, Noriaki Yahata, Takashi Itahashi, Giuseppe Lisi, Takashi Yamada, Naho Ichikawa, Masahiro Takamura, Yujiro Yoshihara, Akira Kuni- matsu, Naohiro Okada, and others. Harmonization of resting-state functional mri data across multiple imaging sites via the separation of site differences into sampling bias and measurement bias.PLoS biology, 17(...
2019
-
[73]
Negar Foroutan, Mohammadreza Banaei, R´ emi Lebret, Antoine Bosselut, and Karl Aberer. Discovering language-neutral sub-networks in multilingual lan- guage models.arXiv preprint arXiv:2205.12672, 2022
-
[74]
Decoding the neural representation of story meanings across languages.Human brain mapping, 38(12):6096–6106, 2017
Morteza Dehghani, Reihane Boghrati, Kingson Man, Joe Hoover, Sarah˜I Gim- bel, Ashish Vaswani, Jason˜D Zevin, Mary˜Helen Immordino-Yang, Andrew˜S Gordon, Antonio Damasio, and others. Decoding the neural representation of story meanings across languages.Human brain mapping, 38(12):6096–6106, 2017
2017
-
[75]
A survey on multilingual large language models: corpora, alignment, and bias.Frontiers of Computer Science, 19(11):1911362, 2025
Yuemei Xu, Ling Hu, Jiayi Zhao, Zihan Qiu, Kexin Xu, Yuqi Ye, and Hanwen Gu. A survey on multilingual large language models: corpora, alignment, and bias.Frontiers of Computer Science, 19(11):1911362, 2025
2025
-
[76]
Sim- ilarity of neural network representations revisited
Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Sim- ilarity of neural network representations revisited. InInternational conference on machine learning, 3519–3529. PMlR, 2019
2019
-
[77]
Emerging cross-lingual structure in pretrained language models
Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, and Veselin Stoyanov. Emerging cross-lingual structure in pretrained language models. InProceedings of the 58th annual meeting of the association for computational linguistics, 6022–
-
[78]
Net- work dissection: quantifying interpretability of deep visual representations
David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. Net- work dissection: quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6541–6549. 2017
2017
-
[79]
Modular processes in mind and brain.Cognitive neuropsychol- ogy, 28(3-4):156–208, 2011
Saul Sternberg. Modular processes in mind and brain.Cognitive neuropsychol- ogy, 28(3-4):156–208, 2011
2011
-
[80]
Importance estimation for neural network pruning
Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. Importance estimation for neural network pruning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11264– 11272. 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.