Cross-lingual robustness of LLM-brain alignment and its computational roots
Pith reviewed 2026-05-21 05:15 UTC · model grok-4.3
The pith
Brain-LLM alignment holds across languages in cortical and subcortical areas but does not track surprisal or dimensionality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using a multilingual whole-brain encoding framework, transformer-based models predicted activity in a distributed landscape spanning widely distributed cortical functional networks like limbic, ventral attention, default mode network, and subcortical structures. Spatial alignment patterns showed substantial cross-linguistic overlap and remained largely stable across model layers, with limited layer progression consistent with functional cortical hierarchies. Contrary to previous evidence, contextual embeddings did not outperform static embeddings. Neither surprisal nor intrinsic dimensionality mirrored neural alignment profiles, suggesting that brain-LLM alignment is spatially robust and跨-l馭
What carries the argument
Multilingual whole-brain encoding that maps LLM embeddings to fMRI responses during naturalistic story listening to measure spatial overlap and test computational explanations.
If this is right
- Alignment covers subcortical structures in addition to multiple cortical networks.
- Patterns remain stable across transformer layers with little evidence of hierarchical progression.
- Static embeddings predict neural responses as effectively as contextual embeddings.
- Alignment does not arise from predictive uncertainty measured by surprisal.
- Alignment does not arise from representational geometry measured by intrinsic dimensionality.
Where Pith is reading between the lines
- If lexical-semantic overlap is the main driver, alignment should weaken when low-frequency or abstract words are removed from the stimuli.
- The result raises the possibility that similar alignments could appear between LLMs and brain data in non-linguistic domains if semantic content overlaps.
- Future experiments could control for lexical overlap explicitly to test whether any residual alignment still tracks model depth or task demands.
Load-bearing premise
That surprisal and intrinsic dimensionality are the right metrics to rule out predictive-processing and compression accounts when they fail to match alignment patterns.
What would settle it
A new dataset or model set in which brain prediction accuracy rises and falls in step with layer-wise surprisal or intrinsic dimensionality would contradict the claim that these metrics do not explain the alignment.
read the original abstract
Large language models (LLMs) reliably predict neural activity during language comprehension and transformer depth has been interpreted as mirroring hierarchical cortical organization. However, it remains unclear whether such alignment extends to subcortical regions, overlaps spatially across languages, and what the computational roots of such alignment are. Here, we used a multilingual, whole-brain encoding framework to examine brain-LLM alignment across three typologically distinct languages: Mandarin, English, and French during naturalistic story listening. Our results show that across languages, transformer-based models predicted activity in a distributed landscape spanning widely distributed cortical functional networks like limbic, ventral attention, default mode network, and subcortical structures. Spatial alignment patterns showed substantial cross-linguistic overlap and remained largely stable across model layers, with limited layer progression consistent with functional cortical hierarchies. Contrary to previous evidence, contextual embeddings did not outperform static embeddings. To test candidate computational explanations, we examined whether layer-wise brain scores reflect surprisal and intrinsic dimensionality, and thereby predictive processing and information compression. Neither of these two computational metrics mirrored neural alignment profiles. Our findings suggest that brain-LLM alignment is spatially robust and cross-linguistically stable but not explainable from predictive uncertainty or representational geometry. Rather than directly reflecting shared hierarchical computation, neural predictivity may primarily arise from distributed lexical-semantic correspondences that generalize across languages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper uses a multilingual whole-brain encoding framework to study LLM-brain alignment during naturalistic story listening in Mandarin, English, and French. It reports distributed alignment across cortical networks (limbic, ventral attention, default mode) and subcortical structures, with substantial cross-linguistic spatial overlap and stability across model layers. Contextual embeddings do not outperform static ones, and layer-wise brain scores do not track surprisal or intrinsic dimensionality. The authors conclude that alignment is spatially robust and cross-linguistically stable but arises primarily from distributed lexical-semantic correspondences rather than predictive processing or representational geometry.
Significance. If the empirical patterns hold, the work would meaningfully extend prior LLM-brain alignment studies by demonstrating cross-lingual stability including in subcortical regions and by directly testing (and failing to support) two prominent computational accounts. The negative findings on surprisal and dimensionality provide a useful constraint, and the inference toward lexical-semantic correspondences offers a plausible alternative interpretation. The multilingual naturalistic design is a strength for generality.
major comments (2)
- [Abstract and Results (computational metrics)] Abstract and Results (computational metrics subsection): The claim that 'neither of these two computational metrics mirrored neural alignment profiles' is central to ruling out predictive-processing and compression accounts, yet the manuscript provides no quantitative details on how surprisal and intrinsic dimensionality were computed (e.g., exact formulas, layer selection, aggregation across stimuli), what statistical thresholds defined 'mirroring,' or how data exclusions were handled. This absence prevents evaluation of whether the mismatch is conclusive or artifactual.
- [Discussion] Discussion: The positive suggestion that 'neural predictivity may primarily arise from distributed lexical-semantic correspondences' is presented as the favored interpretation after falsifying only surprisal and dimensionality. Because the encoding framework itself relies on lexical-semantic features in the embeddings, this risks circularity; a direct test (e.g., ablation of semantic content or comparison to non-semantic controls) would be needed to make the inference load-bearing.
minor comments (2)
- [Figures] Figure legends and axis labels for cross-lingual overlap maps should explicitly state the similarity metric (e.g., Dice coefficient or Pearson r) and correction method used for spatial overlap quantification.
- [Methods] The manuscript should clarify whether the same participants or independent cohorts were used across the three languages, as this affects the interpretation of cross-linguistic stability.
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments, which have helped us strengthen the methodological transparency and interpretive rigor of the manuscript. We address each major comment point by point below and indicate the revisions made.
read point-by-point responses
-
Referee: Abstract and Results (computational metrics subsection): The claim that 'neither of these two computational metrics mirrored neural alignment profiles' is central to ruling out predictive-processing and compression accounts, yet the manuscript provides no quantitative details on how surprisal and intrinsic dimensionality were computed (e.g., exact formulas, layer selection, aggregation across stimuli), what statistical thresholds defined 'mirroring,' or how data exclusions were handled. This absence prevents evaluation of whether the mismatch is conclusive or artifactual.
Authors: We agree that additional quantitative details are necessary for full evaluation. In the revised manuscript, we have added a dedicated 'Computational Metrics' subsection to the Methods. Surprisal was computed as the negative log probability of each token given prior context from the model's output distribution, averaged per story and layer. Intrinsic dimensionality was estimated via the Levina-Bickel maximum likelihood estimator on per-layer embedding matrices. Metrics were aggregated by averaging across the three languages' stimuli. 'Mirroring' was quantified via Pearson correlation between layer-wise brain scores and metric values, with significance at p < 0.05 after FDR correction across layers. No data exclusions occurred beyond standard fMRI preprocessing. These clarifications make the negative findings more evaluable. revision: yes
-
Referee: Discussion: The positive suggestion that 'neural predictivity may primarily arise from distributed lexical-semantic correspondences' is presented as the favored interpretation after falsifying only surprisal and dimensionality. Because the encoding framework itself relies on lexical-semantic features in the embeddings, this risks circularity; a direct test (e.g., ablation of semantic content or comparison to non-semantic controls) would be needed to make the inference load-bearing.
Authors: We appreciate the concern regarding circularity. Our inference rests on the observed empirical patterns rather than the framework in isolation: contextual embeddings showed no advantage over static ones (which primarily encode lexical-semantic information), and alignment lacked the layer-wise progression expected under predictive or hierarchical accounts. We have revised the Discussion to explicitly trace this logic from the negative results on surprisal and dimensionality to the lexical-semantic interpretation, while noting the static-vs-contextual comparison as an internal control. Although a dedicated ablation study would be a valuable extension, it falls outside the current scope; we have added this as a limitation and future direction. revision: partial
Circularity Check
No significant circularity detected
full rationale
The paper's derivation consists of empirical encoding results across languages showing distributed alignment patterns that are stable and overlapping, followed by direct comparisons showing that layer-wise brain scores do not track surprisal or intrinsic dimensionality. These mismatches are reported as observations rather than predictions forced by the model fits themselves. The inference toward lexical-semantic correspondences is offered as an interpretive suggestion, not a self-definitional or fitted-input reduction. No load-bearing step reduces by construction to prior inputs via self-citation chains, uniqueness theorems, or ansatz smuggling. The central claims remain independent of the tested metrics and rest on observable spatial and cross-lingual patterns.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Naturalistic story listening elicits reliable BOLD responses that can be linearly predicted from LLM embeddings
- domain assumption Surprisal and intrinsic dimensionality are valid proxies for predictive processing and information compression
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Neither of these two computational metrics [surprisal, intrinsic dimensionality] mirrored neural alignment profiles. ... neural predictivity may primarily arise from distributed lexical-semantic correspondences
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
layer-wise brain scores reflect surprisal and intrinsic dimensionality
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
ur findings suggest that brain-LLM alignment is spatially robust and cross-linguistically stable but not explainable from predictive uncertainty or representational geometry. Rather than directly reflecting shared hierarchical computation, neural predictivity may primarily arise from distributed lexical-semantic correspondences that generalize across lang...
work page 2007
-
[2]
or discourse levels (Hong et al., 2024). Higher-order discourse features such as event boundary segmentation engage the default mode network (DMN), which supports integration over longer timescales during narrative comprehension (Fernandino & Binder, 2024; Simony et al., 2016). In parallel, studies of connectivity and brain lesions show that subcortical s...
work page 2024
-
[3]
and comprehension (Braga et al., 2020; Rossi et al., 2025). Narrative provides coherence and structure for linking events over time (Baldassano et al., 2017; Dominey, 2021), engaging memory integration and requiring long-timescale accumulation of semantic information. Evaluating whether transformer depth truly mirrors cortical hierarchy thus invites a who...
work page 2020
-
[4]
The Le Petit Prince: A multilingual fMRI corpus using ecological stimuli
and training steps (Cheng & Antonello, 2024), raising questions about its generality. We therefore asked whether the layer-wise brain-model alignment pattern is mirrored by the ones in surprisal or representational geometry. To address these open questions, we here first investigate whether transformer-based language models predict neural activity beyond ...
work page 2024
-
[5]
and 16 subcortical areas from Melbourne Subcortex Atlas (S1) parcellation (Tian et al., 2020), as in Figure 1B. Figure 1: Encoding workflow. (A) The stimuli used for the storytelling were divided into different utterances. In addition to static semantic features (FastText), we extracted the contextual features from multilingual BERT (mBERT) and monolingua...
work page 2020
-
[6]
Firstly, 7 different models (multilingual BERT, 3 mono-lingual BERT models, and 3 FastText models, one for each language) were employed. Pre-trained FastText word vector models (cc.en.300.bin, cc.zh.300bin, and cc.fr.300bin) served as models for non-contextual word embeddings. For contextual embeddings, we use off-the-shelf BERT variants, including mBERT-...
work page 2020
-
[7]
and predicted areas in the brain (Pasquiou et al., 2023). While many studies use the maximum available context length (Antonello et al., 2021; Caucheteux et al., 2023; Varda et al., 2025), others have found that shorter windows, typically around 10 to 20 words (Toneva & Wehbe, 2019; Yu et al.,
work page 2023
-
[8]
or 50 words (Raugel et al., 2025), yield better alignment between LLMs and neural data. In our case, given the findings from former studies and in order to preserve the completeness of a sentence to ensure a meaningful context, as in its training objectives, we use the entire current sentence (15 words on average) as the context window. This choice minimi...
work page 2025
-
[9]
using the Schaefer-400 parcellation and Melbourne S1 subcortical atlas. For each mBERT layer, thresholded ROI maps from Chinese, French, and English were binarized based on the presence or absence of significant encoding effects. Cortical and subcortical regions were categorized as language-specific (Chinese-only, French-only, and English-only) or shared ...
work page 2020
-
[10]
(B) The preferred layer at each ROI across three languages
whose brain scores differ significantly between layer i and layer j, based on a paired sign-flip permutation test across subjects, with FDR correction across ROIs (q = 0.05). (B) The preferred layer at each ROI across three languages. The darker color indicates deeper model depth. To examine whether different transformer layers yield systematically differ...
work page 2023
-
[11]
suggesting that contextual embeddings (e.g., BERT) capture activity in regions associated with higher-level linguistic and semantic processing compared to non-contextual embeddings (e.g., GloVe or limited-context embeddings), we used FastText as a contrasting static model. One-sided sign-flip permutation tests for the hypothesis BERT > FastText were not s...
work page 2016
-
[12]
Figure 5: Network-wise and layer-wise brain scores across languages. (A) Mean ROI-level brain scores within Yeo-7 cortical networks: visual (Vis), somatomotor (SomMot), dorsal attention (DorsAttn), ventral attention (SalVentAttn), limbic (Limbic), control (Cont), default mode (Default) network, and the Melbourne subcortical system in 12 mBERT layers for C...
work page 2024
-
[13]
(Figure 5C). To further interpret these layer-wise patterns and the linguistic processes underlying LLM-based predictions of brain activity, we selected surprisal and ID as two indices of information prediction and compression, respectively, across model layers. Surprisal decreased progressively and showed a marked drop in the final layer, whereas ID peak...
work page 2025
-
[14]
or stronger predictive processing. Finally, architectural depth in transformers may not directly correspond to cortical hierarchy. Transformer layers consist of repeated attention and feedforward computations trained under a shared optimization objective, whereas cortical hierarchy reflects anatomical connectivity, temporal receptive windows, and multimod...
-
[15]
https://doi.org/10.1038/s42003-022-03036-1 Cheng, E., & Antonello, R. J. (2024). Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models (arXiv:2409.05771). arXiv. https://doi.org/10.48550/arXiv.2409.05771 Cheng, E., Doimo, D., Kervadec, C., Macocco, I., Yu, J., Laio, A., & Baroni, M. (2025). Emergence of a High-Dimensional Abstract...
-
[16]
https://doi.org/10.3389/fnhum.2012.00069 Esteban, O., Markiewicz, C. J., Blair, R. W., Moodie, C. A., Isik, A. I., Erramuzpe, A., Kent, J. D., Goncalves, M., DuPre, E., Snyder, M., Oya, H., Ghosh, S. S., Wright, J., Durnez, J., Poldrack, R. A., & Gorgolewski, K. J. (2019). fMRIPrep: A robust preprocessing pipeline for functional MRI. Nature Methods, 16(1)...
-
[17]
https://doi.org/10.1038/s41467-024-46631-y Goldstein, A., Ham, E., Schain, M., Nastase, S. A., Aubrey, B., Zada, Z., Grinstein-Dabush, A., Gazula, H., Feder, A., Doyle, W., Devore, S., Dugan, P., Friedman, D., Brenner, M., Hassidim, A., Matias, Y ., Devinsky, O., Siegelman, N., Flinker, A., … Hasson, U. (2025). Temporal structure of natural language proce...
-
[18]
https://doi.org/10.3389/fninf.2011.00013 Graichen, N., de-Dios-Flores, I., & Boleda, G. (2026). The Grammar of Transformers: A Systematic Review of Interpretability Research on Syntactic Knowledge in Language Models (arXiv:2601.19926). arXiv. https://doi.org/10.48550/arXiv.2601.19926 Hafner, D., Ortega, P. A., Ba, J., Parr, T., Friston, K., & Heess, N. (2...
-
[19]
https://doi.org/10.1038/s41467-024-49173-5 Kurczek, J., Brown-Schmidt, S., & Duff, M. C. (2013). Hippocampal contributions to language: Evidence of referential processing deficits in amnesia. Journal of Experimental Psychology. General, 142(4), 1346–1354. https://doi.org/10.1037/a0034026 Lei, Y ., Ge, X., Zhang, Y ., Yang, Y ., & Ma, B. (2025). Do Large L...
-
[20]
Nathan and Brennan, Jonathan R
https://doi.org/10.1038/s41597-022-01625-7 Lin, Y ., Tan, Y . C., & Frank, R. (2019). Open Sesame: Getting inside BERT’s Linguistic Knowledge. In T. Linzen, G. Chrupała, Y . Belinkov, & D. Hupkes (Eds.), Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP (pp. 241–253). Association for Computational Linguis...
-
[21]
https://doi.org/10.1038/s42003-025-08377-1 Pasquiou, A., Lakretz, Y ., Hale, J., Thirion, B., & Pallier, C. (2022). Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps (arXiv:2207.03380). arXiv. https://doi.org/10.48550/arXiv.2207.03380 Pasquiou, A., Lakretz, Y ., Thirion, B., & Pallier, C. (2023). Information-Restricted Neural...
-
[22]
J., Muñoz, E., Painous, C., Santacruz, P., Ruiz-Idiago, J., Mareca, C., & Hinzen, W
https://proceedings.neurips.cc/paper/2019/hash/749a8e6c231831ef7756db230b4359c8-Abstract.html Tovar, A., Perry, S. J., Muñoz, E., Painous, C., Santacruz, P., Ruiz-Idiago, J., Mareca, C., & Hinzen, W. (2024). Understanding of referential dependencies in Huntington’s disease. Neuropsychologia, 197, 108845. https://doi.org/10.1016/j.neuropsychologia.2024.108...
-
[23]
https://doi.org/10.1038/s42003-025-07862-x V os de Wael, R., Benkarim, O., Paquola, C., Lariviere, S., Royer, J., Tavakol, S., Xu, T., Hong, S.-J., Langs, G., Valk, S., Misic, B., Milham, M., Margulies, D., Smallwood, J., & Bernhardt, B. C. (2020). BrainSpace: A toolbox for the analysis of macroscale gradients in neuroimaging and connectomics datasets. Co...
-
[24]
L., Sharmarke, H., Clarke, N., Gensollen, N., Markiewicz, C
https://doi.org/10.1038/s42003-020-0794-7 Wang, H.-T., Meisler, S. L., Sharmarke, H., Clarke, N., Gensollen, N., Markiewicz, C. J., Paugam, F., Thirion, B., & Bellec, P. (2024). Continuous evaluation of denoising strategies in resting-state fMRI connectivity using fMRIPrep and Nilearn. PLOS Computational Biology, 20(3), e1011942. https://doi.org/10.1371/j...
-
[25]
Declaration of competing interest P.H
https://doi.org/10.1038/s42003-025-09377-x Acknowledgements We thank all the members of the Grammar and Cognition Lab and all the colleagues for helpful discussions and feedback. Declaration of competing interest P.H. has received grants and honoraria from Novartis, Lundbeck, Mepha, Janssen, Boehringer Ingelheim, OM Pharma, and Neurolite outside of this w...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.