Independent-Component-Based Encoding Models of Brain Activity During Story Comprehension
Pith reviewed 2026-05-08 03:32 UTC · model grok-4.3
The pith
Independent-component encoding models predict functional brain network activity from language model features during story listening.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We decompose continuous fMRI data from naturalistic story listening into ICs using one subset of the data, and train encoding models on independent data to predict IC time series from large language model representations of linguistic input. Across subjects, a subset of ICs exhibited consistently high predictivity. These ICs were spatially and temporally consistent across subjects and included cognitive networks known to respond during story listening (auditory and language). Auditory component time series were strongly correlated with acoustic stimulus features. Components identified as noise or motion-related artifacts showed uniformly poor predictive performance, confirming that highly p
What carries the argument
Independent component (IC)-based encoding framework that decomposes fMRI into ICs on one data subset and predicts their time series from LLM features on held-out data.
Load-bearing premise
The assumption that independent components derived from one part of the fMRI dataset capture real stimulus-related brain signals that can be accurately predicted by language model features in the other part, separate from confounds like motion or scanner noise.
What would settle it
Finding that the time series of the highly predictable independent components do not correlate with any measurable features of the auditory stories, such as sound amplitude or word meanings, would indicate they do not reflect stimulus-driven activity.
Figures
read the original abstract
Encoding models provide a powerful framework for linking continuous stimulus features to neural activity; however, traditional voxelwise approaches are limited by measurement noise, inter-subject variability, and redundancy arising from spatially correlated voxels encoding overlapping neural signals. Here, we propose an independent component (IC)-based encoding framework that dissociates stimulus-driven and noise-driven signals in fMRI data. We decompose continuous fMRI data from naturalistic story listening into ICs using one subset of the data, and train encoding models on independent data to predict IC time series from large language model representations of linguistic input. Across subjects, a subset of ICs exhibited consistently high predictivity. These ICs were spatially and temporally consistent across subjects and included cognitive networks known to respond during story listening (auditory and language). Auditory component time series were strongly correlated with acoustic stimulus features, highlighting the interpretability of identified component time series. Components identified as noise or motion-related artifacts by ICA-AROMA showed uniformly poor predictive performance, confirming that highly predicted components reflect genuine stimulus-related neural signals rather than confounds. Overall, IC-based encoding models enable analyses at the level of functional networks, accommodating the variability in network locations across individuals and providing interpretable results that are easy to compare across subjects.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an independent component (IC)-based encoding model for fMRI data during story comprehension. fMRI time series are decomposed into ICs using ICA on one data subset. Encoding models are trained on independent data to predict the time series of these ICs from large language model (LLM) representations of the story stimulus. The authors find that certain ICs, corresponding to auditory and language networks, show high predictivity and consistency across subjects, while noise and motion components identified by ICA-AROMA show poor predictivity. This framework is claimed to enable analyses at the functional network level, accommodating inter-subject variability in network locations and yielding interpretable, comparable results across subjects.
Significance. Should the quantitative results bear out the qualitative descriptions, the work would offer a useful methodological advance for linking linguistic features to brain activity. Traditional voxel-wise encoding models suffer from noise and inter-subject alignment issues; the IC approach mitigates these by operating on data-driven networks. The explicit separation of ICA decomposition and encoding training, combined with the AROMA-based validation that noise components are not predictable, provides a strong internal control against artifactual findings. This could improve the biological interpretability of encoding models and facilitate cross-subject comparisons without requiring spatial normalization of individual networks.
major comments (2)
- Abstract: The abstract reports that 'a subset of ICs exhibited consistently high predictivity' and 'were spatially and temporally consistent across subjects' but supplies no numerical values, error bars, statistical tests, or effect sizes. These metrics are load-bearing for the central claims of improved interpretability and cross-subject comparability; the results section must report mean predictivity correlations (with SE), spatial/temporal consistency measures (e.g., Dice overlap or ICC), and direct statistical comparisons between signal and noise ICs.
- Methods: Although data splitting is used to separate ICA decomposition from encoding-model training and prediction, the manuscript does not specify the exact fraction of data allocated to each stage, the number of runs or time points per stage, or the cross-validation procedure. These details are required to confirm that the reported differential predictivity between stimulus-driven and AROMA noise components is not an artifact of the split.
minor comments (2)
- Abstract: Specify the particular LLM (e.g., GPT-2, LLaMA) and the exact feature extraction (layer, pooling) used to generate linguistic representations, as this choice directly affects the encoding-model results and replicability.
- Results: Consider adding a supplementary table listing per-subject or average predictivity values for the auditory/language ICs versus AROMA components to make the 'uniformly poor' claim for noise components quantitatively verifiable.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and positive assessment of the potential methodological contribution of our work. We address each major comment below and will revise the manuscript accordingly to improve quantitative reporting and methodological transparency.
read point-by-point responses
-
Referee: Abstract: The abstract reports that 'a subset of ICs exhibited consistently high predictivity' and 'were spatially and temporally consistent across subjects' but supplies no numerical values, error bars, statistical tests, or effect sizes. These metrics are load-bearing for the central claims of improved interpretability and cross-subject comparability; the results section must report mean predictivity correlations (with SE), spatial/temporal consistency measures (e.g., Dice overlap or ICC), and direct statistical comparisons between signal and noise ICs.
Authors: We agree that the abstract should include key quantitative metrics to support the central claims. In the revised manuscript we will update the abstract to report mean predictivity correlations with standard errors, spatial and temporal consistency measures (including Dice overlap or ICC), and direct statistical comparisons between signal and noise ICs. We will also review the results section to ensure all requested metrics, error bars, and comparisons are explicitly presented with appropriate statistical tests. revision: yes
-
Referee: Methods: Although data splitting is used to separate ICA decomposition from encoding-model training and prediction, the manuscript does not specify the exact fraction of data allocated to each stage, the number of runs or time points per stage, or the cross-validation procedure. These details are required to confirm that the reported differential predictivity between stimulus-driven and AROMA noise components is not an artifact of the split.
Authors: We thank the referee for highlighting the need for greater precision on the data-splitting protocol. In the revised manuscript we will explicitly state the fraction of data allocated to ICA decomposition versus encoding-model training, the number of runs and time points per stage, and the full cross-validation procedure. These additions will allow readers to verify that the differential predictivity between stimulus-driven and noise components is not an artifact of the split. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper's core derivation relies on an explicit train/test split: ICA decomposition is performed on one data subset to extract independent components, after which encoding models are trained and evaluated on fully held-out data to predict component time series from LLM features. This separation prevents any reported predictivity from reducing to a fitted parameter by construction. Internal validation via AROMA noise components (showing uniformly low predictivity) and cross-subject consistency checks further confirm that selected components reflect stimulus-driven signals rather than artifacts or tautological fits. No load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation are present in the described methodology. The framework does not rename known empirical patterns as novel derivations but instead applies a standard ICA-plus-encoding pipeline with data partitioning to accommodate inter-subject variability.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption ICA can dissociate stimulus-driven neural signals from noise and motion artifacts in continuous fMRI data during naturalistic story listening.
- domain assumption LLM representations of linguistic input contain information sufficient to predict time series of stimulus-related independent components on held-out data.
Reference graph
Works this paper leans on
-
[1]
From Language to Cognition: How LLM s Outgrow the Human Language Network
AlKhamissi,B.,Tuckute,G.,Tang,Y.,Binhuraib,T.O.A., Bosselut, A., & Schrimpf, M. (2025, November). From language to cognition: How LLMs outgrow the humanlanguagenetwork.InC.Christodoulopoulos, T.Chakraborty,C.Rose,&V.Peng(Eds.),Proceed- ings of the 2025 conference on empirical methods in natural language processing(pp. 24321–24339). Association for Computa...
-
[2]
L., Ladopoulou, J., Sun, W., Eldaief, M
Du, J., Tripathi, V., Elliott, M. L., Ladopoulou, J., Sun, W., Eldaief, M. C., & Buckner, R. L. (2025). Within- individual precision mapping of brain networks ex- clusively using task data.Neuron,113(23). Esteban, O., Markiewicz, C. J., Blair, R. W., Moodie, C. A., Isik, A. I., Erramuzpe, A., Kent, J. D., Goncalves, M., DuPre, E., Snyder, M., Oya, H., Gho...
-
[3]
Lipkin, B., Tuckute, G., Affourtit, J., Small, H., Mineroff, Z., Kean, H., Jouravlev, O., Rakocevic, L., Pritchett, B., Siegelman, M., Hoeflin, C., Pongos, A., Blank, I. A., Struhl, M. K., Ivanova, A., Shannon, S., Sathe, A.,Hoffmann,M.,Nieto-Castañón,A.,&Fedorenko, E. (2022). Probabilistic atlas for the language net- work based on precision fMRI data fro...
work page 2022
-
[4]
Mitchell, T. M., Shinkareva, S. V., Carlson, A., Chang, K.-M., Malave, V. L., Mason, R. A., & Just, M. A. (2008). Predicting human brain activity associated with the meanings of nouns.Science,320(5880), 1191–1195. Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI.NeuroIm- age,56(2), 400–410. Oota, S. R., Gupt...
work page 2008
-
[5]
Pruim,R.H.,Mennes,M.,vanRooij,D.,Llera,A.,Buite- laar, J. K., & Beckmann, C. F. (2015). ICA-AROMA: ArobustICA-basedstrategyforremovingmotionar- tifacts from fMRI data.NeuroImage,112, 267–277. Ratan Murty, N. A., Bashivan, P., Abate, A., DiCarlo, J. J., & Kanwisher, N. (2021). Computational mod- els of category-selective brain regions enable high- throughp...
work page 2015
-
[6]
Salvo, J. J., Anderson, N. L., & Braga, R. M. (2025). Intrinsic functional connectivity delineates trans- modal language functions.Imaging Neuroscience, 3, IMAG.a.25. Schrimpf, M., Blank, I. A., Tuckute, G., Kauf, C., Hos- seini, E. A., Kanwisher, N., Tenenbaum, J. B., & Fedorenko, E. (2021). The neural architecture of language: Integrative modeling conve...
work page 2025
-
[7]
(Refer- ence images shown in Figure 3). We then trained en- coding models to predict these ROI-averaged time se- ries directly from the stimulus features. Performance wasagainquantifiedusingPearsoncorrelationbetween predicted and actual ROI signals. Analysis on additional known functional networks Beyond the language network, we observe that several addit...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.