Foundation models yield less human-interpretable features than supervised vision transformers, with interpretability tied to activation locality and coarse semantic alignment rather than task performance.
eLife , issn =
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
NeuralBench is a new benchmarking framework for neuroAI models on EEG data that finds foundation models only marginally outperform task-specific ones while many tasks like cognitive decoding stay highly challenging.
Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.
fMRI responses to natural scenes in human visual cortex exhibit a consistent scale-free structure with power-law decaying variance across four orders of magnitude of dimensions, shared across individuals via hyperalignment.
Mathematical analysis shows sparse linear regression mitigates output dimension collapse in brain-to-image reconstruction at small data scales by exploiting sparsity in the brain-to-feature mapping.
RSA on 7T fMRI during natural scene viewing identifies ventromedial and lateral occipitotemporal representational routes for scene context versus animate content, with differential alignment to vision and language models.
citing papers explorer
-
Capability $\neq$ Interpretability: Human Interpretability of Vision Foundation Models
Foundation models yield less human-interpretable features than supervised vision transformers, with interpretability tied to activation locality and coarse semantic alignment rather than task performance.
-
NeuralBench: A Unifying Framework to Benchmark NeuroAI Models
NeuralBench is a new benchmarking framework for neuroAI models on EEG data that finds foundation models only marginally outperform task-specific ones while many tasks like cognitive decoding stay highly challenging.
-
Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography
Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.
-
Universal scale-free representations in human visual cortex
fMRI responses to natural scenes in human visual cortex exhibit a consistent scale-free structure with power-law decaying variance across four orders of magnitude of dimensions, shared across individuals via hyperalignment.
-
Overcoming Output Dimension Collapse: When Sparsity Enables Zero-shot Brain-to-Image Reconstruction at Small Data Scales
Mathematical analysis shows sparse linear regression mitigates output dimension collapse in brain-to-image reconstruction at small data scales by exploiting sparsity in the brain-to-feature mapping.
-
Shared representations in brains and models reveal a two-route cortical organization during scene perception
RSA on 7T fMRI during natural scene viewing identifies ventromedial and lateral occipitotemporal representational routes for scene context versus animate content, with differential alignment to vision and language models.