Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.
Memorisation versus Generalisation in Pre-trained Language Models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Label noise hurts fine-tuning performance most while grammatical and typographical noise sometimes act as mild regularizers, with changes concentrated in task-specific layers.
citing papers explorer
-
Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography
Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.
-
Analyzing the Effect of Noise in LLM Fine-tuning
Label noise hurts fine-tuning performance most while grammatical and typographical noise sometimes act as mild regularizers, with changes concentrated in task-specific layers.