Multilingual LMs encode script over linguistic structure, with orthography shaping units more than word order or typology, and abstraction emerging gradually in deeper layers.
Sparse autoencoders can capture language-specific concepts across diverse languages
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.
citing papers explorer
-
Multilingual Language Models Encode Script Over Linguistic Structure
Multilingual LMs encode script over linguistic structure, with orthography shaping units more than word order or typology, and abstraction emerging gradually in deeper layers.
-
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.