MIDI is a new multilingual idiom dataset with sentence and conversational contexts; benchmarking reveals worse performance in low-resource languages and on literal vs. figurative uses.
The reasoning-memorization interplay in language models is mediated by a single direction
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.
citing papers explorer
-
Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages
MIDI is a new multilingual idiom dataset with sentence and conversational contexts; benchmarking reveals worse performance in low-resource languages and on literal vs. figurative uses.
-
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.