Sentence-level contextual entrainment exists across LLMs, weakens with scale, and is localized to 2-4% of attention heads whose deactivation removes the effect without performance loss.
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLM s
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Dutch LLMs display coherence illusions tracked by surprisal, with attention entropy identifying affected heads and a new energy metric quantifying discourse coherence.
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.
citing papers explorer
-
Sentence-Level Contextual Entrainment in Large Language Models
Sentence-level contextual entrainment exists across LLMs, weakens with scale, and is localized to 2-4% of attention heads whose deactivation removes the effect without performance loss.
-
When Context Misleads: Surprisal, Energy and Attention Entropy as Metrics of Coherence Illusions in LLMs
Dutch LLMs display coherence illusions tracked by surprisal, with attention entropy identifying affected heads and a new energy metric quantifying discourse coherence.
-
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.