Semantic role understanding partially emerges during language model pre-training, with linear probes on frozen representations achieving substantial performance that improves with scale but does not match fine-tuned models, and representations shifting toward more distributed forms at larger scales.
Computational Linguistics , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MTA is a distillation method that aligns teacher-student LLM representations along their transformation trajectories using layer-adaptive granularities and dynamic structural plus hidden representation alignment losses.
citing papers explorer
-
Emergent Semantic Role Understanding in Language Models
Semantic role understanding partially emerges during language model pre-training, with linear probes on frozen representations achieving substantial performance that improves with scale but does not match fine-tuned models, and representations shifting toward more distributed forms at larger scales.
-
MTA: Multi-Granular Trajectory Alignment for Large Language Model Distillation
MTA is a distillation method that aligns teacher-student LLM representations along their transformation trajectories using layer-adaptive granularities and dynamic structural plus hidden representation alignment losses.