pith. machine review for the scientific record. sign in

arxiv: 1905.06316 · v1 · submitted 2019-05-15 · 💻 cs.CL

Recognition: unknown

What do you learn from context? Probing for sentence structure in contextualized word representations

Authors on Pith no claims yet
classification 💻 cs.CL
keywords modelsprobingrepresentationstaskscontextualizedphenomenarecentsemantic
0
0 comments X
read the original abstract

Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building on recent token-level probing work, we introduce a novel edge probing task design and construct a broad suite of sub-sentence tasks derived from the traditional structured NLP pipeline. We probe word-level contextual representations from four recent models and investigate how they encode sentence structure across a range of syntactic, semantic, local, and long-range phenomena. We find that existing models trained on language modeling and translation produce strong representations for syntactic phenomena, but only offer comparably small improvements on semantic tasks over a non-contextual baseline.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

    cs.CL 2026-05 unverdicted novelty 8.0

    REALISTA optimizes continuous combinations of valid editing directions in latent space to produce realistic adversarial prompts that elicit hallucinations more effectively than prior methods, including on large reason...

  2. On the Blessing of Pre-training in Weak-to-Strong Generalization

    cs.LG 2026-05 unverdicted novelty 6.0

    Pre-training provides a geometric warm start in a single-index model that enables weak-to-strong generalization up to a supervisor-limited bound, with empirical phase-transition evidence in LLMs.

  3. Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction

    cs.CL 2026-04 unverdicted novelty 5.0

    BERT embeddings encode narrative dimensions of time, space, causality, and character at the token level, as a linear probe achieves 94% accuracy versus 47% on variance-matched random embeddings, though unsupervised cl...