pith. sign in

arxiv: 1906.10002 · v1 · pith:XK3LHDAWnew · submitted 2019-06-24 · 💻 cs.CL · cs.AI

LIAAD at SemDeep-5 Challenge: Word-in-Context (WiC)

Pith reviewed 2026-05-25 17:28 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords word sense disambiguationword-in-contextcontextual embeddingssense embeddingsSemDeep challengenatural language processing
0
0 comments X

The pith

A word sense disambiguation system using contextual embeddings adapts directly to word-in-context detection and reaches competitive results without task training data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a system that placed second in the SemDeep-5 Word-in-Context challenge. It starts from a word sense disambiguation method that produces sense embeddings from contextual embeddings across a full inventory of senses. This system is then adapted without modification to decide whether a target word carries the same sense in two given sentences. The resulting approach matches competitive performance levels even when the challenge's training and development sets are ignored entirely.

Core claim

Our solution is based on a novel system for Word Sense Disambiguation using contextual embeddings and full-inventory sense embeddings. We adapt this WSD system, in a straightforward manner, for the present task of detecting whether the same sense occurs in a pair of sentences. Additionally, we show that our solution is able to achieve competitive performance even without using the provided training or development sets, mitigating potential concerns related to task overfitting.

What carries the argument

The novel WSD system based on contextual embeddings and full-inventory sense embeddings, adapted to decide whether a target word shares the same sense across a sentence pair.

If this is right

  • The WSD system can be repurposed for the WiC task without any task-specific training or fine-tuning.
  • Performance on the challenge remains competitive while avoiding reliance on the supplied training and development sets.
  • Concerns about overfitting to the particular WiC dataset are reduced because the core components are drawn from general WSD resources.
  • Sense distinctions captured by the embeddings transfer to the binary same-sense decision required by WiC.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same embedding-based sense representations could be tested on other binary or multi-way sense comparison tasks without new labeled data.
  • If the approach generalizes, it would reduce the need for large task-specific annotated sets in semantic evaluation benchmarks.
  • Direct comparison of sense embeddings from different sentences offers a parameter-light alternative to models trained end-to-end on WiC.

Load-bearing premise

The novel WSD system based on contextual embeddings and full-inventory sense embeddings can be adapted in a straightforward manner to detect whether the same sense occurs in a pair of sentences.

What would settle it

Running the same adapted system on the official WiC test set and finding that its accuracy falls substantially below the top entries that do use the provided training data.

Figures

Figures reproduced from arXiv: 1906.10002 by Alipio Jorge, Daniel Loureiro.

Figure 1
Figure 1. Figure 1: Illustration of our k-NN approach for WSD, which relies on full-coverage sense embeddings repre￾sented in the same space as contextualized embeddings. 2.3 Binary Classification The WiC task calls for a binary judgement on whether the meaning of a target word occurring in a pair of sentences is the same or not. As such, our most immediate solution is to perform WSD and base our decision on the resulting sen… view at source ↗
Figure 2
Figure 2. Figure 2: Components and interactions involved in our approaches. The sim [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of Prediction Probabilities [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: ROC curve for results of our best model on [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
read the original abstract

This paper describes the LIAAD system that was ranked second place in the Word-in-Context challenge (WiC) featured in SemDeep-5. Our solution is based on a novel system for Word Sense Disambiguation (WSD) using contextual embeddings and full-inventory sense embeddings. We adapt this WSD system, in a straightforward manner, for the present task of detecting whether the same sense occurs in a pair of sentences. Additionally, we show that our solution is able to achieve competitive performance even without using the provided training or development sets, mitigating potential concerns related to task overfitting

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper describes the LIAAD system that ranked second in the SemDeep-5 Word-in-Context (WiC) challenge. The approach adapts a novel Word Sense Disambiguation (WSD) system based on contextual embeddings and full-inventory sense embeddings to detect whether the same sense occurs in a pair of sentences. The authors emphasize that competitive performance is achieved without using the provided training or development sets.

Significance. If the performance claim holds, the work is significant for showing that a pre-trained WSD pipeline can be directly adapted to WiC in a zero-shot manner. This provides a concrete example of mitigating task overfitting concerns in lexical semantics and demonstrates the practical utility of full-inventory sense embeddings.

major comments (1)
  1. Abstract: The abstract supplies no quantitative results, error analysis, or derivation; the central performance claim cannot be verified from the given text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of our work's significance in demonstrating zero-shot adaptation of a WSD system to the WiC task. We address the single major comment below.

read point-by-point responses
  1. Referee: Abstract: The abstract supplies no quantitative results, error analysis, or derivation; the central performance claim cannot be verified from the given text.

    Authors: We agree that the abstract would be strengthened by including key quantitative results to allow verification of the performance claim. The submitted abstract emphasized the zero-shot nature of the approach but omitted specific metrics. In the revised version, we will add the ranking (second place in the SemDeep-5 WiC challenge) and the corresponding test-set accuracy. Error analysis and derivations are presented in the body of the paper, consistent with typical abstract length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a competition system report describing a zero-shot adaptation of a pre-existing WSD pipeline to the WiC task. No equations, fitted parameters, or derivation chain appear in the provided text. The central performance claim rests on empirical submission results rather than any self-referential mapping, uniqueness theorem, or renamed empirical pattern. The adaptation is presented as direct and task-independent, with no load-bearing self-citation or construction that reduces the reported outcome to its own inputs by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no free parameters, axioms, or invented entities; ledger left empty.

pith-pipeline@v0.9.0 · 5622 in / 912 out tokens · 27530 ms · 2026-05-25T17:28:58.691843+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 1 internal anchor

  1. [1]

    Alan Ansell, Felipe Bravo-Marquez, and Bernhard Pfahringer. 2019. An elmo-inspired approach to semdeep-5's word-in-context task. In SemDeep-5@IJCAI 2019, page forthcoming

  2. [2]

    Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. https://openreview.net/forum?id=SyK00v5xx A simple but tough-to-beat baseline for sentence embeddings . In International Conference on Learning Representations (ICLR)

  3. [3]

    Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. https://doi.org/10.1613/jair.1.11259 From word to sense embeddings: A survey on vector representations of meaning . J. Artif. Int. Res., 63(1):743--788

  4. [4]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://www.aclweb.org/anthology/N19-1423 BERT : Pre-training of deep bidirectional transformers for language understanding . In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (L...

  5. [5]

    Christiane Fellbaum. 1998. In WordNet : an electronic lexical database. MIT Press

  6. [6]

    Minh Le, Marten Postma, Jacopo Urbani, and Piek Vossen. 2018. https://www.aclweb.org/anthology/C18-1030 A deep dive into word sense disambiguation with LSTM . In Proceedings of the 27th International Conference on Computational Linguistics, pages 354--365, Santa Fe, New Mexico, USA. Association for Computational Linguistics

  7. [7]

    Michael Lesk. 1986. https://doi.org/10.1145/318723.318728 Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone . In Proceedings of the 5th Annual International Conference on Systems Documentation, SIGDOC '86, pages 24--26, New York, NY, USA. ACM

  8. [8]

    Daniel Loureiro and Al \' pio Jorge. 2019. Language modelling makes sense: Propagating representations through wordnet for full-coverage word sense disambiguation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, page forthcoming, Florence, Italy. Association for Computational Linguistics

  9. [9]

    Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. https://doi.org/10.18653/v1/K16-1006 context2vec: Learning generic context embedding with bidirectional LSTM . In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning , pages 51--61, Berlin, Germany. Association for Computational Linguistics

  10. [10]

    Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G

    George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. https://www.aclweb.org/anthology/H94-1046 Using a semantic concordance for sense identification . In HUMAN LANGUAGE TECHNOLOGY : Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994

  11. [11]

    Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. https://doi.org/10.18653/v1/N18-1202 Deep contextualized word representations . In Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Lon...

  12. [12]

    Mohammad Taher Pilehvar and Jose Camacho-Collados. 2019. Wic: the word-in-context dataset for evaluating context-sensitive meaning representations. In Proceedings of NAACL, Minneapolis, United States

  13. [13]

    Aina Gar \' Soler, Marianna Apidianaki, and Alexandre Allauzen. 2019. Limsi-multisem at the ijcai semdeep-5 wic challenge: Context representations for word usage similarity estimation. In SemDeep-5@IJCAI 2019, page forthcoming

  14. [14]

    Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. http://arxiv.org/abs/1905.00537 Superglue: A stickier benchmark for general-purpose language understanding systems . CoRR, abs/1905.00537

  15. [15]

    URL: " 'urlintro :=

    ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

  16. [16]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...