Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Alipio Jorge; Daniel Loureiro

arxiv: 1906.10007 · v1 · pith:ZAAS6I2Znew · submitted 2019-06-24 · 💻 cs.CL

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Daniel Loureiro , Alipio Jorge This is my paper

Pith reviewed 2026-05-25 17:25 UTC · model grok-4.3

classification 💻 cs.CL

keywords word sense disambiguationcontextual embeddingsWordNetneural language modelsnearest neighborssense embeddingspolysemy

0 comments

The pith

Propagating contextual embeddings through WordNet produces sense-level vectors that let a simple nearest-neighbor method outperform neural sequence models on word sense disambiguation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that contextual embeddings learned by neural language models can be turned into sense-specific representations covering every entry in WordNet. Propagation along WordNet relations achieves this coverage without using sense-frequency statistics or any task-specific training. Once obtained, these sense vectors allow a basic k-NN classifier to exceed the accuracy of earlier systems built on powerful neural sequencing architectures. The approach also supports direct examination of how contextual embeddings encode conceptual distinctions at the sense level.

Core claim

Contextual embeddings from neural language models can be propagated through WordNet relations to produce sense-level embeddings with full coverage of the sense inventory. These embeddings require no explicit knowledge of sense distributions and no task-specific modelling. As a result a simple k-NN method using them consistently surpasses the performance of previous systems that employ powerful neural sequencing models.

What carries the argument

Propagation of contextual embeddings through WordNet relations to generate sense-level vectors

If this is right

A k-NN classifier on the sense embeddings outperforms previous neural WSD systems.
The method remains effective when part-of-speech and lemma features are ignored.
Full-inventory disambiguation is possible without recourse to sense-frequency data.
The resulting sense embeddings enable concept-level analyses of contextual embeddings and their source language models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The propagation technique could be applied to other lexical knowledge bases to test whether similar sense coverage emerges.
Comparing the sense vectors against human sense similarity judgments would reveal how faithfully the propagation preserves semantic distance.
Downstream tasks requiring fine-grained meaning, such as semantic role labeling, could benefit from substituting these vectors for raw contextual embeddings.

Load-bearing premise

Propagating contextual embeddings through WordNet relations produces accurate sense-level vectors that preserve the distinctions needed for disambiguation without any sense-frequency information or task-specific training.

What would settle it

A nearest-neighbor classifier using the propagated sense embeddings failing to exceed the accuracy of prior neural WSD systems on standard benchmarks such as SemEval would falsify the central claim.

Figures

Figures reproduced from arXiv: 1906.10007 by Alipio Jorge, Daniel Loureiro.

**Figure 1.** Figure 1: Illustration of our k-NN approach for WSD, which relies on full-coverage sense embeddings represented in the same space as contextualized embeddings. For simplification, we label senses as synsets. Grey nodes belong to different lemmas (see §5.3). Our WSD approach is strictly based on k-NN (see [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Performance gains with LMMS2348 when accepting additional neighbors as valid predictions. 5.2 Part-of-Speech Mismatches The solution we introduced in §4.4 addressed missing lemmas, but we didn’t propose a solution that addressed missing POS information. Indeed, the confusion matrix in [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Examples of gender bias found in the sense [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Propagating contextual embeddings through WordNet produces full-coverage sense vectors that let plain k-NN beat prior neural WSD systems with no task training or sense frequencies.

read the letter

The main result is that you can take contextual embeddings, spread them across WordNet relations to build sense-level vectors for every entry, and then use nearest neighbors to do disambiguation better than the previous neural sequence models. No sense frequency data or supervised fine-tuning on the task is required. The propagation step is the actual novelty here, and it delivers the full-coverage property that most earlier sense embedding work lacked. The experiments back this up on standard WSD benchmarks, with added checks on what happens when POS and lemma features are dropped or when the system must pick from the entire sense inventory instead of a filtered set. They also show some downstream use of the vectors for inspecting the language models themselves. The soft spots are small. The gains are steady but not dramatic in every configuration, and the paper does not run head-to-head comparisons against other graph-based ways of combining embeddings. Results will move with whatever contextual model is used upstream, which is obvious but worth noting. Nothing in the argument looks circular or unsupported by the controls they report. This is the sort of paper that matters to people working on WSD or anyone who needs sense representations without extra training steps. A reader who wants a straightforward, reproducible method would get concrete value from it. It deserves a serious referee because the method is implementable and the experiments let you judge whether the central claim holds.

Referee Report

0 major / 3 minor

Summary. The paper claims that contextual embeddings from neural language models can be propagated through WordNet relations to produce full-coverage sense-level vectors. These vectors enable a simple, parameter-free k-NN classifier to outperform prior neural WSD systems on standard benchmarks without using sense-frequency information or task-specific training. The work also includes robustness analyses (ignoring POS/lemma features, full-inventory disambiguation) and applications to concept-level analysis of NLMs.

Significance. If the empirical results hold, the work is significant because it shows that sense distinctions can be recovered from existing contextual embeddings and a static lexical resource in a fully unsupervised manner, yielding a simpler and stronger baseline than complex sequence models. The parameter-free nature and full WordNet coverage are notable strengths; the approach also supplies a tool for inspecting what NLMs have learned at the concept level.

minor comments (3)

§3 (method): the precise propagation procedure (which relations, number of hops, aggregation function, handling of cycles) should be stated with pseudocode or a small worked example so that the construction of the sense vectors is fully reproducible from the text alone.
Table 2 / §4.2: report the number of senses per lemma in the evaluation sets and confirm that the k-NN lookup is performed over the entire WordNet inventory rather than a reduced candidate set; this directly affects the strength of the 'full-coverage' claim.
§5 (analysis): the robustness experiments that drop POS and lemma features are valuable, but the paper should also report the corresponding drop in the strongest neural baselines for direct comparison.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation relies on external pre-trained contextual embeddings (from NLMs) and the independent WordNet graph for propagation to produce sense vectors, followed by a parameter-free k-NN. No equation or step reduces by construction to a fitted input, self-definition, or self-citation chain; the performance claim is tested against external WSD benchmarks without sense-frequency data or task-specific training. The approach is self-contained against those benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are described.

axioms (1)

domain assumption Contextual embeddings from NLMs separate word senses according to local context.
Central premise for using NLM embeddings as the starting point for sense propagation.

pith-pipeline@v0.9.0 · 5683 in / 1041 out tokens · 23405 ms · 2026-05-25T17:25:43.088811+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

[1]

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. https://openreview.net/forum?id=SyK00v5xx A simple but tough-to-beat baseline for sentence embeddings . In International Conference on Learning Representations (ICLR)

work page 2017
[2]

Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro. 2014. https://www.aclweb.org/anthology/C14-1151 An enhanced L esk word sense disambiguation algorithm through a distributional semantic model . In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers , pages 1591--1600, Dublin, Ireland. Dublin...

work page 2014
[3]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. https://doi.org/10.1162/tacl_a_00051 Enriching word vectors with subword information . Transactions of the Association for Computational Linguistics, 5:135--146

work page doi:10.1162/tacl_a_00051 2017
[4]

Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. http://dl.acm.org/citation.cfm?id=3157382.3157584 Man is to computer programmer as woman is to homemaker? debiasing word embeddings . In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS'16, pages 4356--4364, USA. Curran Asso...

work page arXiv 2016
[5]

Bryson, and Arvind Narayanan

Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. https://doi.org/10.1126/science.aal4230 Semantics derived automatically from language corpora contain human-like biases . Science, 356(6334):183--186

work page doi:10.1126/science.aal4230 2017
[6]

Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. https://doi.org/10.1613/jair.1.11259 From word to sense embeddings: A survey on vector representations of meaning . J. Artif. Int. Res., 63(1):743--788

work page doi:10.1613/jair.1.11259 2018
[7]

Jose Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. https://doi.org/https://doi.org/10.1016/j.artint.2016.07.005 Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities . Artificial Intelligence, 240:36 -- 64

work page doi:10.1016/j.artint.2016.07.005 2016
[8]

Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. 2014. https://doi.org/10.3115/v1/D14-1110 A unified model for word sense representation and disambiguation . In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing ( EMNLP ) , pages 1025--1035, Doha, Qatar. Association for Computational Linguistics

work page doi:10.3115/v1/d14-1110 2014
[9]

Jacob Devlin, Ming - Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. http://arxiv.org/abs/1810.04805v1 BERT: pre-training of deep bidirectional transformers for language understanding . CoRR, abs/1810.04805v1

work page internal anchor Pith review Pith/arXiv arXiv 2018
[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://www.aclweb.org/anthology/N19-1423 BERT : Pre-training of deep bidirectional transformers for language understanding . In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (L...

work page 2019
[11]

Christiane Fellbaum. 1998. In WordNet : an electronic lexical database. MIT Press

work page 1998
[12]

Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. https://doi.org/10.18653/v1/P16-1085 Embeddings for word sense disambiguation: An evaluation study . In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 897--907, Berlin, Germany. Association for Computational Linguistics

work page doi:10.18653/v1/p16-1085 2016
[13]

Minh Le, Marten Postma, Jacopo Urbani, and Piek Vossen. 2018. https://www.aclweb.org/anthology/C18-1030 A deep dive into word sense disambiguation with LSTM . In Proceedings of the 27th International Conference on Computational Linguistics, pages 354--365, Santa Fe, New Mexico, USA. Association for Computational Linguistics

work page 2018
[14]

Doug Lenat, Mayank Prakash, and Mary Shepherd. 1986. http://dl.acm.org/citation.cfm?id=13432.13435 Cyc: Using common sense knowledge to overcome brittleness and knowledge acquistion bottlenecks . AI Mag., 6(4):65--85

work page arXiv 1986
[15]

Michael Lesk. 1986. https://doi.org/10.1145/318723.318728 Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone . In Proceedings of the 5th Annual International Conference on Systems Documentation, SIGDOC '86, pages 24--26, New York, NY, USA. ACM

work page doi:10.1145/318723.318728 1986
[16]

Daniel Loureiro and Al \' pio M \'a rio Jorge. 2019. Liaad at semdeep-5 challenge: Word-in-context (wic). In SemDeep-5@IJCAI 2019, page forthcoming

work page 2019
[17]

Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, and Baobao Chang. 2018 a . https://www.aclweb.org/anthology/D18-1170 Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1402--1411, Brussels, Belgium. Associat...

work page 2018
[18]

Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, and Zhifang Sui. 2018 b . https://www.aclweb.org/anthology/P18-1230 Incorporating glosses into neural word sense disambiguation . In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2473--2482, Melbourne, Australia. Association for Comput...

work page 2018
[19]

Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. https://doi.org/10.18653/v1/K16-1006 context2vec: Learning generic context embedding with bidirectional LSTM . In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning , pages 51--61, Berlin, Germany. Association for Computational Linguistics

work page doi:10.18653/v1/k16-1006 2016
[20]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. http://dl.acm.org/citation.cfm?id=2999792.2999959 Distributed representations of words and phrases and their compositionality . In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS'13, pages 3111--3119, USA. Curran Associates Inc

work page arXiv 2013
[21]

Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G

George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. https://www.aclweb.org/anthology/H94-1046 Using a semantic concordance for sense identification . In HUMAN LANGUAGE TECHNOLOGY : Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994

work page 1994
[22]

Roberto Navigli. 2009. https://doi.org/10.1145/1459352.1459355 Word sense disambiguation: A survey . ACM Computing Surveys, 41(2):10:1--10:69

work page doi:10.1145/1459352.1459355 2009
[23]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. https://doi.org/10.18653/v1/N18-1202 Deep contextualized word representations . In Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Lon...

work page doi:10.18653/v1/n18-1202 2018
[24]

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. https://blog.openai.com/language-unsupervised/ Improving language understanding by generative pre-training

work page 2018
[25]

Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017 a . https://www.aclweb.org/anthology/E17-1010 Word sense disambiguation: A unified evaluation framework and empirical comparison . In Proceedings of the 15th Conference of the E uropean Chapter of the Association for Computational Linguistics: Volume 1, Long Papers , pages 99--110, Vale...

work page 2017
[26]

Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017 b . https://doi.org/10.18653/v1/D17-1120 Neural sequence learning models for word sense disambiguation . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1156--1167, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1120 2017
[27]

Philip Resnik. 1997. https://www.aclweb.org/anthology/W97-0209 Selectional preference and sense disambiguation . In Tagging Text with Lexical Semantics: Why, What, and How?

work page 1997
[28]

Sascha Rothe and Hinrich Sch \"u tze. 2015. https://doi.org/10.3115/v1/P15-1173 A uto E xtend: Extending word embeddings to embeddings for synsets and lexemes . In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages ...

work page doi:10.3115/v1/p15-1173 2015
[29]

Lo \" c Vial, Benjamin Lecouteux, and Didier Schwab. 2018. http://arxiv.org/abs/1811.00960 Improving the coverage and the generalization ability of neural word sense disambiguation through hypernymy and hyponymy relationships . CoRR, abs/1811.00960

work page internal anchor Pith review Pith/arXiv arXiv 2018
[30]

Dayu Yuan, Julian Richardson, Ryan Doherty, Colin Evans, and Eric Altendorf. 2016. https://www.aclweb.org/anthology/C16-1130 Semi-supervised word sense disambiguation with neural models . In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers , pages 1374--1385, Osaka, Japan. The COLING 2016 Organiz...

work page 2016
[31]

Zhi Zhong and Hwee Tou Ng. 2010. https://www.aclweb.org/anthology/P10-4014 It makes sense: A wide-coverage word sense disambiguation system for free text . In Proceedings of the ACL 2010 System Demonstrations , pages 78--83, Uppsala, Sweden. Association for Computational Linguistics

work page 2010
[32]

URL: " 'urlintro :=

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

work page
[33]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[1] [1]

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. https://openreview.net/forum?id=SyK00v5xx A simple but tough-to-beat baseline for sentence embeddings . In International Conference on Learning Representations (ICLR)

work page 2017

[2] [2]

Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro. 2014. https://www.aclweb.org/anthology/C14-1151 An enhanced L esk word sense disambiguation algorithm through a distributional semantic model . In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers , pages 1591--1600, Dublin, Ireland. Dublin...

work page 2014

[3] [3]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. https://doi.org/10.1162/tacl_a_00051 Enriching word vectors with subword information . Transactions of the Association for Computational Linguistics, 5:135--146

work page doi:10.1162/tacl_a_00051 2017

[4] [4]

Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. http://dl.acm.org/citation.cfm?id=3157382.3157584 Man is to computer programmer as woman is to homemaker? debiasing word embeddings . In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS'16, pages 4356--4364, USA. Curran Asso...

work page arXiv 2016

[5] [5]

Bryson, and Arvind Narayanan

Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. https://doi.org/10.1126/science.aal4230 Semantics derived automatically from language corpora contain human-like biases . Science, 356(6334):183--186

work page doi:10.1126/science.aal4230 2017

[6] [6]

Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. https://doi.org/10.1613/jair.1.11259 From word to sense embeddings: A survey on vector representations of meaning . J. Artif. Int. Res., 63(1):743--788

work page doi:10.1613/jair.1.11259 2018

[7] [7]

Jose Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. https://doi.org/https://doi.org/10.1016/j.artint.2016.07.005 Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities . Artificial Intelligence, 240:36 -- 64

work page doi:10.1016/j.artint.2016.07.005 2016

[8] [8]

Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. 2014. https://doi.org/10.3115/v1/D14-1110 A unified model for word sense representation and disambiguation . In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing ( EMNLP ) , pages 1025--1035, Doha, Qatar. Association for Computational Linguistics

work page doi:10.3115/v1/d14-1110 2014

[9] [9]

Jacob Devlin, Ming - Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. http://arxiv.org/abs/1810.04805v1 BERT: pre-training of deep bidirectional transformers for language understanding . CoRR, abs/1810.04805v1

work page internal anchor Pith review Pith/arXiv arXiv 2018

[10] [10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://www.aclweb.org/anthology/N19-1423 BERT : Pre-training of deep bidirectional transformers for language understanding . In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (L...

work page 2019

[11] [11]

Christiane Fellbaum. 1998. In WordNet : an electronic lexical database. MIT Press

work page 1998

[12] [12]

Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. https://doi.org/10.18653/v1/P16-1085 Embeddings for word sense disambiguation: An evaluation study . In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 897--907, Berlin, Germany. Association for Computational Linguistics

work page doi:10.18653/v1/p16-1085 2016

[13] [13]

Minh Le, Marten Postma, Jacopo Urbani, and Piek Vossen. 2018. https://www.aclweb.org/anthology/C18-1030 A deep dive into word sense disambiguation with LSTM . In Proceedings of the 27th International Conference on Computational Linguistics, pages 354--365, Santa Fe, New Mexico, USA. Association for Computational Linguistics

work page 2018

[14] [14]

Doug Lenat, Mayank Prakash, and Mary Shepherd. 1986. http://dl.acm.org/citation.cfm?id=13432.13435 Cyc: Using common sense knowledge to overcome brittleness and knowledge acquistion bottlenecks . AI Mag., 6(4):65--85

work page arXiv 1986

[15] [15]

Michael Lesk. 1986. https://doi.org/10.1145/318723.318728 Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone . In Proceedings of the 5th Annual International Conference on Systems Documentation, SIGDOC '86, pages 24--26, New York, NY, USA. ACM

work page doi:10.1145/318723.318728 1986

[16] [16]

Daniel Loureiro and Al \' pio M \'a rio Jorge. 2019. Liaad at semdeep-5 challenge: Word-in-context (wic). In SemDeep-5@IJCAI 2019, page forthcoming

work page 2019

[17] [17]

Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, and Baobao Chang. 2018 a . https://www.aclweb.org/anthology/D18-1170 Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1402--1411, Brussels, Belgium. Associat...

work page 2018

[18] [18]

Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, and Zhifang Sui. 2018 b . https://www.aclweb.org/anthology/P18-1230 Incorporating glosses into neural word sense disambiguation . In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2473--2482, Melbourne, Australia. Association for Comput...

work page 2018

[19] [19]

Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. https://doi.org/10.18653/v1/K16-1006 context2vec: Learning generic context embedding with bidirectional LSTM . In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning , pages 51--61, Berlin, Germany. Association for Computational Linguistics

work page doi:10.18653/v1/k16-1006 2016

[20] [20]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. http://dl.acm.org/citation.cfm?id=2999792.2999959 Distributed representations of words and phrases and their compositionality . In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS'13, pages 3111--3119, USA. Curran Associates Inc

work page arXiv 2013

[21] [21]

Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G

George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. https://www.aclweb.org/anthology/H94-1046 Using a semantic concordance for sense identification . In HUMAN LANGUAGE TECHNOLOGY : Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994

work page 1994

[22] [22]

Roberto Navigli. 2009. https://doi.org/10.1145/1459352.1459355 Word sense disambiguation: A survey . ACM Computing Surveys, 41(2):10:1--10:69

work page doi:10.1145/1459352.1459355 2009

[23] [23]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. https://doi.org/10.18653/v1/N18-1202 Deep contextualized word representations . In Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Lon...

work page doi:10.18653/v1/n18-1202 2018

[24] [24]

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. https://blog.openai.com/language-unsupervised/ Improving language understanding by generative pre-training

work page 2018

[25] [25]

Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017 a . https://www.aclweb.org/anthology/E17-1010 Word sense disambiguation: A unified evaluation framework and empirical comparison . In Proceedings of the 15th Conference of the E uropean Chapter of the Association for Computational Linguistics: Volume 1, Long Papers , pages 99--110, Vale...

work page 2017

[26] [26]

Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017 b . https://doi.org/10.18653/v1/D17-1120 Neural sequence learning models for word sense disambiguation . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1156--1167, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1120 2017

[27] [27]

Philip Resnik. 1997. https://www.aclweb.org/anthology/W97-0209 Selectional preference and sense disambiguation . In Tagging Text with Lexical Semantics: Why, What, and How?

work page 1997

[28] [28]

Sascha Rothe and Hinrich Sch \"u tze. 2015. https://doi.org/10.3115/v1/P15-1173 A uto E xtend: Extending word embeddings to embeddings for synsets and lexemes . In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages ...

work page doi:10.3115/v1/p15-1173 2015

[29] [29]

Lo \" c Vial, Benjamin Lecouteux, and Didier Schwab. 2018. http://arxiv.org/abs/1811.00960 Improving the coverage and the generalization ability of neural word sense disambiguation through hypernymy and hyponymy relationships . CoRR, abs/1811.00960

work page internal anchor Pith review Pith/arXiv arXiv 2018

[30] [30]

Dayu Yuan, Julian Richardson, Ryan Doherty, Colin Evans, and Eric Altendorf. 2016. https://www.aclweb.org/anthology/C16-1130 Semi-supervised word sense disambiguation with neural models . In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers , pages 1374--1385, Osaka, Japan. The COLING 2016 Organiz...

work page 2016

[31] [31]

Zhi Zhong and Hwee Tou Ng. 2010. https://www.aclweb.org/anthology/P10-4014 It makes sense: A wide-coverage word sense disambiguation system for free text . In Proceedings of the ACL 2010 System Demonstrations , pages 78--83, Uppsala, Sweden. Association for Computational Linguistics

work page 2010

[32] [32]

URL: " 'urlintro :=

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

work page

[33] [33]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page