pith. sign in

arxiv: 1810.01064 · v4 · pith:V3ZXDOHJnew · submitted 2018-10-02 · 💻 cs.CL · cs.LG· cs.NE

Improving Sentence Representations with Consensus Maximisation

classification 💻 cs.CL cs.LGcs.NE
keywords sentencelearningviewsdifferentconsensusdownstreamensemblemaximisation
0
0 comments X
read the original abstract

Consensus maximisation learning can provide self-supervision when different views are available of the same data. The distributional hypothesis provides another form of useful self-supervision from adjacent sentences which are plentiful in large unlabelled corpora. Motivated by the observation that different learning architectures tend to emphasise different aspects of sentence meaning, we present a new self-supervised learning framework for learning sentence representations which minimises the disagreement between two views of the same sentence where one view encodes the sentence with a recurrent neural network (RNN), and the other view encodes the same sentence with a simple linear model. After learning, the individual views (networks) result in higher quality sentence representations than their single-view learnt counterparts (learnt using only the distributional hypothesis) as judged by performance on standard downstream tasks. An ensemble of both views provides even better generalisation on both supervised and unsupervised downstream tasks. Also, importantly the ensemble of views trained with consensus maximisation between the two different architectures performs better on downstream tasks than an analogous ensemble made from the single-view trained counterparts.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning Compressed Sentence Representations for On-Device Text Processing

    cs.CL 2019-06 unverdicted novelty 5.0

    Four binarization strategies turn continuous sentence embeddings into binary form, cutting storage by over 98% with only about 2% performance drop on downstream tasks.