Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang

URL http://jmlr · 2016

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

cs.CL · 2020-06-05 · unverdicted · novelty 7.0

DeBERTa improves BERT-style models by separating content and relative position in attention and adding absolute positions to the decoder, yielding consistent gains on NLU and NLG tasks and the first single-model superhuman score on SuperGLUE.

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

cs.CL · 2021-11-18 · accept · novelty 6.0

DeBERTaV3 improves DeBERTa by switching to replaced token detection pre-training and using gradient-disentangled embedding sharing, reaching 91.37% on GLUE and new SOTA on XNLI zero-shot.

citing papers explorer

Showing 2 of 2 citing papers.

DeBERTa: Decoding-enhanced BERT with Disentangled Attention cs.CL · 2020-06-05 · unverdicted · none · ref 23
DeBERTa improves BERT-style models by separating content and relative position in attention and adding absolute positions to the decoder, yielding consistent gains on NLU and NLG tasks and the first single-model superhuman score on SuperGLUE.
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing cs.CL · 2021-11-18 · accept · none · ref 15
DeBERTaV3 improves DeBERTa by switching to replaced token detection pre-training and using gradient-disentangled embedding sharing, reaching 91.37% on GLUE and new SOTA on XNLI zero-shot.

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang

fields

years

verdicts

representative citing papers

citing papers explorer