Energy and policy considerations for deep learning in nlp

Emma Strubell, Ananya Ganesh, Andrew McCallum · 2019

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CL · 2019-10-02 · unverdicted · novelty 6.0

DistilBERT compresses BERT by 40% via pre-training distillation with a triple loss, retaining 97% performance and running 60% faster.

Showing 1 of 1 citing paper.

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter cs.CL · 2019-10-02 · unverdicted · none · ref 32
DistilBERT compresses BERT by 40% via pre-training distillation with a triple loss, retaining 97% performance and running 60% faster.