Ernie: Enhanced representation through knowledge integration

Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu · 2019 · cs.CL · arXiv 1904.09223

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open full Pith review browse 7 citing papers arXiv PDF

abstract

We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration). Inspired by the masking strategy of BERT, ERNIE is designed to learn language representation enhanced by knowledge masking strategies, which includes entity-level masking and phrase-level masking. Entity-level strategy masks entities which are usually composed of multiple words.Phrase-level strategy masks the whole phrase which is composed of several words standing together as a conceptual unit.Experimental results show that ERNIE outperforms other baseline methods, achieving new state-of-the-art results on five Chinese natural language processing tasks including natural language inference, semantic similarity, named entity recognition, sentiment analysis and question answering. We also demonstrate that ERNIE has more powerful knowledge inference capacity on a cloze test.

representative citing papers

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

cs.CL · 2020-03-23 · conditional · novelty 8.0

ELECTRA replaces masked language modeling with replaced token detection, yielding contextual representations that outperform BERT at equal compute and match larger models like RoBERTa with far less compute.

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

cs.CL · 2020-06-05 · unverdicted · novelty 7.0

DeBERTa improves BERT-style models by separating content and relative position in attention and adding absolute positions to the decoder, yielding consistent gains on NLU and NLG tasks and the first single-model superhuman score on SuperGLUE.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

cs.CL · 2021-09-02 · conditional · novelty 6.0

CodeT5 adds identifier-aware pre-training and bimodal dual generation to a T5-style encoder-decoder, yielding better results on defect detection, clone detection, and code-to-text, text-to-code, and code-to-code tasks than prior encoder-only or decoder-only models.

Object Referring-Guided Scanpath Prediction with Perception-Enhanced Vision-Language Models

cs.CV · 2026-04-22 · unverdicted · novelty 6.0

ScanVLA uses a vision-language model with a history-enhanced decoder and frozen segmentation LoRA to outperform prior methods on object-referring scanpath prediction.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

cs.CL · 2019-07-26 · accept · novelty 5.0

With better hyperparameters, more data, and longer training, an unchanged BERT-Large architecture matches or exceeds XLNet and other successors on GLUE, SQuAD, and RACE.

SGR: A Stepwise Reasoning Framework for LLMs with External Subgraph Generation

cs.CL · 2026-05-15 · unverdicted · novelty 4.0

SGR enhances LLM reasoning accuracy by generating external subgraphs from knowledge bases and guiding progressive inference over them, yielding consistent gains over baselines on benchmarks.

To Tune or Not To Tune? How About the Best of Both Worlds?

cs.CL · 2019-07-09 · unverdicted · novelty 3.0

A sequential fine-tuning strategy for pre-trained language models reports modest accuracy gains of 4.7%, 0.99%, and 0.72% on semantic similarity, sequence labeling, and text classification tasks.

citing papers explorer

Showing 7 of 7 citing papers.

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators cs.CL · 2020-03-23 · conditional · none · ref 9 · internal anchor
ELECTRA replaces masked language modeling with replaced token detection, yielding contextual representations that outperform BERT at equal compute and match larger models like RoBERTa with far less compute.
DeBERTa: Decoding-enhanced BERT with Disentangled Attention cs.CL · 2020-06-05 · unverdicted · none · ref 30
DeBERTa improves BERT-style models by separating content and relative position in attention and adding absolute positions to the decoder, yielding consistent gains on NLU and NLG tasks and the first single-model superhuman score on SuperGLUE.
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation cs.CL · 2021-09-02 · conditional · none · ref 29 · internal anchor
CodeT5 adds identifier-aware pre-training and bimodal dual generation to a T5-style encoder-decoder, yielding better results on defect detection, clone detection, and code-to-text, text-to-code, and code-to-code tasks than prior encoder-only or decoder-only models.
Object Referring-Guided Scanpath Prediction with Perception-Enhanced Vision-Language Models cs.CV · 2026-04-22 · unverdicted · none · ref 35
ScanVLA uses a vision-language model with a history-enhanced decoder and frozen segmentation LoRA to outperform prior methods on object-referring scanpath prediction.
RoBERTa: A Robustly Optimized BERT Pretraining Approach cs.CL · 2019-07-26 · accept · none · ref 41
With better hyperparameters, more data, and longer training, an unchanged BERT-Large architecture matches or exceeds XLNet and other successors on GLUE, SQuAD, and RACE.
SGR: A Stepwise Reasoning Framework for LLMs with External Subgraph Generation cs.CL · 2026-05-15 · unverdicted · none · ref 9 · internal anchor
SGR enhances LLM reasoning accuracy by generating external subgraphs from knowledge bases and guiding progressive inference over them, yielding consistent gains over baselines on benchmarks.
To Tune or Not To Tune? How About the Best of Both Worlds? cs.CL · 2019-07-09 · unverdicted · none · ref 9 · internal anchor
A sequential fine-tuning strategy for pre-trained language models reports modest accuracy gains of 4.7%, 0.99%, and 0.72% on semantic similarity, sequence labeling, and text classification tasks.

Ernie: Enhanced representation through knowledge integration

fields

years

verdicts

representative citing papers

citing papers explorer