Neural Word Segmentation with Rich Pretraining

· 2017 · cs.CL · arXiv 1704.08960

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited richer sources of external information, such as punctuation, automatic segmentation and POS. We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources. Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks.

representative citing papers

Investigating Self-Attention Network for Chinese Word Segmentation

cs.CL · 2019-07-26 · unverdicted · novelty 4.0

Self-attention networks achieve competitive results to BiLSTM-CRF on Chinese word segmentation, with BERT and word integration yielding the best reported performance on six heterogeneous domain benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Investigating Self-Attention Network for Chinese Word Segmentation cs.CL · 2019-07-26 · unverdicted · none · ref 13 · internal anchor
Self-attention networks achieve competitive results to BiLSTM-CRF on Chinese word segmentation, with BERT and word integration yielding the best reported performance on six heterogeneous domain benchmarks.

Neural Word Segmentation with Rich Pretraining

fields

years

verdicts

representative citing papers

citing papers explorer