Neural Architectures for Named Entity Recognition

Chris Dyer; Guillaume Lample; Kazuya Kawakami; Miguel Ballesteros; Sandeep Subramanian

arxiv: 1603.01360 · v3 · pith:2RLWGX2Cnew · submitted 2016-03-04 · 💻 cs.CL

Neural Architectures for Named Entity Recognition

Guillaume Lample , Miguel Ballesteros , Sandeep Subramanian , Kazuya Kawakami , Chris Dyer This is my paper

classification 💻 cs.CL

keywords corporaentityknowledgelearnedmodelsnamedneuralrecognition

0 comments

read the original abstract

State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available. In this paper, we introduce two new neural architectures---one based on bidirectional LSTMs and conditional random fields, and the other that constructs and labels segments using a transition-based approach inspired by shift-reduce parsers. Our models rely on two sources of information about words: character-based word representations learned from the supervised corpus and unsupervised word representations learned from unannotated corpora. Our models obtain state-of-the-art performance in NER in four languages without resorting to any language-specific knowledge or resources such as gazetteers.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ConfusionPrompt: Practical Private Inference for Online Large Language Models
cs.CR 2023-12 unverdicted novelty 6.0

ConfusionPrompt enables private black-box LLM inference via prompt decomposition and pseudo-prompt mixing, claiming better privacy-utility trade-off than perturbation methods and lower memory use than open-source loca...
When Active Learning Falls Short: An Empirical Study on Chemical Reaction Extraction
cs.LG 2026-04 unverdicted novelty 5.0

Active learning for chemical reaction extraction frequently produces non-monotonic learning curves and fails to deliver stable gains over random sampling because of strong pretraining, structured CRF decoding, and lab...
To Tune or Not To Tune? How About the Best of Both Worlds?
cs.CL 2019-07 unverdicted novelty 3.0

A sequential fine-tuning strategy for pre-trained language models reports modest accuracy gains of 4.7%, 0.99%, and 0.72% on semantic similarity, sequence labeling, and text classification tasks.