Bidirectional LSTM-CRF Models for Sequence Tagging

Zhiheng Huang , Wei Xu , Kai Yu

Authors on Pith no claims yet

classification 💻 cs.CL

keywords lstmbidirectionalbi-lstm-crflayermodelmodelssequencetagging

read the original abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Convolutional Neural Network-Derived Catalog of Solar Flares from Soft X-Ray Observations
astro-ph.SR 2026-04 unverdicted novelty 7.0

The CNN-derived catalog detects over seven times more solar flares than the GOES catalog and extends the power-law distribution of flare peak fluxes to smaller sizes.
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models
cs.IR 2026-04 unverdicted novelty 5.0

Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.
A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents
cs.CL 2026-04 unverdicted novelty 5.0

MODEE is a multimodal system that integrates graphs with LLM embeddings to outperform prior open-domain event extraction methods on large datasets.
TabEmb: Joint Semantic-Structure Embedding for Table Annotation
cs.LG 2026-04 unverdicted novelty 5.0

TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.
A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational Conditions
cs.LG 2026-04 unverdicted novelty 5.0

A multi-head attention fusion network integrates monotonic degradation trends, discrete operating state embeddings from clustering, and residual noise using BiLSTM and attention mechanisms to improve prognostic accura...
Beyond the Basics: Leveraging Large Language Model for Fine-Grained Medical Entity Recognition
cs.AI 2026-04 conditional novelty 4.0

Fine-tuned LLaMA3 with LoRA reaches 81.24% F1 on 18-category fine-grained medical entity recognition, beating zero-shot by 63.11% and few-shot by 35.63%.
A Multi-modal Fusion Network for Star-Galaxy Classification from CSST Simulated Datasets
astro-ph.IM 2026-04 unverdicted novelty 4.0

A ResNet-50 and BiLSTM multi-modal fusion network achieves 99.81% galaxy recall and 99.66% star recall on a CSST simulated dataset of 125,896 objects.