pith. machine review for the scientific record. sign in

arxiv: 1508.01991 · v1 · submitted 2015-08-09 · 💻 cs.CL

Recognition: unknown

Bidirectional LSTM-CRF Models for Sequence Tagging

Authors on Pith no claims yet
classification 💻 cs.CL
keywords lstmbidirectionalbi-lstm-crflayermodelmodelssequencetagging
0
0 comments X
read the original abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Convolutional Neural Network-Derived Catalog of Solar Flares from Soft X-Ray Observations

    astro-ph.SR 2026-04 unverdicted novelty 7.0

    The CNN-derived catalog detects over seven times more solar flares than the GOES catalog and extends the power-law distribution of flare peak fluxes to smaller sizes.

  2. Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models

    cs.IR 2026-04 unverdicted novelty 5.0

    Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.

  3. A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents

    cs.CL 2026-04 unverdicted novelty 5.0

    MODEE is a multimodal system that integrates graphs with LLM embeddings to outperform prior open-domain event extraction methods on large datasets.

  4. TabEmb: Joint Semantic-Structure Embedding for Table Annotation

    cs.LG 2026-04 unverdicted novelty 5.0

    TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.

  5. A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational Conditions

    cs.LG 2026-04 unverdicted novelty 5.0

    A multi-head attention fusion network integrates monotonic degradation trends, discrete operating state embeddings from clustering, and residual noise using BiLSTM and attention mechanisms to improve prognostic accura...

  6. Beyond the Basics: Leveraging Large Language Model for Fine-Grained Medical Entity Recognition

    cs.AI 2026-04 conditional novelty 4.0

    Fine-tuned LLaMA3 with LoRA reaches 81.24% F1 on 18-category fine-grained medical entity recognition, beating zero-shot by 63.11% and few-shot by 35.63%.

  7. A Multi-modal Fusion Network for Star-Galaxy Classification from CSST Simulated Datasets

    astro-ph.IM 2026-04 unverdicted novelty 4.0

    A ResNet-50 and BiLSTM multi-modal fusion network achieves 99.81% galaxy recall and 99.66% star recall on a CSST simulated dataset of 125,896 objects.