Recognition: unknown
Bidirectional LSTM-CRF Models for Sequence Tagging
read the original abstract
In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.
This paper has not been read by Pith yet.
Forward citations
Cited by 7 Pith papers
-
A Convolutional Neural Network-Derived Catalog of Solar Flares from Soft X-Ray Observations
The CNN-derived catalog detects over seven times more solar flares than the GOES catalog and extends the power-law distribution of flare peak fluxes to smaller sizes.
-
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models
Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.
-
A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents
MODEE is a multimodal system that integrates graphs with LLM embeddings to outperform prior open-domain event extraction methods on large datasets.
-
TabEmb: Joint Semantic-Structure Embedding for Table Annotation
TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.
-
A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational Conditions
A multi-head attention fusion network integrates monotonic degradation trends, discrete operating state embeddings from clustering, and residual noise using BiLSTM and attention mechanisms to improve prognostic accura...
-
Beyond the Basics: Leveraging Large Language Model for Fine-Grained Medical Entity Recognition
Fine-tuned LLaMA3 with LoRA reaches 81.24% F1 on 18-category fine-grained medical entity recognition, beating zero-shot by 63.11% and few-shot by 35.63%.
-
A Multi-modal Fusion Network for Star-Galaxy Classification from CSST Simulated Datasets
A ResNet-50 and BiLSTM multi-modal fusion network achieves 99.81% galaxy recall and 99.66% star recall on a CSST simulated dataset of 125,896 objects.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.