Bidirectional LSTM-CRF Models for Sequence Tagging

Kai Yu; Wei Xu; Zhiheng Huang

Bidirectional LSTM-CRF Models for Sequence Tagging

Not yet reviewed by Pith; the record is open.

Re-run · record.json Download PDF Read on arXiv ↗

This paper has not been read by Pith yet. Machine review is queued; the pith claim, tier, and objections will appear here once it completes.

SPECIMEN: schema-true, not a live event

T0 review · schema-true

One-sentence machine reading of the paper's core claim.

pith:XXXXXXXX · record.json · timestamp

arxiv 1508.01991 v1 pith:QX4NG7LV submitted 2015-08-09 cs.CL

Bidirectional LSTM-CRF Models for Sequence Tagging

Zhiheng Huang , Wei Xu , Kai Yu This is my paper

classification cs.CL

keywords lstmbidirectionalbi-lstm-crflayermodelmodelssequencetagging

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

0 comments

read the original abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

discussion (0)

Forward citations

Cited by 21 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents
cs.CL 2026-05 unverdicted novelty 7.0

A dataset-agnostic framework converts text tool-calling benchmarks to paired audio versions via TTS and noise, showing model-dependent performance with small text-to-voice gaps of 1.8-4.8 points on Confetti and When2Call.
A Convolutional Neural Network-Derived Catalog of Solar Flares from Soft X-Ray Observations
astro-ph.SR 2026-04 unverdicted novelty 7.0

The CNN-derived catalog detects over seven times more solar flares than the GOES catalog and extends the power-law distribution of flare peak fluxes to smaller sizes.
Utility-Preserving De-Identification for Math Tutoring: Investigating Numeric Ambiguity in the MathEd-PII Benchmark Dataset
cs.CL 2026-02 unverdicted novelty 7.0

The MathEd-PII benchmark shows that math-aware and segment-aware LLM prompting raises PII detection F1 from 0.379 to 0.821 while cutting false redactions of instructional numbers.
MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data
cs.CL 2026-06 unverdicted novelty 6.0

MindAlign decodes inner speech from fMRI via subject-specific neural-semantic alignment into a multimodal space followed by prompting of a frozen LM, outperforming baselines and generalizing across subjects.
From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents
cs.CL 2026-05 unverdicted novelty 6.0

A dataset-agnostic framework converts text tool-calling benchmarks to paired audio evaluations via TTS, speaker variation and noise, then evaluates seven omni-modal models showing model- and task-dependent performance...
Optimizing Chlorination in Water Distribution Systems via Surrogate-assisted Neuroevolution
cs.NE 2026-02 unverdicted novelty 6.0

Surrogate-assisted neuroevolution produces Pareto-optimal chlorine dosing policies for water distribution systems that outperform PPO on four practical objectives.
A Survey on Vision-Language-Action Models for Embodied AI
cs.RO 2024-05 unverdicted novelty 6.0

This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
cs.LG 2020-12 unverdicted novelty 6.0

TabTransformer uses Transformer self-attention to generate contextual embeddings from categorical features in tabular data, outperforming prior deep learning methods by at least 1% mean AUC and matching tree-based ens...
Approximate Inference in Structured Instances with Noisy Categorical Observations
cs.LG 2019-06 unverdicted novelty 6.0

Approximate algorithm for categorical structured inference with noisy observations achieves Hamming error logarithmic in the number of categories, generalizing prior binary-label results.
Olfactory-Inspired Sparse Combinatorial Coding for Low-Resource Named Entity Recognition
cs.CL 2026-06 unverdicted novelty 5.0

A biologically inspired receptor-glomerular bottleneck improves F1 scores for low-resource NER on six multilingual datasets when trained from scratch, with largest gains in Bangla and Telugu.
Pareto-Guided Teacher Alignment for Fair Personalized Text Generation
cs.CL 2026-06 unverdicted novelty 5.0

Fairness mitigation in personalized text generation is objective-dependent with methods occupying different regions of the fairness-personalization Pareto frontier rather than any single strategy dominating all objectives.
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models
cs.IR 2026-04 unverdicted novelty 5.0

Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.
A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents
cs.CL 2026-04 unverdicted novelty 5.0

MODEE is a multimodal system that integrates graphs with LLM embeddings to outperform prior open-domain event extraction methods on large datasets.
TabEmb: Joint Semantic-Structure Embedding for Table Annotation
cs.LG 2026-04 unverdicted novelty 5.0

TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.
A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational Conditions
cs.LG 2026-04 unverdicted novelty 5.0

A multi-head attention fusion network integrates monotonic degradation trends, discrete operating state embeddings from clustering, and residual noise using BiLSTM and attention mechanisms to improve prognostic accura...
Utilizing Pre-trained and Large Language Models for 10-K Items Segmentation
q-fin.GN 2025-02 unverdicted novelty 5.0

BERT4ItemSeg reaches macro-F1 of 0.9825 on core 10-K items across 3,737 annotated reports, outperforming GPT4ItemSeg (0.9567) and baselines.
Eliciting Knowledge from Experts:Automatic Transcript Parsing for Cognitive Task Analysis
cs.CL 2019-06 unverdicted novelty 5.0

Introduces a weakly-supervised framework partitioning CTA transcript parsing into sequence labeling and text span-pair relation extraction using distant supervision from protocols and neighbor sentences for long-range...
Beyond the Basics: Leveraging Large Language Model for Fine-Grained Medical Entity Recognition
cs.AI 2026-04 conditional novelty 4.0

Fine-tuned LLaMA3 with LoRA reaches 81.24% F1 on 18-category fine-grained medical entity recognition, beating zero-shot by 63.11% and few-shot by 35.63%.
A Multi-modal Fusion Network for Star-Galaxy Classification from CSST Simulated Datasets
astro-ph.IM 2026-04 unverdicted novelty 4.0

A ResNet-50 and BiLSTM multi-modal fusion network achieves 99.81% galaxy recall and 99.66% star recall on a CSST simulated dataset of 125,896 objects.
Short Text Conversation Based on Deep Neural Network and Analysis on Evaluation Measures
cs.CL 2019-07 unverdicted novelty 4.0

Hierarchical DNN models with BERT outperform prior models on DQ and ND subtasks using non-traditional metrics NMD, RSNOD, JSD, and RNSS, plus analysis of traditional metrics.
Rare Disease Detection by Sequence Modeling with Generative Adversarial Networks
cs.LG 2019-07 unverdicted novelty 4.0

A GAN-boosted RNN model reaches 0.56 PR-AUC for rare EPI detection on 1.8 million patients and outperforms benchmarks.