pith. machine review for the scientific record. sign in

arxiv: 1603.01354 · v5 · pith:ROBZ2NOMnew · submitted 2016-03-04 · 💻 cs.LG · cs.CL· stat.ML

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

classification 💻 cs.LG cs.CLstat.ML
keywords datalabelingsequencecorpusend-to-endpre-processingstate-of-the-artsystem
0
0 comments X
read the original abstract

State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks --- Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data --- 97.55\% accuracy for POS tagging and 91.21\% F1 for NER.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LIMO: Less is More for Reasoning

    cs.CL 2025-02 unverdicted novelty 6.0

    LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already ...

  2. ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators

    cs.AR 2025-12 unverdicted novelty 5.0

    ODMA raises KV-cache utilization by up to 19.25% and throughput by 23-27% on Cambricon MLU accelerators by dynamically adjusting prediction buckets and using a safety pool for LLM serving.