pith. sign in

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it
abstract

State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks --- Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data --- 97.55\% accuracy for POS tagging and 91.21\% F1 for NER.

fields

cs.CL 2 cs.AR 1

years

2025 2 2019 1

verdicts

UNVERDICTED 3

representative citing papers

LIMO: Less is More for Reasoning

cs.CL · 2025-02-05 · unverdicted · novelty 6.0

LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.

To Tune or Not To Tune? How About the Best of Both Worlds?

cs.CL · 2019-07-09 · unverdicted · novelty 3.0

A sequential fine-tuning strategy for pre-trained language models reports modest accuracy gains of 4.7%, 0.99%, and 0.72% on semantic similarity, sequence labeling, and text classification tasks.

citing papers explorer

Showing 3 of 3 citing papers.

  • LIMO: Less is More for Reasoning cs.CL · 2025-02-05 · unverdicted · none · ref 166 · internal anchor

    LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.

  • ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators cs.AR · 2025-12-10 · unverdicted · none · ref 22 · internal anchor

    ODMA raises KV-cache utilization by up to 19.25% and throughput by 23-27% on Cambricon MLU accelerators by dynamically adjusting prediction buckets and using a safety pool for LLM serving.

  • To Tune or Not To Tune? How About the Best of Both Worlds? cs.CL · 2019-07-09 · unverdicted · none · ref 33 · internal anchor

    A sequential fine-tuning strategy for pre-trained language models reports modest accuracy gains of 4.7%, 0.99%, and 0.72% on semantic similarity, sequence labeling, and text classification tasks.