Character-level Convolutional Networks for Text Classification

Junbo Zhao; Xiang Zhang; Yann LeCun

arxiv: 1509.01626 · v3 · pith:APAUXL4Mnew · submitted 2015-09-04 · 💻 cs.LG · cs.CL

Character-level Convolutional Networks for Text Classification

Xiang Zhang , Junbo Zhao , Yann LeCun This is my paper

classification 💻 cs.LG cs.CL

keywords networkscharacter-levelconvolutionalclassificationconvnetsmodelstextachieve

0 comments

read the original abstract

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation
cs.LG 2026-05 unverdicted novelty 7.0

Uniform diffusion models rely on a leave-one-out denoiser rather than the usual denoising posterior, with exact conversions derived; an absorbing-state reformulation is introduced that matches or exceeds masked diffus...
Rethinking Vacuity for OOD Detection in Evidential Deep Learning
cs.AI 2026-05 accept novelty 7.0

Vacuity-based OOD detection in evidential deep learning is highly sensitive to class cardinality differences between ID and OOD, which can artificially inflate AUROC and AUPR without any change in model predictions.
Meta-Harness: End-to-End Optimization of Model Harnesses
cs.AI 2026-03 unverdicted novelty 7.0

Meta-Harness discovers improved harness code for LLMs via agentic search over prior execution traces, yielding 7.7-point gains on text classification with 4x fewer tokens and 4.7-point gains on math reasoning across h...
Preventing Safety Drift in Large Language Models via Coupled Weight and Activation Constraints
cs.AI 2026-04 unverdicted novelty 6.0

Coupled constraints on weight updates in a safety subspace and regularization of SAE-identified safety features preserve LLM refusal behaviors during fine-tuning better than weight-only or activation-only methods.
Simple synthetic data reduces sycophancy in large language models
cs.CL 2023-08 unverdicted novelty 6.0

Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.
Learning to Reformulate the Queries on the WEB
cs.IR 2019-07 unverdicted novelty 5.0

An unsupervised character-level CNN encoder with attention-based RNN decoder, trained on Clueweb09 anchor phrases, generates query reformulations that improve retrieval on TREC collections.
Green Prompting: Characterizing Prompt-driven Energy Costs of LLM Inference
cs.CL 2025-03 unverdicted novelty 4.0

Empirical tests on three LLMs show prompt semantics and task keywords drive inference energy costs more than length, with varying patterns by task.
UW-BHI at MEDIQA 2019: An Analysis of Representation Methods for Medical Natural Language Inference
cs.IR 2019-07 unverdicted novelty 2.0

Compares BERT, ESP, and Cui2Vec embeddings within ESIM on the MedNLI shared-task dataset to assess performance and internal representations for medical inference.