pith. sign in

arxiv: 1905.10892 · v1 · pith:E7QQY2TInew · submitted 2019-05-26 · 💻 cs.CL

Extreme Multi-Label Legal Text Classification: A case study in EU Legislation

classification 💻 cs.CL
keywords multi-labelattentionbigrusclassificationdataseteurlexextremelabel-wise
0
0 comments X
read the original abstract

We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union's public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.