pith. sign in

arxiv: 1608.03542 · v2 · pith:33HA3E34new · submitted 2016-08-11 · 💻 cs.CL

WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

classification 💻 cs.CL
keywords taskclassificationextractionlanguagelarge-scalemodelmodelsrich
0
0 comments X
read the original abstract

We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8%.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation

    cs.CL 2026-05 unverdicted novelty 5.0

    DebiasRAG uses a three-stage RAG process to generate and rerank query-specific debiasing contexts that act as fairness constraints for LLM outputs.

  2. Short-term Electric Load Forecasting Using TensorFlow and Deep Auto-Encoders

    eess.SP 2019-07 unverdicted novelty 2.0

    A TensorFlow-based deep auto-encoder model is proposed for short-term electric load forecasting and claimed to outperform traditional neural networks in accuracy and stability.