pith. machine review for the scientific record. sign in

arxiv: 1611.01587 · v5 · submitted 2016-11-05 · 💻 cs.CL · cs.AI

Recognition: unknown

A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Authors on Pith no claims yet
classification 💻 cs.CL cs.AI
keywords tasksmodelsinglegrowingjointlinguisticmany-taskother
0
0 comments X
read the original abstract

Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end model obtains state-of-the-art or competitive results on five different tasks from tagging, parsing, relatedness, and entailment tasks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Multitask Prompted Training Enables Zero-Shot Task Generalization

    cs.LG 2021-10 conditional novelty 7.0

    Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.

  2. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

    cs.CL 2017-08 conditional novelty 7.0

    Seq2SQL uses deep learning plus reinforcement learning to generate SQL from natural language, reaching 59.4% execution accuracy on the new WikiSQL dataset of 80k examples.