A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Kazuma Hashimoto , Caiming Xiong , Yoshimasa Tsuruoka , Richard Socher

Authors on Pith no claims yet

classification 💻 cs.CL cs.AI

keywords tasksmodelsinglegrowingjointlinguisticmany-taskother

read the original abstract

Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end model obtains state-of-the-art or competitive results on five different tasks from tagging, parsing, relatedness, and entailment tasks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Multitask Prompted Training Enables Zero-Shot Task Generalization
cs.LG 2021-10 conditional novelty 7.0

Multitask fine-tuning of an encoder-decoder model on prompted datasets produces zero-shot generalization that often beats models up to 16 times larger on standard benchmarks.
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
cs.CL 2017-08 conditional novelty 7.0

Seq2SQL uses deep learning plus reinforcement learning to generate SQL from natural language, reaching 59.4% execution accuracy on the new WikiSQL dataset of 80k examples.