Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Jonathan Shen , Patrick Nguyen , Yonghui Wu , Zhifeng Chen , Mia X. Chen , Ye Jia , Anjuli Kannan , Tara Sainath

show 83 more authors

Yuan Cao Chung-Cheng Chiu Yanzhang He Jan Chorowski Smit Hinsu Stella Laurenzo James Qin Orhan Firat Wolfgang Macherey Suyog Gupta Ankur Bapna Shuyuan Zhang Ruoming Pang Ron J. Weiss Rohit Prabhavalkar Qiao Liang Benoit Jacob Bowen Liang HyoukJoong Lee Ciprian Chelba S\'ebastien Jean Bo Li Melvin Johnson Rohan Anil Rajat Tibrewal Xiaobing Liu Akiko Eriguchi Navdeep Jaitly Naveen Ari Colin Cherry Parisa Haghani Otavio Good Youlong Cheng Raziel Alvarez Isaac Caswell Wei-Ning Hsu Zongheng Yang Kuan-Chieh Wang Ekaterina Gonina Katrin Tomanek Ben Vanik Zelin Wu Llion Jones Mike Schuster Yanping Huang Dehao Chen Kazuki Irie George Foster John Richardson Klaus Macherey Antoine Bruguier Heiga Zen Colin Raffel Shankar Kumar Kanishka Rao David Rybach Matthew Murray Vijayaditya Peddinti Maxim Krikun Michiel A. U. Bacchiani Thomas B. Jablin Rob Suderman Ian Williams Benjamin Lee Deepti Bhatia Justin Carlson Semih Yavuz Yu Zhang Ian McGraw Max Galkin Qi Ge Golan Pundak Chad Whipkey Todd Wang Uri Alon Dmitry Lepikhin Ye Tian Sara Sabour William Chan Shubham Toshniwal Baohua Liao Michael Nirschl Pat Rondon

Authors on Pith no claims yet

classification 💻 cs.LG stat.ML

keywords frameworklingvomodelsmodularofferingresearchsequence-to-sequenceadvanced

0 comments

read the original abstract

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
cs.CV 2022-06 unverdicted novelty 6.0

Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
LaMDA: Language Models for Dialog Applications
cs.CL 2022-01 unverdicted novelty 6.0

LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
cs.CL 2020-06 unverdicted novelty 6.0

GShard supplies automatic sharding and conditional computation support that enabled training a 600-billion-parameter multilingual translation model on thousands of TPUs with superior quality.