pith. sign in

arxiv: 1904.01038 · v1 · pith:26LC64CLnew · submitted 2019-04-01 · 💻 cs.CL

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

classification 💻 cs.CL
keywords modelingtoolkitfairseqfastgpussequencetrainingacross
0
0 comments X
read the original abstract

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found at https://www.youtube.com/watch?v=OtgDdWtHvto

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

    cs.LG 2022-08 conditional novelty 7.0

    LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.

  2. InCoder: A Generative Model for Code Infilling and Synthesis

    cs.SE 2022-04 unverdicted novelty 7.0

    InCoder is the first generative model to directly perform zero-shot code infilling via bidirectional context from a masked-then-appended training scheme, matching left-to-right models on synthesis while improving on t...

  3. Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation

    cs.SE 2025-03 unverdicted novelty 6.0

    CodeQ aggregates token rationales into code categories to enable global interpretability of LLMs, claiming over 50% entropy reduction and revealing model preference for syntactic cues plus human misalignment in a 37-p...

  4. HuggingFace's Transformers: State-of-the-art Natural Language Processing

    cs.CL 2019-10 accept novelty 6.0

    Hugging Face releases an open-source Python library that supplies a unified API and pretrained weights for major Transformer architectures used in natural language processing.

  5. Gated Memory Policy

    cs.RO 2026-04 unverdicted novelty 5.0

    GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.

  6. Video-guided Machine Translation with Global Video Context

    cs.CV 2026-04 unverdicted novelty 4.0

    A globally video-guided multimodal translation framework retrieves semantically related video segments with a vector database and applies attention mechanisms to improve subtitle translation accuracy in long videos.