The curious case of neural text degeneration

Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi · 2020

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Vector-quantized Image Modeling with Improved VQGAN

cs.CV · 2021-10-09 · accept · novelty 6.0

Improved ViT-VQGAN enables autoregressive Transformer pretraining on ImageNet tokens to reach IS 175.1 and FID 4.17 for generation plus 73.2% linear-probe accuracy, beating prior iGPT models.

StarCoder: may the source be with you!

cs.CL · 2023-05-09 · accept · novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

MiniGPT: Rebuilding GPT from First Principles

cs.CL · 2026-05-17 · conditional · novelty 2.0

MiniGPT is a self-contained PyTorch implementation of standard GPT autoregressive modeling that reaches 1.478 validation loss on Tiny Shakespeare with a 10.77M-parameter model and produces recognizable Shakespeare-style text.

citing papers explorer

Showing 3 of 3 citing papers.

Vector-quantized Image Modeling with Improved VQGAN cs.CV · 2021-10-09 · accept · none · ref 35
Improved ViT-VQGAN enables autoregressive Transformer pretraining on ImageNet tokens to reach IS 175.1 and FID 4.17 for generation plus 73.2% linear-probe accuracy, beating prior iGPT models.
StarCoder: may the source be with you! cs.CL · 2023-05-09 · accept · none · ref 44
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
MiniGPT: Rebuilding GPT from First Principles cs.CL · 2026-05-17 · conditional · none · ref 57
MiniGPT is a self-contained PyTorch implementation of standard GPT autoregressive modeling that reaches 1.478 validation loss on Tiny Shakespeare with a 10.77M-parameter model and produces recognizable Shakespeare-style text.

The curious case of neural text degeneration

fields

years

verdicts

representative citing papers

citing papers explorer