How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

Phillip Rust, Jonas Pfeiffer, Ivan Vuli ´c, Sebastian Ruder, Iryna Gurevych · 2021 · DOI 10.18653/v1/2021.acl-

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

cs.CL · 2026-04-29 · unverdicted · novelty 6.0 · 2 refs

Byte-level simulations show subword tokenization improves LLM training mainly via increased throughput and boundary priors.

Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models

cs.SE · 2026-04-28 · unverdicted · novelty 4.0

CTT is a compression pipeline for LLMs that achieves up to 49x memory reduction, 10x faster inference, 81% lower CO2 emissions, and retains 68-98% accuracy on code clone detection, summarization, and generation tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation cs.CL · 2026-04-29 · unverdicted · none · ref 29 · 2 links
Byte-level simulations show subword tokenization improves LLM training mainly via increased throughput and boundary priors.
Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models cs.SE · 2026-04-28 · unverdicted · none · ref 29
CTT is a compression pipeline for LLMs that achieves up to 49x memory reduction, 10x faster inference, 81% lower CO2 emissions, and retains 68-98% accuracy on code clone detection, summarization, and generation tasks.

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer