pith. sign in

How can we effectively expand the vocabulary of LLMs with 0.01 GB of target language text? Computational Linguistics, pp.\ 1--40, 11 2025 b

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CL 3

years

2026 2 2025 1

clear filters

representative citing papers

MultiHashFormer: Hash-based Generative Language Models

cs.CL · 2026-06-26 · unverdicted · novelty 7.0

MultiHashFormer enables hash-based autoregression in LMs by encoding tokens as multi-hash signatures, outperforming standard Transformers at 100M-3B scales while keeping parameter count constant for multilingual expansion.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • MultiHashFormer: Hash-based Generative Language Models cs.CL · 2026-06-26 · unverdicted · none · ref 55

    MultiHashFormer enables hash-based autoregression in LMs by encoding tokens as multi-hash signatures, outperforming standard Transformers at 100M-3B scales while keeping parameter count constant for multilingual expansion.

  • Multilinguality of Large Language Models From a Structural Perspective cs.CL · 2026-06-01 · unverdicted · none · ref 39

    Low-resource languages are structurally more different from English in LLMs than high- or mid-resource ones, and language-specific post-training alters structures while preserving inter-language relationships.