Improbable bigrams expose vulnerabilities of incomplete to- kens in byte-level tokenizers.arXiv preprint arXiv:2410.23684, 2024

Eugene Jang, Kimin Lee, Jin-Woo Chung, Keuntae Park, Seungwon Shin · 2024 · arXiv 2410.23684

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

SPLIT: Cross-Lingual Empathy and Cultural Grounding in English and Ukrainian LLM Responses

cs.CL · 2026-07-02 · unverdicted · novelty 6.0

SPLIT benchmark finds Gemini-2.5-Flash and LLaMA-3.3-70B degrade in Ukrainian while DeepSeek-V3 stays stable, with weak human-AI agreement on cultural grounding.

citing papers explorer

Showing 1 of 1 citing paper.

SPLIT: Cross-Lingual Empathy and Cultural Grounding in English and Ukrainian LLM Responses cs.CL · 2026-07-02 · unverdicted · none · ref 23
SPLIT benchmark finds Gemini-2.5-Flash and LLaMA-3.3-70B degrade in Ukrainian while DeepSeek-V3 stays stable, with weak human-AI agreement on cultural grounding.

Improbable bigrams expose vulnerabilities of incomplete to- kens in byte-level tokenizers.arXiv preprint arXiv:2410.23684, 2024

fields

years

verdicts

representative citing papers

citing papers explorer