GlotWeb: Web indexing for minority languages

Abdullah Al Sefat, Amir Hossein Kargaran, François Yvon, Hinrich Schütze · 2026 · arXiv 4904.379288

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts

cs.CL · 2026-04-14 · unverdicted · novelty 7.0

GlotOCR Bench shows that OCR models perform well on fewer than 10 scripts and fail to generalize beyond about 30, with results tracking pretraining coverage and models hallucinating from known scripts on unfamiliar ones.

citing papers explorer

Showing 1 of 1 citing paper.

GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts cs.CL · 2026-04-14 · unverdicted · none · ref 52
GlotOCR Bench shows that OCR models perform well on fewer than 10 scripts and fail to generalize beyond about 30, with results tracking pretraining coverage and models hallucinating from known scripts on unfamiliar ones.

GlotWeb: Web indexing for minority languages

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer