W iki M atrix: Mining 135 M Parallel Sentences in 1620 Language Pairs from W ikipedia

Holger Schwenk, Vishrav Chaudhary, Shuo Sun, Hongyu Gong · 2021 · DOI 10.18653/v1/2021.eacl-main.115

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Knowledge Beyond Language: Bridging the Gap in Multilingual Machine Unlearning Evaluation

cs.CL · 2026-05-14 · unverdicted · novelty 7.0

New metrics KSS and KPS are introduced to evaluate multilingual machine unlearning quality and cross-language consistency in LLMs, addressing limitations of single-language evaluation protocols.

How Good is Your Wikipedia? Auditing Data Quality for Low-resource and Multilingual NLP

cs.CL · 2024-11-08 · unverdicted · novelty 6.0

The study filters non-English Wikipedia, reveals quality problems, proposes a 4-level ranking, and shows filtered data matches or beats raw data in language modeling with largest gains for lower-quality editions.

citing papers explorer

Showing 2 of 2 citing papers.

Knowledge Beyond Language: Bridging the Gap in Multilingual Machine Unlearning Evaluation cs.CL · 2026-05-14 · unverdicted · none · ref 28
New metrics KSS and KPS are introduced to evaluate multilingual machine unlearning quality and cross-language consistency in LLMs, addressing limitations of single-language evaluation protocols.
How Good is Your Wikipedia? Auditing Data Quality for Low-resource and Multilingual NLP cs.CL · 2024-11-08 · unverdicted · none · ref 55
The study filters non-English Wikipedia, reveals quality problems, proposes a 4-level ranking, and shows filtered data matches or beats raw data in language modeling with largest gains for lower-quality editions.

W iki M atrix: Mining 135 M Parallel Sentences in 1620 Language Pairs from W ikipedia

fields

years

verdicts

representative citing papers

citing papers explorer