Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
Semantic Drift in Multilingual Representations
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
Word2Vec on Toki Pona shows distributional patterns suffice for semantic structure even at extreme vocabulary reduction, and incidental non-core tokens tighten rather than disrupt clusters.
citing papers explorer
-
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus
Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
-
Examining the Limits of Word2Vec with Toki Pona
Word2Vec on Toki Pona shows distributional patterns suffice for semantic structure even at extreme vocabulary reduction, and incidental non-core tokens tighten rather than disrupt clusters.