Offline bilingual word vectors, orthogonal transformations and the inverted softmax

David H. P. Turban; Nils Y. Hammerla; Samuel L. Smith; Steven Hamblin

Offline bilingual word vectors, orthogonal transformations and the inverted softmax

Not yet reviewed by Pith; the record is open.

Re-run · record.json Download PDF Read on arXiv ↗

This paper has not been read by Pith yet. Machine review is queued; the pith claim, tier, and objections will appear here once it completes.

SPECIMEN: schema-true, not a live event

T0 review · schema-true

One-sentence machine reading of the paper's core claim.

pith:XXXXXXXX · record.json · timestamp

arxiv 1702.03859 v1 pith:ERNGIL7Z submitted 2017-02-13 cs.CL cs.AIcs.IR

Offline bilingual word vectors, orthogonal transformations and the inverted softmax

Samuel L. Smith , David H. P. Turban , Steven Hamblin , Nils Y. Hammerla This is my paper

classification cs.CL cs.AIcs.IR

keywords transformationbilingualorthogonalprecisionenglishexpertinverteditalian

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

0 comments

read the original abstract

Usually bilingual word vectors are trained "online". Mikolov et al. showed they can also be found "offline", whereby two pre-trained embeddings are aligned with a linear transformation, using dictionaries compiled from expert knowledge. In this work, we prove that the linear transformation between two spaces should be orthogonal. This transformation can be obtained using the singular value decomposition. We introduce a novel "inverted softmax" for identifying translation pairs, with which we improve the precision @1 of Mikolov's original mapping from 34% to 43%, when translating a test set composed of both common and rare English words into Italian. Orthogonal transformations are more robust to noise, enabling us to learn the transformation without expert bilingual signal by constructing a "pseudo-dictionary" from the identical character strings which appear in both languages, achieving 40% precision on the same test set. Finally, we extend our method to retrieve the true translations of English sentences from a corpus of 200k Italian sentences with a precision @1 of 68%.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Unsupervised Adversarial Graph Alignment with Graph Embedding
cs.SI 2019-07 unverdicted novelty 6.0

UAGA aligns two graph embedding spaces via adversarial training in a fully unsupervised setting, with an incremental extension iUAGA that uses discovered pseudo-anchors to refine both embeddings and alignments.
Domain Fine-Tuning FinBERT on Finnish Histopathological Reports: Train-Time Signals and Downstream Correlations
cs.CL 2026-04 unverdicted novelty 4.0

Fine-tuning FinBERT on Finnish medical text produces embedding geometry shifts whose correlation with downstream performance the authors attempt to measure as a potential early signal for domain adaptation benefit.
Cross-lingual Data Transformation and Combination for Text Classification
cs.IR 2019-06 unverdicted novelty 3.0

Cross-lingual data combined via translation or aligned embeddings can improve performance of CNN and RNN text classifiers.