ForMaT is a new parallel corpus of 3,956 PDFs across 15 language pairs that preserves original layout metadata as a benchmark for visually-grounded multilingual translation.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ForMaT: Dataset for Visually-Grounded Multilingual PDF Translation
ForMaT is a new parallel corpus of 3,956 PDFs across 15 language pairs that preserves original layout metadata as a benchmark for visually-grounded multilingual translation.