pith. sign in

arxiv: 1906.10282 · v2 · pith:C6DF4TEGnew · submitted 2019-06-25 · 💻 cs.CL

Saliency-driven Word Alignment Interpretation for Neural Machine Translation

Pith reviewed 2026-05-25 17:13 UTC · model grok-4.3

classification 💻 cs.CL
keywords neural machine translationword alignmentsaliencyinterpretabilityTransformerforce decodingalignment quality
0
0 comments X

The pith

NMT models learn interpretable word alignments that saliency methods can extract.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that neural machine translation models, including Transformers, do learn word alignments even though they are often seen as not doing so. These alignments become visible only when using saliency-based interpretation techniques that measure how much each source word influences target word predictions. The methods work without changing the model and apply in both forced and free decoding. If correct, this means alignment information is already present in standard NMT training and can be recovered post-hoc for analysis or use.

Core claim

NMT models learn interpretable word alignments, revealed by saliency-driven interpretation methods. Under force decoding, these alignments exceed fast-align quality for some systems, and in free decoding they align well with automatic tools. The methods are model-agnostic and require no parameter updates.

What carries the argument

Saliency scores that quantify the contribution of each source word to the model's output predictions for target words.

Load-bearing premise

Saliency scores accurately reflect the word alignment information learned by the model rather than unrelated computational effects.

What would settle it

If saliency-based alignments show no better agreement with human or gold alignments than random baselines when compared to fast-align results.

Figures

Figures reproduced from arXiv: 1906.10282 by Hainan Xu, Philipp Koehn, Shuoyang Ding.

Figure 1
Figure 1. Figure 1: Comparison of our saliency-based word alignment interpretation of convolutional NMT model with reference and attention interpretation. in computer-aided translation. When aiming for the most accurate alignments, the state-of-the-art tools include GIZA++ (Brown et al. , 1993 ; Och and Ney , 2003) and fast-align (Dyer et al. , 2013), which are all external models invented in SMT era and need to be run as a s… view at source ↗
Figure 2
Figure 2. Figure 2: Saliency interpretation of FConv de-en model [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Saliency interpretation of Transformer de-en mod [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments. In this paper, we show that NMT models do learn interpretable word alignments, which could only be revealed with proper interpretation methods. We propose a series of such methods that are model-agnostic, are able to be applied either offline or online, and do not require parameter update or architectural change. We show that under the force decoding setup, the alignments induced by our interpretation method are of better quality than fast-align for some systems, and when performing free decoding, they agree well with the alignments induced by automatic alignment tools.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that NMT models (including Transformers) learn interpretable word alignments that can be recovered using a series of model-agnostic saliency-based interpretation methods applicable offline or online without parameter updates or architectural changes. Under forced decoding the induced alignments are reported to exceed fast-align quality for some systems; under free decoding they agree well with automatic alignment tools.

Significance. If the saliency methods are shown to isolate alignments actually learned by the model rather than gradient or perturbation artifacts, the result would be significant for NMT interpretability research by providing a practical way to inspect alignments in pre-trained models. The model-agnostic and no-retraining design is a clear strength that enables direct application to existing systems.

major comments (2)
  1. [Abstract / Experimental Results] Abstract and experimental sections: the central claim that saliency scores recover alignments the model has learned (rather than gradient saturation, decoder-state dependencies, or input-normalization artifacts) is load-bearing, yet the manuscript provides no ablations, random baselines, or controls that would distinguish these possibilities. This directly affects the force-decoding and free-decoding comparisons.
  2. [Experimental Results] The reported superiority over fast-align under forced decoding and agreement with automatic tools under free decoding lacks details on statistical significance testing, variance across multiple runs, or dataset-specific breakdowns, making it impossible to assess whether the differences are robust.
minor comments (2)
  1. [Methods] Notation for the different saliency variants (model-agnostic offline vs. online) should be introduced with explicit equations or pseudocode early in the methods section to improve readability.
  2. [Related Work] The paper should include a short related-work subsection contrasting the proposed saliency approach with prior gradient- or attention-based alignment extraction methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below, agreeing where the manuscript is lacking and outlining planned revisions.

read point-by-point responses
  1. Referee: [Abstract / Experimental Results] Abstract and experimental sections: the central claim that saliency scores recover alignments the model has learned (rather than gradient saturation, decoder-state dependencies, or input-normalization artifacts) is load-bearing, yet the manuscript provides no ablations, random baselines, or controls that would distinguish these possibilities. This directly affects the force-decoding and free-decoding comparisons.

    Authors: We acknowledge that the manuscript does not include explicit ablations, random baselines, or controls to isolate learned alignments from potential artifacts such as gradient saturation or decoder-state dependencies. While the reported comparisons to fast-align and automatic tools provide supporting evidence, they do not fully rule out these alternatives. In the revised version we will add random saliency baselines and targeted controls for decoder dependencies and input normalization. revision: yes

  2. Referee: [Experimental Results] The reported superiority over fast-align under forced decoding and agreement with automatic tools under free decoding lacks details on statistical significance testing, variance across multiple runs, or dataset-specific breakdowns, making it impossible to assess whether the differences are robust.

    Authors: The current results are presented as averages without statistical tests, variance, or per-dataset breakdowns. We will incorporate bootstrap significance testing, report standard deviations from multiple runs where feasible, and add dataset-specific result tables in the revision to allow assessment of robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: methods applied to pre-trained models with external comparisons

full rationale

The paper applies saliency-based interpretation techniques to existing pre-trained NMT models and evaluates the resulting alignments against independent external tools (fast-align and automatic aligners). No equations, parameters, or central claims are defined in terms of the paper's own outputs or fitted values. The derivation chain consists of standard gradient/perturbation computations followed by post-hoc comparison, with no self-definitional steps, fitted-input predictions, or load-bearing self-citations that reduce the result to the input by construction. This is the expected non-circular outcome for an interpretation study on fixed models.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that saliency-based scores faithfully reflect learned alignments; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Saliency methods applied to NMT models extract meaningful word alignment information
    The paper's claim that alignments can be revealed with proper interpretation methods depends on this assumption about the validity of saliency for alignment extraction.

pith-pipeline@v0.9.0 · 5641 in / 1096 out tokens · 28649 ms · 2026-05-25T17:13:34.304192+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 5 internal anchors

  1. [1]

    Tamer Alkhouli, Gabriel Bretschner, and Hermann Ney. 2018. https://aclanthology.info/papers/W18-6318/w18-6318 On the alignment problem in multi-head attention-based neural machine translation . In Proceedings of the Third Conference on Machine Translation: Research Papers, WMT 2018, Belgium, Brussels, October 31 - November 1, 2018 , pages 177--185

  2. [2]

    Tamer Alkhouli, Gabriel Bretschner, Jan - Thorsten Peter, Mohammed Hethnawi, Andreas Guta, and Hermann Ney. 2016. http://aclweb.org/anthology/W/W16/W16-2206.pdf Alignment-based neural machine translation . In Proceedings of the First Conference on Machine Translation, WMT 2016, colocated with ACL 2016, August 11-12, Berlin, Germany , pages 54--65

  3. [3]

    Mihael Arcan, Marco Turchi, Sara Tonelli, and Paul Buitelaar. 2014. Enhancing statistical machine translation with bilingual terminology in a cat environment. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2014), pages 54--68

  4. [4]

    Sebastian Bach, Alexander Binder, Gr \'e goire Montavon, Frederick Klauschen, Klaus-Robert M \"u ller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140

  5. [5]

    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. http://arxiv.org/abs/1409.0473 Neural machine translation by jointly learning to align and translate . CoRR, abs/1409.0473

  6. [6]

    Gosse Bouma and Yannick Parmentier, editors. 2014. http://aclweb.org/anthology/E/E14/ Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30, 2014, Gothenburg, Sweden . The Association for Computer Linguistics

  7. [7]

    Brown, Stephen Della Pietra, Vincent J

    Peter F. Brown, Stephen Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263--311

  8. [8]

    Le, and Oriol Vinyals

    William Chan, Navdeep Jaitly, Quoc V. Le, and Oriol Vinyals. 2016. https://doi.org/10.1109/ICASSP.2016.7472621 Listen, attend and spell: A neural network for large vocabulary conversational speech recognition . In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016 , pages 4960--4964

  9. [9]

    Huadong Chen, Shujian Huang, David Chiang, and Jiajun Chen. 2017. https://doi.org/10.18653/v1/P17-1177 Improved neural machine translation with a syntax-aware encoder and decoder . In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers , pages 1936--1945

  10. [10]

    Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. http://papers.nips.cc/paper/5847-attention-based-models-for-speech-recognition Attention-based models for speech recognition . In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 201...

  11. [11]

    Yanzhuo Ding, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. https://doi.org/10.18653/v1/P17-1106 Visualizing and understanding neural machine translation . In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers , pages 1150--1159

  12. [12]

    Chris Dyer, Victor Chahuneau, and Noah A. Smith. 2013. http://aclweb.org/anthology/N/N13/N13-1073.pdf A simple, fast, and effective reparameterization of IBM model 2 . In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta...

  13. [13]

    Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. http://proceedings.mlr.press/v70/gehring17a.html Convolutional sequence to sequence learning . In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 , pages 1243--1252

  14. [14]

    Hamidreza Ghader and Christof Monz. 2017. https://aclanthology.info/papers/I17-1004/i17-1004 What does attention in neural machine translation pay attention to? In Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP 2017, Taipei, Taiwan, November 27 - December 1, 2017 - Volume 1: Long Papers , pages 30--39

  15. [15]

    Fern, and Prasad Tadepalli

    Reza Ghaeini, Xiaoli Z. Fern, and Prasad Tadepalli. 2018. https://aclanthology.info/papers/D18-1537/d18-1537 Interpreting recurrent and attention-based neural models: a case study on natural language inference . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, page...

  16. [16]

    Eva Hasler, Adri \` a de Gispert, Gonzalo Iglesias, and Bill Byrne. 2018. https://aclanthology.info/papers/N18-2081/n18-2081 Neural machine translation decoding with terminology constraints . In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orl...

  17. [17]

    Philipp Koehn and Rebecca Knowles. 2017. https://aclanthology.info/papers/W17-3204/w17-3204 Six challenges for neural machine translation . In Proceedings of the First Workshop on Neural Machine Translation, NMT@ACL 2017, Vancouver, Canada, August 4, 2017, pages 28--39

  18. [18]

    Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. http://aclweb.org/anthology/N/N03/N03-1017.pdf Statistical phrase-based translation . In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2003, Edmonton, Canada, May 27 - June 1, 2003

  19. [19]

    Jaesong Lee, Joong - Hwi Shin, and Jun - Seok Kim. 2017. https://aclanthology.info/papers/D17-2021/d17-2021 Interactive visualization and manipulation of attention-based neural machine translation . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017 - System Demo...

  20. [20]

    Jo \" e l Legrand, Michael Auli, and Ronan Collobert. 2016. http://aclweb.org/anthology/W/W16/W16-2207.pdf Neural network-based word alignment through score aggregation . In Proceedings of the First Conference on Machine Translation, WMT 2016, colocated with ACL 2016, August 11-12, Berlin, Germany , pages 66--73

  21. [21]

    Hovy, and Dan Jurafsky

    Jiwei Li, Xinlei Chen, Eduard H. Hovy, and Dan Jurafsky. 2016. http://aclweb.org/anthology/N/N16/N16-1082.pdf Visualizing and understanding neural models in NLP . In NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016 , ...

  22. [22]

    Finch, and Eiichiro Sumita

    Lemao Liu, Masao Utiyama, Andrew M. Finch, and Eiichiro Sumita. 2016. http://aclweb.org/anthology/C/C16/C16-1291.pdf Neural machine translation with supervised attention . In COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan , pages 3093--3102

  23. [23]

    Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. http://aclweb.org/anthology/D/D15/D15-1166.pdf Effective approaches to attention-based neural machine translation . In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015 , pages 1412--1421

  24. [24]

    V Menon. 2015. Salience network

  25. [25]

    Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. http://aclweb.org/anthology/D/D16/D16-1249.pdf Supervised attentions for neural machine translation . In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016 , pages 2283--2288

  26. [26]

    Gr \' e goire Montavon, Wojciech Samek, and Klaus - Robert M \" u ller. 2018. https://doi.org/10.1016/j.dsp.2017.10.011 Methods for interpreting and understanding deep neural networks . Digital Signal Processing, 73:1--15

  27. [27]

    Nguyen and David Chiang

    Toan Q. Nguyen and David Chiang. 2018. https://aclanthology.info/papers/N18-1031/n18-1031 Improving lexical choice in neural machine translation . In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Vo...

  28. [28]

    Franz Josef Och and Hermann Ney. 2003. https://doi.org/10.1162/089120103321337421 A systematic comparison of various statistical alignment models . Computational Linguistics, 29(1):19--51

  29. [29]

    a ckstr \

    Ankur P. Parikh, Oscar T \" a ckstr \" o m, Dipanjan Das, and Jakob Uszkoreit. 2016. http://aclweb.org/anthology/D/D16/D16-1244.pdf A decomposable attention model for natural language inference . In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016 , pages 2249--2255

  30. [30]

    Alessandro Raganato and J \" o rg Tiedemann. 2018. https://aclanthology.info/papers/W18-5431/w18-5431 An analysis of encoder representations in transformer-based machine translation . In Proceedings of the Workshop: Analyzing and Interpreting Neural Networks for NLP, BlackboxNLP@EMNLP 2018, Brussels, Belgium, November 1, 2018, pages 287--297

  31. [31]

    Rush, Sumit Chopra, and Jason Weston

    Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. http://aclweb.org/anthology/D/D15/D15-1044.pdf A neural attention model for abstractive sentence summarization . In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015 , pages 379--389

  32. [32]

    Liu, and Christopher D

    Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. https://doi.org/10.18653/v1/P17-1099 Get to the point: Summarization with pointer-generator networks . In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers , pages 1073--1083

  33. [33]

    Richard M Shiffrin and Walter Schneider. 1977 a . Controlled and automatic human information processing: Ii. perceptual learning, automatic attending and a general theory. Psychological review, 84(2):127

  34. [34]

    Richard M Shiffrin and Walter Schneider. 1977 b . Controlled and automatic human information processing: Ii. perceptual learning, automatic attending and a general theory. Psychological review, 84(2):127

  35. [35]

    Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. http://arxiv.org/abs/1312.6034 Deep inside convolutional networks: Visualising image classification models and saliency maps . CoRR, abs/1312.6034

  36. [36]

    SmoothGrad: removing noise by adding noise

    Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda B. Vi \' e gas, and Martin Wattenberg. 2017. http://arxiv.org/abs/1706.03825 Smoothgrad: removing noise by adding noise . CoRR, abs/1706.03825

  37. [37]

    Striving for Simplicity: The All Convolutional Net

    Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin A. Riedmiller. 2014. http://arxiv.org/abs/1412.6806 Striving for simplicity: The all convolutional net . CoRR, abs/1412.6806

  38. [38]

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. http://proceedings.mlr.press/v70/sundararajan17a.html Axiomatic attribution for deep networks . In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 , pages 3319--3328

  39. [39]

    Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks Sequence to sequence learning with neural networks . In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada,...

  40. [40]

    Gongbo Tang, Mathias M \" u ller, Annette Rios, and Rico Sennrich. 2018 a . https://aclanthology.info/papers/D18-1458/d18-1458 Why self-attention? A targeted evaluation of neural machine translation architectures . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, p...

  41. [41]

    Gongbo Tang, Rico Sennrich, and Joakim Nivre. 2018 b . https://aclanthology.info/papers/W18-6304/w18-6304 An analysis of attention mechanisms: The case of word sense disambiguation in neural machine translation . In Proceedings of the Third Conference on Machine Translation: Research Papers, WMT 2018, Belgium, Brussels, October 31 - November 1, 2018 , pag...

  42. [42]

    Gomez, Lukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. http://papers.nips.cc/paper/7181-attention-is-all-you-need Attention is all you need . In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 201...

  43. [43]

    Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, and Hermann Ney. 2018. https://aclanthology.info/papers/P18-2060/p18-2060 Neural hidden markov model for machine translation . In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers , pages 377--382

  44. [44]

    Thomas Zenkel, Joern Wuebker, and John DeNero. 2019. http://arxiv.org/abs/1901.11359 Adding interpretable attention to neural translation models improves word alignment . CoRR, abs/1901.11359

  45. [45]

    URL: " 'urlintro :=

    ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

  46. [46]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...