pith. machine review for the scientific record. sign in

arxiv: 1903.00802 · v1 · submitted 2019-03-03 · 💻 cs.LG · cs.CL· stat.ML

Recognition: unknown

Calibration of Encoder Decoder Models for Neural Machine Translation

Authors on Pith no claims yet
classification 💻 cs.LG cs.CLstat.ML
keywords calibrationmodelsbeam-searchmachineneuraltranslationaccuracyattention
0
0 comments X
read the original abstract

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning

    cs.CL 2026-04 unverdicted novelty 5.0

    Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.