pith. sign in

arxiv: 1804.05958 · v1 · pith:B72EDDAInew · submitted 2018-04-16 · 💻 cs.CL · stat.ML

Can Neural Machine Translation be Improved with User Feedback?

classification 💻 cs.CL stat.ML
keywords feedbacktranslationmachineuserbanditcollectedexplicitimplicit
0
0 comments X
read the original abstract

We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments---five-star ratings of translation quality---and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning to summarize from human feedback

    cs.CL 2020-09 conditional novelty 7.0

    Reinforcement learning on a reward model trained from human summary comparisons produces summaries humans prefer over supervised fine-tuning or human references on TL;DR and transfers to CNN/DM.

  2. Aligning Text-to-Image Models using Human Feedback

    cs.LG 2023-02 unverdicted novelty 6.0

    A three-stage fine-tuning process uses human ratings to train a reward model and then improves text-to-image alignment by maximizing reward-weighted likelihood.