pith. sign in

arxiv: 1907.04618 · v1 · pith:ZZH3SZTLnew · submitted 2019-07-10 · 💻 cs.CL

Lingua Custodia at WMT'19: Attempts to Control Terminology

Pith reviewed 2026-05-24 23:59 UTC · model grok-4.3

classification 💻 cs.CL
keywords machine translationterminology controlconstrained decodingbacktranslationWMT shared taskGerman-French translationEU elections domain
0
0 comments X

The pith

Backtranslation with constrained decoding guarantees correct translation of specific unseen terms in machine translation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports experiments adapting machine translation terminology for a German-to-French news task on EU elections, a topic with no provided in-domain parallel training data. The core method generates backtranslations using a decoding approach that inserts constraints to force accurate rendering of particular entities such as political parties and person names. This targets cases where the needed terms do not appear in the training data. A reader would care because the technique offers a way to steer output terminology for restricted domains without collecting new parallel text.

Core claim

Our primary submission to the shared task uses backtranslation generated with a type of decoding allowing the insertion of constraints in the output in order to guarantee the correct translation of specific terms that are not necessarily observed in the data.

What carries the argument

Constrained decoding during backtranslation that permits insertion of required terms into the generated output.

If this is right

  • Terminology can be adapted to a narrow topic such as EU elections without any in-domain parallel data.
  • Specific entities including political parties and person names receive more accurate translations than would be produced by unconstrained generation.
  • The same constrained backtranslation pipeline applies to the German-to-French news translation direction.
  • Terms absent from training data can still be rendered correctly by direct insertion during decoding.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same constraint mechanism could be tested on other language pairs where terminology consistency matters more than raw fluency.
  • If term lists can be extracted automatically from source text or glossaries, the method might reduce reliance on manual term identification.
  • Combining the constrained decoder with later fine-tuning steps could produce additive gains on domain-specific accuracy.

Load-bearing premise

The approach assumes that the specific terms requiring constraints can be reliably identified in advance and that forcing their inclusion via constrained decoding will not degrade overall translation quality, fluency, or adequacy on the rest of the sentence.

What would settle it

A side-by-side comparison in which the same backtranslation model is run once with constraints and once without, then measured for whether the constrained version produces lower automatic scores or human judgments on overall sentence quality.

read the original abstract

This paper describes Lingua Custodia's submission to the WMT'19 news shared task for German-to-French on the topic of the EU elections. We report experiments on the adaptation of the terminology of a machine translation system to a specific topic, aimed at providing more accurate translations of specific entities like political parties and person names, given that the shared task provided no in-domain training parallel data dealing with the restricted topic. Our primary submission to the shared task uses backtranslation generated with a type of decoding allowing the insertion of constraints in the output in order to guarantee the correct translation of specific terms that are not necessarily observed in the data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript describes Lingua Custodia's submission to the WMT'19 German-to-French news translation shared task on the EU elections topic. It reports experiments on terminology adaptation via backtranslation generated with constrained decoding, with the goal of forcing correct translations of specific entities (political parties, person names) that are absent from the provided parallel training data.

Significance. If the constrained-decoding backtranslation is shown to preserve overall translation quality while guaranteeing term accuracy, the approach would supply a practical, deployable technique for domain-specific terminology control when in-domain parallel data are unavailable. The work is a standard system-description contribution to a shared-task track.

major comments (2)
  1. [Abstract] Abstract: the central claim that constrained decoding 'guarantees the correct translation of specific terms that are not necessarily observed in the data' is presented without any quantitative support (BLEU, TER, or human adequacy/fluency scores) comparing the constrained backtranslation to an unconstrained baseline; this absence prevents verification that the synthetic data remain usable for final model training.
  2. [Method description] Method description (primary submission paragraph): no ablation or diagnostic is reported on the interaction between forced term insertion and the remainder of the sentence; without such analysis it is impossible to confirm the weakest assumption that constraint placement does not degrade fluency or adequacy on non-constrained tokens.
minor comments (1)
  1. The manuscript should include the official WMT automatic and human evaluation scores for the submitted system, together with a brief comparison to the unconstrained backtranslation baseline used in the same pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the comments on our WMT'19 system description. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that constrained decoding 'guarantees the correct translation of specific terms that are not necessarily observed in the data' is presented without any quantitative support (BLEU, TER, or human adequacy/fluency scores) comparing the constrained backtranslation to an unconstrained baseline; this absence prevents verification that the synthetic data remain usable for final model training.

    Authors: The guarantee is a direct consequence of the constrained decoding procedure, which forces inclusion of the specified terms regardless of their presence in training data. The manuscript reports the final system performance on the shared-task test set, which was trained using the constrained backtranslations; this serves as the quantitative evidence of usability. No separate BLEU comparison isolating the backtranslation step was conducted, as the paper is a system description rather than an ablation study. We will revise the abstract to explicitly link the claim to the reported final scores. revision: yes

  2. Referee: [Method description] Method description (primary submission paragraph): no ablation or diagnostic is reported on the interaction between forced term insertion and the remainder of the sentence; without such analysis it is impossible to confirm the weakest assumption that constraint placement does not degrade fluency or adequacy on non-constrained tokens.

    Authors: We agree that a targeted diagnostic on how constraints affect surrounding tokens would be informative. As a concise shared-task system paper, the manuscript prioritizes description of the submitted pipeline and its overall results over internal ablations. The final system scores provide an indirect check on overall quality. We will add a short limitations paragraph in the revision acknowledging the absence of such diagnostics. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system description with no derivations or self-referential reductions.

full rationale

The paper is a WMT shared-task system description that reports an engineering pipeline (backtranslation + constrained decoding for terminology). No equations, fitted parameters, uniqueness theorems, or ansatzes are present. The central claim is an empirical assertion about term insertion, not a derivation that reduces to its own inputs by construction. No self-citation load-bearing steps or renaming of known results occur. The work is self-contained as a practical report; absence of ablations is a separate evidence question, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical machine translation system paper; the abstract contains no mathematical derivations, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5622 in / 1091 out tokens · 21631 ms · 2026-05-24T23:59:18.225864+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.