The CUED's Grammatical Error Correction Systems for BEA-2019
Pith reviewed 2026-05-25 13:11 UTC · model grok-4.3
The pith
Two grammatical error correction systems are described for the BEA-2019 shared task: one hybrid FST-NLM for low resources and one purely neural NMT-LM for restricted settings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A hybrid system based on finite state transducers together with strong neural language models is submitted for the low-resource track while a purely neural system of neural language models and neural machine translation models trained with back-translation plus checkpoint averaging and fine-tuning is submitted for the restricted track without any additional tools.
What carries the argument
Finite state transducers paired with neural language models for the low-resource track and neural machine translation models trained via back-translation with checkpoint averaging and fine-tuning for the restricted track.
If this is right
- The hybrid FST-NLM approach enables participation in the low-resource track.
- Purely neural NMT and LM components can be assembled into a working restricted-track system without spell checkers or other tools.
- Back-translation combined with checkpoint averaging and fine-tuning supports training of the neural models.
- The restricted-track system can be merged into a larger system combination entry.
Where Pith is reading between the lines
- Back-translation may reduce reliance on manually annotated error-correction data for building GEC models.
- Checkpoint averaging after fine-tuning could stabilize outputs across different neural GEC architectures.
- The same hybrid and pure-neural recipes might transfer to other text-correction tasks that have similar resource constraints.
Load-bearing premise
That the listed training procedures and component choices will produce functional entries for the shared task when implemented as described.
What would settle it
The submitted systems produce no measurable improvement in grammatical error correction accuracy on the BEA-2019 test data compared with a no-correction baseline.
read the original abstract
We describe two entries from the Cambridge University Engineering Department to the BEA 2019 Shared Task on grammatical error correction. Our submission to the low-resource track is based on prior work on using finite state transducers together with strong neural language models. Our system for the restricted track is a purely neural system consisting of neural language models and neural machine translation models trained with back-translation and a combination of checkpoint averaging and fine-tuning -- without the help of any additional tools like spell checkers. The latter system has been used inside a separate system combination entry in cooperation with the Cambridge University Computer Lab.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper describes two grammatical error correction systems submitted by the Cambridge University Engineering Department to the BEA-2019 Shared Task. The low-resource track submission uses finite state transducers with strong neural language models based on prior work. The restricted track submission is a purely neural system using neural language models and neural machine translation models trained with back-translation, checkpoint averaging, and fine-tuning, without additional tools such as spell checkers. This system was also used in a system combination entry with the Cambridge University Computer Lab.
Significance. If the descriptions are accurate, the paper contributes to the documentation of approaches in the grammatical error correction shared task. It highlights specific training techniques and component choices, which can be of interest to researchers working on similar systems. However, the lack of any reported performance metrics or comparisons means the significance is primarily in providing a record of the submitted systems rather than advancing new findings or demonstrating effectiveness.
minor comments (1)
- [Abstract] The abstract mentions 'prior work' on FST with NLM but does not provide a citation, which would help readers locate the referenced methods.
Simulated Author's Rebuttal
We thank the referee for the review and the recommendation of minor revision. The manuscript is a system description paper for the BEA-2019 shared task, whose primary purpose is to document the submitted systems rather than to present new experimental findings.
read point-by-point responses
-
Referee: However, the lack of any reported performance metrics or comparisons means the significance is primarily in providing a record of the submitted systems rather than advancing new findings or demonstrating effectiveness.
Authors: We agree that the paper's main role is to document the systems entered in the shared task. Results and comparisons for all participating systems are provided in the official shared-task overview paper; individual system-description papers conventionally omit them to avoid duplication. We can add a brief clarifying sentence in the introduction if the referee considers it helpful. revision: partial
Circularity Check
No significant circularity
full rationale
The paper is a system-description report for a shared-task submission with no equations, derivations, fitted parameters, or quantitative claims. It simply enumerates component choices (FST+NLM for low-resource; NMT+LM with back-translation, checkpoint averaging, fine-tuning for restricted track) and notes a combination entry. No load-bearing steps reduce to self-definition, fitted inputs called predictions, or self-citation chains. The text is self-contained factual description.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.