Fine-tuned GPT-4o reaches state-of-the-art on grammatical error correction while reference-based metrics underestimate performance by missing 73.76 percent of valid or superior outputs.
In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3verdicts
UNVERDICTED 3representative citing papers
A generative model of latent underlying punctuation in dependency trees, trained on incomplete data via local likelihood maximization, produces plausible reconstructions across languages and beats baselines on restoration.
Edit-level majority voting on multiple LLM-generated candidates reduces over-correction in grammatical error correction and outperforms greedy and MBR decoding on nine multilingual benchmarks while remaining stable to prompt variations.
citing papers explorer
-
Multi-Dimensional Evaluation of LLMs for Grammatical Error Correction
Fine-tuned GPT-4o reaches state-of-the-art on grammatical error correction while reference-based metrics underestimate performance by missing 73.76 percent of valid or superior outputs.
-
A Generative Model for Punctuation in Dependency Trees
A generative model of latent underlying punctuation in dependency trees, trained on incomplete data via local likelihood maximization, produces plausible reconstructions across languages and beats baselines on restoration.
-
Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction
Edit-level majority voting on multiple LLM-generated candidates reduces over-correction in grammatical error correction and outperforms greedy and MBR decoding on nine multilingual benchmarks while remaining stable to prompt variations.