hub

COMET : A Neural Framework for MT Evaluation

Rei, R · 2020 · DOI 10.18653/v1/2020.emnlp-main.213

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

open at publisher browse 16 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Creativity Bias: How Machine Evaluation Struggles with Creativity in Literary Translations

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

Automatic evaluation tools for literary translations correlate poorly with expert human judgments on creativity and exhibit bias favoring machine-translated texts.

ReflectMT: Internalizing Reflection for Efficient and High-Quality Machine Translation

cs.CL · 2026-04-21 · unverdicted · novelty 7.0

ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.

LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

LQM introduces a six-level linguistically motivated error taxonomy for MT evaluation and applies it via expert annotation to LLM outputs on a new 3,850-sentence multi-dialect Arabic corpus.

Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax

cs.CL · 2026-05-14 · unverdicted · novelty 6.0

Reinforcement learning with semantic rewards lets LLMs gain low-resource language skills without the alignment tax that degrades general capabilities in supervised fine-tuning.

SLoW: Select Low-frequency Words! Automatic Dictionary Selection for Translation on Large Language Models

cs.CL · 2025-07-25 · conditional · novelty 6.0

SLoW selects low-frequency word dictionaries to boost LLM translation quality and efficiency across 100 languages from FLORES.

CompactQE: Interpretable Translation Quality Estimation via Small Open-Weight LLMs

cs.CL · 2026-05-15 · unverdicted · novelty 5.0

Small open-source LLMs achieve competitive system-level correlations with human judgments in machine translation quality estimation, outperforming traditional neural metrics and fine-tuned models via single-pass multi-output prompting.

COPRA: Conditional Parameter Adaptation with Reinforcement Learning for Video Anomaly Detection

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

COPRA introduces conditional parameter adaptation via RL to dynamically tune frozen VLMs for video anomaly detection, outperforming static methods in in-domain and cross-domain settings while generalizing to other video tasks.

Towards Visually-Guided Movie Subtitle Translation for Indic Languages

cs.CL · 2026-05-12 · unverdicted · novelty 5.0

Selective replacement of the worst 20-30% of text-only subtitle segments with visual-enhanced outputs raises COMET scores for Indic languages, but full visual grounding is ineffective because of temporal misalignment between subtitles and frames.

Smarter edits? Post-editing with error highlights and translation suggestions

cs.CL · 2026-05-20 · unverdicted · novelty 4.0

User study with professional En-Nl translators found LLM-based error highlights and APE correction suggestions did not improve productivity or quality over standard post-editing but were better received and enhanced user experience.

Benchmarked Yet Not Measured -- Generative AI Should be Evaluated Against Real-World Utility

cs.LG · 2026-05-07 · unverdicted · novelty 4.0 · 2 refs

Generative AI evaluation must shift from static benchmark scores to measuring sustained improvements in human capabilities within specific deployment contexts.

Effects of Cross-lingual Evidence in Multilingual Medical Question Answering

cs.CL · 2026-04-22 · unverdicted · novelty 4.0

Combining English and target-language web retrieval boosts medical QA for low-resource languages to match high-resource performance, while English web data benefits high-resource languages most and specialized sources like PubMed lack multilingual coverage.

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages

cs.CL · 2026-04-03 · unverdicted · novelty 4.0

BM25 retrieval makes many-shot ICL for low-resource MT roughly 5x more data-efficient, with 50 examples matching 250 random ones and 250 matching 1000.

Adam's Law: Textual Frequency Law on Large Language Models

cs.CL · 2026-04-02 · unverdicted · novelty 3.0

Frequent sentence-level text improves LLM prompting and fine-tuning performance across math, translation, commonsense, and tool-use tasks via a proposed frequency law and curriculum ordering.

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

cs.CL · 2025-04-02 · unverdicted · novelty 3.0

A literature survey that organizes prompting, fine-tuning, preference optimization, and context-aware techniques for LLM-based machine translation with emphasis on low-resource languages.

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

cs.CL · 2026-05-09

VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation

cs.CL · 2026-05-03

citing papers explorer

Showing 16 of 16 citing papers.

Creativity Bias: How Machine Evaluation Struggles with Creativity in Literary Translations cs.CL · 2026-05-13 · unverdicted · none · ref 13
Automatic evaluation tools for literary translations correlate poorly with expert human judgments on creativity and exhibit bias favoring machine-translated texts.
ReflectMT: Internalizing Reflection for Efficient and High-Quality Machine Translation cs.CL · 2026-04-21 · unverdicted · none · ref 34
ReflectMT internalizes reflection via two-stage RL to enable direct high-quality machine translation that outperforms explicit reasoning models like DeepSeek-R1 on WMT24 while using 94% fewer tokens.
LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation cs.CL · 2026-04-20 · unverdicted · none · ref 73
LQM introduces a six-level linguistically motivated error taxonomy for MT evaluation and applies it via expert annotation to LLM outputs on a new 3,850-sentence multi-dialect Arabic corpus.
Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax cs.CL · 2026-05-14 · unverdicted · none · ref 17
Reinforcement learning with semantic rewards lets LLMs gain low-resource language skills without the alignment tax that degrades general capabilities in supervised fine-tuning.
SLoW: Select Low-frequency Words! Automatic Dictionary Selection for Translation on Large Language Models cs.CL · 2025-07-25 · conditional · none · ref 23
SLoW selects low-frequency word dictionaries to boost LLM translation quality and efficiency across 100 languages from FLORES.
CompactQE: Interpretable Translation Quality Estimation via Small Open-Weight LLMs cs.CL · 2026-05-15 · unverdicted · none · ref 10
Small open-source LLMs achieve competitive system-level correlations with human judgments in machine translation quality estimation, outperforming traditional neural metrics and fine-tuned models via single-pass multi-output prompting.
COPRA: Conditional Parameter Adaptation with Reinforcement Learning for Video Anomaly Detection cs.CV · 2026-05-14 · unverdicted · none · ref 51
COPRA introduces conditional parameter adaptation via RL to dynamically tune frozen VLMs for video anomaly detection, outperforming static methods in in-domain and cross-domain settings while generalizing to other video tasks.
Towards Visually-Guided Movie Subtitle Translation for Indic Languages cs.CL · 2026-05-12 · unverdicted · none · ref 14
Selective replacement of the worst 20-30% of text-only subtitle segments with visual-enhanced outputs raises COMET scores for Indic languages, but full visual grounding is ineffective because of temporal misalignment between subtitles and frames.
Smarter edits? Post-editing with error highlights and translation suggestions cs.CL · 2026-05-20 · unverdicted · none · ref 3
User study with professional En-Nl translators found LLM-based error highlights and APE correction suggestions did not improve productivity or quality over standard post-editing but were better received and enhanced user experience.
Benchmarked Yet Not Measured -- Generative AI Should be Evaluated Against Real-World Utility cs.LG · 2026-05-07 · unverdicted · none · ref 13 · 2 links
Generative AI evaluation must shift from static benchmark scores to measuring sustained improvements in human capabilities within specific deployment contexts.
Effects of Cross-lingual Evidence in Multilingual Medical Question Answering cs.CL · 2026-04-22 · unverdicted · none · ref 57
Combining English and target-language web retrieval boosts medical QA for low-resource languages to match high-resource performance, while English web data benefits high-resource languages most and specialized sources like PubMed lack multilingual coverage.
An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages cs.CL · 2026-04-03 · unverdicted · none · ref 26
BM25 retrieval makes many-shot ICL for low-resource MT roughly 5x more data-efficient, with 50 examples matching 250 random ones and 250 matching 1000.
Adam's Law: Textual Frequency Law on Large Language Models cs.CL · 2026-04-02 · unverdicted · none · ref 32
Frequent sentence-level text improves LLM prompting and fine-tuning performance across math, translation, commonsense, and tool-use tasks via a proposed frequency law and curriculum ordering.
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation cs.CL · 2025-04-02 · unverdicted · none · ref 81
A literature survey that organizes prompting, fine-tuning, preference optimization, and context-aware techniques for LLM-based machine translation with emphasis on low-resource languages.
Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation cs.CL · 2026-05-09 · unreviewed · ref 9
VIDA: A dataset for Visually Dependent Ambiguity in Multimodal Machine Translation cs.CL · 2026-05-03 · unreviewed · ref 9

COMET : A Neural Framework for MT Evaluation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer