OmniScore is a family of lightweight deterministic learned metrics that approximate LLM-judge behavior for reliable multilingual evaluation of generative text in tasks such as QA, translation, and summarization.
- Do not penalize Plausibility just because it differs from the source; penalize only if it becomes illogical or self-contradictory
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beyond LLM-as-a-Judge: Deterministic Metrics for Multilingual Generative Text Evaluation
OmniScore is a family of lightweight deterministic learned metrics that approximate LLM-judge behavior for reliable multilingual evaluation of generative text in tasks such as QA, translation, and summarization.