Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices

Laskar, Md Tahmid Rahman, Islam, Mohammed Saidul, Mahbub, Ridwan, Rahman, Mizanur, Bhuiyan, Amran, Jahan, Israt · 2025 · DOI 10.18653/v1/2025.emnlp-industry.134

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Lost in Translation: Do LVLM Judges Generalize Across Languages?

cs.CL · 2026-04-21 · unverdicted · novelty 8.0

MM-JudgeBench shows substantial cross-lingual performance variance in 22 LVLM judges, with model size and architecture as poor predictors of multilingual robustness.

citing papers explorer

Showing 1 of 1 citing paper.

Lost in Translation: Do LVLM Judges Generalize Across Languages? cs.CL · 2026-04-21 · unverdicted · none · ref 41
MM-JudgeBench shows substantial cross-lingual performance variance in 22 LVLM judges, with model size and architecture as poor predictors of multilingual robustness.

Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices

fields

years

verdicts

representative citing papers

citing papers explorer