K-MetBench shows LLMs have large gaps in interpreting meteorology diagrams and Korean-specific context, with smaller local models beating much larger global ones.
arXiv preprint arXiv:2404.01954
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
SCRIPT is a model-agnostic injection module that enhances Korean PLM embeddings with subcharacter compositional knowledge from Jamo, leading to better performance on NLU and NLG tasks and more linguistically coherent embedding spaces.
SemEval-2026 Task 7 presents a benchmark and two evaluation tracks for assessing LLMs on everyday knowledge in diverse languages and cultures without allowing training on the test data.
citing papers explorer
-
K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology
K-MetBench shows LLMs have large gaps in interpreting meteorology diagrams and Korean-specific context, with smaller local models beating much larger global ones.
-
SCRIPT: A Subcharacter Compositional Representation Injection Module for Korean Pre-Trained Language Models
SCRIPT is a model-agnostic injection module that enhances Korean PLM embeddings with subcharacter compositional knowledge from Jamo, leading to better performance on NLU and NLG tasks and more linguistically coherent embedding spaces.
-
SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures
SemEval-2026 Task 7 presents a benchmark and two evaluation tracks for assessing LLMs on everyday knowledge in diverse languages and cultures without allowing training on the test data.