HyperCLOVA X technical report.Preprint at https://arxiv.org/abs/2404.01954(2024)

HyperCLOVA X Team · 2024 · arXiv 2404.01954

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

representative citing papers

Ko-WideSearch: A Korean Breadth-Search Benchmark for Exhaustive Set Enumeration by Web Agents

cs.CL · 2026-06-25 · unverdicted · novelty 7.0

Ko-WideSearch is a new Korean breadth-search benchmark spanning 16 categories and three difficulty tiers that evaluates web agents on full set membership plus per-item attributes, showing consistent gaps between set recovery and row completion.

Anchoring LLM Gender Bias to Human Baselines: A Cross-Lingual Audit

cs.CL · 2026-05-29 · unverdicted · novelty 6.0

LLM gender stereotyping across four languages spans roughly 2.5 times the human cross-country range on HEXACO-100, with translation altering specific stereotyped attributes and effects that can compound.

Discovering Lexical Gaps Using Embeddings from Multilingual LLMs

cs.CL · 2026-05-23 · unverdicted · novelty 6.0

A framework extracts embeddings from Korean-English bilingual LLMs across thousands of spaces and uses similarity distributions plus logistic classifiers to identify lexical gaps with AUCs of 0.81 and 0.76.

K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology

cs.CL · 2026-04-27 · unverdicted · novelty 6.0

K-MetBench shows LLMs have large gaps in interpreting meteorology diagrams and Korean-specific context, with smaller local models beating much larger global ones.

SCRIPT: A Subcharacter Compositional Representation Injection Module for Korean Pre-Trained Language Models

cs.CL · 2026-04-14 · unverdicted · novelty 6.0

SCRIPT is a model-agnostic injection module that enhances Korean PLM embeddings with subcharacter compositional knowledge from Jamo, leading to better performance on NLU and NLG tasks and more linguistically coherent embedding spaces.

CHERRY: Compressed Hierarchical Experts with Recurrent Representational Yield

cs.CL · 2026-06-30 · unverdicted · novelty 5.0

CHERRY combines selective ground-truth token training, recurrent depth compression from 48 to 6 layers, and mixture-of-efficient-experts to achieve competitive loss with fewer parameters on a 1.8B Korean model.

SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures

cs.CL · 2026-05-04 · unverdicted · novelty 4.0

SemEval-2026 Task 7 presents a benchmark and two evaluation tracks for assessing LLMs on everyday knowledge in diverse languages and cultures without allowing training on the test data.

citing papers explorer

Showing 7 of 7 citing papers.

Ko-WideSearch: A Korean Breadth-Search Benchmark for Exhaustive Set Enumeration by Web Agents cs.CL · 2026-06-25 · unverdicted · none · ref 14
Ko-WideSearch is a new Korean breadth-search benchmark spanning 16 categories and three difficulty tiers that evaluates web agents on full set membership plus per-item attributes, showing consistent gaps between set recovery and row completion.
Anchoring LLM Gender Bias to Human Baselines: A Cross-Lingual Audit cs.CL · 2026-05-29 · unverdicted · none · ref 3
LLM gender stereotyping across four languages spans roughly 2.5 times the human cross-country range on HEXACO-100, with translation altering specific stereotyped attributes and effects that can compound.
Discovering Lexical Gaps Using Embeddings from Multilingual LLMs cs.CL · 2026-05-23 · unverdicted · none · ref 5
A framework extracts embeddings from Korean-English bilingual LLMs across thousands of spaces and uses similarity distributions plus logistic classifiers to identify lexical gaps with AUCs of 0.81 and 0.76.
K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology cs.CL · 2026-04-27 · unverdicted · none · ref 7
K-MetBench shows LLMs have large gaps in interpreting meteorology diagrams and Korean-specific context, with smaller local models beating much larger global ones.
SCRIPT: A Subcharacter Compositional Representation Injection Module for Korean Pre-Trained Language Models cs.CL · 2026-04-14 · unverdicted · none · ref 4
SCRIPT is a model-agnostic injection module that enhances Korean PLM embeddings with subcharacter compositional knowledge from Jamo, leading to better performance on NLU and NLG tasks and more linguistically coherent embedding spaces.
CHERRY: Compressed Hierarchical Experts with Recurrent Representational Yield cs.CL · 2026-06-30 · unverdicted · none · ref 33
CHERRY combines selective ground-truth token training, recurrent depth compression from 48 to 6 layers, and mixture-of-efficient-experts to achieve competitive loss with fewer parameters on a 1.8B Korean model.
SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures cs.CL · 2026-05-04 · unverdicted · none · ref 35
SemEval-2026 Task 7 presents a benchmark and two evaluation tracks for assessing LLMs on everyday knowledge in diverse languages and cultures without allowing training on the test data.

HyperCLOVA X technical report.Preprint at https://arxiv.org/abs/2404.01954(2024)

fields

years

verdicts

representative citing papers

citing papers explorer