CultureForest benchmark shows top LLMs degrade sharply on open-ended cultural reasoning tasks, exhibit regional disparities, and are limited more by effective use of knowledge than by lack of knowledge itself.
CDE val: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
KG-FairDiff is an inference-time framework that uses a knowledge graph to guide prompt refinement and reduce gender, race, age, and intersectional biases in text-to-image generation while preserving semantics.
citing papers explorer
-
CultureForest: Understanding and Evaluating Cultural Norm Grounded Reasoning in LLMs
CultureForest benchmark shows top LLMs degrade sharply on open-ended cultural reasoning tasks, exhibit regional disparities, and are limited more by effective use of knowledge than by lack of knowledge itself.
-
KG-FairDiff: Knowledge Graph-Guided Prompt Refinement for Demographically Fair Text-to-Image Generation
KG-FairDiff is an inference-time framework that uses a knowledge graph to guide prompt refinement and reduce gender, race, age, and intersectional biases in text-to-image generation while preserving semantics.