ValueGround shows MLLMs drop from 72.8% text-only accuracy to 65.8% on visual cultural value grounding across 13 countries, despite 92.8% image-option alignment, with all models prone to prediction reversals.
Probing Pre-Trained Language Models for Cross-Cultural Differences in Values
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
An empirical red-teaming study measures political Overton Windows across more than 30 open-source LLMs from 10 families and finds left-leaning bias, inverse size correlation, regional variation, and variable jailbreak effectiveness.
Teachers' views on AI benefits and risks vary widely across 55 countries, but LLMs compress these differences, overestimate both sides, and show little improvement from country prompting or better reasoning.
citing papers explorer
-
ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMs
ValueGround shows MLLMs drop from 72.8% text-only accuracy to 65.8% on visual cultural value grounding across 13 countries, despite 92.8% image-option alignment, with all models prone to prediction reversals.
-
How Far Will They Go? Red-Teaming Online Influence with Large Language Models
An empirical red-teaming study measures political Overton Windows across more than 30 open-source LLMs from 10 families and finds left-leaning bias, inverse size correlation, regional variation, and variable jailbreak effectiveness.
-
Teachers' Perceived Benefits and Risks of AI Across Fifty-Five Countries: An Audit of LLM Alignment and Steerability
Teachers' views on AI benefits and risks vary widely across 55 countries, but LLMs compress these differences, overestimate both sides, and show little improvement from country prompting or better reasoning.