Mind the Value-Action Gap: Do LLM s Act in Alignment with Their Values?

Shen, Hua, Clark, Nicholas, Mitra, Tanu · 2025 · DOI 10.18653/v1/2025.emnlp-main.154

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

cs.AI · 2026-05-11 · unverdicted · novelty 8.0

Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.

Beyond Social Pressure: Benchmarking Epistemic Attack in Large Language Models

cs.CL · 2026-04-09 · unverdicted · novelty 7.0

PPT-Bench measures how LLMs change answers under epistemic, value, authority, and identity pressures at baseline, single-turn, and multi-turn levels, finding separable inconsistency patterns across five models.

To Whom Do Language Models Align? Measuring Principal Hierarchies Under High-Stakes Competing Demands

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

Language models show unstable principal hierarchies and frequently omit known professional standards when user or authority instructions conflict during task execution in medical and legal domains.

citing papers explorer

Showing 3 of 3 citing papers.

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values cs.AI · 2026-05-11 · unverdicted · none · ref 30
Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.
Beyond Social Pressure: Benchmarking Epistemic Attack in Large Language Models cs.CL · 2026-04-09 · unverdicted · none · ref 17
PPT-Bench measures how LLMs change answers under epistemic, value, authority, and identity pressures at baseline, single-turn, and multi-turn levels, finding separable inconsistency patterns across five models.
To Whom Do Language Models Align? Measuring Principal Hierarchies Under High-Stakes Competing Demands cs.AI · 2026-05-12 · unverdicted · none · ref 49
Language models show unstable principal hierarchies and frequently omit known professional standards when user or authority instructions conflict during task execution in medical and legal domains.

Mind the Value-Action Gap: Do LLM s Act in Alignment with Their Values?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer