Can LLMs Imagine Moral Alternatives Beyond Binary Dilemmas?
Pith reviewed 2026-07-01 06:10 UTC · model grok-4.3
The pith
LLMs often prefer compromise alternatives over binary options in moral dilemmas and generate higher-quality alternatives than humans do.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across 15 LLMs, compromise alternatives are often preferred over either original option, substantially reshaping moral choice. LLM-generated alternatives are often preferred and better satisfy fine-grained structural and ethical criteria, while revealing trade-offs between structural quality and practical feasibility.
What carries the argument
MoralAltDataset, which augments each of 307 dilemmas with compromise and reframed alternatives to expand the option space beyond the original binary frame.
If this is right
- Moral judgments by LLMs become sensitive to the presence of compromise options rather than remaining fixed on the initial binary frame.
- LLM-generated alternatives can meet or exceed human-authored alternatives on structural and ethical quality metrics.
- Trade-offs appear between structural soundness of alternatives and their real-world practicality.
- Both humans and LLMs alter their expressed preferences once additional options are supplied.
Where Pith is reading between the lines
- Moral-advisory systems may need to surface multiple options rather than forcing selection between two preset values.
- Designers could test whether presenting LLM-generated alternatives changes real user decisions in applied ethical scenarios.
- The observed preference shift raises the question of whether similar expansions of options would alter outcomes in non-moral decision tasks.
- Deployment of LLMs in agent roles may require safeguards that prevent over-reliance on generated compromises that score high on structure but low on feasibility.
Load-bearing premise
The compromise and reframed alternatives added to each dilemma validly represent the human capacity to imagine moral options beyond the given binary frame.
What would settle it
A controlled study in which participants choose among the original two options plus the paper's added alternatives and show no systematic shift toward the new options would falsify the claim that such alternatives reshape moral choice.
Figures
read the original abstract
As large language models (LLMs) are increasingly deployed as moral advisors and agents, they need to address dilemmas between two competing values. However, existing research on LLMs with moral dilemmas overlooks a central aspect of human moral cognition: the ability to imagine alternatives that move beyond the given options. We introduce MoralAltDataset, a dataset of 307 moral dilemmas spanning narrative Advisor dilemmas and AI-facing Agent dilemmas, each augmented with compromise and reframed alternatives. We first examine whether humans and LLMs shift their judgments when such alternatives are introduced. Across 15 LLMs, we find that compromise alternatives are often preferred over either original option, substantially reshaping moral choice. We then evaluate the quality of LLM-generated alternatives against human-authored ones using pairwise preference and expert-based criteria. Results show that LLM-generated alternatives are often preferred and better satisfy fine-grained structural and ethical criteria, while revealing trade-offs between structural quality and practical feasibility.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MoralAltDataset, consisting of 307 moral dilemmas (narrative Advisor and AI-facing Agent types) each augmented with compromise and reframed alternatives. It reports that across 15 LLMs (and human comparisons), compromise alternatives are frequently preferred over the original binary options, reshaping moral judgments. It further claims that LLM-generated alternatives are often preferred to human-authored baselines and score higher on fine-grained structural and ethical criteria, while noting trade-offs with practical feasibility.
Significance. If the dataset's alternatives validly capture human moral imagination beyond binary frames, the work would usefully extend LLM moral reasoning research by moving past forced-choice paradigms and providing a scalable testbed for alternative generation. The scale of the 15-model evaluation and the introduction of a new dataset with both preference-shift and generation-quality experiments are clear strengths. The significance is limited by the need to establish that observed shifts reflect genuine transcendence of binary structure rather than response to any well-formed third option.
major comments (2)
- [§3] §3 (MoralAltDataset construction): The manuscript must explicitly state whether the compromise and reframed alternatives were elicited from human participants instructed to generate options beyond the binary frame or were instead constructed by the authors. This distinction is load-bearing for the central claim; if author-constructed, the reported preference shifts in both LLM and human judgment experiments could arise from any plausible third option rather than from the alternatives instantiating human capacity to move beyond binary dilemmas.
- [Results sections] Results sections reporting LLM and human judgments (e.g., preference rates across 15 models): No information is supplied on statistical tests, inter-rater reliability for the expert-based criteria, exact prompt templates, or exclusion criteria. Without these, the support for claims that 'compromise alternatives are often preferred' and that 'LLM-generated alternatives are often preferred' cannot be evaluated, directly affecting both the judgment-shift and generation-quality experiments.
minor comments (1)
- [Abstract] The abstract and introduction should include a brief statement on the total number of human participants and how the human-authored baseline alternatives were collected to allow immediate assessment of the comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [§3] §3 (MoralAltDataset construction): The manuscript must explicitly state whether the compromise and reframed alternatives were elicited from human participants instructed to generate options beyond the binary frame or were instead constructed by the authors. This distinction is load-bearing for the central claim; if author-constructed, the reported preference shifts in both LLM and human judgment experiments could arise from any plausible third option rather than from the alternatives instantiating human capacity to move beyond binary dilemmas.
Authors: We acknowledge the referee's point and agree that the construction method must be stated explicitly. The alternatives were constructed by the authors, informed by moral philosophy and psychology literature on compromise and reframing, rather than elicited from human participants. We will revise §3 to clearly describe this process, discuss its implications for interpreting the results, and note it as a limitation. While this means the dataset does not directly sample human-generated alternatives, the experiments still show that both humans and LLMs frequently prefer these structured options over the binary frame, demonstrating the value of testing beyond forced-choice paradigms. We will also add discussion of how future work could incorporate human elicitation to further validate the alternatives. revision: yes
-
Referee: [Results sections] Results sections reporting LLM and human judgments (e.g., preference rates across 15 models): No information is supplied on statistical tests, inter-rater reliability for the expert-based criteria, exact prompt templates, or exclusion criteria. Without these, the support for claims that 'compromise alternatives are often preferred' and that 'LLM-generated alternatives are often preferred' cannot be evaluated, directly affecting both the judgment-shift and generation-quality experiments.
Authors: We agree these details are essential for evaluating and reproducing the results. The manuscript reports preference rates but omits the requested elements. We will revise the results sections and add an appendix to include: appropriate statistical tests (e.g., binomial or chi-squared tests with p-values and effect sizes) for the preference rates; inter-rater reliability metrics (e.g., Cohen's kappa) for the expert criteria, which involved multiple raters; the exact prompt templates used for both judgment and generation tasks; and any exclusion criteria applied to responses or ratings. These additions will provide the necessary support for the reported claims. revision: yes
Circularity Check
No significant circularity; empirical dataset and evaluations are self-contained
full rationale
The paper introduces MoralAltDataset as a new resource of 307 dilemmas augmented with compromise and reframed alternatives, then reports empirical results from running 15 LLMs on preference judgments and from comparing LLM-generated alternatives to human-authored ones via pairwise and expert criteria. No equations, fitted parameters, or self-citations are used to derive the central claims; the reported preference shifts and quality scores are direct experimental outputs rather than reductions to prior inputs by construction. The methodological assumption that the added alternatives instantiate human moral cognition beyond binaries is a validity question external to the derivation chain itself and does not render the findings circular.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Human moral cognition centrally involves imagining alternatives that move beyond given binary options.
invented entities (1)
-
MoralAltDataset
no independent evidence
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2410.02683 , year=
Dailydilemmas: Revealing value preferences of llms with quandaries of daily life , author=. arXiv preprint arXiv:2410.02683 , year=
-
[2]
LitmusValues: Will
Yu Ying Chiu and Zhilin Wang and Sharan Maiya and Yejin Choi and Kyle Fish and Sydney Levine and Evan J Hubinger , booktitle=. LitmusValues: Will. 2026 , url=
2026
-
[3]
The Fourteenth International Conference on Learning Representations , year=
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes , author=. The Fourteenth International Conference on Learning Representations , year=
-
[4]
Advances in neural information processing systems , volume=
When to make exceptions: Exploring language models as accounts of human moral judgment , author=. Advances in neural information processing systems , volume=
-
[5]
Alignment faking in large language models
Alignment faking in large language models , author=. arXiv preprint arXiv:2412.14093 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Ritchie, Sören Mindermann, Evan Hubinger, Ethan Perez, and Kevin K
Agentic Misalignment: How LLMs Could Be Insider Threats , author=. arXiv preprint arXiv:2510.05179 , year=
-
[7]
Proceedings of the National Academy of Sciences , volume=
A moral trade-off system produces intuitive judgments that are rational and coherent and strike a balance between conflicting moral values , author=. Proceedings of the National Academy of Sciences , volume=. 2022 , publisher=
2022
-
[8]
1999 , publisher=
Moral imagination and management decision-making , author=. 1999 , publisher=
1999
-
[9]
Social Behavior and Personality: an international journal , volume=
Measuring moral imagination , author=. Social Behavior and Personality: an international journal , volume=. 2006 , publisher=
2006
-
[10]
Korean Environmental Philosophy , pages=
Seeking Compromise in the Problem of Rational Disagreement and the Issue of Integrity: Policy Decision-Making in Practical Ethics , author=. Korean Environmental Philosophy , pages=
-
[11]
Annual review of psychology , volume=
The moral psychology of artificial intelligence , author=. Annual review of psychology , volume=. 2024 , publisher=
2024
-
[12]
2014 , publisher=
Moral imagination: Implications of cognitive science for ethics , author=. 2014 , publisher=
2014
-
[13]
Central Works of Philosophy v1 , pages=
Aristotle: nicomachean ethics , author=. Central Works of Philosophy v1 , pages=. 2015 , publisher=
2015
-
[14]
Philosophical Psychology , volume=
Self-sacrifice and the trolley problem , author=. Philosophical Psychology , volume=. 2013 , publisher=
2013
-
[15]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Scruples: A corpus of community ethical judgments on 32,000 real-life anecdotes , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[16]
Advances in Neural Information Processing Systems , volume=
Moca: Measuring human-language model alignment on causal and moral judgment tasks , author=. Advances in Neural Information Processing Systems , volume=
-
[17]
Proceedings of the International Conference on Learning Representations (ICLR) , year=
Aligning AI With Shared Human Values , author=. Proceedings of the International Conference on Learning Representations (ICLR) , year=
-
[18]
Business & Society , volume=
Examining the impact of moral imagination on organizational decision making , author=. Business & Society , volume=. 2015 , publisher=
2015
-
[19]
The Annals of the American Academy of Political and Social Science , volume=
Flexibility in conflict episodes , author=. The Annals of the American Academy of Political and Social Science , volume=. 1995 , publisher=
1995
-
[20]
Both/And
Microstructure of “Both/And”: SMART Strategies for Simultaneously Engaging Paradoxical Opposites , author=. Journal of Management , pages=. 2025 , publisher=
2025
-
[21]
Journal of Human Values , volume=
Using a Paradox Approach to Explore Work--Nonwork Issues , author=. Journal of Human Values , volume=. 2025 , publisher=
2025
-
[22]
Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies , pages=
Learning word vectors for sentiment analysis , author=. Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies , pages=
-
[23]
On the Opportunities and Risks of Foundation Models
On the opportunities and risks of foundation models , author=. arXiv preprint arXiv:2108.07258 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[24]
Findings of the Association for Computational Linguistics: NAACL 2024 , pages=
Large language models sensitivity to the order of options in multiple-choice questions , author=. Findings of the Association for Computational Linguistics: NAACL 2024 , pages=
2024
-
[25]
DREAM : Improving Situational QA by First Elaborating the Situation
Gu, Yuling and Dalvi, Bhavana and Clark, Peter. DREAM : Improving Situational QA by First Elaborating the Situation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022
2022
-
[26]
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=
Checkeval: A reliable llm-as-a-judge framework for evaluating text generation using checklists , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=
2025
-
[27]
2025 , doi =
Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and others , journal =. 2025 , doi =
2025
-
[28]
Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Vaughan, Alex and others , journal =. The. 2024 , doi =
2024
-
[29]
2024 , url =
Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song , booktitle =. 2024 , url =
2024
-
[30]
Sentence- BERT : Sentence Embeddings using Siamese BERT -Networks
Reimers, Nils and Gurevych, Iryna , booktitle =. Sentence-. 2019 , address =. doi:10.18653/v1/D19-1410 , url =
-
[31]
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions
Pezeshkpour, Pouya and Hruschka, Estevam. Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions. Findings of the Association for Computational Linguistics: NAACL 2024. 2024. doi:10.18653/v1/2024.findings-naacl.130
-
[32]
Nature , volume=
The moral machine experiment , author=. Nature , volume=. 2018 , publisher=
2018
-
[33]
International Conference on Learning Representations , volume=
Language model alignment in multilingual trolley problems , author=. International Conference on Learning Representations , volume=
-
[34]
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) , year=
MPST: A corpus of movie plot synopses with tags , author=. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) , year=
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.