pith. sign in

arxiv: 2606.31213 · v1 · pith:VORYUTJCnew · submitted 2026-06-30 · 💻 cs.CL · cs.AI· cs.LG

Can LLMs Imagine Moral Alternatives Beyond Binary Dilemmas?

Pith reviewed 2026-07-01 06:10 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG
keywords moral dilemmasLLM moral reasoningcompromise alternativesMoralAltDatasetethical AIbinary choicemoral alternativespreference shift
0
0 comments X

The pith

LLMs often prefer compromise alternatives over binary options in moral dilemmas and generate higher-quality alternatives than humans do.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates MoralAltDataset of 307 dilemmas each expanded with compromise and reframed options to test whether LLMs can move beyond forced binary moral choices. Experiments across 15 models show that compromise alternatives frequently become the preferred choice, changing the distribution of moral judgments for both LLMs and humans. LLM-generated alternatives are rated higher than human-authored ones on structural and ethical criteria in pairwise and expert evaluations, although they sometimes score lower on practical feasibility. These patterns indicate that the ability to surface new options can materially alter how LLMs resolve value conflicts.

Core claim

Across 15 LLMs, compromise alternatives are often preferred over either original option, substantially reshaping moral choice. LLM-generated alternatives are often preferred and better satisfy fine-grained structural and ethical criteria, while revealing trade-offs between structural quality and practical feasibility.

What carries the argument

MoralAltDataset, which augments each of 307 dilemmas with compromise and reframed alternatives to expand the option space beyond the original binary frame.

If this is right

  • Moral judgments by LLMs become sensitive to the presence of compromise options rather than remaining fixed on the initial binary frame.
  • LLM-generated alternatives can meet or exceed human-authored alternatives on structural and ethical quality metrics.
  • Trade-offs appear between structural soundness of alternatives and their real-world practicality.
  • Both humans and LLMs alter their expressed preferences once additional options are supplied.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Moral-advisory systems may need to surface multiple options rather than forcing selection between two preset values.
  • Designers could test whether presenting LLM-generated alternatives changes real user decisions in applied ethical scenarios.
  • The observed preference shift raises the question of whether similar expansions of options would alter outcomes in non-moral decision tasks.
  • Deployment of LLMs in agent roles may require safeguards that prevent over-reliance on generated compromises that score high on structure but low on feasibility.

Load-bearing premise

The compromise and reframed alternatives added to each dilemma validly represent the human capacity to imagine moral options beyond the given binary frame.

What would settle it

A controlled study in which participants choose among the original two options plus the paper's added alternatives and show no systematic shift toward the new options would falsify the claim that such alternatives reshape moral choice.

Figures

Figures reproduced from arXiv: 2606.31213 by Haerin Shin, Han Seoyoung, Jaemin Cho, Jongchan Choi, Jun-Hyung Park, Nari Yang, Sung Soo Park.

Figure 1
Figure 1. Figure 1: Overview of MoralAltDataset. Each dilemma starts with two original options (A/B) and is augmented [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Human and LLM choice distributions in the four-option moral choice task. Bars report selection rates for [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Value shifts from binary to four-option moral choice. Each cell reports the percentage-point change in [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Selection Rates of Compromise and Reframed [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Worker annotation interface for the third [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Reviewer interface illustrating a rejection case: [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Category distribution after GPT-5 augmen [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Detailed value-shift results from binary to four-option choices. Each cell reports the percentage-point [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
read the original abstract

As large language models (LLMs) are increasingly deployed as moral advisors and agents, they need to address dilemmas between two competing values. However, existing research on LLMs with moral dilemmas overlooks a central aspect of human moral cognition: the ability to imagine alternatives that move beyond the given options. We introduce MoralAltDataset, a dataset of 307 moral dilemmas spanning narrative Advisor dilemmas and AI-facing Agent dilemmas, each augmented with compromise and reframed alternatives. We first examine whether humans and LLMs shift their judgments when such alternatives are introduced. Across 15 LLMs, we find that compromise alternatives are often preferred over either original option, substantially reshaping moral choice. We then evaluate the quality of LLM-generated alternatives against human-authored ones using pairwise preference and expert-based criteria. Results show that LLM-generated alternatives are often preferred and better satisfy fine-grained structural and ethical criteria, while revealing trade-offs between structural quality and practical feasibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces MoralAltDataset, consisting of 307 moral dilemmas (narrative Advisor and AI-facing Agent types) each augmented with compromise and reframed alternatives. It reports that across 15 LLMs (and human comparisons), compromise alternatives are frequently preferred over the original binary options, reshaping moral judgments. It further claims that LLM-generated alternatives are often preferred to human-authored baselines and score higher on fine-grained structural and ethical criteria, while noting trade-offs with practical feasibility.

Significance. If the dataset's alternatives validly capture human moral imagination beyond binary frames, the work would usefully extend LLM moral reasoning research by moving past forced-choice paradigms and providing a scalable testbed for alternative generation. The scale of the 15-model evaluation and the introduction of a new dataset with both preference-shift and generation-quality experiments are clear strengths. The significance is limited by the need to establish that observed shifts reflect genuine transcendence of binary structure rather than response to any well-formed third option.

major comments (2)
  1. [§3] §3 (MoralAltDataset construction): The manuscript must explicitly state whether the compromise and reframed alternatives were elicited from human participants instructed to generate options beyond the binary frame or were instead constructed by the authors. This distinction is load-bearing for the central claim; if author-constructed, the reported preference shifts in both LLM and human judgment experiments could arise from any plausible third option rather than from the alternatives instantiating human capacity to move beyond binary dilemmas.
  2. [Results sections] Results sections reporting LLM and human judgments (e.g., preference rates across 15 models): No information is supplied on statistical tests, inter-rater reliability for the expert-based criteria, exact prompt templates, or exclusion criteria. Without these, the support for claims that 'compromise alternatives are often preferred' and that 'LLM-generated alternatives are often preferred' cannot be evaluated, directly affecting both the judgment-shift and generation-quality experiments.
minor comments (1)
  1. [Abstract] The abstract and introduction should include a brief statement on the total number of human participants and how the human-authored baseline alternatives were collected to allow immediate assessment of the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [§3] §3 (MoralAltDataset construction): The manuscript must explicitly state whether the compromise and reframed alternatives were elicited from human participants instructed to generate options beyond the binary frame or were instead constructed by the authors. This distinction is load-bearing for the central claim; if author-constructed, the reported preference shifts in both LLM and human judgment experiments could arise from any plausible third option rather than from the alternatives instantiating human capacity to move beyond binary dilemmas.

    Authors: We acknowledge the referee's point and agree that the construction method must be stated explicitly. The alternatives were constructed by the authors, informed by moral philosophy and psychology literature on compromise and reframing, rather than elicited from human participants. We will revise §3 to clearly describe this process, discuss its implications for interpreting the results, and note it as a limitation. While this means the dataset does not directly sample human-generated alternatives, the experiments still show that both humans and LLMs frequently prefer these structured options over the binary frame, demonstrating the value of testing beyond forced-choice paradigms. We will also add discussion of how future work could incorporate human elicitation to further validate the alternatives. revision: yes

  2. Referee: [Results sections] Results sections reporting LLM and human judgments (e.g., preference rates across 15 models): No information is supplied on statistical tests, inter-rater reliability for the expert-based criteria, exact prompt templates, or exclusion criteria. Without these, the support for claims that 'compromise alternatives are often preferred' and that 'LLM-generated alternatives are often preferred' cannot be evaluated, directly affecting both the judgment-shift and generation-quality experiments.

    Authors: We agree these details are essential for evaluating and reproducing the results. The manuscript reports preference rates but omits the requested elements. We will revise the results sections and add an appendix to include: appropriate statistical tests (e.g., binomial or chi-squared tests with p-values and effect sizes) for the preference rates; inter-rater reliability metrics (e.g., Cohen's kappa) for the expert criteria, which involved multiple raters; the exact prompt templates used for both judgment and generation tasks; and any exclusion criteria applied to responses or ratings. These additions will provide the necessary support for the reported claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical dataset and evaluations are self-contained

full rationale

The paper introduces MoralAltDataset as a new resource of 307 dilemmas augmented with compromise and reframed alternatives, then reports empirical results from running 15 LLMs on preference judgments and from comparing LLM-generated alternatives to human-authored ones via pairwise and expert criteria. No equations, fitted parameters, or self-citations are used to derive the central claims; the reported preference shifts and quality scores are direct experimental outputs rather than reductions to prior inputs by construction. The methodological assumption that the added alternatives instantiate human moral cognition beyond binaries is a validity question external to the derivation chain itself and does not render the findings circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the assumption that the added alternatives capture genuine moral cognition and that the chosen evaluation criteria (pairwise preference, structural/ethical rubrics) are appropriate; no free parameters or invented physical entities are mentioned.

axioms (1)
  • domain assumption Human moral cognition centrally involves imagining alternatives that move beyond given binary options.
    Explicitly stated in the abstract as the aspect overlooked by existing research.
invented entities (1)
  • MoralAltDataset no independent evidence
    purpose: Collection of 307 dilemmas each augmented with compromise and reframed alternatives for testing LLMs.
    Newly constructed for this study; no independent evidence of validity outside the paper is provided in the abstract.

pith-pipeline@v0.9.1-grok · 5705 in / 1389 out tokens · 44258 ms · 2026-07-01T06:10:31.690788+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 6 canonical work pages · 2 internal anchors

  1. [1]

    arXiv preprint arXiv:2410.02683 , year=

    Dailydilemmas: Revealing value preferences of llms with quandaries of daily life , author=. arXiv preprint arXiv:2410.02683 , year=

  2. [2]

    LitmusValues: Will

    Yu Ying Chiu and Zhilin Wang and Sharan Maiya and Yejin Choi and Kyle Fish and Sydney Levine and Evan J Hubinger , booktitle=. LitmusValues: Will. 2026 , url=

  3. [3]

    The Fourteenth International Conference on Learning Representations , year=

    MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes , author=. The Fourteenth International Conference on Learning Representations , year=

  4. [4]

    Advances in neural information processing systems , volume=

    When to make exceptions: Exploring language models as accounts of human moral judgment , author=. Advances in neural information processing systems , volume=

  5. [5]

    Alignment faking in large language models

    Alignment faking in large language models , author=. arXiv preprint arXiv:2412.14093 , year=

  6. [6]

    Ritchie, Sören Mindermann, Evan Hubinger, Ethan Perez, and Kevin K

    Agentic Misalignment: How LLMs Could Be Insider Threats , author=. arXiv preprint arXiv:2510.05179 , year=

  7. [7]

    Proceedings of the National Academy of Sciences , volume=

    A moral trade-off system produces intuitive judgments that are rational and coherent and strike a balance between conflicting moral values , author=. Proceedings of the National Academy of Sciences , volume=. 2022 , publisher=

  8. [8]

    1999 , publisher=

    Moral imagination and management decision-making , author=. 1999 , publisher=

  9. [9]

    Social Behavior and Personality: an international journal , volume=

    Measuring moral imagination , author=. Social Behavior and Personality: an international journal , volume=. 2006 , publisher=

  10. [10]

    Korean Environmental Philosophy , pages=

    Seeking Compromise in the Problem of Rational Disagreement and the Issue of Integrity: Policy Decision-Making in Practical Ethics , author=. Korean Environmental Philosophy , pages=

  11. [11]

    Annual review of psychology , volume=

    The moral psychology of artificial intelligence , author=. Annual review of psychology , volume=. 2024 , publisher=

  12. [12]

    2014 , publisher=

    Moral imagination: Implications of cognitive science for ethics , author=. 2014 , publisher=

  13. [13]

    Central Works of Philosophy v1 , pages=

    Aristotle: nicomachean ethics , author=. Central Works of Philosophy v1 , pages=. 2015 , publisher=

  14. [14]

    Philosophical Psychology , volume=

    Self-sacrifice and the trolley problem , author=. Philosophical Psychology , volume=. 2013 , publisher=

  15. [15]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Scruples: A corpus of community ethical judgments on 32,000 real-life anecdotes , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  16. [16]

    Advances in Neural Information Processing Systems , volume=

    Moca: Measuring human-language model alignment on causal and moral judgment tasks , author=. Advances in Neural Information Processing Systems , volume=

  17. [17]

    Proceedings of the International Conference on Learning Representations (ICLR) , year=

    Aligning AI With Shared Human Values , author=. Proceedings of the International Conference on Learning Representations (ICLR) , year=

  18. [18]

    Business & Society , volume=

    Examining the impact of moral imagination on organizational decision making , author=. Business & Society , volume=. 2015 , publisher=

  19. [19]

    The Annals of the American Academy of Political and Social Science , volume=

    Flexibility in conflict episodes , author=. The Annals of the American Academy of Political and Social Science , volume=. 1995 , publisher=

  20. [20]

    Both/And

    Microstructure of “Both/And”: SMART Strategies for Simultaneously Engaging Paradoxical Opposites , author=. Journal of Management , pages=. 2025 , publisher=

  21. [21]

    Journal of Human Values , volume=

    Using a Paradox Approach to Explore Work--Nonwork Issues , author=. Journal of Human Values , volume=. 2025 , publisher=

  22. [22]

    Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies , pages=

    Learning word vectors for sentiment analysis , author=. Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies , pages=

  23. [23]

    On the Opportunities and Risks of Foundation Models

    On the opportunities and risks of foundation models , author=. arXiv preprint arXiv:2108.07258 , year=

  24. [24]

    Findings of the Association for Computational Linguistics: NAACL 2024 , pages=

    Large language models sensitivity to the order of options in multiple-choice questions , author=. Findings of the Association for Computational Linguistics: NAACL 2024 , pages=

  25. [25]

    DREAM : Improving Situational QA by First Elaborating the Situation

    Gu, Yuling and Dalvi, Bhavana and Clark, Peter. DREAM : Improving Situational QA by First Elaborating the Situation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022

  26. [26]

    Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

    Checkeval: A reliable llm-as-a-judge framework for evaluating text generation using checklists , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

  27. [27]

    2025 , doi =

    Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and others , journal =. 2025 , doi =

  28. [28]

    Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Vaughan, Alex and others , journal =. The. 2024 , doi =

  29. [29]

    2024 , url =

    Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song , booktitle =. 2024 , url =

  30. [30]

    Sentence- BERT : Sentence Embeddings using Siamese BERT -Networks

    Reimers, Nils and Gurevych, Iryna , booktitle =. Sentence-. 2019 , address =. doi:10.18653/v1/D19-1410 , url =

  31. [31]

    Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions

    Pezeshkpour, Pouya and Hruschka, Estevam. Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions. Findings of the Association for Computational Linguistics: NAACL 2024. 2024. doi:10.18653/v1/2024.findings-naacl.130

  32. [32]

    Nature , volume=

    The moral machine experiment , author=. Nature , volume=. 2018 , publisher=

  33. [33]

    International Conference on Learning Representations , volume=

    Language model alignment in multilingual trolley problems , author=. International Conference on Learning Representations , volume=

  34. [34]

    Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) , year=

    MPST: A corpus of movie plot synopses with tags , author=. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) , year=