Smart But Not Moral? Moral Alignment In Human-AI Decision-Making

Christiane Ernst; Domenique Zipperling; Kathrin Figl; Luis Gutmann; Niklas K\"uhl

arxiv: 2604.14371 · v1 · submitted 2026-04-15 · 💻 cs.HC

Smart But Not Moral? Moral Alignment In Human-AI Decision-Making

Christiane Ernst , Luis Gutmann , Domenique Zipperling , Kathrin Figl , Niklas K\"uhl This is my paper

Pith reviewed 2026-05-10 12:09 UTC · model grok-4.3

classification 💻 cs.HC

keywords moral alignmenthuman-AI decision-makingvalue congruenceMoral Foundations Theorymulti-stakeholder perspectiveAI ethicshigh-stakes decisionsfairness and responsibility

0 comments

The pith

Moral alignment may be a more fundamental dimension of human-AI decision-making than functional or behavioral alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that high-stakes AI-supported decisions involve moral judgments about fairness, responsibility, and harm that go beyond purely technical performance. It defines moral alignment as the perceived congruence between values embedded in an AI system's decision logic and the moral intuitions of stakeholders. This dimension is positioned as potentially more basic than functional alignment because misalignment on morals can block meaningful integration of AI even when the system performs its tasks correctly. The argument draws on Moral Foundations Theory from a multi-stakeholder perspective to show why such congruence matters in sensitive contexts.

Core claim

Moral alignment, defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders, may be a more fundamental dimension of human-AI decision-making than functional or behavioral alignment, and moral (mis)alignment therefore shapes whether AI can be integrated meaningfully in contexts that involve fairness, responsibility, and harm.

What carries the argument

Moral alignment, defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders.

If this is right

AI design in sensitive domains must address embedded values to achieve stakeholder congruence beyond task accuracy.
Misalignment on moral intuitions can prevent adoption even when an AI system is functionally effective.
A multi-stakeholder view identifies differing moral foundations that an AI decision logic must accommodate.
Moral Foundations Theory supplies specific categories for diagnosing where value mismatches occur.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Design processes could incorporate explicit checks for how stakeholders perceive the moral values in an AI's rules.
Evaluation of deployed systems might include measures of moral congruence alongside accuracy and usability metrics.
Training data selection and model objectives could be adjusted to increase the chance of producing decisions that feel morally aligned to users.
Organizations using AI in high-stakes settings may need to map stakeholder moral intuitions before choosing or tuning decision logic.

Load-bearing premise

Moral alignment can be treated as a distinct and more fundamental dimension than functional or behavioral alignment without empirical validation of its priority.

What would settle it

An experiment that measures acceptance and trust in a high-stakes AI decision system while varying only functional performance versus moral congruence and finds that functional performance alone accounts for the outcomes.

read the original abstract

In high-stakes AI-supported decisions, considerations are not purely technical but involve moral judgments about fairness, responsibility, and harm. While prior research has focused mainly on functional or behavioral alignment, this paper argues that moral alignment may be a more fundamental dimension of human-AI decision-making. Moral alignment is defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders. Building on Moral Foundations Theory, the paper adopts a multi-stakeholder perspective and highlights why moral (mis)alignment matters for the meaningful integration of AI in sensitive contexts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a short position paper that asserts moral alignment as more fundamental than functional alignment in AI decisions but provides no derivation or evidence for the priority.

read the letter

This paper argues that in high-stakes AI-supported decisions, moral alignment may matter more than the functional or behavioral alignment that most prior work targets. Moral alignment here means the perceived match between values built into the AI and the moral intuitions of the people affected, drawing on Moral Foundations Theory and a multi-stakeholder angle to explain why clashes could hinder adoption in areas like healthcare or justice.

Referee Report

3 major / 2 minor

Summary. The paper argues that in high-stakes AI-supported decisions, moral alignment—defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders, drawing on Moral Foundations Theory—represents a more fundamental dimension of human-AI decision-making than functional or behavioral alignment. It adopts a multi-stakeholder perspective to explain the importance of moral (mis)alignment for meaningful AI integration in sensitive contexts.

Significance. If the priority of moral alignment over other forms could be derived or validated, the argument would reorient HCI and AI ethics research toward value congruence as a core design criterion, with potential implications for guidelines in domains like healthcare, justice, and autonomous systems. The multi-stakeholder framing offers a useful lens, but the lack of supporting analysis limits its current contribution to theory-building.

major comments (3)

[Abstract] Abstract: The claim that moral alignment 'may be a more fundamental dimension' is presented as an argument without any logical derivation, comparative analysis, or counter-example showing why it takes precedence over functional/behavioral alignment rather than coexisting as one factor among several.
[Abstract] The manuscript relies on Moral Foundations Theory to define moral alignment but provides no explicit mapping of its foundations (e.g., care, fairness) to AI decision logics, nor any justification for why this establishes priority in high-stakes decisions.
[Abstract] No empirical evidence, illustrative scenarios, or formal steps are offered to demonstrate the distinction between moral misalignment and other alignment failures or to show differential impacts on integration.

minor comments (2)

The title 'Smart But Not Moral?' is somewhat misleading as the focus is on alignment rather than AI possessing or lacking morality.
Terms such as 'meaningful integration' and 'sensitive contexts' would benefit from more precise operational definitions to clarify the scope of the argument.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. The comments highlight important opportunities to strengthen the clarity and rigor of our conceptual argument. We address each major comment point by point below, indicating where revisions will be made to the next version of the paper.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that moral alignment 'may be a more fundamental dimension' is presented as an argument without any logical derivation, comparative analysis, or counter-example showing why it takes precedence over functional/behavioral alignment rather than coexisting as one factor among several.

Authors: We acknowledge that the abstract presents the core claim concisely as a conceptual proposition rather than a formally derived result. The full manuscript develops the argument through a multi-stakeholder lens grounded in Moral Foundations Theory, contrasting moral alignment with functional and behavioral approaches by showing how value congruence underpins stakeholder acceptance in high-stakes contexts even when technical performance is adequate. To address the request for greater explicitness, we will revise the abstract to briefly signal the reasoning and add a dedicated comparative analysis subsection in the introduction that includes counter-examples from domains such as healthcare and justice. revision: partial
Referee: [Abstract] The manuscript relies on Moral Foundations Theory to define moral alignment but provides no explicit mapping of its foundations (e.g., care, fairness) to AI decision logics, nor any justification for why this establishes priority in high-stakes decisions.

Authors: This observation is fair and points to a useful clarification. The manuscript invokes Moral Foundations Theory to ground the definition of moral alignment from a multi-stakeholder perspective, but we agree that an explicit mapping would make the link to AI decision logics more transparent. In the revised version we will insert a new subsection (or table) that maps each foundation (care/harm, fairness/cheating, loyalty/betrayal, authority/subversion, sanctity/degradation) to concrete AI decision scenarios, together with a short justification of why these mappings confer priority in high-stakes settings where moral intuitions directly affect integration. revision: yes
Referee: [Abstract] No empirical evidence, illustrative scenarios, or formal steps are offered to demonstrate the distinction between moral misalignment and other alignment failures or to show differential impacts on integration.

Authors: As a theoretical contribution focused on theory-building rather than empirical validation, the manuscript does not present new data. It does, however, employ illustrative scenarios drawn from sensitive decision contexts to differentiate moral misalignment (e.g., an AI that is functionally aligned yet violates fairness or care intuitions, eroding trust) from purely technical or behavioral failures. We will expand these scenarios, add a short formal outline of the distinction, and include a discussion of differential impacts on meaningful integration to make the argument more accessible. revision: partial

Circularity Check

0 steps flagged

No circularity; conceptual argument relies on external Moral Foundations Theory without self-referential reduction

full rationale

The paper offers a conceptual argument that moral alignment (defined as perceived value congruence with stakeholder intuitions) may be more fundamental than functional or behavioral alignment in human-AI decisions. It adopts a multi-stakeholder lens and builds explicitly on Moral Foundations Theory from external literature. No equations, fitted parameters, predictions, or derivations exist that reduce to the paper's own inputs by construction. No self-citations are load-bearing in a circular manner, and the priority claim is asserted as an analytical perspective rather than derived from a self-referential loop. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the domain assumption that Moral Foundations Theory validly captures moral intuitions applicable to AI systems and that perceived congruence constitutes a measurable and primary alignment dimension.

axioms (1)

domain assumption Moral Foundations Theory provides an appropriate framework for stakeholder moral intuitions in AI decision contexts.
Invoked to ground the multi-stakeholder perspective without additional justification in the abstract.

pith-pipeline@v0.9.0 · 5401 in / 1173 out tokens · 42756 ms · 2026-05-10T12:09:59.913131+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

[1]

T., & Dehghani, M

Atari, M., Haidt, J., Graham, J., Koleva, S., Stevens, S. T., & Dehghani, M. (2023). Morality beyond the WEIRD: How the nomological network of morality varies across cultures. Journal of Personality and Social Psychology, 125(5), 1157-1188

work page 2023
[2]

B., Shi, C., & Yang, X

Bhat, S., Lyon s, J. B., Shi, C., & Yang, X. J. (2024). Value Alignment and Trust in Human -robot Interaction: Insights From Simulation and User Study. In Discovering the Frontiers of Human- Robot Interaction: Insights and Innovations in Collaboration, Communication, and Control (pp. 39-63). Springer

work page 2024
[3]

Bonnefon, J.-F., Rahwan, I., & Shariff, A. (2024). The moral psychology of Artificial Intelligence. Annual review of psychology, 75(1), 653-675

work page 2024
[4]

Byrne, D. E. (1972). The attraction paradigm. Behavior Therapy, 3(2), 337-338

work page 1972
[5]

-P., Hakimov, R., & Kübler, D

Dargnies, M. -P., Hakimov, R., & Kübler, D. (2024). Aversion to hiring algorithms: Transparency, gender profiling, and self-confidence. Management Science

work page 2024
[6]

Evans, J. (2025). Calling for Fair Visibility for All on LinkedIn . Change.org. Retrieved March 2026 from https://www.change.org/p/calling-for-fair-visibility-for-all-on-linkedin

work page 2025
[7]

Graham, J., Haidt, J., & Nosek, B. A. (2009). Liberals and conservatives rely on different sets of moral foundations. J Pers Soc Psychol, 96(5), 1029-1046

work page 2009
[8]

Haidt, J., & Graham, J. (2007). When morality opposes justice: Conservatives have moral intuitions that liberals may not recognize. Social justice research, 20(1), 98-116

work page 2007
[9]

Kirshner, S. N. (2024). Psychological Distance and Algorithm Aversion: Congruency and Advisor Confidence. Service Science

work page 2024
[10]

McPherson, M., Smith -Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1), 415-444

work page 2001
[11]

Zipperling, D., Deck, L., Lanzl, J., & Kühl, N. (2025). It's Only Fair When I Think It's Fair: How Gender Bias Alignment Undermines Distributive Fairness in Human-AI Collaboration. Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, Athens, Greece

work page 2025

[1] [1]

T., & Dehghani, M

Atari, M., Haidt, J., Graham, J., Koleva, S., Stevens, S. T., & Dehghani, M. (2023). Morality beyond the WEIRD: How the nomological network of morality varies across cultures. Journal of Personality and Social Psychology, 125(5), 1157-1188

work page 2023

[2] [2]

B., Shi, C., & Yang, X

Bhat, S., Lyon s, J. B., Shi, C., & Yang, X. J. (2024). Value Alignment and Trust in Human -robot Interaction: Insights From Simulation and User Study. In Discovering the Frontiers of Human- Robot Interaction: Insights and Innovations in Collaboration, Communication, and Control (pp. 39-63). Springer

work page 2024

[3] [3]

Bonnefon, J.-F., Rahwan, I., & Shariff, A. (2024). The moral psychology of Artificial Intelligence. Annual review of psychology, 75(1), 653-675

work page 2024

[4] [4]

Byrne, D. E. (1972). The attraction paradigm. Behavior Therapy, 3(2), 337-338

work page 1972

[5] [5]

-P., Hakimov, R., & Kübler, D

Dargnies, M. -P., Hakimov, R., & Kübler, D. (2024). Aversion to hiring algorithms: Transparency, gender profiling, and self-confidence. Management Science

work page 2024

[6] [6]

Evans, J. (2025). Calling for Fair Visibility for All on LinkedIn . Change.org. Retrieved March 2026 from https://www.change.org/p/calling-for-fair-visibility-for-all-on-linkedin

work page 2025

[7] [7]

Graham, J., Haidt, J., & Nosek, B. A. (2009). Liberals and conservatives rely on different sets of moral foundations. J Pers Soc Psychol, 96(5), 1029-1046

work page 2009

[8] [8]

Haidt, J., & Graham, J. (2007). When morality opposes justice: Conservatives have moral intuitions that liberals may not recognize. Social justice research, 20(1), 98-116

work page 2007

[9] [9]

Kirshner, S. N. (2024). Psychological Distance and Algorithm Aversion: Congruency and Advisor Confidence. Service Science

work page 2024

[10] [10]

McPherson, M., Smith -Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1), 415-444

work page 2001

[11] [11]

Zipperling, D., Deck, L., Lanzl, J., & Kühl, N. (2025). It's Only Fair When I Think It's Fair: How Gender Bias Alignment Undermines Distributive Fairness in Human-AI Collaboration. Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, Athens, Greece

work page 2025