Grounding Value Alignment with Ethical Principles

John Hooker; Tae Wan Kim; Thomas Donaldson

arxiv: 1907.05447 · v1 · pith:HMM7HROGnew · submitted 2019-07-11 · 💻 cs.AI · cs.CY· cs.LG

Grounding Value Alignment with Ethical Principles

Tae Wan Kim , Thomas Donaldson , John Hooker This is my paper

Pith reviewed 2026-05-24 22:55 UTC · model grok-4.3

classification 💻 cs.AI cs.CYcs.LG

keywords value alignmentethical principlesnaturalistic fallacyquantified modal logicAI ethics

0 comments

The pith

Quantified modal logic connects ethical principles to factual propositions for AI value alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies that many AI value alignment designs commit the naturalistic fallacy by attempting to derive an ought from an is. It also argues that standard training routines fail to simulate how humans integrate ethical principles with facts. The authors propose an approach using quantified modal logic to connect ethical principles with propositions about states of affairs. A sympathetic reader would care because this interrelation is needed for hybrid ethical and empirical reasoning in future AI systems.

Core claim

Using concepts of quantified modal logic, the paper offers an approach that promises to simulate ethical reasoning in humans by connecting ethical principles on the one hand and propositions about states of affairs on the other.

What carries the argument

Quantified modal logic that links ethical principles to propositions about states of affairs.

If this is right

AI systems can integrate ethical reasoning and empirical observation without committing the naturalistic fallacy.
Value alignment training routines can be redesigned to better simulate human integration of principles and facts.
Designers gain a formal method to ground machine behavior in both ethical principles and factual states.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same logical connection might support verification of ethical compliance in specific AI decision domains.
It could extend to handling obligations across possible but non-actual situations in machine reasoning.

Load-bearing premise

Quantified modal logic can successfully model the integration of ethical principles with factual propositions in a way that avoids the naturalistic fallacy and approximates human ethical reasoning.

What would settle it

A test case in which the logic either derives an ought directly from an is or produces decisions inconsistent with human ethical judgments would show the approach does not work.

Figures

Figures reproduced from arXiv: 1907.05447 by John Hooker, Tae Wan Kim, Thomas Donaldson.

**Figure 1.** Figure 1: Microsoft’s Twitter-bot Tay contexts in which slavery or racism have been generally practiced and condoned. We can make sure machines are not exposed to behavior we consider unethical, but in that case, their ethical norms are not based on observed human preferences and values, but on the ethical principles espoused by their trainers. This, of course, is one reasonable approach. But when taking this approa… view at source ↗

**Figure 2.** Figure 2: MIT Media Lab’s Moral Machine website are preferred to those with less utility. However, this sense of utility is not the same as the moral utilitarian’s, because it is only a measure of what individuals prefer, rather than an intrinsically valuable quality. Individuals may base their preferences on criteria other than their estimate of utility in the ethical sense. They may base their preferences on mere … view at source ↗

**Figure 3.** Figure 3: GENETH[11] which actions would be preferable, in enough specific cases from which a machinelearning procedure arrived at a general principle.” Anderson and Anderson’s approach constitutes one of the better attempts to avoid the naturalistic fallacy, but reveals a number of shortcomings. One can interpret Anderson and Anderson’s maneuver as avoiding the naturalistic fallacy in one of two ways. On one inter… view at source ↗

read the original abstract

An important step in the development of value alignment (VA) systems in AI is understanding how values can interrelate with facts. Designers of future VA systems will need to utilize a hybrid approach in which ethical reasoning and empirical observation interrelate successfully in machine behavior. In this article we identify two problems about this interrelation that have been overlooked by AI discussants and designers. The first problem is that many AI designers commit inadvertently a version of what has been called by moral philosophers the "naturalistic fallacy," that is, they attempt to derive an "ought" from an "is." We illustrate when and why this occurs. The second problem is that AI designers adopt training routines that fail fully to simulate human ethical reasoning in the integration of ethical principles and facts. Using concepts of quantified modal logic, we proceed to offer an approach that promises to simulate ethical reasoning in humans by connecting ethical principles on the one hand and propositions about states of affairs on the other.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper flags the naturalistic fallacy in AI value alignment and sketches a quantified modal logic bridge between principles and facts, but the abstract gives no proof that the logic blocks deriving ought from is.

read the letter

The main takeaway is that this paper names two problems in value alignment work: AI designers sometimes slip into deriving ought from is, and training routines often fail to model how humans actually combine ethical principles with facts about the world. It then points to quantified modal logic as a way to connect the two sides without the bad inference. That framing is useful because it ties a classic philosophical issue directly to practical AI design choices.

Referee Report

2 major / 2 minor

Summary. The paper identifies two overlooked problems in AI value alignment: inadvertent commission of the naturalistic fallacy (deriving 'ought' from 'is') and training routines that fail to simulate human ethical reasoning in integrating principles with facts. It proposes an approach using concepts from quantified modal logic to connect ethical principles with propositions about states of affairs, promising to simulate ethical reasoning while avoiding these issues.

Significance. If the QML construction can be shown to enforce a strict separation between deontic and alethic modalities without permitting invalid inferences, the work would provide a formal tool for ethical AI design that respects the is-ought distinction. The identification of the two problems is a clear contribution to the VA literature, though the paper offers only a high-level promise rather than a verified implementation.

major comments (2)

[QML approach section] The section presenting the QML approach: no explicit axiomatization, semantics, or theorem is supplied demonstrating that the logic blocks any derivation of deontic obligations from factual (alethic) propositions alone; without such a non-derivability result the central claim that the framework avoids the naturalistic fallacy remains unverified.
[QML approach section] The claim that the approach 'promises to simulate ethical reasoning in humans': the manuscript provides neither a formal definition of the integration mechanism nor any worked example showing how ethical principles and state-of-affairs propositions interact in the logic, leaving the simulation claim unsubstantiated and load-bearing for the paper's contribution.

minor comments (2)

[Abstract and introduction] The abstract and introduction use 'quantified modal logic' without specifying which variant (e.g., constant vs. varying domains, which deontic axioms) is intended; a brief clarification would aid readability.
[Introduction] No references are given to prior formal work on deontic logic or is-ought separation in AI ethics (e.g., work on deontic modal logic in machine ethics); adding a short related-work paragraph would strengthen context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We respond to each major comment below.

read point-by-point responses

Referee: [QML approach section] The section presenting the QML approach: no explicit axiomatization, semantics, or theorem is supplied demonstrating that the logic blocks any derivation of deontic obligations from factual (alethic) propositions alone; without such a non-derivability result the central claim that the framework avoids the naturalistic fallacy remains unverified.

Authors: We agree the manuscript presents the QML approach at a conceptual level without an explicit axiomatization or non-derivability theorem. The proposal draws on the standard separation of alethic and deontic modalities in quantified modal logic to block derivation of obligations from facts. We will revise to include a sketch of the relevant semantics together with a short argument establishing the non-derivability result. revision: yes
Referee: [QML approach section] The claim that the approach 'promises to simulate ethical reasoning in humans': the manuscript provides neither a formal definition of the integration mechanism nor any worked example showing how ethical principles and state-of-affairs propositions interact in the logic, leaving the simulation claim unsubstantiated and load-bearing for the paper's contribution.

Authors: The manuscript frames the QML construction as a promising direction rather than a complete formal system. We will add a concrete worked example in the revised manuscript that illustrates the interaction between ethical principles and factual propositions, thereby substantiating the simulation claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; conceptual proposal is self-contained

full rationale

The paper advances a philosophical proposal that uses quantified modal logic to connect ethical principles with factual propositions about states of affairs while avoiding the naturalistic fallacy. The provided text contains no equations, fitted parameters, self-citations invoked as load-bearing uniqueness theorems, or renamings of prior results. The central claim is an outline of an approach rather than a derivation that reduces to its own inputs by construction, satisfying the criteria for a self-contained non-circular analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; the proposal rests on standard concepts from moral philosophy and modal logic whose details are not supplied.

pith-pipeline@v0.9.0 · 5689 in / 1060 out tokens · 25204 ms · 2026-05-24T22:55:15.491748+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

[1]

Turing, Computing machinery and intelligence, Mind 3 (1950)

A. Turing, Computing machinery and intelligence, Mind 3 (1950)

work page 1950
[2]

Russell, D

S. Russell, D. Dewey, M. Tegmark, Research priorities for robust and beneﬁcial artiﬁcial intelligence, AI Magazine 36 (2015) 105–114

work page 2015
[3]

Allen, I

C. Allen, I. Smit, W. Wallach, Artiﬁcial morality: Top-down, bottom- up, and hybrid approaches, Ethics and Information Technology 7 (2005) 149–155

work page 2005
[4]

Arnold, D

T. Arnold, D. Kasenberg, M. Scheutz, Value alignment or misalignment– What will keep systems accountable?, in: 3rd International Workshop on AI, Ethics, and Society, 2017

work page 2017
[5]

is-ought

D. J. Singer, Mind the “is-ought” gap, The Journal of Philosophy 112 (2015) 193–210. 22

work page 2015
[6]

M. H. Bazerman, A. E. Tenbrunsel, Blind Spots: Why We Fail To Do What’s Right and What To Do About It, Princeton University Press, 2011

work page 2011
[7]

experiment,

M. J. Wolf, K. Miller, F. S. Grodzinsky, Why we should have seen that coming: Comments on microsoft’s tay “experiment,” and wider implications, SIGCAS Comput. Soc. 47 (2017) 54–64

work page 2017
[8]

Johnson, B

C. Johnson, B. Kuipers, Socially-aware navigation using topological maps and social norm learning, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 151–157. URL: http://doi.acm.org/10.1145/3278721. 3278772. doi:10.1145/3278721.3278772

work page doi:10.1145/3278721 2018
[9]

R. Kim, M. Kleiman-Weiner, A. Abeliuk, E. Awad, S. Dsouza, J. B. Tenenbaum, I. Rahwan, A computational model of commonsense moral decision making, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 197–203. URL: http://doi.acm.org/10.1145/3278721.3278770. doi:10. 1145/3278721.3278770

work page doi:10.1145/3278721.3278770 2018
[10]

A Voting-Based System for Ethical Decision Making

R. Noothigattu, S. Gaikwad, E. Awad, S. Dsouza, I. Rahwan, P. Ravikumar, A. D. Procaccia, A voting-based system for ethical decision making, arXiv preprint arXiv:1709.06692 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[11]

S. L. Anderson, M. Anderson, GenEth: A general ethical dilemma analyzer, in: Proceedings of the Twenty-Eighth AAAI Conference on Artiﬁcial Intelligence, 2014, pp. 253–261

work page 2014
[12]

W. D. Ross, The Right and the Good, Oxford University Press, Oxford, 1930

work page 1930
[13]

S. L. Anderson, M. Anderson, A prima facie duty approach to machine ethics: Machine learning of features of ethical dilemmas, prima facie duties, and decision principles through a dialogue with ethicists, in: M. Anderson, 23 S. L. Anderson (Eds.), Machine Ethics, Cambridge University Press, New York, 2011, pp. 476–492

work page 2011
[14]

Alexander, Experimental Philosophy, Polity Press, Cambridge, 2012

J. Alexander, Experimental Philosophy, Polity Press, Cambridge, 2012

work page 2012
[15]

Schwitzgebel, J

E. Schwitzgebel, J. Rust, The behavior of ethicists, in: A Companion to Experimental Philosophy, Wiley Blackwell, Malden, MA, 2016

work page 2016
[16]

J. N. Hooker, T. W. N. Kim, Toward non-intuition-based machine and artiﬁcial intelligence ethics: A deontological approach based on modal logic, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 130–136. URL: http://doi.acm.org/10.1145/3278721.3278753. doi: 10.1145/3278721. 3278753

work page doi:10.1145/3278721.3278753 2018
[17]

Rawls, A Theory of Justice, Harvard university press, 1971

J. Rawls, A Theory of Justice, Harvard university press, 1971

work page 1971
[18]

Karsu, A

¨O. Karsu, A. Morton, Inequity averse optimization in operational research, European Journal of Operational Research 245 (2015)

work page 2015
[19]

J. N. Hooker, H. P. Williams, Combining equity and utilitarianism in a mathematical programming model, Management Science 58 (2012) 1682– 1693

work page 2012
[20]

Hooker, Taking Ethics Seriously: Why Ethics Is an Essential Tool for the Modern Workplace, Taylor & Francis, 2018

J. Hooker, Taking Ethics Seriously: Why Ethics Is an Essential Tool for the Modern Workplace, Taylor & Francis, 2018. 24

work page 2018

[1] [1]

Turing, Computing machinery and intelligence, Mind 3 (1950)

A. Turing, Computing machinery and intelligence, Mind 3 (1950)

work page 1950

[2] [2]

Russell, D

S. Russell, D. Dewey, M. Tegmark, Research priorities for robust and beneﬁcial artiﬁcial intelligence, AI Magazine 36 (2015) 105–114

work page 2015

[3] [3]

Allen, I

C. Allen, I. Smit, W. Wallach, Artiﬁcial morality: Top-down, bottom- up, and hybrid approaches, Ethics and Information Technology 7 (2005) 149–155

work page 2005

[4] [4]

Arnold, D

T. Arnold, D. Kasenberg, M. Scheutz, Value alignment or misalignment– What will keep systems accountable?, in: 3rd International Workshop on AI, Ethics, and Society, 2017

work page 2017

[5] [5]

is-ought

D. J. Singer, Mind the “is-ought” gap, The Journal of Philosophy 112 (2015) 193–210. 22

work page 2015

[6] [6]

M. H. Bazerman, A. E. Tenbrunsel, Blind Spots: Why We Fail To Do What’s Right and What To Do About It, Princeton University Press, 2011

work page 2011

[7] [7]

experiment,

M. J. Wolf, K. Miller, F. S. Grodzinsky, Why we should have seen that coming: Comments on microsoft’s tay “experiment,” and wider implications, SIGCAS Comput. Soc. 47 (2017) 54–64

work page 2017

[8] [8]

Johnson, B

C. Johnson, B. Kuipers, Socially-aware navigation using topological maps and social norm learning, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 151–157. URL: http://doi.acm.org/10.1145/3278721. 3278772. doi:10.1145/3278721.3278772

work page doi:10.1145/3278721 2018

[9] [9]

R. Kim, M. Kleiman-Weiner, A. Abeliuk, E. Awad, S. Dsouza, J. B. Tenenbaum, I. Rahwan, A computational model of commonsense moral decision making, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 197–203. URL: http://doi.acm.org/10.1145/3278721.3278770. doi:10. 1145/3278721.3278770

work page doi:10.1145/3278721.3278770 2018

[10] [10]

A Voting-Based System for Ethical Decision Making

R. Noothigattu, S. Gaikwad, E. Awad, S. Dsouza, I. Rahwan, P. Ravikumar, A. D. Procaccia, A voting-based system for ethical decision making, arXiv preprint arXiv:1709.06692 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[11] [11]

S. L. Anderson, M. Anderson, GenEth: A general ethical dilemma analyzer, in: Proceedings of the Twenty-Eighth AAAI Conference on Artiﬁcial Intelligence, 2014, pp. 253–261

work page 2014

[12] [12]

W. D. Ross, The Right and the Good, Oxford University Press, Oxford, 1930

work page 1930

[13] [13]

S. L. Anderson, M. Anderson, A prima facie duty approach to machine ethics: Machine learning of features of ethical dilemmas, prima facie duties, and decision principles through a dialogue with ethicists, in: M. Anderson, 23 S. L. Anderson (Eds.), Machine Ethics, Cambridge University Press, New York, 2011, pp. 476–492

work page 2011

[14] [14]

Alexander, Experimental Philosophy, Polity Press, Cambridge, 2012

J. Alexander, Experimental Philosophy, Polity Press, Cambridge, 2012

work page 2012

[15] [15]

Schwitzgebel, J

E. Schwitzgebel, J. Rust, The behavior of ethicists, in: A Companion to Experimental Philosophy, Wiley Blackwell, Malden, MA, 2016

work page 2016

[16] [16]

J. N. Hooker, T. W. N. Kim, Toward non-intuition-based machine and artiﬁcial intelligence ethics: A deontological approach based on modal logic, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, ACM, New York, NY, USA, 2018, pp. 130–136. URL: http://doi.acm.org/10.1145/3278721.3278753. doi: 10.1145/3278721. 3278753

work page doi:10.1145/3278721.3278753 2018

[17] [17]

Rawls, A Theory of Justice, Harvard university press, 1971

J. Rawls, A Theory of Justice, Harvard university press, 1971

work page 1971

[18] [18]

Karsu, A

¨O. Karsu, A. Morton, Inequity averse optimization in operational research, European Journal of Operational Research 245 (2015)

work page 2015

[19] [19]

J. N. Hooker, H. P. Williams, Combining equity and utilitarianism in a mathematical programming model, Management Science 58 (2012) 1682– 1693

work page 2012

[20] [20]

Hooker, Taking Ethics Seriously: Why Ethics Is an Essential Tool for the Modern Workplace, Taylor & Francis, 2018

J. Hooker, Taking Ethics Seriously: Why Ethics Is an Essential Tool for the Modern Workplace, Taylor & Francis, 2018. 24

work page 2018