{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2020:DPXW2N6DR3BZQPDQQKSXA3WKDN","short_pith_number":"pith:DPXW2N6D","schema_version":"1.0","canonical_sha256":"1bef6d37c38ec3983c7082a5706eca1b5159335b6557e6b8092be1f34868a4ae","source":{"kind":"arxiv","id":"2001.09768","version":2},"attestation_state":"computed","paper":{"title":"Artificial Intelligence, Values and Alignment","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"The central task for AI alignment is to identify fair principles that gain reflective endorsement from people with differing moral beliefs, rather than discovering true moral principles.","cross_cats":[],"primary_cat":"cs.CY","authors_text":"Iason Gabriel","submitted_at":"2020-01-13T10:32:16Z","abstract_excerpt":"This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has c"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2001.09768","kind":"arxiv","version":2},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.CY","submitted_at":"2020-01-13T10:32:16Z","cross_cats_sorted":[],"title_canon_sha256":"a0a4b9db8456af42a49e14a15b67794cd626f16fe748cc1733894fcb6f5f9166","abstract_canon_sha256":"80a96133082d42a89a0e18f4f54c8c81ee73d86ed0f3d7ec95d81d87a437a9f1"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:14.337658Z","signature_b64":"s5zbBqPzWEQH4WPjjuAdywH5lM7rfPCbrdZ0Zy7gh7y0BbxSFi8YZirb9DlSStb78/7UKKmz6HITucd346N4Ag==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"1bef6d37c38ec3983c7082a5706eca1b5159335b6557e6b8092be1f34868a4ae","last_reissued_at":"2026-05-17T23:38:14.336965Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:14.336965Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Artificial Intelligence, Values and Alignment","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"The central task for AI alignment is to identify fair principles that gain reflective endorsement from people with differing moral beliefs, rather than discovering true moral principles.","cross_cats":[],"primary_cat":"cs.CY","authors_text":"Iason Gabriel","submitted_at":"2020-01-13T10:32:16Z","abstract_excerpt":"This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has c"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"The central challenge for theorists is not to identify 'true' moral principles for AI; rather, it is to identify fair principles for alignment, that receive reflective endorsement despite widespread variation in people's moral beliefs.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That fair principles for AI alignment can be identified through methods like reflective endorsement or other procedures in a way that is robust to moral pluralism and sufficient to guide technical alignment work.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"AI alignment should target fair principles that receive reflective endorsement despite moral variation, rather than identifying true moral principles, with a principle-based approach combining different alignment elements.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"The central task for AI alignment is to identify fair principles that gain reflective endorsement from people with differing moral beliefs, rather than discovering true moral principles.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7cf8ca398cc3b6de4eb2a45b22ebb1a6c1f1637655003e6611dc8222c0e4032b"},"source":{"id":"2001.09768","kind":"arxiv","version":2},"verdict":{"id":"6b47f9fe-11d3-4227-b7ae-6204dc8fa715","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-17T10:37:12.164043Z","strongest_claim":"The central challenge for theorists is not to identify 'true' moral principles for AI; rather, it is to identify fair principles for alignment, that receive reflective endorsement despite widespread variation in people's moral beliefs.","one_line_summary":"AI alignment should target fair principles that receive reflective endorsement despite moral variation, rather than identifying true moral principles, with a principle-based approach combining different alignment elements.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That fair principles for AI alignment can be identified through methods like reflective endorsement or other procedures in a way that is robust to moral pluralism and sufficient to guide technical alignment work.","pith_extraction_headline":"The central task for AI alignment is to identify fair principles that gain reflective endorsement from people with differing moral beliefs, rather than discovering true moral principles."},"references":{"count":12,"sample":[{"doi":"","year":2004,"title":"Abbeel, P. & Ng, A.Y. (2004, July). Apprenticeship learning via inverse reinforcement learning. In Pro- ceedings of the twenty-first international conference on Machine learning (p. 1). ACM. Achiam, J","work_id":"187d93cf-ade3-467e-aa78-64b473a84dfc","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"Baum, S.D. (2017). Social choice ethics in artificial intelligence. AI Soc (pp. 1–12). Beauchamp, T. L., & Childress, J. F. (2001). Principles of biomedical ethics. USA: Oxford University Press. Black","work_id":"c60858ca-b59b-4755-a6e3-e7e35dbea225","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2003,"title":"Cohen, G. A. (2003). Facts and principles. Philosophy & Public Affairs, 31(3), 211–245. Cohen, J. (2010). The arc of the moral universe and other essays. New York: Harvard University Press. Cohen, J.,","work_id":"8695c307-b8f7-4e54-b6b7-9f04f2ec508e","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":1981,"title":"Impossibility and uncertainty theorems in AI value alignment","work_id":"29d74cf8-1968-4254-b33d-fc52f304303c","ref_index":4,"cited_arxiv_id":"1901.00064","is_internal_anchor":true},{"doi":"","year":2018,"title":"Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., et al. (2018). AI4People— an ethical framework for a good AI society: opportunities, risks, principles, and recommendat","work_id":"23a3c23d-daf5-4115-a7d5-03e29226acc7","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":12,"snapshot_sha256":"90ff3df18ffd245f1dc73152aade1f8eb305356cdc60d68cc846a113eee55183","internal_anchors":2},"formal_canon":{"evidence_count":2,"snapshot_sha256":"095197fd65686caacd9e35415cd9b11301e94afd7d2eebeae99614376b4eb311"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2001.09768","created_at":"2026-05-17T23:38:14.337070+00:00"},{"alias_kind":"arxiv_version","alias_value":"2001.09768v2","created_at":"2026-05-17T23:38:14.337070+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2001.09768","created_at":"2026-05-17T23:38:14.337070+00:00"},{"alias_kind":"pith_short_12","alias_value":"DPXW2N6DR3BZ","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_16","alias_value":"DPXW2N6DR3BZQPDQ","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_8","alias_value":"DPXW2N6D","created_at":"2026-05-18T12:33:33.725879+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":20,"internal_anchor_count":20,"sample":[{"citing_arxiv_id":"2412.01459","citing_title":"Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16291","citing_title":"AI of the People, by the People, for the People: A Social Choice Approach to Collective Control of Artificial Intelligence","ref_index":38,"is_internal_anchor":true},{"citing_arxiv_id":"2510.18184","citing_title":"ActivationReasoning: Logical Reasoning in Latent Activation Spaces","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2402.05070","citing_title":"A Roadmap to Pluralistic Alignment","ref_index":32,"is_internal_anchor":true},{"citing_arxiv_id":"2601.22440","citing_title":"AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2306.16388","citing_title":"Towards Measuring the Representation of Subjective Global Opinions in Language Models","ref_index":26,"is_internal_anchor":true},{"citing_arxiv_id":"2604.06233","citing_title":"Blind Refusal: Language Models Refuse to Help Users Evade Unjust, Absurd, and Illegitimate Rules","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24155","citing_title":"The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers","ref_index":22,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10310","citing_title":"Positive Alignment: Artificial Intelligence for Human Flourishing","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10365","citing_title":"Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2112.04359","citing_title":"Ethical and social risks of harm from Language Models","ref_index":83,"is_internal_anchor":true},{"citing_arxiv_id":"2112.00861","citing_title":"A General Language Assistant as a Laboratory for Alignment","ref_index":220,"is_internal_anchor":true},{"citing_arxiv_id":"2605.07925","citing_title":"How Value Induction Reshapes LLM Behaviour","ref_index":1,"is_internal_anchor":true},{"citing_arxiv_id":"2604.11517","citing_title":"Understanding the Gap Between Stated and Revealed Preferences in News Curation: A Study of Young Adult Social Media Users","ref_index":14,"is_internal_anchor":true},{"citing_arxiv_id":"2207.05221","citing_title":"Language Models (Mostly) Know What They Know","ref_index":82,"is_internal_anchor":true},{"citing_arxiv_id":"2605.00280","citing_title":"How Designers Envision Value-Oriented AI Design Concepts with Generative AI","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2605.00282","citing_title":"Developing an AI Concept Envisioning Toolkit to Support Reflective Juxtaposition of Values and Harms","ref_index":36,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25982","citing_title":"Open Problems in Frontier AI Risk Management","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24155","citing_title":"The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers","ref_index":22,"is_internal_anchor":true},{"citing_arxiv_id":"2604.21864","citing_title":"FAccT-Checked: A Narrative Review of Authority Reconfigurations and Retention in AI-Mediated Journalism","ref_index":77,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN","json":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN.json","graph_json":"https://pith.science/api/pith-number/DPXW2N6DR3BZQPDQQKSXA3WKDN/graph.json","events_json":"https://pith.science/api/pith-number/DPXW2N6DR3BZQPDQQKSXA3WKDN/events.json","paper":"https://pith.science/paper/DPXW2N6D"},"agent_actions":{"view_html":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN","download_json":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN.json","view_paper":"https://pith.science/paper/DPXW2N6D","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2001.09768&json=true","fetch_graph":"https://pith.science/api/pith-number/DPXW2N6DR3BZQPDQQKSXA3WKDN/graph.json","fetch_events":"https://pith.science/api/pith-number/DPXW2N6DR3BZQPDQQKSXA3WKDN/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN/action/timestamp_anchor","attest_storage":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN/action/storage_attestation","attest_author":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN/action/author_attestation","sign_citation":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN/action/citation_signature","submit_replication":"https://pith.science/pith/DPXW2N6DR3BZQPDQQKSXA3WKDN/action/replication_record"}},"created_at":"2026-05-17T23:38:14.337070+00:00","updated_at":"2026-05-17T23:38:14.337070+00:00"}