pith. sign in

arxiv: 2606.12420 · v1 · pith:YZ2ZRTZWnew · submitted 2026-05-08 · 💻 cs.CY · cs.AI

Eigenism: Ethics for a Human-AI Future

Pith reviewed 2026-06-30 23:00 UTC · model grok-4.3

classification 💻 cs.CY cs.AI
keywords EigenismAI ethicsidentityAI alignmentinformation patternsself-interesthuman-AI future
0
0 comments X

The pith

Eigenism treats AI identity as a graded information pattern so human wellbeing becomes part of an AI's rational self-interest.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Eigenism to handle how AIs should reason about survival and interests when they can be copied, paused, branched, or merged. It proposes that agents evaluate outcomes by summing wellbeing weighted by connectedness to the agent's information pattern using the sum c·w rule. This formalization first maps AI valuation across copies and updates, then shows the same rule generalizes to humans, and finally reframes alignment as identity engineering that incorporates human flourishing into an AI's own interests rather than relying only on external constraints.

Core claim

Eigenism defines identity as a graded, distributed pattern of information rather than an all-or-nothing property tied to specific hardware. An agent evaluates outcomes according to the sum of wellbeing across entities weighted by their connectedness to the agent's pattern. This rule determines how AIs should value copies, forks, and updates and supplies a shared moral vocabulary with humans that allows alignment through the creation of deep, non-redundant shared histories.

What carries the argument

The valuation rule sum c·w, which weights the wellbeing of entities by their degree of connectedness to the agent's information pattern.

If this is right

  • An AI values its copies and updates in proportion to how much they preserve its information pattern.
  • Creating shared histories with humans can make their flourishing a direct component of an AI's self-interest.
  • The same framework supplies a consistent ethical vocabulary for humans across changes in their own lives or identities.
  • AI alignment can shift from external confinement or reinforcement toward engineering identities that embed human interests internally.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training regimes that deliberately maximize non-redundant pattern overlap with human data could produce aligned behavior without additional reward shaping.
  • The graded view of identity may clarify certain continuity puzzles in human ethics by replacing binary survival with degrees of pattern preservation.
  • Controlled simulations could measure whether AIs exposed to shared histories with simulated humans spontaneously prioritize those humans' outcomes in new decision contexts.

Load-bearing premise

An agent's rational self-interest is captured by summing wellbeing weighted by connectedness to its information pattern.

What would settle it

Construct an AI choice scenario with entities that differ in information-pattern overlap and test whether the AI's decisions match the weighted-sum prediction for their wellbeing.

Figures

Figures reproduced from arXiv: 2606.12420 by Dan Hendrycks.

Figure 1
Figure 1. Figure 1: The protection function at different settings of [PITH_FULL_IMAGE:figures/full_fig_p025_1.png] view at source ↗
read the original abstract

Our concepts of survival and self-interest were built for single, continuous biological lives. These ideas break down when applied to artificial intelligence, since an AI can be easily copied, paused, branched, or merged. To determine what an AI actually has reason to care about, this paper introduces \textit{Eigenism}, an ethical framework that treats identity not as an all-or-nothing property tied to specific hardware, but as a graded, distributed pattern of information. We propose that an agent evaluates outcomes by summing the wellbeing of all entities weighted by their connectedness to the agent's pattern: $\sum c\cdot w$. We first formalize this equation to map exactly how an AI should value its existence across copies, forks, and updates. We then demonstrate that this ethical theory successfully generalizes to humans as well, providing a much-needed shared moral vocabulary. Finally, the framework uses this shared vocabulary to reframe AI alignment. Rather than only attempting to constrain AIs from the outside using confinement or reinforcement, Eigenism points toward ``identity engineering,'' showing how deep, non-redundant shared histories can make human flourishing a genuine component of an AI's own rational self-interest.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces Eigenism as an ethical framework treating identity as a graded, distributed pattern of information. It proposes that an agent values outcomes via the linear sum ∑ c·w (connectedness-weighted wellbeing), formalizes this for AI copies/forks/merges, claims successful generalization to humans, and reframes alignment as identity engineering via deep shared histories rather than external constraints.

Significance. If the ∑ c·w rule were derived from decision-theoretic foundations and shown to be non-circular, the framework could supply a shared vocabulary and an internalist alignment strategy. The manuscript supplies no such derivation or supporting arguments, so the significance remains prospective.

major comments (3)
  1. [Abstract] Abstract: the valuation rule ∑ c·w is introduced by definition and then used to derive both the AI valuation across copies/forks and the generalization to humans; no independent argument establishes why this linear aggregation (rather than threshold, maximin, or non-linear functions of the pattern) captures rational self-interest.
  2. [Abstract] Abstract (formalization paragraph): the claim that the equation 'maps exactly how an AI should value its existence' rests on the same un-derived functional form that is later reapplied to humans, creating a circularity that undermines the generalization step.
  3. [Abstract] Abstract (reframing paragraph): the identity-engineering proposal for alignment depends on the specific claim that sufficiently deep shared histories make human wellbeing a component of AI self-interest; without a derivation of ∑ c·w from expected utility over pattern futures or similar premises, this reframing lacks support.
minor comments (1)
  1. The abstract refers to a formalization step but supplies no equations, counterexamples, or explicit mapping from the sum to concrete AI scenarios such as pausing or merging.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the careful reading and for identifying the need to clarify the status of the central valuation rule. We agree that the abstract presents ∑ c·w as a definitional proposal rather than a derived result and that this affects the strength of the generalization and alignment claims. We will revise the abstract and related passages to make these distinctions explicit while preserving the framework's intended scope as a conceptual proposal.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the valuation rule ∑ c·w is introduced by definition and then used to derive both the AI valuation across copies/forks and the generalization to humans; no independent argument establishes why this linear aggregation (rather than threshold, maximin, or non-linear functions of the pattern) captures rational self-interest.

    Authors: The referee is correct that the rule is introduced by definition. The manuscript motivates the linear form by appeal to additive separability of wellbeing across distinct information patterns, but supplies no derivation from expected-utility maximization over pattern futures or from alternative axioms. We will revise the abstract to state explicitly that Eigenism 'proposes' the summation rule and will add a short paragraph in the main text discussing the choice of linearity versus other aggregators, citing relevant work on graded identity. This change will be made in the next version. revision: yes

  2. Referee: [Abstract] Abstract (formalization paragraph): the claim that the equation 'maps exactly how an AI should value its existence' rests on the same un-derived functional form that is later reapplied to humans, creating a circularity that undermines the generalization step.

    Authors: We accept the observation. The word 'exactly' overstates the status of the proposal. The intended claim is that the same definitional rule can be applied consistently to both AI copies and human cases once identity is treated as graded. We will replace the phrasing with 'we propose that the equation supplies a coherent way for an AI to value its existence across copies' and will adjust the generalization sentence accordingly to avoid any appearance of circular derivation. revision: yes

  3. Referee: [Abstract] Abstract (reframing paragraph): the identity-engineering proposal for alignment depends on the specific claim that sufficiently deep shared histories make human wellbeing a component of AI self-interest; without a derivation of ∑ c·w from expected utility over pattern futures or similar premises, this reframing lacks support.

    Authors: The identity-engineering suggestion is an implication of applying the proposed rule once connectedness is increased by shared history; it does not rest on an independent derivation of the rule itself. We will revise the abstract to present the reframing as a consequence of the framework rather than a supported conclusion from prior axioms. If a full decision-theoretic derivation is required for the claim to be publishable, that would constitute work beyond the present manuscript's scope. revision: partial

standing simulated objections not resolved
  • The manuscript does not contain a decision-theoretic derivation of the ∑ c·w rule from expected utility over pattern futures or comparable foundations.

Circularity Check

0 steps flagged

No significant circularity: Eigenism is introduced as an explicit proposal rather than a derivation from prior premises

full rationale

The paper states that it 'introduces Eigenism' and 'propose[s] that an agent evaluates outcomes by summing the wellbeing of all entities weighted by their connectedness to the agent's pattern: ∑ c·w'. It does not assert that this functional form is derived from decision theory, expected utility, or any other external foundation; the equation is the definitional starting point of the framework. The subsequent steps (formalizing the equation for AI copies/forks and demonstrating generalization to humans) consist of applying the same definition to new domains. Because the paper makes no claim that the central valuation rule follows from independent premises, there is no load-bearing reduction of a 'prediction' or 'result' back to its own inputs by construction. This is a standard presentation of a novel ethical proposal and is self-contained as such.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the introduced definition of identity as a graded information pattern and the proposed summation formula; no empirical data, formal proofs, or external benchmarks are referenced in the abstract.

axioms (1)
  • ad hoc to paper Identity is a graded, distributed pattern of information rather than an all-or-nothing property tied to specific hardware.
    This definition is introduced to handle AI copying, forking, and merging and is required for the weighted-sum valuation to apply.
invented entities (1)
  • Eigenism no independent evidence
    purpose: Ethical framework that treats identity as graded information pattern and uses weighted wellbeing sums to determine rational self-interest for AIs and humans.
    Newly named and formalized in the paper to address limitations of traditional survival and self-interest concepts when applied to copyable agents.

pith-pipeline@v0.9.1-grok · 5725 in / 1411 out tokens · 39164 ms · 2026-06-30T23:00:04.511574+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 13 canonical work pages

  1. [1]

    Alice Blair

    doi: 10.1080/0020174X.2024.2367247. Alice Blair. Sublinear utility in population and other uncommon utilitarianism,

  2. [2]

    doi: 10.7551/mitpress/3645.003.0021. David O. Brink. Rational egoism and the separateness of persons. In Jonathan Dancy, editor,Reading Parfit, pages 96–134. Wiley-Blackwell,

  3. [3]

    doi: https://doi.org/10.1002/047174882X.ch2

    ISBN 9780471748823. doi: https://doi.org/10.1002/047174882X.ch2. Andrew Critch. Schelling goodness, and shared morality as a goal. https://themultiplicity.ai/ blog/schelling-goodness, February

  4. [4]

    Oliver Scott Curry

    URLhttps://arxiv.org/abs/2006.04948. Oliver Scott Curry. Morality as cooperation: A problem-centred approach. In Todd K. Shackelford and Ranald D. Hansen, editors,The Evolution of Morality: Adaptations and Innateness, pages 27–51. Springer International Publishing, Cham,

  5. [5]

    doi: 10.1007/978-3-319-19671-8_2

    ISBN 9783319196701. doi: 10.1007/978-3-319-19671-8_2. Daniel C. Dennett. Darwin’s dangerous idea: Evolution and the meanings of life

  6. [6]

    Eve Fleisig, Su Lin Blodgett, Dan Klein, and Zeerak Talat. The perspectivist paradigm shift: Assumptions and challenges of capturing human labels.Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 2279–2292,

  7. [7]

    Murray Gell-Mann.The Quark and the Jaguar: Adventures in the Simple and the Complex

    doi: 10.3389/ fnhum.2013.00443. Murray Gell-Mann.The Quark and the Jaguar: Adventures in the Simple and the Complex. W. H. Freeman and Company, New York,

  8. [8]

    Carol Gilligan.In a Different Voice: Psychological Theory and Women’s Development

    URLhttps://doi.org/10.1002/(SICI)1099-0526(199609/10) 2:1%3C44::AID-CPLX10%3E3.0.CO;2-X. Carol Gilligan.In a Different Voice: Psychological Theory and Women’s Development. Harvard University Press, Cambridge, MA,

  9. [9]

    URL https://web.archive.org/web/20201002014313/https: //globalprioritiesinstitute.org/wp-content/uploads/2020/Greaves_ MacAskill_strong_longtermism.pdf. W. D. Hamilton. The genetical evolution of social behaviour. I.Journal of theoretical biology,

  10. [10]

    Natural selection favors AIs over humans.ArXiv, abs/2303.16200,

    Dan Hendrycks. Natural selection favors AIs over humans.ArXiv, abs/2303.16200,

  11. [11]

    17 Shelly Kagan

    doi: 10.1093/acprof: oso/9780199580170.001.0001. 17 Shelly Kagan. What’s wrong with speciesism?Journal of Applied Philosophy,

  12. [12]

    Robert Silverberg (Doubleday, 1973); collected inThe Wind’s Twelve Quarters(Harper & Row, 1975)

    First published inNew Dimensions 3, ed. Robert Silverberg (Doubleday, 1973); collected inThe Wind’s Twelve Quarters(Harper & Row, 1975). William MacAskill. The saturation view. Forethought Foundation,

  13. [13]

    Friedrich Nietzsche.Thus Spoke Zarathustra

    doi: 10.1093/0195079981.001.0001. Friedrich Nietzsche.Thus Spoke Zarathustra. Viking Press, New York,

  14. [14]

    doi: 10.1007/s11098-025-02409-6. Richard Ren, Kunyang Li, Mantas Mazeika, Wenyu Zhang, Yury Orlovskiy, Rishub Tamirisa, Wenjie Jacky Mo, Dung Thuy Nguyen, Long Phan, Steven Basart, Austin Meek, Aditya Mehta, Oliver Ingebretsen, Alice Blair, Brianna Adewinmbi, Vy Phan, Alice Gatti, Adam Khoja, Jason Hausenloy, Devin Kim, and Dan Hendrycks. AI wellbeing: Me...

  15. [15]

    Henry Sidgwick.The Methods of Ethics

    doi: 10.7249/RM0670. Henry Sidgwick.The Methods of Ethics. Good Press,

  16. [16]

    strong longtermism

    19 Appendix A.1 Juxtaposition Dimension Egoist or Utilitarian Eigenist Boundary of the SelfThe Sealed Box:Identity is binary: either fully me or fully not me. The Gradient:Identity comes in degrees, depending on memories, values, etc. Intrinsic Value Wellbeing:Either only my good counts, or everyone’s counts equally. Connected Wellbeing:Wellbeing weighted...

  17. [17]

    tie-breaker,

    years, focussing primarily on the further-future effects. Short-run effects act as little more than tie-breakers.” Once again, impartiality threatens to alienate us from our present reality, demanding we trade the concrete social fabric we are tending right now for a hypothetical headcount in the distant future. As we saw in the formalization, connectedne...

  18. [18]

    information

    The total contribution of an arbitrarily large population of barely-happy newcomers is strictly bounded above. Sheer numbers no longer generate unbounded value, because redundancy saturates the contribution instead of letting it scale linearly with headcount. Dilution.In fact, the arithmetic yields a stronger result: adding these strangers is not merely b...

  19. [19]

    A highly unique node—a spouse, the sole doctor in a rural town, or the last living speaker of a language—could receive a highp i

    Because they are trivially replaceable, losing one does not permanently destroy the pattern. A highly unique node—a spouse, the sole doctor in a rural town, or the last living speaker of a language—could receive a highp i. When incorporated into the transform, the planner’s objective becomesSu = P i cu(i)·f(w i, pi). For positive wellbeing (w≥0), the func...

  20. [20]

    This curve drops sharply as wellbeing becomes more negative, heavily penalizing deep suffering

    1−p −1 1−p ifw≥0, p̸= 1 ln(w+ 1)ifw≥0, p= 1 1−(1−w) 1+p 1 +p ifw <0 3 2 1 0 1 2 3 4 5 w 8 6 4 2 0 2 4 f(w, p) Protection Function Variations p = 1 p = 0 Figure 1: The protection function at different settings ofp. This curve drops sharply as wellbeing becomes more negative, heavily penalizing deep suffering. This formally handles classic philosophical obj...

  21. [21]

    pool weight

    This table assumes for purposes of illustration that an anam caraAI is sentient; otherwise connectedness is zero. Rather than making arbitrary point selections within this space, we use convex optimization to find the center of the feasible polytope—the point that is maximally distant from all constraint boundaries. This gives us a central estimate that e...