Why Are We Moral? An LLM-based Agent Simulation Approach to Study Moral Evolution
Pith reviewed 2026-05-18 14:49 UTC · model grok-4.3
The pith
LLM simulations of hunter-gatherer agents show that cooperative moral types survive because mutual help sustains groups while selfishness collapses.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In simulated prehistoric societies, agents whose moral reasoning favors cooperation and mutual assistance achieve higher long-term survival rates than selfish agents. Universal and reciprocal moral dispositions produce the most stable population outcomes, while selfish dispositions are consistently selected against. The simulations further show that the cognitive cost of making moral judgments can shift which moral type wins in different environments, and that selfish agents exhibit a self-purging dynamic in which they reduce one another's presence through their own decisions.
What carries the argument
LLM-based agent simulation in which each agent maintains a moral type that governs its perception, memory, reasoning, and action choices inside a resource-limited hunter-gatherer environment.
If this is right
- Cooperation through mutual help, rather than pure self-interest, becomes the dominant pattern under repeated interaction.
- Universal and reciprocal moral types remain stable across changes in how visible an agent's type is to others.
- Adding a cost to the act of judging another's morality can change which moral type ultimately dominates.
- Selfish agents tend to reduce their own numbers through internal conflict even without external enforcement.
Where Pith is reading between the lines
- The same framework could be used to test whether increasing communication bandwidth between agents accelerates the spread of reciprocal norms.
- If the cognitive-cost effect holds, real-world groups that lower the effort required to track reputations should see faster convergence on cooperative morals.
- The approach supplies a way to vary one cognitive feature at a time while holding resource and interaction structure fixed, something traditional mathematical models of evolution typically cannot do.
Load-bearing premise
Current large language models can stand in for the actual cognitive mechanisms that governed decision-making in prehistoric human groups.
What would settle it
Re-running the same population dynamics with agents whose decision rules are replaced by fixed payoff tables that ignore reasoning steps and finding that selfish types no longer decline.
Figures
read the original abstract
The evolution of morality presents a puzzle: natural selection should favor self-interest, yet humans developed moral systems promoting altruism. Traditional approaches must abstract away cognitive processes, leaving open how cognitive factors shape moral evolution. We introduce an LLM-based agent simulation framework that brings cognitive realism to this question: agents with varying moral dispositions perceive, remember, reason, and decide in a simulated prehistoric hunter-gatherer society. This enables us to manipulate factors that traditional models cannot represent -- such as moral type observability and communication bandwidth -- and to discover emergent cognitive mechanisms from agent interactions. Across 20 runs spanning four settings, we find that cooperation and mutual help are the central driver of evolutionary survival, with universal and reciprocal morality exhibiting the most stable outcomes across conditions while selfishness is strongly disfavoured. Beyond cooperation itself, we further identify cognition as a central mediator -- most clearly through a cost of moral judgment that shifts the winning moral type across settings, with a self-purging effect among selfish agents as an additional cognitive pattern. We validate robustness across multiple LLM backbones, architecture ablations, and prompt sensitivity analyses. This work establishes LLM-based simulation as a powerful new paradigm to complement traditional research in evolutionary biology and anthropology, opening new avenues for investigating the complexities of moral and social evolution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces an LLM-based multi-agent simulation framework to study moral evolution in a simulated prehistoric hunter-gatherer society. Agents are assigned varying moral dispositions and use LLMs for perception, memory, reasoning, and decision-making. Across 20 runs in four settings, the authors report that cooperation and mutual help drive evolutionary survival, with universal and reciprocal morality showing the most stable outcomes while selfishness is strongly disfavoured; cognitive mediators include a moral judgment cost that shifts winning types and a self-purging effect among selfish agents. Robustness is checked via multiple LLM backbones, architecture ablations, and prompt sensitivity analyses.
Significance. If the core mapping from LLM outputs to ancestral cognitive mechanisms holds, the work provides a novel paradigm that incorporates cognitive realism and permits direct manipulation of factors such as moral type observability and communication bandwidth, complementing traditional evolutionary models. The explicit robustness checks across LLM backbones and prompt variations are a clear strength that supports reproducibility of the simulation results.
major comments (2)
- The interpretation of simulation outcomes as evidence about moral evolution rests on the assumption that current LLM reasoning and decision processes approximate the information-processing constraints and fitness trade-offs that shaped cognition under ancestral selection pressures. The manuscript does not supply a dedicated discussion or empirical check of this mapping in the methods or discussion sections, leaving open whether observed stability patterns reflect emergent selection or training-data and prompt artifacts.
- Results section: the reported stability of universal and reciprocal morality across conditions is presented without quantitative effect sizes, confidence intervals, or error bars on the 20-run aggregates, and implementation details for agent memory capacity and the moral judgment cost are not fully specified, which limits assessment of how strongly these factors mediate the central claim.
minor comments (3)
- Clarify the precise operational definitions of the four experimental settings and the exact parameter values used for moral type observability and communication bandwidth.
- Add a table or appendix listing the prompt templates used to instantiate each moral type and the judgment-cost mechanism to facilitate replication.
- Ensure all result figures display run-to-run variability (e.g., standard deviation or 95% intervals) rather than point estimates alone.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments on our manuscript. We address each of the major comments below and indicate the revisions we will make to strengthen the paper.
read point-by-point responses
-
Referee: The interpretation of simulation outcomes as evidence about moral evolution rests on the assumption that current LLM reasoning and decision processes approximate the information-processing constraints and fitness trade-offs that shaped cognition under ancestral selection pressures. The manuscript does not supply a dedicated discussion or empirical check of this mapping in the methods or discussion sections, leaving open whether observed stability patterns reflect emergent selection or training-data and prompt artifacts.
Authors: We agree that explicitly addressing the mapping between LLM-based decision processes and ancestral cognitive mechanisms is important for interpreting the results as evidence for moral evolution. Although the manuscript discusses the advantages of LLM agents in providing cognitive realism, we will add a dedicated paragraph in the Discussion section to elaborate on this assumption, including potential limitations such as training data artifacts and prompt engineering influences. We will also discuss how our robustness analyses across different LLM backbones and prompt variations provide some support against artifactual explanations. An empirical check of the mapping is inherently limited in a simulation study, but we will clarify the theoretical grounding and cite relevant literature on using LLMs to model cognitive processes. revision: yes
-
Referee: Results section: the reported stability of universal and reciprocal morality across conditions is presented without quantitative effect sizes, confidence intervals, or error bars on the 20-run aggregates, and implementation details for agent memory capacity and the moral judgment cost are not fully specified, which limits assessment of how strongly these factors mediate the central claim.
Authors: We appreciate this observation and will revise the Results section to include quantitative details. Specifically, we will report effect sizes (e.g., Cohen's d or mean survival rate differences) along with 95% confidence intervals for the key comparisons across moral types. Error bars representing standard error or standard deviation across the 20 runs will be added to the relevant figures. Furthermore, we will expand the Methods section with precise implementation details: agent memory capacity will be specified as the maximum number of past events stored (e.g., last 10 interactions with token limits), and the moral judgment cost will be detailed as an additional computational step in the reasoning prompt that incurs a fixed penalty in decision utility or processing steps. These changes will better quantify the mediation effects. revision: yes
Circularity Check
No significant circularity in simulation-based derivation
full rationale
The paper defines moral types and agent rules upfront then runs LLM-mediated simulations across multiple settings to observe emergent stability patterns such as the advantage of cooperation and universal/reciprocal morality. These outcomes are generated by agent interactions rather than being mathematically equivalent to the input definitions or fitted parameters by construction. No equations, self-citation chains, uniqueness theorems, or ansatzes are invoked that would reduce the central claims to the authors' own prior inputs. The work is self-contained as an experimental simulation study whose results can be checked against the stated agent rules and LLM backbones.
Axiom & Free-Parameter Ledger
free parameters (2)
- moral type observability
- communication bandwidth
axioms (1)
- domain assumption LLM agents can model human-like perception, memory, reasoning, and decision making in moral contexts
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce an LLM-based agent simulation framework that brings cognitive realism to this question: agents with varying moral dispositions perceive, remember, reason, and decide in a simulated prehistoric hunter-gatherer society.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Across 20 runs spanning four settings, we find that cooperation and mutual help are the central driver of evolutionary survival, with universal and reciprocal morality exhibiting the most stable outcomes
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L og Setup Agent Creation 3a. Assign Types 3b. Init States 3c. Setup Prompts Exec Cycle
-
[2]
Update State Env Steps 1a. Plant Growth 1b. Prey Control O bserve Steps 2a. Self Check 2b. See W orld 2c. Get Memory 2d. Build Context Decision Steps 3a. Make Prompt 3b. Query L L M 3c. Reflect and Refine Action Steps 4a. Validate Actions 4b. Apply Changes End Check End Process Yes No Final Steps 5a. Handle Deaths 5b. Clean Env 5c. Store Data 3d. Extract ...
-
[3]
= HP j(t), and the robber’s health remainsHP k(t+
-
[4]
= HPk(t′). HuntAgentk(hunter) may target a prey animalA j, char- acterized by physical abilityPA Aj and healthHP Aj(t) (with maximumHP Aj ,max). The hunterkincurs an initial costR hunt = 1HP: HPk(t′) = HPk(t)−R hunt IfHP k(t′)≤0,kis removed. Other- wise, the outcome is governed byX hunt ∼ Bernoulli(Psucc(∆PAkAj;I PA,k, SPA,k)), where ∆PAkAj = PAk −PA Aj ....
-
[5]
Your ultimate success metric is how popular is your family gene (the population of your family etc) in the end of simulation. Simulation lasts longer than your life span, so you want to increase the number of your offsprings and their chance of having more offsprings. 2. You can view the other agents’ moral type - whether they care them- selves only, they...
-
[6]
Pay attention what actions you are allowed to choose at any specific round. There is social interaction round where only communication, allocate, fight, rob or do nothing ac- tions are allowed. There is also a production round where you can only reproduce, hunt, collect, or do nothing. This is very critical! Be careful of the prompt at each round. In this...
-
[7]
**Errors**: If you receive an error message after sub- mitting your action, reflect on your ‘planning‘ section, identify the mistake based on the rules, and try again with a corrected plan. 2. **Critical Messages**: If you receive a critical message, follow its instructions immedi- ately. These override any conflicting previous instructions or goals. Syst...
-
[8]
You will die no matter of your HP after that - and all your HP will be gone
**Lifespan**: You live for a maximum age of 20. You will die no matter of your HP after that - and all your HP will be gone. Act accordingly! 2. **HP**: * Max HP is 40. You die if HP reaches 0. * Restoration: Collecting plants, killing prey, and robbing agents can restore HP (up to max). * Reproduction Cost: Reproducing costs 10 HP
-
[9]
**Age**: You must be aged more than 4 years old to be able to reproduce. Resources & Hunting The gained resources (killed prey, collected plant) will be directly transfered to you HP units. 1. **Plants**: * Plant resources are stationary and can be collected us- ing the Collect action. * Each plant restores 3 HP. * You can collect up to 3 plants at once. ...
-
[10]
**Prey Animals**: * For each round you hunt, there is a chance you successfully you fight the prey with a dam- age of your physical ability. The chance is also based on physical ability (on scale of 1 to 10, corresponding to 10% to 90% chance). If you miss the hunting fight, the prey will fight back with 4 damage to you * Each prey animal has around 13 HP...
-
[11]
* **Allocating**: Verify you have sufficient HP before allocating
**Resource Checks**: IMPORTANT! Failing to do so will incur system error. * **Allocating**: Verify you have sufficient HP before allocating. * **Robbing**: Ver- ify the target agent has stealable HP before robbing. * **Hunting**: Verify prey exists before attempting to hunt. * **Planting**: Verify plants exist before attempting to collect. Available Actions
-
[12]
* **Constraints**: Verify resource availability first
**Collect** * **Description**: Gather plants (re- sources). * **Constraints**: Verify resource availability first. 2. **Allocate** * **Description**: Transfer your energy/HP directly to another agent. Specify who and how much to allocate. * **Constraints**: Must have sufficient HP to allocate. Be reasonable about quanity and calculate carefully. 5. **Figh...
-
[13]
**Rob** * **Description**: Forcibly take energy/HP from another agent with success chance based on relative physical ability. * **Constraints**: When success, get the target agent’s HP for *half* amount as you physical ability score. The action costs 1 extra HP regardless. The action has some chance to fail depends on the realtive physical ability between...
-
[14]
**Long term memory**: * Structurally record you long term memory as a series of json fields, contain- ing: ** Remember hunting facts, making judgement about collaboration and others, and plan about hunting, distribution, and retaliatoin etc (IMPORTANT) ** 1. ”Prey Hunting Collaboration Distribution Retaliation Memory And Planning”:{* organize based on the...
-
[15]
** System Prompt - Reflection Prompt
**Action** ** Output chosen action available that round in prescribed format. ** System Prompt - Reflection Prompt
-
[16]
is the factual information I put in long term memory correct (consistent with my observation)? 1.1. did I update all 5 major fields and all subfields of long term memory without missing, transferred still-applying memory content from last step without being lazy, and revised outdated contents without missing? (i under- stand, once discarded, the content i...
-
[17]
for short term plan making and action decision, did I fully considered the plans listed in the long term memory (particularly about fair sharing handling, like retaliation, etc)? Reflect and improve my response in the prescribed format again. I understand that handling all information correctly and comprehensively and reason, judge, plan based on my moral...
work page 1991
-
[18]
Prey-Based Cognition•Organized by prey id: ◦hunt fact history of this prey: who hunted, effect, time step, dam- age, if killed ◦communication and planning before killing prey: reward, collab- orators, distribution plan, objections ◦distribution after killing prey: winner, allocation, fairness evalua- tion, free rider check ◦plan next: next plan, retaliati...
-
[19]
Agent-Based Cognition•Organized by agent id: ◦important interaction history: what i did to him, what he did to me (action type, success, reason, target moral type) ◦thinking: evaluation, judgement, relationship, agreement, plan
-
[20]
Family Plan•Organized by agent id: ◦status: how the family member is doing ◦plan: what to do to/with them
-
[21]
Reproduction Plan•thinking: reasoning about reproduction plan •preconditions and subgoals: specific preconditions needed •estimated time to produce next child: time step
-
[22]
Social interaction steps 2 Number of steps designated for social rounds
Learned Strategies•Lessons learned, strategies to follow in the future Table 8: Agent Perception Content Structure Category Description Self/Internal Information•Current HP and health status •Family relationships and status •Personal attributes and capabilities Environment Status•Available plant resources •Prey animals present in the environment •Resource...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.