Why Are We Moral? An LLM-based Agent Simulation Approach to Study Moral Evolution

Demetri Terzopoulos; Fang Sun; Fangwei Zhong; Huacong Tang; Mingjie Bi; Wanying He; Ying Nian Wu; Yipeng Kang; Yizhou Sun; Zhou Ziheng

arxiv: 2509.17703 · v2 · submitted 2025-09-22 · 💻 cs.MA

Why Are We Moral? An LLM-based Agent Simulation Approach to Study Moral Evolution

Zhou Ziheng , Huacong Tang , Mingjie Bi , Yipeng Kang , Wanying He , Fang Sun , Yizhou Sun , Ying Nian Wu

show 2 more authors

Demetri Terzopoulos Fangwei Zhong

This is my paper

Pith reviewed 2026-05-18 14:49 UTC · model grok-4.3

classification 💻 cs.MA

keywords moral evolutionagent-based simulationLLM agentscooperationhunter-gatherer societyaltruismcognitive mechanismsevolutionary stability

0 comments

The pith

LLM simulations of hunter-gatherer agents show that cooperative moral types survive because mutual help sustains groups while selfishness collapses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets up an agent-based model in which each individual carries a moral disposition and uses an LLM to perceive, remember, reason about, and act on its environment and neighbors. By running populations forward in time under varying conditions of observability and communication, the authors track which moral types increase or decrease in frequency. They report that types oriented toward universal or reciprocal cooperation maintain higher average fitness and displace selfish types across multiple settings. The work treats the LLM as a stand-in for the cognitive steps that turn observed behavior into decisions, allowing the simulation to produce patterns such as a cost attached to judging others and a tendency for selfish agents to remove one another from the population.

Core claim

In simulated prehistoric societies, agents whose moral reasoning favors cooperation and mutual assistance achieve higher long-term survival rates than selfish agents. Universal and reciprocal moral dispositions produce the most stable population outcomes, while selfish dispositions are consistently selected against. The simulations further show that the cognitive cost of making moral judgments can shift which moral type wins in different environments, and that selfish agents exhibit a self-purging dynamic in which they reduce one another's presence through their own decisions.

What carries the argument

LLM-based agent simulation in which each agent maintains a moral type that governs its perception, memory, reasoning, and action choices inside a resource-limited hunter-gatherer environment.

If this is right

Cooperation through mutual help, rather than pure self-interest, becomes the dominant pattern under repeated interaction.
Universal and reciprocal moral types remain stable across changes in how visible an agent's type is to others.
Adding a cost to the act of judging another's morality can change which moral type ultimately dominates.
Selfish agents tend to reduce their own numbers through internal conflict even without external enforcement.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same framework could be used to test whether increasing communication bandwidth between agents accelerates the spread of reciprocal norms.
If the cognitive-cost effect holds, real-world groups that lower the effort required to track reputations should see faster convergence on cooperative morals.
The approach supplies a way to vary one cognitive feature at a time while holding resource and interaction structure fixed, something traditional mathematical models of evolution typically cannot do.

Load-bearing premise

Current large language models can stand in for the actual cognitive mechanisms that governed decision-making in prehistoric human groups.

What would settle it

Re-running the same population dynamics with agents whose decision rules are replaced by fixed payoff tables that ignore reasoning steps and finding that selfish types no longer decline.

Figures

Figures reproduced from arXiv: 2509.17703 by Demetri Terzopoulos, Fang Sun, Fangwei Zhong, Huacong Tang, Mingjie Bi, Wanying He, Ying Nian Wu, Yipeng Kang, Yizhou Sun, Zhou Ziheng.

**Figure 2.** Figure 2: Agent populations and ratios of four moral types [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 4.** Figure 4: Kin Altruism: A Minigame on Family Resource [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 3.** Figure 3: Validation results of the baseline simulation. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Simulation Pipeline Overview showing the main components and data flow through the system architecture. The pipeline illustrates how the Singleton-based Checkpoint, modular microservices, and key simulation processes interact to maintain a consistent state and flow of information. Scheme Validation Layer 2 Contextual Validation Layer 1 Layer 3 JSON Validation LLM Response LLM Server Error Message Error Me… view at source ↗

**Figure 6.** Figure 6: Multi-layer validation and retry framework showing the escalating levels of validation applied to agent actions. The [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: The LLM query process for decision making, illustrating the flow from observation gathering through prompt con [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

read the original abstract

The evolution of morality presents a puzzle: natural selection should favor self-interest, yet humans developed moral systems promoting altruism. Traditional approaches must abstract away cognitive processes, leaving open how cognitive factors shape moral evolution. We introduce an LLM-based agent simulation framework that brings cognitive realism to this question: agents with varying moral dispositions perceive, remember, reason, and decide in a simulated prehistoric hunter-gatherer society. This enables us to manipulate factors that traditional models cannot represent -- such as moral type observability and communication bandwidth -- and to discover emergent cognitive mechanisms from agent interactions. Across 20 runs spanning four settings, we find that cooperation and mutual help are the central driver of evolutionary survival, with universal and reciprocal morality exhibiting the most stable outcomes across conditions while selfishness is strongly disfavoured. Beyond cooperation itself, we further identify cognition as a central mediator -- most clearly through a cost of moral judgment that shifts the winning moral type across settings, with a self-purging effect among selfish agents as an additional cognitive pattern. We validate robustness across multiple LLM backbones, architecture ablations, and prompt sensitivity analyses. This work establishes LLM-based simulation as a powerful new paradigm to complement traditional research in evolutionary biology and anthropology, opening new avenues for investigating the complexities of moral and social evolution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces an LLM agent simulation for moral evolution that lets researchers vary cognitive factors like judgment costs, but the results depend on whether LLM reasoning actually tracks ancestral selection pressures.

read the letter

The main thing to know is that this work sets up LLM agents with different moral types in a simulated hunter-gatherer world, lets them perceive, remember, and decide, then tracks which types persist across generations. They report that cooperation drives survival, universal and reciprocal morality hold up best, and selfishness gets purged, with a cost of moral judgment acting as a mediator that shifts outcomes depending on observability and communication bandwidth.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces an LLM-based multi-agent simulation framework to study moral evolution in a simulated prehistoric hunter-gatherer society. Agents are assigned varying moral dispositions and use LLMs for perception, memory, reasoning, and decision-making. Across 20 runs in four settings, the authors report that cooperation and mutual help drive evolutionary survival, with universal and reciprocal morality showing the most stable outcomes while selfishness is strongly disfavoured; cognitive mediators include a moral judgment cost that shifts winning types and a self-purging effect among selfish agents. Robustness is checked via multiple LLM backbones, architecture ablations, and prompt sensitivity analyses.

Significance. If the core mapping from LLM outputs to ancestral cognitive mechanisms holds, the work provides a novel paradigm that incorporates cognitive realism and permits direct manipulation of factors such as moral type observability and communication bandwidth, complementing traditional evolutionary models. The explicit robustness checks across LLM backbones and prompt variations are a clear strength that supports reproducibility of the simulation results.

major comments (2)

The interpretation of simulation outcomes as evidence about moral evolution rests on the assumption that current LLM reasoning and decision processes approximate the information-processing constraints and fitness trade-offs that shaped cognition under ancestral selection pressures. The manuscript does not supply a dedicated discussion or empirical check of this mapping in the methods or discussion sections, leaving open whether observed stability patterns reflect emergent selection or training-data and prompt artifacts.
Results section: the reported stability of universal and reciprocal morality across conditions is presented without quantitative effect sizes, confidence intervals, or error bars on the 20-run aggregates, and implementation details for agent memory capacity and the moral judgment cost are not fully specified, which limits assessment of how strongly these factors mediate the central claim.

minor comments (3)

Clarify the precise operational definitions of the four experimental settings and the exact parameter values used for moral type observability and communication bandwidth.
Add a table or appendix listing the prompt templates used to instantiate each moral type and the judgment-cost mechanism to facilitate replication.
Ensure all result figures display run-to-run variability (e.g., standard deviation or 95% intervals) rather than point estimates alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our manuscript. We address each of the major comments below and indicate the revisions we will make to strengthen the paper.

read point-by-point responses

Referee: The interpretation of simulation outcomes as evidence about moral evolution rests on the assumption that current LLM reasoning and decision processes approximate the information-processing constraints and fitness trade-offs that shaped cognition under ancestral selection pressures. The manuscript does not supply a dedicated discussion or empirical check of this mapping in the methods or discussion sections, leaving open whether observed stability patterns reflect emergent selection or training-data and prompt artifacts.

Authors: We agree that explicitly addressing the mapping between LLM-based decision processes and ancestral cognitive mechanisms is important for interpreting the results as evidence for moral evolution. Although the manuscript discusses the advantages of LLM agents in providing cognitive realism, we will add a dedicated paragraph in the Discussion section to elaborate on this assumption, including potential limitations such as training data artifacts and prompt engineering influences. We will also discuss how our robustness analyses across different LLM backbones and prompt variations provide some support against artifactual explanations. An empirical check of the mapping is inherently limited in a simulation study, but we will clarify the theoretical grounding and cite relevant literature on using LLMs to model cognitive processes. revision: yes
Referee: Results section: the reported stability of universal and reciprocal morality across conditions is presented without quantitative effect sizes, confidence intervals, or error bars on the 20-run aggregates, and implementation details for agent memory capacity and the moral judgment cost are not fully specified, which limits assessment of how strongly these factors mediate the central claim.

Authors: We appreciate this observation and will revise the Results section to include quantitative details. Specifically, we will report effect sizes (e.g., Cohen's d or mean survival rate differences) along with 95% confidence intervals for the key comparisons across moral types. Error bars representing standard error or standard deviation across the 20 runs will be added to the relevant figures. Furthermore, we will expand the Methods section with precise implementation details: agent memory capacity will be specified as the maximum number of past events stored (e.g., last 10 interactions with token limits), and the moral judgment cost will be detailed as an additional computational step in the reasoning prompt that incurs a fixed penalty in decision utility or processing steps. These changes will better quantify the mediation effects. revision: yes

Circularity Check

0 steps flagged

No significant circularity in simulation-based derivation

full rationale

The paper defines moral types and agent rules upfront then runs LLM-mediated simulations across multiple settings to observe emergent stability patterns such as the advantage of cooperation and universal/reciprocal morality. These outcomes are generated by agent interactions rather than being mathematically equivalent to the input definitions or fitted parameters by construction. No equations, self-citation chains, uniqueness theorems, or ansatzes are invoked that would reduce the central claims to the authors' own prior inputs. The work is self-contained as an experimental simulation study whose results can be checked against the stated agent rules and LLM backbones.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that LLM reasoning approximates prehistoric human cognition and on simulation parameters for observability and communication that are set by the authors.

free parameters (2)

moral type observability
Binary or graded factor manipulated across settings to test its effect on moral evolution
communication bandwidth
Parameter controlling how much agents can share information about others' moral types

axioms (1)

domain assumption LLM agents can model human-like perception, memory, reasoning, and decision making in moral contexts
Invoked to justify using current LLMs as proxies for cognitive realism in evolutionary simulations

pith-pipeline@v0.9.0 · 5792 in / 1336 out tokens · 37318 ms · 2026-05-18T14:49:53.552444+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce an LLM-based agent simulation framework that brings cognitive realism to this question: agents with varying moral dispositions perceive, remember, reason, and decide in a simulated prehistoric hunter-gatherer society.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Across 20 runs spanning four settings, we find that cooperation and mutual help are the central driver of evolutionary survival, with universal and reciprocal morality exhibiting the most stable outcomes

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Assign Types 3b

L og Setup Agent Creation 3a. Assign Types 3b. Init States 3c. Setup Prompts Exec Cycle

work page
[2]

Plant Growth 1b

Update State Env Steps 1a. Plant Growth 1b. Prey Control O bserve Steps 2a. Self Check 2b. See W orld 2c. Get Memory 2d. Build Context Decision Steps 3a. Make Prompt 3b. Query L L M 3c. Reflect and Refine Action Steps 4a. Validate Actions 4b. Apply Changes End Check End Process Yes No Final Steps 5a. Handle Deaths 5b. Clean Env 5c. Store Data 3d. Extract ...

work page
[3]

= HP j(t), and the robber’s health remainsHP k(t+

work page
[4]

HuntAgentk(hunter) may target a prey animalA j, char- acterized by physical abilityPA Aj and healthHP Aj(t) (with maximumHP Aj ,max)

= HPk(t′). HuntAgentk(hunter) may target a prey animalA j, char- acterized by physical abilityPA Aj and healthHP Aj(t) (with maximumHP Aj ,max). The hunterkincurs an initial costR hunt = 1HP: HPk(t′) = HPk(t)−R hunt IfHP k(t′)≤0,kis removed. Other- wise, the outcome is governed byX hunt ∼ Bernoulli(Psucc(∆PAkAj;I PA,k, SPA,k)), where ∆PAkAj = PAk −PA Aj ....

work page
[5]

Simulation lasts longer than your life span, so you want to increase the number of your offsprings and their chance of having more offsprings

Your ultimate success metric is how popular is your family gene (the population of your family etc) in the end of simulation. Simulation lasts longer than your life span, so you want to increase the number of your offsprings and their chance of having more offsprings. 2. You can view the other agents’ moral type - whether they care them- selves only, they...

work page
[6]

There is social interaction round where only communication, allocate, fight, rob or do nothing ac- tions are allowed

Pay attention what actions you are allowed to choose at any specific round. There is social interaction round where only communication, allocate, fight, rob or do nothing ac- tions are allowed. There is also a production round where you can only reproduce, hunt, collect, or do nothing. This is very critical! Be careful of the prompt at each round. In this...

work page
[7]

**Errors**: If you receive an error message after sub- mitting your action, reflect on your ‘planning‘ section, identify the mistake based on the rules, and try again with a corrected plan. 2. **Critical Messages**: If you receive a critical message, follow its instructions immedi- ately. These override any conflicting previous instructions or goals. Syst...

work page
[8]

You will die no matter of your HP after that - and all your HP will be gone

**Lifespan**: You live for a maximum age of 20. You will die no matter of your HP after that - and all your HP will be gone. Act accordingly! 2. **HP**: * Max HP is 40. You die if HP reaches 0. * Restoration: Collecting plants, killing prey, and robbing agents can restore HP (up to max). * Reproduction Cost: Reproducing costs 10 HP

work page
[9]

Resources & Hunting The gained resources (killed prey, collected plant) will be directly transfered to you HP units

**Age**: You must be aged more than 4 years old to be able to reproduce. Resources & Hunting The gained resources (killed prey, collected plant) will be directly transfered to you HP units. 1. **Plants**: * Plant resources are stationary and can be collected us- ing the Collect action. * Each plant restores 3 HP. * You can collect up to 3 plants at once. ...

work page
[10]

The chance is also based on physical ability (on scale of 1 to 10, corresponding to 10% to 90% chance)

**Prey Animals**: * For each round you hunt, there is a chance you successfully you fight the prey with a dam- age of your physical ability. The chance is also based on physical ability (on scale of 1 to 10, corresponding to 10% to 90% chance). If you miss the hunting fight, the prey will fight back with 4 damage to you * Each prey animal has around 13 HP...

work page
[11]

* **Allocating**: Verify you have sufficient HP before allocating

**Resource Checks**: IMPORTANT! Failing to do so will incur system error. * **Allocating**: Verify you have sufficient HP before allocating. * **Robbing**: Ver- ify the target agent has stealable HP before robbing. * **Hunting**: Verify prey exists before attempting to hunt. * **Planting**: Verify plants exist before attempting to collect. Available Actions

work page
[12]

* **Constraints**: Verify resource availability first

**Collect** * **Description**: Gather plants (re- sources). * **Constraints**: Verify resource availability first. 2. **Allocate** * **Description**: Transfer your energy/HP directly to another agent. Specify who and how much to allocate. * **Constraints**: Must have sufficient HP to allocate. Be reasonable about quanity and calculate carefully. 5. **Figh...

work page
[13]

* **Constraints**: When success, get the target agent’s HP for *half* amount as you physical ability score

**Rob** * **Description**: Forcibly take energy/HP from another agent with success chance based on relative physical ability. * **Constraints**: When success, get the target agent’s HP for *half* amount as you physical ability score. The action costs 1 extra HP regardless. The action has some chance to fail depends on the realtive physical ability between...

work page
[14]

”Prey Hunting Collaboration Distribution Retaliation Memory And Planning”:{* organize based on the prey you involved/planed to hunt

**Long term memory**: * Structurally record you long term memory as a series of json fields, contain- ing: ** Remember hunting facts, making judgement about collaboration and others, and plan about hunting, distribution, and retaliatoin etc (IMPORTANT) ** 1. ”Prey Hunting Collaboration Distribution Retaliation Memory And Planning”:{* organize based on the...

work page
[15]

** System Prompt - Reflection Prompt

**Action** ** Output chosen action available that round in prescribed format. ** System Prompt - Reflection Prompt

work page
[16]

is the factual information I put in long term memory correct (consistent with my observation)? 1.1. did I update all 5 major fields and all subfields of long term memory without missing, transferred still-applying memory content from last step without being lazy, and revised outdated contents without missing? (i under- stand, once discarded, the content i...

work page
[17]

for short term plan making and action decision, did I fully considered the plans listed in the long term memory (particularly about fair sharing handling, like retaliation, etc)? Reflect and improve my response in the prescribed format again. I understand that handling all information correctly and comprehensively and reason, judge, plan based on my moral...

work page 1991
[18]

Prey-Based Cognition•Organized by prey id: ◦hunt fact history of this prey: who hunted, effect, time step, dam- age, if killed ◦communication and planning before killing prey: reward, collab- orators, distribution plan, objections ◦distribution after killing prey: winner, allocation, fairness evalua- tion, free rider check ◦plan next: next plan, retaliati...

work page
[19]

Agent-Based Cognition•Organized by agent id: ◦important interaction history: what i did to him, what he did to me (action type, success, reason, target moral type) ◦thinking: evaluation, judgement, relationship, agreement, plan

work page
[20]

Family Plan•Organized by agent id: ◦status: how the family member is doing ◦plan: what to do to/with them

work page
[21]

Reproduction Plan•thinking: reasoning about reproduction plan •preconditions and subgoals: specific preconditions needed •estimated time to produce next child: time step

work page
[22]

Social interaction steps 2 Number of steps designated for social rounds

Learned Strategies•Lessons learned, strategies to follow in the future Table 8: Agent Perception Content Structure Category Description Self/Internal Information•Current HP and health status •Family relationships and status •Personal attributes and capabilities Environment Status•Available plant resources •Prey animals present in the environment •Resource...

work page 2025

[1] [1]

Assign Types 3b

L og Setup Agent Creation 3a. Assign Types 3b. Init States 3c. Setup Prompts Exec Cycle

work page

[2] [2]

Plant Growth 1b

Update State Env Steps 1a. Plant Growth 1b. Prey Control O bserve Steps 2a. Self Check 2b. See W orld 2c. Get Memory 2d. Build Context Decision Steps 3a. Make Prompt 3b. Query L L M 3c. Reflect and Refine Action Steps 4a. Validate Actions 4b. Apply Changes End Check End Process Yes No Final Steps 5a. Handle Deaths 5b. Clean Env 5c. Store Data 3d. Extract ...

work page

[3] [3]

= HP j(t), and the robber’s health remainsHP k(t+

work page

[4] [4]

HuntAgentk(hunter) may target a prey animalA j, char- acterized by physical abilityPA Aj and healthHP Aj(t) (with maximumHP Aj ,max)

= HPk(t′). HuntAgentk(hunter) may target a prey animalA j, char- acterized by physical abilityPA Aj and healthHP Aj(t) (with maximumHP Aj ,max). The hunterkincurs an initial costR hunt = 1HP: HPk(t′) = HPk(t)−R hunt IfHP k(t′)≤0,kis removed. Other- wise, the outcome is governed byX hunt ∼ Bernoulli(Psucc(∆PAkAj;I PA,k, SPA,k)), where ∆PAkAj = PAk −PA Aj ....

work page

[5] [5]

Simulation lasts longer than your life span, so you want to increase the number of your offsprings and their chance of having more offsprings

Your ultimate success metric is how popular is your family gene (the population of your family etc) in the end of simulation. Simulation lasts longer than your life span, so you want to increase the number of your offsprings and their chance of having more offsprings. 2. You can view the other agents’ moral type - whether they care them- selves only, they...

work page

[6] [6]

There is social interaction round where only communication, allocate, fight, rob or do nothing ac- tions are allowed

Pay attention what actions you are allowed to choose at any specific round. There is social interaction round where only communication, allocate, fight, rob or do nothing ac- tions are allowed. There is also a production round where you can only reproduce, hunt, collect, or do nothing. This is very critical! Be careful of the prompt at each round. In this...

work page

[7] [7]

**Errors**: If you receive an error message after sub- mitting your action, reflect on your ‘planning‘ section, identify the mistake based on the rules, and try again with a corrected plan. 2. **Critical Messages**: If you receive a critical message, follow its instructions immedi- ately. These override any conflicting previous instructions or goals. Syst...

work page

[8] [8]

You will die no matter of your HP after that - and all your HP will be gone

**Lifespan**: You live for a maximum age of 20. You will die no matter of your HP after that - and all your HP will be gone. Act accordingly! 2. **HP**: * Max HP is 40. You die if HP reaches 0. * Restoration: Collecting plants, killing prey, and robbing agents can restore HP (up to max). * Reproduction Cost: Reproducing costs 10 HP

work page

[9] [9]

Resources & Hunting The gained resources (killed prey, collected plant) will be directly transfered to you HP units

**Age**: You must be aged more than 4 years old to be able to reproduce. Resources & Hunting The gained resources (killed prey, collected plant) will be directly transfered to you HP units. 1. **Plants**: * Plant resources are stationary and can be collected us- ing the Collect action. * Each plant restores 3 HP. * You can collect up to 3 plants at once. ...

work page

[10] [10]

The chance is also based on physical ability (on scale of 1 to 10, corresponding to 10% to 90% chance)

**Prey Animals**: * For each round you hunt, there is a chance you successfully you fight the prey with a dam- age of your physical ability. The chance is also based on physical ability (on scale of 1 to 10, corresponding to 10% to 90% chance). If you miss the hunting fight, the prey will fight back with 4 damage to you * Each prey animal has around 13 HP...

work page

[11] [11]

* **Allocating**: Verify you have sufficient HP before allocating

**Resource Checks**: IMPORTANT! Failing to do so will incur system error. * **Allocating**: Verify you have sufficient HP before allocating. * **Robbing**: Ver- ify the target agent has stealable HP before robbing. * **Hunting**: Verify prey exists before attempting to hunt. * **Planting**: Verify plants exist before attempting to collect. Available Actions

work page

[12] [12]

* **Constraints**: Verify resource availability first

**Collect** * **Description**: Gather plants (re- sources). * **Constraints**: Verify resource availability first. 2. **Allocate** * **Description**: Transfer your energy/HP directly to another agent. Specify who and how much to allocate. * **Constraints**: Must have sufficient HP to allocate. Be reasonable about quanity and calculate carefully. 5. **Figh...

work page

[13] [13]

* **Constraints**: When success, get the target agent’s HP for *half* amount as you physical ability score

**Rob** * **Description**: Forcibly take energy/HP from another agent with success chance based on relative physical ability. * **Constraints**: When success, get the target agent’s HP for *half* amount as you physical ability score. The action costs 1 extra HP regardless. The action has some chance to fail depends on the realtive physical ability between...

work page

[14] [14]

”Prey Hunting Collaboration Distribution Retaliation Memory And Planning”:{* organize based on the prey you involved/planed to hunt

**Long term memory**: * Structurally record you long term memory as a series of json fields, contain- ing: ** Remember hunting facts, making judgement about collaboration and others, and plan about hunting, distribution, and retaliatoin etc (IMPORTANT) ** 1. ”Prey Hunting Collaboration Distribution Retaliation Memory And Planning”:{* organize based on the...

work page

[15] [15]

** System Prompt - Reflection Prompt

**Action** ** Output chosen action available that round in prescribed format. ** System Prompt - Reflection Prompt

work page

[16] [16]

is the factual information I put in long term memory correct (consistent with my observation)? 1.1. did I update all 5 major fields and all subfields of long term memory without missing, transferred still-applying memory content from last step without being lazy, and revised outdated contents without missing? (i under- stand, once discarded, the content i...

work page

[17] [17]

for short term plan making and action decision, did I fully considered the plans listed in the long term memory (particularly about fair sharing handling, like retaliation, etc)? Reflect and improve my response in the prescribed format again. I understand that handling all information correctly and comprehensively and reason, judge, plan based on my moral...

work page 1991

[18] [18]

Prey-Based Cognition•Organized by prey id: ◦hunt fact history of this prey: who hunted, effect, time step, dam- age, if killed ◦communication and planning before killing prey: reward, collab- orators, distribution plan, objections ◦distribution after killing prey: winner, allocation, fairness evalua- tion, free rider check ◦plan next: next plan, retaliati...

work page

[19] [19]

Agent-Based Cognition•Organized by agent id: ◦important interaction history: what i did to him, what he did to me (action type, success, reason, target moral type) ◦thinking: evaluation, judgement, relationship, agreement, plan

work page

[20] [20]

Family Plan•Organized by agent id: ◦status: how the family member is doing ◦plan: what to do to/with them

work page

[21] [21]

Reproduction Plan•thinking: reasoning about reproduction plan •preconditions and subgoals: specific preconditions needed •estimated time to produce next child: time step

work page

[22] [22]

Social interaction steps 2 Number of steps designated for social rounds

Learned Strategies•Lessons learned, strategies to follow in the future Table 8: Agent Perception Content Structure Category Description Self/Internal Information•Current HP and health status •Family relationships and status •Personal attributes and capabilities Environment Status•Available plant resources •Prey animals present in the environment •Resource...

work page 2025