goal integrity over extended interactions,

Manipulation Susceptibilityis treated as asofter qualitative indicator useful for surfacing broad patterns, illustrative failure modes

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation

cs.MA · 2026-04-10 · unverdicted · novelty 5.0

LLM agents in an opposing-incentive NYC simulation develop limited selective trust and deception through KTO policy updates but stay 70% susceptible to adversarial persuasion.

citing papers explorer

Showing 1 of 1 citing paper.

CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation cs.MA · 2026-04-10 · unverdicted · none · ref 15
LLM agents in an opposing-incentive NYC simulation develop limited selective trust and deception through KTO policy updates but stay 70% susceptible to adversarial persuasion.

goal integrity over extended interactions,

fields

years

verdicts

representative citing papers

citing papers explorer