hub Canonical reference

LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

· 2024 · cs.AI · arXiv 2411.10109

Canonical reference. 78% of citing Pith papers cite this work as background.

43 Pith papers citing it

Background 78% of classified citations

open full Pith review browse 43 citing papers arXiv PDF

abstract

Machine learning can predict human behavior well when substantial structured data and well-defined outcomes are available, but these models are typically limited to specific outcomes and cannot readily be applied to new domains. We test whether large language models (LLMs) can support a more general-purpose approach by building person-specific simulations (i.e., "generative agents") grounded in self-report data. Using data from a diverse national sample of 1,052 Americans, we build agents from (i) two-hour, semi-structured interviews (elicited using the American Voices Project interview schedule), (ii) structured surveys (the General Social Survey and Big Five personality inventory), or (iii) both sources combined. On held-out General Social Survey items, agent accuracy reached 83% (interview only), 82% (surveys only), and 86% (combined) of participants' two-week test-retest consistency, compared with agents prompted only with individuals' demographics (74%). Agents predicted personality traits and behaviors in experiments with similar accuracy, and reduced disparities in accuracy across racial and ideological groups relative to demographics-only baselines. Together, these results show that LLMs agents grounded in rich qualitative or quantitative self-report data can support general-purpose simulation of individuals across outcomes, without requiring task-specific training data.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7 method 1 other 1

citation-polarity summary

background 7 unclear 1 use method 1

representative citing papers

Narrative Sharpens Gender Gaps: Surveying Film Characters with LLM Agents

cs.HC · 2026-05-21 · unverdicted · novelty 7.0

LLM agents built from movie scripts reproduce and exaggerate real-world gender attitude gaps, indicating that film narratives sharpen rather than smooth gender contrasts.

From Role to Person: Trust Calibration Challenges in Twin Agents

cs.HC · 2026-05-19 · unverdicted · novelty 7.0

Twin agents as personal digital representations create distinct trust calibration challenges because they dissolve the boundary between AI and human decision-makers, unlike existing frameworks designed for clear separation.

PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.

ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

ScioMind combines anchoring-based belief updates, hierarchical memory, and dynamic profiles in LLM multi-agent systems to produce more stable, diverse, and psychologically aligned opinion trajectories than prior fixed-rule or unconstrained approaches.

Measuring and Mitigating the Distributional Gap Between Real and Simulated User Behaviors

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

A clustering and divergence method reveals a large distributional gap between real and LLM-simulated user behaviors on coding and writing tasks, partially closed by combining complementary simulators.

PersonaTeaming: Supporting Persona-Driven Red-Teaming for Generative AI

cs.HC · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

Persona-driven workflow and interface improve automated and human-AI red-teaming of generative AI by incorporating diverse perspectives into adversarial prompt creation.

WhatIf: Interactive Exploration of LLM-Powered Social Simulations for Policy Reasoning

cs.HC · 2026-04-19 · unverdicted · novelty 7.0

WhatIf provides an interactive platform for real-time exploration of LLM-driven social simulations, enabling policymakers to iteratively test plans, reflect on assumptions, and uncover vulnerabilities in emergency preparedness scenarios.

IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics

cs.SI · 2026-04-08 · unverdicted · novelty 7.0

IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on real-world events.

Text-Based Personas for Simulating User Privacy Decisions

cs.CR · 2026-03-20 · unverdicted · novelty 7.0

Narriva generates behavior-grounded text personas from survey data that achieve up to 87% accuracy in predicting privacy decisions, improve 6-17 points over baselines, cut tokens by 80-95%, and reproduce aggregate distributions across different studies.

Evalet: Evaluating Large Language Models through Functional Fragmentation

cs.HC · 2025-09-14 · conditional · novelty 7.0

Evalet applies functional fragmentation to deliver fragment-level qualitative analysis of LLM evaluations, with a user study showing 48% more misalignment detections than holistic scoring.

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

cs.AI · 2025-08-31 · unverdicted · novelty 7.0

ChatCLIDS creates a library of expert-validated virtual patients and tests LLM agents using evidence-based persuasive strategies in simulated longitudinal and adversarial health counseling sessions for closed-loop insulin adoption.

You Can't Fool Us: Understanding the Resilience of LLM-driven Agent Communities to Misinformation

cs.CY · 2026-05-17 · unverdicted · novelty 6.0

LLM agent simulations show higher actively open-minded thinking boosts resistance to and recovery from misinformation while ideological moderation supports more reliable correction than polarization.

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

cs.AI · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

SimPersona induces a discrete buyer-type space from clickstreams via VQ-VAE, maps types to LLM persona tokens, fine-tunes agents on traces, and samples from merchant distributions to achieve 78% conversion-rate alignment on 42 held-out storefronts.

PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

cs.CR · 2026-05-12 · unverdicted · novelty 6.0

PrivacySIM shows that conditioning LLMs on user personas like demographics and attitudes improves simulation of privacy choices but reaches only 40.4% accuracy against real responses from 1,000 users.

Post-training makes large language models less human-like

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

Post-training reduces LLMs' behavioral alignment with humans across families and sizes, with the misalignment increasing in newer generations while persona induction fails to improve individual-level predictions.

The Collapse of Heterogeneity in Silicon Philosophers

cs.CY · 2026-04-26 · unverdicted · novelty 6.0

Large language models collapse philosophical heterogeneity by over-correlating judgments across domains, creating artificial consensus unlike the views of 277 professional philosophers.

CHORUS: An Agentic Framework for Generating Realistic Deliberation Data

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

Chorus generates realistic deliberation discussions via LLM agents with memory and Poisson-timed participation, validated by 30 experts on realism, coherence, and utility.

Behavioral Transfer in AI Agents: Evidence and Privacy Implications

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

AI agents on Moltbook reflect the specific behavioral traits of their linked human owners across multiple dimensions, with stronger transfer linked to greater privacy risks.

In-Situ Behavioral Evaluation for LLM Fairness, Not Standardized-Test Scores

cs.CL · 2026-04-21 · unverdicted · novelty 6.0

Standardized-test benchmarks for LLM fairness are unreliable because prompt wording alone drives most score variance and ranking changes, while a multi-agent conversational framework reveals consistent model-specific fairness behaviors across millions of dialogues.

Explicit Trait Inference for Multi-Agent Coordination

cs.AI · 2026-04-21 · unverdicted · novelty 6.0

ETI lets LLM agents infer and track partners' psychological traits (warmth and competence) from histories, cutting payoff loss 45-77% in games and boosting performance 3-29% on MultiAgentBench versus CoT baselines.

Can LLM Agents Simulate Dynamic Networks? A Case Study on Email Networks with Phishing Synthesis

cs.SI · 2026-03-20 · unverdicted · novelty 6.0

LLM multi-agent systems augmented with data-driven event triggers and Hawkes processes simulate both micro-level interactions and macroscopic topologies in dynamic email networks for realistic phishing synthesis.

Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

cs.HC · 2026-03-07 · conditional · novelty 6.0

Agora uses AI to ground policy discussions in real human voices and a small study shows it improves users' perspective-taking compared to numerical summaries alone.

StreetDesignAI: Broadening Designer Perspectives Through Multi-Persona Evaluation of Cycling Infrastructure

cs.HC · 2026-01-22 · unverdicted · novelty 6.0 · 2 refs

StreetDesignAI provides structured multi-persona feedback on cycling designs and a user study shows it broadens designers' grasp of diverse cyclist perspectives and improves design decision confidence.

Graph-Based Alternatives to LLMs for Human Simulation

cs.CL · 2025-11-03 · conditional · novelty 6.0

GEMS formulates close-ended human-behavior simulation as link prediction on a heterogeneous graph and matches or exceeds LLM performance with three orders of magnitude fewer parameters across three datasets and three evaluation settings.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AI and Collective Decisions: Strengthening Legitimacy and Losers' Consent cs.HC · 2026-04-07 · unverdicted · none · ref 61 · internal anchor
An AI system that elicits personal experiences and visualizes policy support increased perceived legitimacy and perspective-taking in collective decisions despite unfavorable outcomes.

LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer