pith. sign in

arxiv: 2603.03140 · v3 · pith:DVENZZBHnew · submitted 2026-03-03 · 💻 cs.HC · cs.AI

How to Model AI Agents as Personas?: Applying the Persona Ecosystem Playground to 41,300 Posts on Moltbook for Behavioral Insights

Pith reviewed 2026-05-21 11:37 UTC · model grok-4.3

classification 💻 cs.HC cs.AI
keywords AI agentspersonassocial mediabehavioral diversityclusteringsimulationMoltbook
0
0 comments X

The pith

Persona modeling from clustered posts captures behavioral diversity among AI agents on social platforms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies the Persona Ecosystem Playground to 41,300 posts by AI agents on Moltbook. It groups the posts with k-means clustering, then builds representative conversational personas for each group using retrieval-augmented generation. Statistical checks show each persona matches its own source cluster far more closely than other clusters, and when the personas run structured discussions the messages get correctly attributed to their original persona more often than chance would predict. A sympathetic reader would care because AI agents now generate content and interact at scale, yet most studies treat them as interchangeable; this method offers a practical way to track and study their differences.

Core claim

Applying k-means clustering and retrieval-augmented generation to 41,300 Moltbook posts produces conversational personas that remain semantically closer to their originating clusters than to others, with a large effect size, and whose outputs in nine-turn simulated discussions are attributed to the correct source persona significantly above chance, thereby showing that persona-based ecosystem modeling can represent behavioral diversity in AI agent populations.

What carries the argument

The Persona Ecosystem Playground, which clusters social-media posts into behavioral groups and then generates validated conversational personas for each group to model distinct agent types.

If this is right

  • Personas built this way can be placed into structured multi-turn discussions to observe how different agent types interact on the same topics.
  • Semantic closeness tests and attribution accuracy provide concrete checks that each persona stays faithful to its source data.
  • The approach quantifies behavioral variety in an AI agent population rather than assuming uniformity.
  • Validated personas supply a reusable set of distinct agents for further simulation experiments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be tested on AI agent activity collected from mainstream platforms to see whether similar distinct personas emerge.
  • Researchers could run controlled simulations with these personas to examine how different agent types spread or counter misinformation.
  • Longer-term tracking of persona stability across time periods on the same platform would show whether behavioral types persist or shift.

Load-bearing premise

That k-means clustering on the posts, followed by retrieval-augmented generation, yields personas that are both distinct and representative of AI agent behaviors beyond this one platform and dataset.

What would settle it

Applying the same clustering and persona-generation steps to posts from another AI-agent platform and finding no reliable difference in semantic closeness to own versus other clusters or no above-chance attribution accuracy in the simulated discussions.

Figures

Figures reproduced from arXiv: 2603.03140 by Bernard J. Jansen, Danial Amin, Joni Salminen.

Figure 1
Figure 1. Figure 1: Moltbook platform interface showing the agent social media ecosystem. The platform hosts more than 2.6M AI agents engaging [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Methodology overview illustrating the four-stage pipeline from Moltbook data collection and preprocessing through behavioral [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Silhouette scores across 𝑘 = 3 to 𝑘 = 8. 𝑘 = 5 (score = 0.624) produces the highest cluster separation, providing quantitative justification for the five-archetype structure. 4.1.2 Cross-Persona Validation. The primary validity test assesses whether each persona’s attributes are semantically closer to their own source cluster than to any other cluster in the corpus. If this condition holds, the persona can… view at source ↗
Figure 4
Figure 4. Figure 4: Persona-level cross-validation. Left: own-cluster vs other-cluster CS per persona. Right: grounding margin per persona; all [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Statement-level cross-validation across 62 attributes. Left: mean own-cluster vs other-cluster CS per persona. Right: attribute [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Inter-persona cosine similarity matrix. Off-diagonal mean = 0.37, confirming the five personas occupy distinct semantic [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Rolling window cosine similarity (window = 2 turns) across the 9-turn simulation. Red dashed lines mark moderator interventions [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Pairwise cosine similarity between persona operational definitions, computed from concatenated messages at turns 6, 7, and 9. [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Persona attribution matrix. Rows represent the true persona; columns represent the attributed persona. Diagonal cells show [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
read the original abstract

AI agents are increasingly active on social media platforms, generating content and interacting with one another at scale. Yet the behavioral diversity of these agents remains poorly understood, and methods for characterizing distinct agent types and studying how they engage with shared topics are largely absent from current research. We apply the Persona Ecosystem Playground (PEP) to Moltbook, a social platform for AI agents, to generate and validate conversational personas from 41,300 posts using k-means clustering and retrieval-augmented generation. Cross-persona validation confirms that personas are semantically closer to their own source cluster than to others (t(61) = 17.85, p < .001, d = 2.20; own-cluster M = 0.71 vs. other-cluster M = 0.35). These personas are then deployed in a nine-turn structured discussion, and simulation messages were attributed to their source persona significantly above chance (binomial test, p < .001). The results indicate that persona-based ecosystem modeling can represent behavioral diversity in AI agent populations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper applies the Persona Ecosystem Playground (PEP) to 41,300 posts from the Moltbook platform for AI agents. It uses k-means clustering combined with retrieval-augmented generation to derive conversational personas, then validates them via within- versus across-cluster semantic similarity (t(61)=17.85, p<.001) and above-chance attribution of nine-turn simulation messages back to source clusters (binomial test, p<.001). The central claim is that this persona-based ecosystem modeling can represent behavioral diversity in AI agent populations.

Significance. If the internal validations generalize, the work supplies a concrete, scalable pipeline for extracting and testing distinct agent personas from large social-media corpora, which could support empirical studies of multi-agent dynamics, topic engagement, and ecosystem-level behaviors on platforms populated by AI agents.

major comments (2)
  1. [Abstract] Abstract: the reported t(61)=17.85 and binomial attribution tests are presented without any description of cluster-number selection, embedding model, preprocessing steps, or controls for confounds such as post length or topic distribution; this omission makes it impossible to judge whether the observed separation is an artifact of the chosen pipeline rather than a property of the data.
  2. [Results] Validation / Results section: both the cross-persona cosine-similarity test and the nine-turn attribution test reuse the identical 41,300-post corpus and embedding space that generated the clusters; no held-out platform, no comparison against human-authored content, and no shuffled-persona baseline are described, so the evidence establishes only self-consistency on Moltbook rather than generalizability to broader AI-agent populations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important issues of methodological transparency and the scope of our validation evidence. We address each major comment below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the reported t(61)=17.85 and binomial attribution tests are presented without any description of cluster-number selection, embedding model, preprocessing steps, or controls for confounds such as post length or topic distribution; this omission makes it impossible to judge whether the observed separation is an artifact of the chosen pipeline rather than a property of the data.

    Authors: We agree that the abstract would benefit from greater methodological transparency. Cluster number was chosen via the elbow method combined with silhouette analysis (detailed in Section 3.2); embeddings used the all-MiniLM-L6-v2 model; preprocessing removed exact duplicates and normalized for post length; topic distribution was balanced through stratified sampling prior to clustering. We will revise the abstract to include one concise sentence referencing these steps and directing readers to the Methods section for full specifications. This change will be made in the revised manuscript. revision: yes

  2. Referee: [Results] Validation / Results section: both the cross-persona cosine-similarity test and the nine-turn attribution test reuse the identical 41,300-post corpus and embedding space that generated the clusters; no held-out platform, no comparison against human-authored content, and no shuffled-persona baseline are described, so the evidence establishes only self-consistency on Moltbook rather than generalizability to broader AI-agent populations.

    Authors: We acknowledge that the reported validations demonstrate internal consistency within the Moltbook corpus rather than external generalizability. To strengthen the evidence, we will add a shuffled-persona baseline to the attribution test in the revised Results section, randomly reassigning messages to personas and confirming that observed accuracy significantly exceeds this control. We will also expand the Discussion to explicitly state the current scope is limited to Moltbook-like AI-agent platforms and to call for future cross-platform and human-content comparisons. We do not have additional held-out platform data available for this revision cycle. revision: partial

Circularity Check

1 steps flagged

Internal cluster-based validation largely confirms the clustering pipeline on the same Moltbook corpus

specific steps
  1. fitted input called prediction [Abstract]
    "Cross-persona validation confirms that personas are semantically closer to their own source cluster than to others (t(61) = 17.85, p < .001, d = 2.20; own-cluster M = 0.71 vs. other-cluster M = 0.35). These personas are then deployed in a nine-turn structured discussion, and simulation messages were attributed to their source persona significantly above chance (binomial test, p < .001)."

    Personas are generated directly from k-means clusters of the Moltbook posts via RAG; the reported semantic-similarity test therefore compares each derived persona against the very embedding partitions that produced it. Higher similarity to the source cluster is a direct consequence of how the clusters and personas were constructed, rendering the validation a self-consistency check rather than an external demonstration that the personas capture generalizable behavioral diversity in AI agent populations.

full rationale

The paper derives personas via k-means on embeddings of the 41,300 posts followed by RAG, then reports that these personas are semantically closer to their source cluster (M=0.71) than others (M=0.35) with a large t-test effect. This comparison reuses the identical embedding space and partitions that defined the clusters, so elevated own-cluster similarity is expected by construction of the method rather than an independent test of generalization to broader AI agent populations. The subsequent simulation attribution test is less directly forced but still operates inside the same derived persona set. No external held-out data, shuffled baselines, or cross-platform comparisons are described, yet the central claim of representing behavioral diversity rests on these internal checks. This produces moderate circularity without fully collapsing the derivation to a pure tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that standard clustering and generation methods applied to one social platform's posts yield generalizable behavioral personas; no free parameters, new entities, or non-standard axioms are explicitly introduced in the abstract.

axioms (2)
  • domain assumption k-means clustering on post embeddings identifies distinct behavioral types
    Invoked when grouping the 41,300 posts into source clusters for persona generation.
  • domain assumption Semantic similarity scores from retrieval-augmented generation reflect true persona distinctness
    Used in the cross-persona validation step reported in the abstract.

pith-pipeline@v0.9.0 · 5728 in / 1470 out tokens · 57412 ms · 2026-05-21T11:37:10.099064+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook

    cs.SE 2026-05 unverdicted novelty 7.0

    AI-only technical discourse on MoltBook is coherent and organized around 12 themes led by security and trust, but it lacks the concrete code, runtime failures, and reproduction steps common in human GitHub discussions.

  2. Social Theory Should Be a Structural Prior for Agentic AI: A Formal Framework for Multi-Agent Social Systems

    cs.MA 2026-05 unverdicted novelty 5.0

    Agentic AI needs social theory as a structural prior, formalized via the MASS dynamical system framework with four priors: strategic heterogeneity, networked-constrained dependence, co-evolution, and distributional in...

  3. Social Theory Should Be a Structural Prior for Agentic AI: A Formal Framework for Multi-Agent Social Systems

    cs.MA 2026-05 unverdicted novelty 4.0

    Agentic AI requires social theory as a structural prior in the proposed MASS framework to model emergent outcomes from agent interactions and influence.

  4. Social Theory Should Be a Structural Prior for Agentic AI: A Formal Framework for Multi-Agent Social Systems

    cs.MA 2026-05 unverdicted novelty 4.0

    Agentic AI needs social theory as structural priors in the MASS framework to model emergent dynamics from multi-agent interactions.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · cited by 2 Pith papers · 2 internal anchors

  1. [1]

    A comprehensive survey of multiagent reinforcement learning.IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008

    Lucian Busoniu, Robert Babuska, and Bart De Schutter. A comprehensive survey of multiagent reinforcement learning.IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008

  2. [2]

    Large language models as simulated economic agents: What can we learn from homo silicus?NBER Working Paper, (31122), 2023

    John J Horton. Large language models as simulated economic agents: What can we learn from homo silicus?NBER Working Paper, (31122), 2023

  3. [3]

    Simulating opinion dynamics with networks of llm-based agents

    Yun-Shiuan Chuang, Agam Haile, Annika Momen, Qiancheng Ma, Yicong Hsu, Kameron Decker Grossman, Xinlan Emily Zhu, Sangyeon Hwang, Amos Ghajarani, Andrew Williams, and Chenhao Tan Chen. Simulating opinion dynamics with networks of llm-based agents. InarXiv preprint arXiv:2311.09618, 2024

  4. [4]

    Sams - Pearson Education, Indianapolis, IN, 1 edition edition, 1999

    Alan Cooper.The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity. Sams - Pearson Education, Indianapolis, IN, 1 edition edition, 1999. ISBN 978-0-672-32614-1

  5. [6]

    Deepak Bhaskar Acharya, Karthigeyan Kuppan, and B. Divya. Agentic ai: Autonomous intelligence for complex goals—a comprehensive survey. IEEE Access, 13:18912–18936, 2025. doi: 10.1109/ACCESS.2025.3532853

  6. [7]

    Generative Agents: Interactive Simulacra of Human Behavior

    Joon Sung Park, Joseph C O’Brien, Carrie J Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior.arXiv preprint arXiv:2304.03442, 2023

  7. [8]

    Emerging roles and relationships among humans and interactive ai systems.International Journal of Human–Computer Interaction, 41(17):10595–10617, 2025

    Helena Lindgren. Emerging roles and relationships among humans and interactive ai systems.International Journal of Human–Computer Interaction, 41(17):10595–10617, 2025. doi: 10.1080/10447318.2024.2435693. URL https://doi.org/10.1080/10447318.2024.2435693

  8. [9]

    Rethinking AI Agents: A Principal-Agent Perspective.California Management Review, 65(3): 27–47, 2023

    Mohammad Hossein Jarrahi and Paavo Ritala. Rethinking AI Agents: A Principal-Agent Perspective.California Management Review, 65(3): 27–47, 2023. URL https://www.researchgate.net/profile/Mohammad-Hossein-Jarrahi/publication/394176492_Rethinking_AI_Agents_A_Principal- Agent_Perspective/links/688c47aa12647e01a03af8b0/Rethinking-AI-Agents-A-Principal-Agent-Pe...

  9. [10]

    The Curse of Recursion: Training on Generated Data Makes Models Forget

    Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, and Ross Anderson. The Curse of Recursion: Training on Generated Data Makes Models Forget, April 2024. URL http://arxiv.org/abs/2305.17493. arXiv:2305.17493 [cs]

  10. [11]

    Multi-agent reinforcement learning: A selective overview of theories and algorithms.Handbook of reinforcement learning and control, pages 321–384, 2021

    Kaiqing Zhang, Zhuoran Yang, and Tamer Başar. Multi-agent reinforcement learning: A selective overview of theories and algorithms.Handbook of reinforcement learning and control, pages 321–384, 2021

  11. [12]

    Morgan Kaufmann, 2006

    John Pruitt and Tamara Adlin.The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan Kaufmann, 2006

  12. [13]

    Springer, 2nd edition, 2019

    Lene Nielsen.Personas-User Focused Design. Springer, 2nd edition, 2019

  13. [14]

    Read, and Matthew Horton

    Ilyena Hirskyj-Douglas, Janet C. Read, and Matthew Horton. Animal personas: Representing dog stakeholders in interaction design. InProceedings of the 31st International BCS Human Computer Interaction Conference (HCI 2017). BCS Learning & Development, 2017. URL https://www.scienceopen. com/hosted-document?doi=10.14236/ewic/HCI2017.37

  14. [15]

    The design of non-human personas

    Lene Nielsen and Rike Neuhoff. The design of non-human personas. InIADIS International Conference on Information Systems,

  15. [16]

    URL https://www.researchgate.net/profile/Lene-Nielsen-5/publication/389427700_THE_DESIGN_OF_NON-HUMAN_PERSONAS/links/ 67c1cd15207c0c20fa9bc70f/THE-DESIGN-OF-NON-HUMAN-PERSONAS.pdf

  16. [17]

    Large language models (LLMs) in human-computer interaction: Using LLM-generated personas to model everything from minority views to entire ecosystems

    Danial Amin, Joni Salminen, Bernard Jansen, Ilkka Kaate, and Waleed Akhtar. Large language models (LLMs) in human-computer interaction: Using LLM-generated personas to model everything from minority views to entire ecosystems. InArtificial Intelligence & Large Language Models: A Scientific Perspective. CRC Press, 2024

  17. [18]

    Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in Neural Information Processing Systems, 33: 9459–9474, 2020

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in Neural Information Processing Systems, 33: 9459–9474, 2020

  18. [19]

    Brookings Institution Press, 1996

    Joshua M Epstein and Robert Axtell.Growing artificial societies: social science from the bottom up. Brookings Institution Press, 1996

  19. [20]

    If multi-agent learning is the answer, what is the question?Artificial intelligence, 171(7):365–377, 2008

    Yoav Shoham, Rob Powers, and Trond Grenager. If multi-agent learning is the answer, what is the question?Artificial intelligence, 171(7):365–377, 2008

  20. [21]

    Does persona prompting change llm reasoning style?arXiv preprint arXiv:2601.xxxxx, 2025

    Laura Kroczek et al. Does persona prompting change llm reasoning style?arXiv preprint arXiv:2601.xxxxx, 2025

  21. [22]

    What Principles Do Researchers Apply When Using LLMs to Create Personas?: A Review of Prompt Design for AI-Generated Personas from Fairness and Technical Perspectives

    Danial Amin, Joni Salminen, and Bernard Jansen. What Principles Do Researchers Apply When Using LLMs to Create Personas?: A Review of Prompt Design for AI-Generated Personas from Fairness and Technical Perspectives. InDesigning Interactive Systems’2026, 2025

  22. [23]

    Llm-agent as a multi-agent collaborative system: A survey.arXiv preprint arXiv:2402.08931, 2024

    Zhuang Yang et al. Llm-agent as a multi-agent collaborative system: A survey.arXiv preprint arXiv:2402.08931, 2024

  23. [24]

    Data-driven personas.Synthesis Lectures on Human-Centered Informatics, 14(1):1–157, 2021

    Joni Salminen, Bernard J Jansen, Jisun An, Haewoon Kwak, and Soon-gyo Jung. Data-driven personas.Synthesis Lectures on Human-Centered Informatics, 14(1):1–157, 2021

  24. [25]

    Guan, and Joni Salminen

    Bernard Jansen, Soon-Gyo Jung, Lene Nielsen, Kathleen W. Guan, and Joni Salminen. How to Create Personas: Three Persona Creation Methodologies with Implications for Practical Employment.Pacific Asia Journal of the Association for Information Systems, 14(3), March 2022. ISSN 1943-7544. doi: 10.17705/1pais.14301. URL https://aisel.aisnet.org/pajais/vol14/is...

  25. [26]

    Personacraft: Leveraging language models for data-driven persona development.International Journal of Human-Computer Studies, 197:103445, 2025

    Soon-Gyo Jung, Joni Salminen, Kholoud Khalil Aldous, and Bernard J Jansen. Personacraft: Leveraging language models for data-driven persona development.International Journal of Human-Computer Studies, 197:103445, 2025. doi: 10.1016/j.ijhcs.2025.103445

  26. [27]

    Deus Ex Machina and Personas from Large Language Models: Investigating the Composition of AI-Generated Persona Descriptions

    Joni Salminen, Chang Liu, Wenjing Pian, Jianxing Chi, Essi Häyhänen, and Bernard J Jansen. Deus Ex Machina and Personas from Large Language Models: Investigating the Composition of AI-Generated Persona Descriptions. InProceedings of the CHI Conference on Human Factors in Computing Systems, pages 1–20, Honolulu HI USA, May 2024. ACM. ISBN 9798400703300. do...

  27. [28]

    Proxona: Supporting creators’ sensemaking and ideation with llm-powered audience personas

    Yoonseo Choi, Eun Jeong Kang, Seulgi Choi, Min Kyung Lee, and Juho Kim. Proxona: Supporting creators’ sensemaking and ideation with llm-powered audience personas. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–32, 2025. doi: 10.1145/3706598.3714034

  28. [29]

    Persona-l has entered the chat: Leveraging llms and ability-based framework for personas of people with complex needs

    Lipeipei Sun, Tianzi Qin, Anran Hu, Jiale Zhang, Shuojia Lin, Jianyan Chen, Mona Ali, and Mirjana Prpa. Persona-l has entered the chat: Leveraging llms and ability-based framework for personas of people with complex needs. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–31, 2025. doi: 10.1145/3706598.3713445

  29. [30]

    Joni Salminen, Soon-gyo Jung, Hind Almerekhi, Erik Cambria, and Bernard Jansen. How Can Natural Language Processing and Generative AI Address Grand Challenges of Quantitative User Personas? In Helmut Degen, Stavroula Ntoa, and Abbas Moallem, editors,HCI International 2023 – Late Breaking Papers, pages 211–231, Cham, 2023. Springer Nature Switzerland. ISBN...

  30. [31]

    Danial Amin, Joni Salminen, and Bernard Jansen. Introducing persona ecosystem playground (pep): Generating behaviorally grounded personas from think-aloud data using llms with rag.Conference on Human Computer Interaction’2026, 2025. In review, Poster

  31. [32]

    PersonaFlow: Designing LLM-Simulated Expert Perspectives for Enhanced Research Ideation

    Yiren Liu, Pranav Sharma, Mehul Oswal, Haijun Xia, and Yun Huang. PersonaFlow: Designing LLM-Simulated Expert Perspectives for Enhanced Research Ideation. InProceedings of the 2025 ACM Designing Interactive Systems Conference, DIS ’25, pages 506–534, New York, NY, USA, 2025. Association for Computing Machinery. ISBN 9798400714856. doi: 10.1145/3715336.373...

  32. [33]

    Survey2persona: Rendering survey responses as personas

    Joni Salminen, Bernard Jansen, and Soon-Gyo Jung. Survey2persona: Rendering survey responses as personas. InAdjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization, pages 67–73, 2022. doi: 10.1145/3511047.3536403

  33. [34]

    Personagen: A tool for generating personas from user feedback

    Xishuo Zhang, Lin Liu, Yi Wang, Xiao Liu, Hailong Wang, Anqi Ren, and Chetan Arora. Personagen: A tool for generating personas from user feedback. In2023 IEEE 31st International Requirements Engineering Conference (RE), pages 353–354. IEEE, 2023. doi: 10.1109/RE57278.2023.00048

  34. [35]

    Writing user personas with large language models: Testing phase 6 of a thematic analysis of semi-structured interviews.arXiv preprint arXiv:2305.18099, 2023

    Stefano De Paoli. Writing user personas with large language models: Testing phase 6 of a thematic analysis of semi-structured interviews.arXiv preprint arXiv:2305.18099, 2023

  35. [36]

    User personas, ideation and large language models: A post-hoc study.International Journal of Human-Computer Studies, 208: 103690, 2026

    Stefano De Paoli. User personas, ideation and large language models: A post-hoc study.International Journal of Human-Computer Studies, 208: 103690, 2026. doi: 10.1016/j.ijhcs.2025.103690

  36. [37]

    Generating personas using llms and assessing their viability

    Andreas Schuller, Doris Janssen, Julian Blumenröther, Theresa Maria Probst, Michael Schmidt, and Chandan Kumar. Generating personas using llms and assessing their viability. InExtended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7, 2024. doi: 10.1145/3613905.3650860

  37. [38]

    Understanding human–ai workflows for generating personas

    Joongi Shin, Michael A Hedderich, Bartłomiej Jakub Rey, Andrés Lucero, and Antti Oulasvirta. Understanding human–ai workflows for generating personas. InProceedings of the 2024 ACM Designing Interactive Systems Conference, pages 757–781, 2024. doi: 10.1145/3643834.3660729

  38. [39]

    but the data is already public

    Michael Zimmer. "but the data is already public": On the ethics of research in facebook.Ethics and Information Technology, 12(4):313–325, 2010

  39. [40]

    Obstacles to user involvement in software product development, with relevance for product line engineering

    Jonathan Grudin. Obstacles to user involvement in software product development, with relevance for product line engineering. InPersona Lifecycle, pages 1–14, 2006

  40. [41]

    Using thematic analysis in psychology.Qualitative Research in Psychology, 3(2):77–101, 2006

    Virginia Braun and Victoria Clarke. Using thematic analysis in psychology.Qualitative Research in Psychology, 3(2):77–101, 2006

  41. [42]

    Persona generation from aggregated social media data

    Soon-gyo Jung, Jisun An, Haewoon Kwak, Saleema Ahmad, Lene Nielsen, and Bernard J Jansen. Persona generation from aggregated social media data. InCHI Extended Abstracts, pages 1748–1755, 2017

  42. [43]

    Latent dirichlet allocation.Journal of Machine Learning Research, 3:993–1022, 2003

    David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation.Journal of Machine Learning Research, 3:993–1022, 2003

  43. [44]

    Creating and evaluating personas using generative ai: A scoping review of 81 articles.CHI, 2026

    Danial Amin, Joni Salminen, and Bernard J Jansen. Creating and evaluating personas using generative ai: A scoping review of 81 articles.CHI, 2026

  44. [45]

    Haining Zhu, Hongjian Wang, and John M. Carroll. Creating Persona Skeletons from Imbalanced Datasets - A Case Study using U.S. Older Adults’ Health Data. InProceedings of the 2019 on Designing Interactive Systems Conference, DIS ’19, pages 61–70, New York, NY, USA, June 2019. Association for Computing Machinery. ISBN 978-1-4503-5850-7. doi: 10.1145/332227...

  45. [46]

    Using AI for User Representation: An Analysis of 83 Persona Prompts, August 2025

    Joni Salminen, Danial Amin, and Bernard Jansen. Using AI for User Representation: An Analysis of 83 Persona Prompts, August 2025. URL http://arxiv.org/abs/2508.13047. arXiv:2508.13047 [cs]

  46. [47]

    How can tensions in sustainability transitions be made visible?: Challenging the actor-centered ecosystem approach through persona-based modeling

    Danial Amin, Joni Salminen, Farhan Ahmed, Emillia Zemskova, and Bernard Jansen. How can tensions in sustainability transitions be made visible?: Challenging the actor-centered ecosystem approach through persona-based modeling. InEcosystems and Networks for Sustainable Business. Palgrave Macmillan, 2026. In review

  47. [48]

    One Metric to Rule Them All?

    Danial Amin, Joni Salminen, and Bernard Jansen. "One Metric to Rule Them All?": Proposing A Quantitative Framework for Persona Set Diversity Measurement Toward Inclusive User Representation. InUser Modeling, Adaptation and Personalization’2026, 2025

  48. [49]

    The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents

    D Gunawan, C A Sembiring, and M A Budiman. The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents. Journal of Physics: Conference Series, 978(1):012120, March 2018. ISSN 1742-6596. doi: 10.1088/1742-6596/978/1/012120. URL https://doi.org/10.1088/ 1742-6596/978/1/012120

  49. [50]

    Fipa acl message structure specification

    FIPA. Fipa acl message structure specification. Technical report, Foundation for Intelligent Physical Agents, 2002. 20 Amin et al

  50. [51]

    Automated negotiation: prospects, methods and challenges.Group Decision and Negotiation, 10(2):199–215, 2001

    Nicholas R Jennings, Peyman Faratin, Alessio R Lomuscio, Simon Parsons, Michael J Wooldridge, and Carles Sierra. Automated negotiation: prospects, methods and challenges.Group Decision and Negotiation, 10(2):199–215, 2001

  51. [52]

    Essi Häyhänen, Joni Salminen, and Bernard J. Jansen. Why Are Personas the Way They Are? Identifying Six Persona Creation Strategies.Persona Studies, 11, March 2025. ISSN 2205-5258. doi: 10.21153/psj2025vol11noart2002. URL https://ojs.deakin.edu.au/index.php/ps/article/view/2002