Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities
Pith reviewed 2026-05-19 06:31 UTC · model grok-4.3
The pith
Gamified interactions with multiple LLM agents embodying different personalities yield effective and more accurate personality assessments based on the Big Five model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Multi-PR GPA framework uses Large Language Models to create virtual agents with distinct personalities that engage users in interactive games; the resulting multi-type textual data then supports Big Five personality assessments that are both effective and interpretable, with superior results when the multiplicity of personality representations is taken into account and with partial mitigation of LLM assessment biases through multi-context aggregation.
What carries the argument
Multi-PR GPA framework that deploys several LLM agents with varied personalities to run gamified interactions and aggregates the generated textual data for Big Five scoring and bias analysis.
If this is right
- The approach supports low-intrusion automated personality assessment suitable for psychology and HCI applications.
- Multi-personality representation produces superior assessment performance compared with single-context methods.
- Multi-context aggregation partially corrects systematic biases present in LLM personality judgments.
- The generated multi-type textual data supplies interpretable insights alongside the trait scores.
Where Pith is reading between the lines
- The same interaction logs could be reused to study how personality expression changes across different game contexts.
- Developers might embed the method in apps to make personality feedback more engaging than standard questionnaires.
- The bias findings suggest a general need to test LLM outputs for consistency when they are used as proxies for human judgment.
Load-bearing premise
Interactions with LLM agents can draw out genuine and multifaceted human personality signals without the agents' own training biases dominating the results.
What would settle it
A controlled comparison in which the game-based scores show no stronger correlation with participants' established Big Five questionnaire results or real-world behavioral markers than scores from a single-personality LLM agent.
Figures
read the original abstract
The low-intrusion and automated personality assessment is receiving increasing attention in psychology and human-computer interaction fields. This study explores an interactive approach for personality assessment, focusing on the multiplicity of personality representation. We propose a framework of Gamified Personality Assessment through Multi-Personality Representations (Multi-PR GPA). The framework leverages Large Language Models to empower virtual agents with different personalities. These agents elicit multifaceted human personality representations through engaging in interactive games. Drawing upon the multi-type textual data generated throughout the interaction, it achieves personality assessments with interpretable insights. Grounded in the classic Big Five personality theory, we developed a prototype system and conducted a user study to evaluate the efficacy of Multi-PR GPA. The results affirm the effectiveness of our approach in personality assessment and demonstrate its superior performance when considering the multiplicity of personality representation. Error structure analysis further revealed systematic assessment biases in LLMs, which multi-context aggregation partially mitigated.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes the Multi-PR GPA framework, which employs LLM agents embodying varied personalities to engage users in gamified interactions. These interactions generate multi-type textual data for assessing Big Five personality traits, yielding interpretable insights. A prototype system was evaluated via user study, with results claimed to affirm effectiveness, demonstrate superiority when accounting for personality multiplicity, and show that multi-context aggregation partially mitigates systematic LLM assessment biases.
Significance. If the empirical claims are substantiated through proper anchoring to established instruments, the work could meaningfully advance low-intrusion, interactive personality assessment methods in HCI and psychology. The focus on multiplicity of representation and the error-structure analysis of LLM biases represent potentially useful contributions. However, the absence of key validation details currently constrains the assessed significance and generalizability.
major comments (3)
- [User Study] User Study section: Sample size, participant demographics, recruitment method, and any statistical tests (e.g., significance levels or effect sizes) are not reported despite the abstract's claims of positive results and bias mitigation. These omissions prevent evaluation of the reliability and generalizability of the reported effectiveness.
- [Results] Results section: The central claim of superior performance with multi-personality representations and partial bias mitigation lacks quantitative grounding such as correlation coefficients with validated instruments (NEO-PI-R or IPIP-NEO), inter-rater reliability, or ablation comparisons against single-personality baselines. Without these metrics, it remains unclear whether observed signals reflect user personality variance or consistent LLM response patterns.
- [Methodology] Methodology section: The exact scoring algorithms used to derive Big Five trait scores from the collected multi-context textual data, including the implementation of multi-context aggregation, are not specified. This hinders reproducibility and assessment of how the framework operationalizes personality assessment.
minor comments (2)
- [Abstract] The abstract would benefit from a concise statement of the user-study scale or primary quantitative outcomes to better contextualize the reported findings.
- [Framework Description] Notation for how textual features map to specific Big Five facets could be clarified for readers unfamiliar with the LLM prompting setup.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback. The comments highlight important areas for improving the transparency and rigor of our empirical sections. We have revised the manuscript accordingly to address each point while preserving the core contributions of the Multi-PR GPA framework.
read point-by-point responses
-
Referee: [User Study] User Study section: Sample size, participant demographics, recruitment method, and any statistical tests (e.g., significance levels or effect sizes) are not reported despite the abstract's claims of positive results and bias mitigation. These omissions prevent evaluation of the reliability and generalizability of the reported effectiveness.
Authors: We agree that these details should have been reported more explicitly. The revised User Study section now includes the sample size, full participant demographics, recruitment procedures (via online research platforms and institutional channels), and the statistical tests performed, including significance levels and effect sizes supporting the abstract claims. These additions directly improve evaluability of reliability and generalizability. revision: yes
-
Referee: [Results] Results section: The central claim of superior performance with multi-personality representations and partial bias mitigation lacks quantitative grounding such as correlation coefficients with validated instruments (NEO-PI-R or IPIP-NEO), inter-rater reliability, or ablation comparisons against single-personality baselines. Without these metrics, it remains unclear whether observed signals reflect user personality variance or consistent LLM response patterns.
Authors: We acknowledge the need for stronger quantitative anchoring. The revised Results section now incorporates correlation coefficients with a validated instrument, inter-rater reliability statistics, and explicit ablation comparisons between multi-personality and single-personality conditions. These metrics demonstrate that the performance gains arise from capturing user personality variance rather than LLM artifacts alone, while the error-structure analysis quantifies the partial bias mitigation achieved through multi-context aggregation. revision: yes
-
Referee: [Methodology] Methodology section: The exact scoring algorithms used to derive Big Five trait scores from the collected multi-context textual data, including the implementation of multi-context aggregation, are not specified. This hinders reproducibility and assessment of how the framework operationalizes personality assessment.
Authors: We agree that greater specificity is required for reproducibility. The revised Methodology section now details the exact scoring algorithms, including the prompt templates, trait extraction rules, and the multi-context aggregation procedure (weighted combination of scores across interaction contexts based on relevance and consistency). This makes the operationalization of the personality assessment fully transparent and replicable. revision: yes
Circularity Check
Empirical user study shows no circular derivations or self-referential reductions
full rationale
The paper describes a gamified personality assessment framework (Multi-PR GPA) that uses LLM agents to elicit user responses in interactive games, followed by a user study to evaluate effectiveness against Big Five theory. All central claims rest on empirical outcomes from participant interactions and error analysis rather than any mathematical derivation, fitted parameters renamed as predictions, or load-bearing self-citations. No equations, ansatzes, or uniqueness theorems are invoked that reduce results to inputs by construction; the assessment is grounded in external study data collection.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Big Five personality theory provides a valid and sufficient basis for interpreting interaction data as personality traits.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a framework of Gamified Personality Assessment through Multi-Personality Representations (Multi-PR GPA). The framework leverages Large Language Models to empower virtual agents with different personalities. These agents elicit multifaceted human personality representations through engaging in interactive games.
-
IndisputableMonolith/Foundation/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Error structure analysis further revealed systematic assessment biases in LLMs, which multi-context aggregation partially mitigated.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Reece Akhtar, Lara Boustani, Dimitrios Tsivrikos, and Tomas Chamorro- Premuzic. 2015. The engageable personality: Personality and trait EI as pre- dictors of work engagement.Personality and individual differences73 (2015), 44–49
work page 2015
-
[2]
Gordon W Allport. 1961. Pattern and growth in personality. (1961)
work page 1961
-
[3]
Gordon W Allport and Henry S Odbert. 1936. Trait-names: A psycho-lexical study.Psychological monographs47, 1 (1936)
work page 1936
-
[4]
American Psychological Association. n.d.. Multiple selves.APA Dictionary of Psychology. Retrieved September 2, 2025, from https://dictionary.apa.org/ multiple-selves
work page 2025
-
[5]
2012.Psychologie der persönlichkeit
Jens B Asendorpf and Franz J Neyer. 2012.Psychologie der persönlichkeit. Springer-Verlag
work page 2012
-
[6]
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. 2021. Program synthesis with large language models.arXiv preprint arXiv:2108.07732(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[7]
Robert Axelrod and William D Hamilton. 1981. The evolution of cooperation. science211, 4489 (1981), 1390–1396
work page 1981
-
[8]
Verónica Benet-Martínez and Oliver P John. 1998. Los Cinco Grandes across cultures and ethnic groups: Multitrait-multimethod analyses of the Big Five in Spanish and English.Journal of personality and social psychology75, 3 (1998), 729
work page 1998
-
[9]
Shlomo Berkovsky, Ronnie Taib, Irena Koprinska, Eileen Wang, Yucheng Zeng, Jingjie Li, and Sabina Kleitman. 2019. Detecting personality traits using eye- tracking data. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12
work page 2019
-
[10]
Ilene R Berson, Michael J Berson, Amy M Carnes, and Claudia R Wiedeman
-
[11]
Excursion into empathy: exploring prejudice with virtual reality.Social Education82, 2 (2018), 96–100
work page 2018
-
[12]
Michal Bialek and Sylvia Terbeck. 2016. Can cognitive psychological research on reasoning enhance the discussion around moral judgments?Cognitive processing 17, 3 (2016), 329–335
work page 2016
-
[13]
Yulong Bian, Chenglei Yang, Chao Zhou, Juan Liu, Wei Gai, Xiangxu Meng, Feng Tian, and Chia Shen. 2018. Exploring the weak association between flow experience and performance in virtual environments. InProceedings of the 2018 CHI conference on human factors in computing systems. 1–12
work page 2018
-
[14]
Yulong Bian, Chao Zhou, Yeqing Chen, Yanshuai Zhao, Juan Liu, and Chenglei Yang. 2020. The role of the field dependence-independence construct on the flow-performance link in virtual reality. InSymposium on interactive 3D graphics and games. 1–9
work page 2020
-
[15]
1986.Symbolic interactionism: Perspective and method
Herbert Blumer. 1986.Symbolic interactionism: Perspective and method. Univ of California Press
work page 1986
-
[16]
2019.Virtual reality for psychological and neurocognitive interventions
Stéphane Bouchard and A Rizzo. 2019.Virtual reality for psychological and neurocognitive interventions. Springer
work page 2019
-
[17]
Urie Bronfenbrenner. 1977. Toward an experimental ecology of human develop- ment.American psychologist32, 7 (1977), 513
work page 1977
-
[18]
Alessandro Bruno and Gurmeet Singh. 2022. Personality traits prediction from text via machine learning. In2022 IEEE World Conference on Applied Intelligence and Computing (AIC). IEEE, 588–594
work page 2022
-
[19]
Richard Carciofo, Jiaoyan Yang, Nan Song, Feng Du, and Kan Zhang. 2016. Psychometric evaluation of Chinese-language 44-item and 10-item big five personality inventories, including correlations with chronotype, mindfulness and mind wandering.PloS one11, 2 (2016), e0149963
work page 2016
-
[20]
Charles S Carver and Jennifer Connor-Smith. 2010. Personality and coping. Annual review of psychology61, 1 (2010), 679–704
work page 2010
-
[21]
Nicky Case. 2017. The Evolution of Trust. https://ncase.me/trust/
work page 2017
-
[22]
Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, et al. 2023. Open problems and fundamental limitations of reinforcement learning from human feedback.arXiv preprint arXiv:2307.15217(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[23]
Heather EP Cattell. 2001. The sixteen personality factor (16PF) questionnaire. InUnderstanding psychological assessment. Springer, 187–215
work page 2001
-
[24]
Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences.Advances in neural information processing systems30 (2017)
work page 2017
-
[25]
Monica F Contrino, Maribell Reyes-Millán, Patricia Vázquez-Villegas, and Jorge Membrillo-Hernández. 2024. Using an adaptive learning tool to improve student performance and satisfaction in online and face-to-face education for a more personalized approach.Smart Learning Environments11, 1 (2024), 6
work page 2024
-
[26]
Paul T Costa and Robert R McCrae. 1988. Personality in adulthood: a six-year longitudinal study of self-reports and spouse ratings on the NEO Personality Inventory.Journal of personality and social psychology54, 5 (1988), 853
work page 1988
-
[27]
Paul T Costa and Robert R McCrae. 1999. A five-factor theory of personality. Handbook of personality: Theory and research2, 01 (1999), 1999
work page 1999
-
[28]
Nigel Crisp and Lincoln Chen. 2014. Global supply of health professionals.New England Journal of Medicine370, 10 (2014), 950–957
work page 2014
-
[29]
Carolina Cruz-Neira, Daniel J Sandin, Thomas A DeFanti, Robert V Kenyon, and John C Hart. 1992. The CAVE: Audio visual experience automatic virtual environment.Commun. ACM35, 6 (1992), 64–73
work page 1992
-
[30]
1990.Flow: The psychology of optimal experience
Mihaly Czikszentmihalyi. 1990.Flow: The psychology of optimal experience. New York: Harper & Row
work page 1990
-
[31]
Boele De Raad. 2000.The big five personality factors: the psycholexical approach to personality.Hogrefe & Huber Publishers
work page 2000
-
[32]
Erik Derner, Dalibor Kučera, Nuria Oliver, and Jan Zahálka. 2024. Can ChatGPT read who you are?Computers in Human Behavior: Artificial Humans2, 2 (2024), 100088
work page 2024
-
[33]
Melissa E DeRosier and James M Thomas. 2019. Hall of heroes: A digital game for social skills training with young adolescents.International Journal of Computer Games Technology2019, 1 (2019), 6981698
work page 2019
-
[34]
Ed Diener, Randy J Larsen, and Robert A Emmons. 1984. Person × Situation interactions: Choice of situations and congruence response models.Journal of personality and social psychology47, 3 (1984), 580
work page 1984
-
[35]
Danica Dillion, Niket Tandon, Yuling Gu, and Kurt Gray. 2023. Can AI language models replace human participants?Trends in Cognitive Sciences27, 7 (2023), 597–600
work page 2023
-
[36]
Don Eskridge. 2012.The Resistance: A valon. Indie Boards & Cards. Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities Conference’17, July 2017, Washington, DC, USA
work page 2012
-
[37]
Golnoosh Farnadi, Susana Zoghbi, Marie-Francine Moens, and Martine De Cock
-
[38]
https://doi.org/10.1609/icwsm.v7i2.14470
Recognising Personality Traits Using Facebook Status Updates.Proceedings of the International AAAI Conference on Web and Social Media7, 2 (Nov 2022), 14–18. https://doi.org/10.1609/icwsm.v7i2.14470
-
[39]
Franz Faul, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009. Statis- tical power analyses using G* Power 3.1: Tests for correlation and regression analyses.Behavior research methods41, 4 (2009), 1149–1160
work page 2009
-
[40]
Ernst Fehr and Simon Gächter. 2002. Altruistic punishment in humans.Nature 415, 6868 (2002), 137–140
work page 2002
-
[41]
Ali-Reza Feizi-Derakhshi, Mohammad-Reza Feizi-Derakhshi, Majid Ramezani, Narjes Nikzad-Khasmakhi, Meysam Asgari-Chenaghlu, Taymaz Akan, Mehrdad Ranjbar-Khadivi, Elnaz Zafarni-Moattar, and Z Jahanbakhsh-Naghadeh. 2021. The state-of-the-art in text-based automatic personality prediction.arXiv preprint arXiv:2110.01186(2021)
-
[42]
Anna Felnhofer, Oswald D Kothgassner, Nathalie Hauk, Leon Beutl, Helmut Hlavacs, and Ilse Kryspin-Exner. 2014. Physical and social presence in collabo- rative virtual environments: Exploring age and gender differences with respect to empathy.Computers in Human Behavior31 (2014), 272–279
work page 2014
-
[43]
Daniel Fernau, Stefan Hillmann, Nils Feldhus, Tim Polzehl, and Sebastian Möller
-
[44]
InProceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Towards personality-aware chatbots. InProceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 135–145
-
[45]
Merrill M Flood. 1958. Some experimental games.Management Science5, 1 (1958), 5–26
work page 1958
-
[46]
Leilani H Gilpin, Danielle M Olson, and Tarfah Alrashed. 2018. Perception of speaker personality traits using speech signals. InExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. 1–6
work page 2018
-
[47]
Lewis R Goldberg. 1981. Language and individual differences: The search for universals in personality lexicons.Review of personality and social psychology2, 1 (1981), 141–165
work page 1981
-
[48]
Lewis R Goldberg. 2013. An alternative “description of personality”: The Big-Five factor structure. InPersonality and Personality Disorders. Routledge, 34–47
work page 2013
-
[49]
Manuel J Gomez, José A Ruipérez-Valiente, and Félix J García Clemente. 2022. A systematic literature review of game-based assessment studies: Trends and challenges.IEEE Transactions on Learning Technologies16, 4 (2022), 500–515
work page 2022
-
[50]
Michael Gurven, Christopher Von Rueden, Maxim Massenkoff, Hillard Kaplan, and Marino Lero Vie. 2013. How universal is the Big Five? Testing the five-factor model of personality variation among forager–farmers in the Bolivian Amazon. Journal of personality and social psychology104, 2 (2013), 354
work page 2013
-
[51]
Jason L Harman and Justin Purl. 2024. Advances in game-like personality assessment.Trends in Psychology32, 4 (2024), 1445–1459
work page 2024
-
[52]
Peter Henderson, Koustuv Sinha, Nicolas Angelard-Gontier, Nan Rosemary Ke, Genevieve Fried, Ryan Lowe, and Joelle Pineau. 2018. Ethical challenges in data-driven dialogue systems. InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 123–129
work page 2018
-
[53]
Jacob B. Hirsh and Jordan B. Peterson. 2009. Personality and language use in self-narratives.Journal of Research in Personality(Jun 2009), 524–527. https: //doi.org/10.1016/j.jrp.2009.01.006
-
[54]
Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, and Liqiang Nie. 2024. LLM vs Small Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 18234–18242
work page 2024
-
[55]
Jen-tse Huang, Wenxiang Jiao, Man Ho Lam, Eric John Li, Wenxuan Wang, and Michael Lyu. 2024. On the reliability of psychological scales on large language models. InProceedings of The 2024 Conference on Empirical Methods in Natural Language Processing. 6152–6173
work page 2024
-
[56]
Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, and Michael Lyu. 2023. On the humanity of conversational ai: Evaluating the psychological portrayal of llms. InThe Twelfth International Conference on Learning Representations
work page 2023
-
[57]
Yuan Jia, Bin Xu, Yamini Karanam, and Stephen Voida. 2016. Personality-targeted gamification: a survey study on personality traits and motivational affordances. InProceedings of the 2016 CHI conference on human factors in computing systems. 2001–2013
work page 2016
-
[58]
Guangyuan Jiang, Manjie Xu, Song-Chun Zhu, Wenjuan Han, Chi Zhang, and Yixin Zhu. 2024. Evaluating and inducing personality in pre-trained language models.Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[59]
O John. 1999. The Big Five trait taxonomy: History, measurement, and theoretical perspectives.Handbook of personality/Guilford(1999)
work page 1999
-
[60]
Oliver P John, Laura P Naumann, and Christopher J Soto. 2008. Paradigm shift to the integrative big five trait taxonomy.Handbook of personality: Theory and research3, 2 (2008), 114–158
work page 2008
-
[61]
Oliver P. John and Sanjay Srivastava. 1999.Handbook of Personality: Theory and Research(2nd ed.). Guilford Press, New York. Chinese edition: Lawrence A. Pervin, Oliver P. John, 2003:135–184. (Chinese BFI-44 printed on p.176 of the Chinese edition)
work page 1999
-
[62]
Seoyoung Kim, Jiyoun Ha, and Juho Kim. 2018. Detecting personality unobtru- sively from users’ online and offline workplace behaviors. InExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. 1–6
work page 2018
-
[63]
Yoon Jeon Kim, Russell G Almond, and Valerie J Shute. 2016. Applying evidence- centered design for the development of game-based assessments in physics playground.International Journal of Testing16, 2 (2016), 142–163
work page 2016
-
[64]
Rodrigo Schames Kreitchmann, Francisco J Abad, Vicente Ponsoda, Maria Do- lores Nieto, and Daniel Morillo. 2019. Controlling for response biases in self- report scales: Forced-choice vs. psychometric modeling of Likert items.Frontiers in psychology10 (2019), 2309
work page 2019
-
[65]
Niclas Kuper, Simon M Breil, Kai T Horstmann, Lena Roemer, Tanja Lischetzke, Ryne A Sherman, Mitja D Back, Jaap JA Denissen, and John F Rauthmann. 2022. Individual differences in contingencies between situation characteristics and personality states.Journal of Personality and Social Psychology123, 5 (2022), 1166
work page 2022
-
[66]
Richard N Landers and Diana R Sanchez. 2022. Game-based, gamified, and gamefully designed assessments for employee selection: Definitions, distinctions, design, and validation.International Journal of Selection and Assessment30, 1 (2022), 1–13
work page 2022
-
[67]
Lee, Kyungil Kim, Young Seok Seo, and Cindy K
Chang H. Lee, Kyungil Kim, Young Seok Seo, and Cindy K. Chung. 2007. The Relations Between Personality and Language Use.The Journal of General Psychology134 (Oct 2007), 405–413. https://doi.org/10.3200/genp.134.4.405-414
-
[68]
Jungjae Lee, Yubin Choi, Minhyuk Song, and Sanghyun Park. 2024. ChatFive: Enhancing User Experience in Likert Scale Personality Test through Interactive Conversation with LLM Agents. InProceedings of the 6th ACM Conference on Conversational User Interfaces. 1–8
work page 2024
- [69]
-
[70]
Ningke Li, Yuekang Li, Yi Liu, Ling Shi, Kailong Wang, and Haoyu Wang. 2024. Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models.Proc. ACM Program. Lang.8, OOPSLA2, Article 336 (Oct. 2024), 30 pages
work page 2024
- [71]
-
[72]
Zheng Li, Dawei Zhu, Qilong Ma, Weimin Xiong, and Sujian Li. 2025. EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality De- tection. InProceedings of the 31st International Conference on Computational Linguistics. Association for Computational Linguistics, Abu Dhabi, UAE, 7721–
work page 2025
-
[73]
https://aclanthology.org/2025.coling-main.516/
work page 2025
-
[74]
Rensis Likert. [n. d.]. A technique for the measurement of attitudes. ([n. d.])
-
[75]
Chuang-Chun Liu, I Chang, et al. 2012. Measuring the flow experience of players playing online games. (2012)
work page 2012
- [76]
-
[77]
François Mairesse, Marilyn A Walker, Matthias R Mehl, and Roger K Moore
-
[78]
Using linguistic cues for the automatic recognition of personality in conversation and text.Journal of artificial intelligence research30 (2007), 457– 500
work page 2007
-
[79]
Gerald Matthews, Ian J Deary, and Martha C Whiteman. 2003.Personality traits. Cambridge University Press
work page 2003
-
[80]
John-Luke McCord, Jason L Harman, and Justin Purl. 2019. Game-like person- ality testing: An emerging mode of personality assessment.Personality and Individual Differences143 (2019), 95–102
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.