Reflecti-Mate: A Conversational Agent for Adaptive Decision-Making Support Through System 1 and System 2 Thinking
Pith reviewed 2026-05-22 03:58 UTC · model grok-4.3
The pith
A conversational agent that adapts to individual thinking patterns promotes better integration of intuitive and analytical processes during personal decision making.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that the Reflecti-Mate agent, by adapting to users' thought patterns, enables more personalized reflective trajectories, elicits more integrative reflective language, and is perceived as providing stronger support for holistic reflection, whereas a baseline agent leads to homogenized profiles dominated by cognitive language.
What carries the argument
Reflecti-Mate, a conversational agent that adapts its support based on detected patterns in the user's reflective language to balance System 1 intuitive and System 2 analytical thinking.
If this is right
- Users follow more personalized paths in their reflection instead of uniform ones.
- Reflective language becomes more integrative across cognitive and non-cognitive elements.
- Participants rate the agent higher for supporting complete, holistic reflection.
- Without adaptation, user outputs converge to similar cognitive-heavy styles.
Where Pith is reading between the lines
- Decision-support tools in fields like finance or medicine could benefit from similar adaptation to capture emotional considerations.
- Long-term studies tracking whether integrated reflection leads to better actual decisions would test the practical value.
- Interface designers might incorporate language monitoring to detect when users are stuck in one mode of thinking.
Load-bearing premise
The language analysis measures and perception scales truly indicate integration of System 1 and System 2 thinking rather than being artifacts of how the agents were designed or what participants believed the researchers expected.
What would settle it
Re-running the study with the same agents but using a different set of language metrics or blinded evaluators that show no increase in integrative language or perceived holism for the adaptive agent.
Figures
read the original abstract
Making high-stakes personal decisions involves cognitive, emotional, and intuitive processes, and individuals differ in how they allocate attention across these modes. Integration of these processes has shown to benefit decision making. Yet, most current decision-support systems focus primarily on supporting cognitive aspects, rather than adapting to the individual's thinking profile to support integration of different types of thoughts. In this study, we investigate an agent designed to encourage integration by adapting to the individual user's thought patterns. We explore its effects on participants' perceptions of the agent and their reflective behavior, in comparison with unaided pre-reflection and a baseline agent. In a between-subjects study (N = 128), our agent, which fostered broad and elaborated thinking, enabled more personalized reflective trajectories, elicited more integrative reflective language, and was perceived as providing stronger support for holistic reflection. In contrast, the baseline agent produced homogenized profiles dominated by cognitive language across participants.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Reflecti-Mate, an adaptive conversational agent designed to support integration of System 1 (intuitive/emotional) and System 2 (analytical) thinking during high-stakes personal decisions. It reports a between-subjects study (N=128) comparing the adaptive agent against unaided pre-reflection and a non-adaptive baseline agent. Key claims are that the adaptive agent produces more personalized reflective trajectories, elicits more integrative reflective language, and is perceived as offering stronger holistic support, while the baseline yields homogenized, cognitively dominated profiles.
Significance. If the central results hold under scrutiny, the work offers a useful contribution to HCI research on adaptive decision-support systems by shifting focus from purely cognitive aids to agents that promote balanced cognitive-emotional integration. The between-subjects design and dual measurement approach (language analysis plus perception scales) are positive elements that could inform future personalized reflection tools. The paper would benefit from stronger validation of its language metrics to realize this potential.
major comments (2)
- [Methods] Methods section: The paper must supply precise details on the language-analysis pipeline used to quantify 'integrative reflective language' and 'personalized reflective trajectories' (e.g., specific dictionaries, coding rubrics, inter-rater reliability, or automated feature extraction). Without these, it is impossible to determine whether observed differences reflect genuine System 1/2 integration or simply surface-level compliance with the agent's prompts. This is load-bearing for the headline claim.
- [Results] Results section: The abstract asserts directional superiority on integrative language and personalization, yet the provided text supplies no statistical tests, effect sizes, confidence intervals, or descriptive statistics for the between-condition comparisons. Include a results table (or §5) reporting means, SDs, and inferential tests so readers can evaluate the magnitude and reliability of the reported effects.
minor comments (2)
- [System Design] Clarify the exact differences in prompt structure between the adaptive and baseline agents in the system description to help readers assess the degree of adaptation.
- [Abstract] The abstract would be strengthened by a brief parenthetical mention of the primary statistical outcomes or effect sizes supporting the main claims.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which have helped us identify areas for improvement in our manuscript. We address each major comment below and plan to incorporate the suggested changes in the revised version.
read point-by-point responses
-
Referee: [Methods] Methods section: The paper must supply precise details on the language-analysis pipeline used to quantify 'integrative reflective language' and 'personalized reflective trajectories' (e.g., specific dictionaries, coding rubrics, inter-rater reliability, or automated feature extraction). Without these, it is impossible to determine whether observed differences reflect genuine System 1/2 integration or simply surface-level compliance with the agent's prompts. This is load-bearing for the headline claim.
Authors: We fully agree with the referee that a precise description of the language-analysis pipeline is crucial for validating our findings and ensuring the results are not merely due to prompt compliance. In the revised manuscript, we will provide detailed information on the specific dictionaries, coding rubrics, inter-rater reliability metrics, and automated feature extraction methods used to measure integrative reflective language and personalized reflective trajectories. This addition will strengthen the methodological transparency and allow for better assessment of the claims regarding System 1 and System 2 integration. revision: yes
-
Referee: [Results] Results section: The abstract asserts directional superiority on integrative language and personalization, yet the provided text supplies no statistical tests, effect sizes, confidence intervals, or descriptive statistics for the between-condition comparisons. Include a results table (or §5) reporting means, SDs, and inferential tests so readers can evaluate the magnitude and reliability of the reported effects.
Authors: We acknowledge that the current manuscript does not include the requested statistical details in the Results section. We will revise the paper to include a results table or expanded section reporting means, standard deviations, statistical tests (e.g., appropriate inferential statistics for the between-subjects design), effect sizes, and confidence intervals for the comparisons on integrative language and personalization measures. This will enable readers to properly evaluate the magnitude and reliability of the effects. revision: yes
Circularity Check
No significant circularity in empirical evaluation
full rationale
The paper reports results from a between-subjects experiment (N=128) that directly compares participant outcomes across three conditions using standard language-analysis metrics and perception scales. No derivations, equations, fitted parameters, or first-principles predictions are present; the reported effects on reflective trajectories, integrative language, and perceived support are measured outcomes rather than quantities constructed from the inputs by definition. The work contains no self-citation chains that bear the central claim, no uniqueness theorems imported from prior author work, and no ansatz smuggled via citation. The analysis is therefore self-contained as an empirical comparison whose validity rests on the chosen measures rather than on any internal reduction to its own premises.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Each main thought t_i is represented as a tuple: t_i = (text_i, category_i) where category_i in {internal,external,experiential,other}. ... S_k = sum (1 + |E_i|) ... a_t = explore with probability epsilon
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MANOVA on cognitive/emotional/intuitive composite scores from LIWC-22; cluster analysis of pre-interaction linguistic profiles
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Marah I. Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C.T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Y...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[2]
Annalena Aicher, Daniel Kornmüller, Yuki Matsuda, Stefan Ultes, Wolfgang Minker, and Keiichi Yasumoto. 2023. Towards breaking the self-imposed filter bubble in argumentative dialogues. (2023)
work page 2023
-
[3]
Annalena Aicher, Wolfgang Minker, and Stefan Ultes. 2022. Towards modelling self-imposed filter bubbles in argumentative dialogue systems. (2022)
work page 2022
-
[4]
Riku Arakawa and Hiromu Yakura. [n. d.]. Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth. InProceedings of the 6th ACM Conference on Conversational User Interfaces. 1–14
-
[5]
Ruben T Azevedo, Salvatore Maria Aglioti, and Bigna Lenggenhager. 2016. Participants’ above-chance recognition of own-heart sound combined with poor metacognitive awareness suggests implicit knowledge of own heart cardiodynamics. Scientific reports6, 1 (2016), 26545
work page 2016
-
[6]
Max H Bazerman and Dolly Chugh. 2006. Bounded awareness: Focusing failures in negotiation. InNegotiation theory and research. Psychology Press, 7–26
work page 2006
-
[7]
Antoine Bechara and Antonio R Damasio. 2005. The somatic marker hypothesis: A neural theory of economic decision.Games and economic behavior52, 2 (2005), 336–372
work page 2005
-
[8]
Godfred O Boateng, Torsten B Neilands, Edward A Frongillo, Hugo R Melgar-Quiñonez, and Sera L Young. 2018. Best practices for developing and validating scales for health, social, and behavioral research: a primer.Frontiers in public health6 (2018), 149
work page 2018
-
[9]
2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky
Becca Caddy. 2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky. TechRadar. https://www.techradar.com/computing/artificial-intelligence/altman-says-gen- z-uses-chatgpt-for-life-decisions-heres-why-thats-both-smart-and-risky Accessed: 2025-12-05
work page 2025
-
[10]
Adrian R Camilleri. 2023. An investigation of big life decisions.Judgment and Decision Making18 (2023), e32
work page 2023
-
[11]
Timothy A Carey and Richard J Mullan. 2004. What is Socratic questioning? Psychotherapy: theory, research, practice, training41, 3 (2004), 217
work page 2004
-
[12]
Chun-Wei Chiang, Zhuoran Lu, Zhuoyan Li, and Ming Yin. 2024. Enhancing ai-assisted group decision making through llm-powered devil’s advocate. In Proceedings of the 29th International Conference on Intelligent User Interfaces. 103–119
work page 2024
-
[13]
NAJ Cornelissen, RJM Van Eerdt, HK Schraffenberger, and Willem FG Haselager. 2022. Reflection machines: increasing meaningful human control over Decision Support Systems.Ethics and Information Technology24, 2 (2022), 19
work page 2022
-
[14]
Patricia G Devine, Patrick S Forscher, Anthony J Austin, and William TL Cox. 2012. Long-term reduction in implicit race bias: A prejudice habit-breaking intervention. Journal of experimental social psychology48, 6 (2012), 1267–1278
work page 2012
-
[15]
2006.Head, Heart and Guts: How the world’s best companies develop complete leaders
David L Dotlich, Peter C Cairo, and Stephen H Rhinesmith. 2006.Head, Heart and Guts: How the world’s best companies develop complete leaders. John Wiley & Sons
work page 2006
-
[16]
Glyn Elwyn and Talya Miron-Shatz. 2010. Deliberation before determination: the definition and evaluation of good decision making.Health Expectations13, 2 (2010), 139–147
work page 2010
-
[17]
Jonathan St B T Evans. 2008. Dual-processing accounts of reasoning, judgment, and social cognition.Annual Review of Psychology59 (2008), 255–278
work page 2008
-
[18]
Franz Faul, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009. Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior research methods41, 4 (2009), 1149–1160
work page 2009
-
[19]
Franz Faul, Edgar Erdfelder, Albert-Georg Lang, and Axel Buchner. 2007. G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences.Behavior research methods39, 2 (2007), 175–191
work page 2007
-
[20]
Robert E Goodin and Simon J Niemeyer. 2003. When does deliberation begin? Internal reflection versus public discussion in deliberative democracy.Political Studies51, 4 (2003), 627–649
work page 2003
-
[21]
Adam J Guastella and Mark R Dadds. 2006. Cognitive-behavioral models of emotional writing: A validation study.Cognitive Therapy and Research30, 3 (2006), 397–414
work page 2006
-
[22]
Emmanuel Hadoux, Anthony Hunter, and Sylwia Polberg. 2023. Strategic argumentation dialogues for persuasion: Framework and experiments based on modelling the beliefs and concerns of the persuadee.Argument & Computation14, 2 (2023), 109–161
work page 2023
-
[23]
Kate E Hamilton-West and Lyn Quine. 2007. Effects of written emotional disclosure on health outcomes in patients with ankylosing spondylitis.Psychology and Health22, 6 (2007), 637–657
work page 2007
-
[24]
Hsieh-Hong Huang, Jack Shih-Chieh Hsu, and Cheng-Yuan Ku. 2012. Understanding the role of computer-mediated counter-argument in countering confirmation bias. Decision Support Systems53, 3 (2012), 438–447
work page 2012
-
[25]
Bryan D Jones. 1999. Bounded rationality.Annual review of political science2, 1 (1999), 297–321
work page 1999
-
[26]
Daniel Kahneman and Shane Frederick. 2002. Representativeness revisited: Attribute substitution in intuitive judgment. InHeuristics and Biases: The Psychology of Intuitive Judgment, Thomas Gilovich, Dale Griffin, and Daniel Kahneman (Eds.). Cambridge University Press, 49–81
work page 2002
-
[27]
G Klein. 1998. Sources of Power: How People Make decisions MIT Press Cambridge MA. (1998)
work page 1998
-
[28]
Philipp Koralus. 2025. The philosophic turn for AI agents: replacing centralized digital rhetoric with decentralized truth-seeking: P. Koralus.Mind & Society(2025), 1–24
work page 2025
-
[29]
Russell F Korte. 2003. Biases in decision making and implications for human resource development.Advances in Developing Human Resources5, 4 (2003), 440–457
work page 2003
-
[30]
Ivica Kostric, Krisztian Balog, and Ujwal Gadiraju. 2025. Should We Tailor the Talk? Understanding the Impact of Conversational Styles on Preference Elicitation in Conversational Recommender Systems. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. 164–173
work page 2025
-
[31]
Nguyen-Thinh Le and Laura Wartschinski. 2018. A cognitive assistant for improving human reasoning skills.International Journal of Human-Computer Studies117 (2018), 45–54
work page 2018
-
[32]
David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers.Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1994), 3–12
work page 1994
-
[33]
Mary R Lynn. 1986. Determination and quantification of content validity.Nursing research35, 6 (1986), 382–386
work page 1986
-
[34]
Ine Mols, Elise Van den Hoven, and Berry Eggen. 2016. Informing design for reflection: An overview of current everyday practices. InProceedings of the 9th Nordic Conference on Human-Computer Interaction. 1–10
work page 2016
-
[35]
Paul Norris and Seymour Epstein. 2011. An experiential thinking style: Its facets and relations with objective and subjective criterion measures.Journal of personality79, 5 (2011), 1043–1080
work page 2011
- [36]
-
[37]
Richard Paul and Linda Elder. 2007. Critical thinking: The art of Socratic questioning. Journal of developmental education31, 1 (2007), 36
work page 2007
-
[38]
James W. Pennebaker, Ryan L. Boyd, Richard J. Booth, A. Ashokkumar, and M. E. Francis. 2022.Linguistic Inquiry and Word Count: LIWC-22. Pennebaker Conglomerates, Austin, TX. https://www.liwc.app
work page 2022
-
[39]
Leon Reicherts, Gun Woo Park, and Yvonne Rogers. 2022. Extending Chatbots to probe users: Enhancing complex decision-making through probing conversations. In Proceedings of the 4th Conference on Conversational User Interfaces. 1–10
work page 2022
-
[40]
Leon Reicherts, Zelun Tony Zhang, Elisabeth von Oswald, Yuanting Liu, Yvonne Rogers, and Mariam Hassib. 2025. AI, help me think—but for myself: Assisting people in complex decision-making by providing different kinds of cognitive support. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–19
work page 2025
-
[41]
Troy D Sadler and Dana L Zeidler. 2005. Patterns of informal reasoning in the context of socioscientific decision making.Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching42, 1 (2005), 112–138
work page 2005
-
[42]
Lucrezia Savioni, Stefano Triberti, Ilaria Durosini, and Gabriella Pravettoni. 2023. How to make big decisions: A cross-sectional study on the decision making process in life choices.Current Psychology42, 18 (2023), 15223–15236
work page 2023
-
[43]
Irene Scopelliti, Carey K Morewedge, Erin McCormick, H Lauren Min, Sophie Lebrecht, and Karim S Kassam. 2015. Bias blind spot: Structure, measurement, and consequences.Management Science61, 10 (2015), 2468–2486. UMAP ’26, June 08–11, 2026, Gothenburg, Sweden Tarvirdians et al
work page 2015
-
[44]
Sarah Seraj, Kate G Blackburn, and James W Pennebaker. 2021. Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup. Proceedings of the National Academy of Sciences118, 7 (2021), e2017154118
work page 2021
-
[45]
2009.Active learning literature survey
Burr Settles. 2009.Active learning literature survey. Ph. D. Dissertation. University of Wisconsin-Madison
work page 2009
- [46]
-
[47]
Paul J Silvia. 2022. The self-reflection and insight scale: Applying item response theory to craft an efficient short form.Current Psychology41, 12 (2022), 8635–8645
work page 2022
-
[48]
Steven A. Sloman. 1996. The empirical case for two systems of reasoning. Psychological Bulletin119, 1 (1996), 3–22
work page 1996
-
[49]
Grant Soosalu, Suzanne Henwood, and Arun Deo. 2019. Head, heart, and gut in decision making: Development of a multiple brain preference questionnaire.Sage Open9, 1 (2019), 2158244019837439
work page 2019
-
[50]
Richard S. Sutton and Andrew G. Barto. 2018.Reinforcement Learning: An Introduction. MIT Press
work page 2018
-
[51]
Barbara G. Tabachnick and Linda S. Fidell. 2019.Using Multivariate Statistics(7 ed.). Pearson
work page 2019
-
[52]
Morita Tarvirdians, Senthil Chandrasegaran, Hayley Hung, Catholijn M Jonker, and Catharine Oertel. 2025. Reflection Before Action: Designing a Framework for Quantifying Thought Patterns for Increased Self-awareness in Personal Decision Making.arXiv preprint arXiv:2510.04364(2025)
-
[53]
Amos Tversky and Daniel Kahneman. 1974. Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science185, 4157 (1974), 1124–1131
work page 1974
-
[54]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning.Machine Learning8, 3-4 (1992), 279–292
work page 1992
-
[55]
Klaus Weber, Annalena Aicher, Wolfang Minker, Stefan Ultes, and Elisabeth André
- [56]
-
[57]
Yu Zhang, Jingwei Sun, Li Feng, Cen Yao, Mingming Fan, Liuxin Zhang, Qianying Wang, Xin Geng, and Yong Rui. 2024. See widely, think wisely: Toward designing a generative multi-agent system to burst filter bubbles. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–24
work page 2024
- [58]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.