Reflecti-Mate: A Conversational Agent for Adaptive Decision-Making Support Through System 1 and System 2 Thinking

Catharine Oertel; Catholijn M. Jonker; Hayley Hung; Morita Tarvirdians; Senthil Chandrasegaran

arxiv: 2605.22509 · v1 · pith:XUXOXJ5Rnew · submitted 2026-05-21 · 💻 cs.HC · cs.CL

Reflecti-Mate: A Conversational Agent for Adaptive Decision-Making Support Through System 1 and System 2 Thinking

Morita Tarvirdians , Senthil Chandrasegaran , Hayley Hung , Catholijn M. Jonker , Catharine Oertel This is my paper

Pith reviewed 2026-05-22 03:58 UTC · model grok-4.3

classification 💻 cs.HC cs.CL

keywords conversational agentsdecision makingreflective thinkingsystem 1system 2adaptive systemshuman-computer interactionholistic reflection

0 comments

The pith

A conversational agent that adapts to individual thinking patterns promotes better integration of intuitive and analytical processes during personal decision making.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

High-stakes decisions require blending cognitive analysis with emotional and intuitive insights, yet most support systems focus only on the cognitive side. This paper tests an agent named Reflecti-Mate that monitors a user's language in real time and adjusts its questions to encourage a wider range of thinking. In experiments with 128 people, those using the adaptive agent showed more unique reflection paths, used language that mixed different thought types more often, and rated the support as more holistic. A comparison agent without adaptation produced similar, mostly analytical responses from everyone. If this approach holds, it suggests decision tools could become more effective by matching how each person naturally thinks rather than imposing one style.

Core claim

The central discovery is that the Reflecti-Mate agent, by adapting to users' thought patterns, enables more personalized reflective trajectories, elicits more integrative reflective language, and is perceived as providing stronger support for holistic reflection, whereas a baseline agent leads to homogenized profiles dominated by cognitive language.

What carries the argument

Reflecti-Mate, a conversational agent that adapts its support based on detected patterns in the user's reflective language to balance System 1 intuitive and System 2 analytical thinking.

If this is right

Users follow more personalized paths in their reflection instead of uniform ones.
Reflective language becomes more integrative across cognitive and non-cognitive elements.
Participants rate the agent higher for supporting complete, holistic reflection.
Without adaptation, user outputs converge to similar cognitive-heavy styles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Decision-support tools in fields like finance or medicine could benefit from similar adaptation to capture emotional considerations.
Long-term studies tracking whether integrated reflection leads to better actual decisions would test the practical value.
Interface designers might incorporate language monitoring to detect when users are stuck in one mode of thinking.

Load-bearing premise

The language analysis measures and perception scales truly indicate integration of System 1 and System 2 thinking rather than being artifacts of how the agents were designed or what participants believed the researchers expected.

What would settle it

Re-running the study with the same agents but using a different set of language metrics or blinded evaluators that show no increase in integrative language or perceived holism for the adaptive agent.

Figures

Figures reproduced from arXiv: 2605.22509 by Catharine Oertel, Catholijn M. Jonker, Hayley Hung, Morita Tarvirdians, Senthil Chandrasegaran.

**Figure 2.** Figure 2: Overview of the Conditions. The baseline agent selects the next prompt based on the conversational history—including [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Transformation of Linguistic Markers by Condition. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: Reflection trajectories across clusters and conditions. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

Making high-stakes personal decisions involves cognitive, emotional, and intuitive processes, and individuals differ in how they allocate attention across these modes. Integration of these processes has shown to benefit decision making. Yet, most current decision-support systems focus primarily on supporting cognitive aspects, rather than adapting to the individual's thinking profile to support integration of different types of thoughts. In this study, we investigate an agent designed to encourage integration by adapting to the individual user's thought patterns. We explore its effects on participants' perceptions of the agent and their reflective behavior, in comparison with unaided pre-reflection and a baseline agent. In a between-subjects study (N = 128), our agent, which fostered broad and elaborated thinking, enabled more personalized reflective trajectories, elicited more integrative reflective language, and was perceived as providing stronger support for holistic reflection. In contrast, the baseline agent produced homogenized profiles dominated by cognitive language across participants.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The adaptive agent produces more varied reflections than the baseline in this N=128 study, but the language measures may simply track prompt compliance rather than genuine System 1/2 integration.

read the letter

The paper builds a conversational agent that detects whether a user is leaning intuitive or analytical and adjusts its prompts to push for integration of both. In the between-subjects experiment the adaptive version yielded more individualized reflection sequences and language that mixed cognitive and affective terms, while the non-adaptive baseline produced uniform cognitive-heavy profiles across participants. That contrast is the clearest new piece of evidence here, and the study design itself is straightforward enough to be replicable.

Referee Report

2 major / 2 minor

Summary. The paper introduces Reflecti-Mate, an adaptive conversational agent designed to support integration of System 1 (intuitive/emotional) and System 2 (analytical) thinking during high-stakes personal decisions. It reports a between-subjects study (N=128) comparing the adaptive agent against unaided pre-reflection and a non-adaptive baseline agent. Key claims are that the adaptive agent produces more personalized reflective trajectories, elicits more integrative reflective language, and is perceived as offering stronger holistic support, while the baseline yields homogenized, cognitively dominated profiles.

Significance. If the central results hold under scrutiny, the work offers a useful contribution to HCI research on adaptive decision-support systems by shifting focus from purely cognitive aids to agents that promote balanced cognitive-emotional integration. The between-subjects design and dual measurement approach (language analysis plus perception scales) are positive elements that could inform future personalized reflection tools. The paper would benefit from stronger validation of its language metrics to realize this potential.

major comments (2)

[Methods] Methods section: The paper must supply precise details on the language-analysis pipeline used to quantify 'integrative reflective language' and 'personalized reflective trajectories' (e.g., specific dictionaries, coding rubrics, inter-rater reliability, or automated feature extraction). Without these, it is impossible to determine whether observed differences reflect genuine System 1/2 integration or simply surface-level compliance with the agent's prompts. This is load-bearing for the headline claim.
[Results] Results section: The abstract asserts directional superiority on integrative language and personalization, yet the provided text supplies no statistical tests, effect sizes, confidence intervals, or descriptive statistics for the between-condition comparisons. Include a results table (or §5) reporting means, SDs, and inferential tests so readers can evaluate the magnitude and reliability of the reported effects.

minor comments (2)

[System Design] Clarify the exact differences in prompt structure between the adaptive and baseline agents in the system description to help readers assess the degree of adaptation.
[Abstract] The abstract would be strengthened by a brief parenthetical mention of the primary statistical outcomes or effect sizes supporting the main claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us identify areas for improvement in our manuscript. We address each major comment below and plan to incorporate the suggested changes in the revised version.

read point-by-point responses

Referee: [Methods] Methods section: The paper must supply precise details on the language-analysis pipeline used to quantify 'integrative reflective language' and 'personalized reflective trajectories' (e.g., specific dictionaries, coding rubrics, inter-rater reliability, or automated feature extraction). Without these, it is impossible to determine whether observed differences reflect genuine System 1/2 integration or simply surface-level compliance with the agent's prompts. This is load-bearing for the headline claim.

Authors: We fully agree with the referee that a precise description of the language-analysis pipeline is crucial for validating our findings and ensuring the results are not merely due to prompt compliance. In the revised manuscript, we will provide detailed information on the specific dictionaries, coding rubrics, inter-rater reliability metrics, and automated feature extraction methods used to measure integrative reflective language and personalized reflective trajectories. This addition will strengthen the methodological transparency and allow for better assessment of the claims regarding System 1 and System 2 integration. revision: yes
Referee: [Results] Results section: The abstract asserts directional superiority on integrative language and personalization, yet the provided text supplies no statistical tests, effect sizes, confidence intervals, or descriptive statistics for the between-condition comparisons. Include a results table (or §5) reporting means, SDs, and inferential tests so readers can evaluate the magnitude and reliability of the reported effects.

Authors: We acknowledge that the current manuscript does not include the requested statistical details in the Results section. We will revise the paper to include a results table or expanded section reporting means, standard deviations, statistical tests (e.g., appropriate inferential statistics for the between-subjects design), effect sizes, and confidence intervals for the comparisons on integrative language and personalization measures. This will enable readers to properly evaluate the magnitude and reliability of the effects. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical evaluation

full rationale

The paper reports results from a between-subjects experiment (N=128) that directly compares participant outcomes across three conditions using standard language-analysis metrics and perception scales. No derivations, equations, fitted parameters, or first-principles predictions are present; the reported effects on reflective trajectories, integrative language, and perceived support are measured outcomes rather than quantities constructed from the inputs by definition. The work contains no self-citation chains that bear the central claim, no uniqueness theorems imported from prior author work, and no ansatz smuggled via citation. The analysis is therefore self-contained as an empirical comparison whose validity rests on the chosen measures rather than on any internal reduction to its own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a purely empirical HCI user study; it introduces no mathematical derivations, new theoretical entities, or fitted parameters that the central claim depends on.

pith-pipeline@v0.9.0 · 5712 in / 1120 out tokens · 36280 ms · 2026-05-22T03:58:56.269890+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Each main thought t_i is represented as a tuple: t_i = (text_i, category_i) where category_i in {internal,external,experiential,other}. ... S_k = sum (1 + |E_i|) ... a_t = explore with probability epsilon
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MANOVA on cognitive/emotional/intuitive composite scores from LIWC-22; cluster analysis of pre-interaction linguistic profiles

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 1 internal anchor

[1]

Phi-4 Technical Report

Marah I. Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C.T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Y...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[2]

Annalena Aicher, Daniel Kornmüller, Yuki Matsuda, Stefan Ultes, Wolfgang Minker, and Keiichi Yasumoto. 2023. Towards breaking the self-imposed filter bubble in argumentative dialogues. (2023)

work page 2023
[3]

Annalena Aicher, Wolfgang Minker, and Stefan Ultes. 2022. Towards modelling self-imposed filter bubbles in argumentative dialogue systems. (2022)

work page 2022
[4]

Riku Arakawa and Hiromu Yakura. [n. d.]. Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth. InProceedings of the 6th ACM Conference on Conversational User Interfaces. 1–14

work page
[5]

Ruben T Azevedo, Salvatore Maria Aglioti, and Bigna Lenggenhager. 2016. Participants’ above-chance recognition of own-heart sound combined with poor metacognitive awareness suggests implicit knowledge of own heart cardiodynamics. Scientific reports6, 1 (2016), 26545

work page 2016
[6]

Max H Bazerman and Dolly Chugh. 2006. Bounded awareness: Focusing failures in negotiation. InNegotiation theory and research. Psychology Press, 7–26

work page 2006
[7]

Antoine Bechara and Antonio R Damasio. 2005. The somatic marker hypothesis: A neural theory of economic decision.Games and economic behavior52, 2 (2005), 336–372

work page 2005
[8]

Godfred O Boateng, Torsten B Neilands, Edward A Frongillo, Hugo R Melgar-Quiñonez, and Sera L Young. 2018. Best practices for developing and validating scales for health, social, and behavioral research: a primer.Frontiers in public health6 (2018), 149

work page 2018
[9]

2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky

Becca Caddy. 2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky. TechRadar. https://www.techradar.com/computing/artificial-intelligence/altman-says-gen- z-uses-chatgpt-for-life-decisions-heres-why-thats-both-smart-and-risky Accessed: 2025-12-05

work page 2025
[10]

Adrian R Camilleri. 2023. An investigation of big life decisions.Judgment and Decision Making18 (2023), e32

work page 2023
[11]

Timothy A Carey and Richard J Mullan. 2004. What is Socratic questioning? Psychotherapy: theory, research, practice, training41, 3 (2004), 217

work page 2004
[12]

Chun-Wei Chiang, Zhuoran Lu, Zhuoyan Li, and Ming Yin. 2024. Enhancing ai-assisted group decision making through llm-powered devil’s advocate. In Proceedings of the 29th International Conference on Intelligent User Interfaces. 103–119

work page 2024
[13]

NAJ Cornelissen, RJM Van Eerdt, HK Schraffenberger, and Willem FG Haselager. 2022. Reflection machines: increasing meaningful human control over Decision Support Systems.Ethics and Information Technology24, 2 (2022), 19

work page 2022
[14]

Patricia G Devine, Patrick S Forscher, Anthony J Austin, and William TL Cox. 2012. Long-term reduction in implicit race bias: A prejudice habit-breaking intervention. Journal of experimental social psychology48, 6 (2012), 1267–1278

work page 2012
[15]

2006.Head, Heart and Guts: How the world’s best companies develop complete leaders

David L Dotlich, Peter C Cairo, and Stephen H Rhinesmith. 2006.Head, Heart and Guts: How the world’s best companies develop complete leaders. John Wiley & Sons

work page 2006
[16]

Glyn Elwyn and Talya Miron-Shatz. 2010. Deliberation before determination: the definition and evaluation of good decision making.Health Expectations13, 2 (2010), 139–147

work page 2010
[17]

Jonathan St B T Evans. 2008. Dual-processing accounts of reasoning, judgment, and social cognition.Annual Review of Psychology59 (2008), 255–278

work page 2008
[18]

Franz Faul, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009. Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior research methods41, 4 (2009), 1149–1160

work page 2009
[19]

Franz Faul, Edgar Erdfelder, Albert-Georg Lang, and Axel Buchner. 2007. G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences.Behavior research methods39, 2 (2007), 175–191

work page 2007
[20]

Robert E Goodin and Simon J Niemeyer. 2003. When does deliberation begin? Internal reflection versus public discussion in deliberative democracy.Political Studies51, 4 (2003), 627–649

work page 2003
[21]

Adam J Guastella and Mark R Dadds. 2006. Cognitive-behavioral models of emotional writing: A validation study.Cognitive Therapy and Research30, 3 (2006), 397–414

work page 2006
[22]

Emmanuel Hadoux, Anthony Hunter, and Sylwia Polberg. 2023. Strategic argumentation dialogues for persuasion: Framework and experiments based on modelling the beliefs and concerns of the persuadee.Argument & Computation14, 2 (2023), 109–161

work page 2023
[23]

Kate E Hamilton-West and Lyn Quine. 2007. Effects of written emotional disclosure on health outcomes in patients with ankylosing spondylitis.Psychology and Health22, 6 (2007), 637–657

work page 2007
[24]

Hsieh-Hong Huang, Jack Shih-Chieh Hsu, and Cheng-Yuan Ku. 2012. Understanding the role of computer-mediated counter-argument in countering confirmation bias. Decision Support Systems53, 3 (2012), 438–447

work page 2012
[25]

Bryan D Jones. 1999. Bounded rationality.Annual review of political science2, 1 (1999), 297–321

work page 1999
[26]

Daniel Kahneman and Shane Frederick. 2002. Representativeness revisited: Attribute substitution in intuitive judgment. InHeuristics and Biases: The Psychology of Intuitive Judgment, Thomas Gilovich, Dale Griffin, and Daniel Kahneman (Eds.). Cambridge University Press, 49–81

work page 2002
[27]

G Klein. 1998. Sources of Power: How People Make decisions MIT Press Cambridge MA. (1998)

work page 1998
[28]

Philipp Koralus. 2025. The philosophic turn for AI agents: replacing centralized digital rhetoric with decentralized truth-seeking: P. Koralus.Mind & Society(2025), 1–24

work page 2025
[29]

Russell F Korte. 2003. Biases in decision making and implications for human resource development.Advances in Developing Human Resources5, 4 (2003), 440–457

work page 2003
[30]

Ivica Kostric, Krisztian Balog, and Ujwal Gadiraju. 2025. Should We Tailor the Talk? Understanding the Impact of Conversational Styles on Preference Elicitation in Conversational Recommender Systems. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. 164–173

work page 2025
[31]

Nguyen-Thinh Le and Laura Wartschinski. 2018. A cognitive assistant for improving human reasoning skills.International Journal of Human-Computer Studies117 (2018), 45–54

work page 2018
[32]

Lewis and William A

David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers.Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1994), 3–12

work page 1994
[33]

Mary R Lynn. 1986. Determination and quantification of content validity.Nursing research35, 6 (1986), 382–386

work page 1986
[34]

Ine Mols, Elise Van den Hoven, and Berry Eggen. 2016. Informing design for reflection: An overview of current everyday practices. InProceedings of the 9th Nordic Conference on Human-Computer Interaction. 1–10

work page 2016
[35]

Paul Norris and Seymour Epstein. 2011. An experiential thinking style: Its facets and relations with objective and subjective criterion measures.Journal of personality79, 5 (2011), 1043–1080

work page 2011
[36]

Soya Park and Chinmay Kulkarni. 2023. Thinking assistants: Llm-based conversational assistants that help users think by asking rather than answering.arXiv preprint arXiv:2312.06024(2023)

work page arXiv 2023
[37]

Richard Paul and Linda Elder. 2007. Critical thinking: The art of Socratic questioning. Journal of developmental education31, 1 (2007), 36

work page 2007
[38]

Pennebaker, Ryan L

James W. Pennebaker, Ryan L. Boyd, Richard J. Booth, A. Ashokkumar, and M. E. Francis. 2022.Linguistic Inquiry and Word Count: LIWC-22. Pennebaker Conglomerates, Austin, TX. https://www.liwc.app

work page 2022
[39]

Leon Reicherts, Gun Woo Park, and Yvonne Rogers. 2022. Extending Chatbots to probe users: Enhancing complex decision-making through probing conversations. In Proceedings of the 4th Conference on Conversational User Interfaces. 1–10

work page 2022
[40]

Leon Reicherts, Zelun Tony Zhang, Elisabeth von Oswald, Yuanting Liu, Yvonne Rogers, and Mariam Hassib. 2025. AI, help me think—but for myself: Assisting people in complex decision-making by providing different kinds of cognitive support. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–19

work page 2025
[41]

Troy D Sadler and Dana L Zeidler. 2005. Patterns of informal reasoning in the context of socioscientific decision making.Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching42, 1 (2005), 112–138

work page 2005
[42]

Lucrezia Savioni, Stefano Triberti, Ilaria Durosini, and Gabriella Pravettoni. 2023. How to make big decisions: A cross-sectional study on the decision making process in life choices.Current Psychology42, 18 (2023), 15223–15236

work page 2023
[43]

Irene Scopelliti, Carey K Morewedge, Erin McCormick, H Lauren Min, Sophie Lebrecht, and Karim S Kassam. 2015. Bias blind spot: Structure, measurement, and consequences.Management Science61, 10 (2015), 2468–2486. UMAP ’26, June 08–11, 2026, Gothenburg, Sweden Tarvirdians et al

work page 2015
[44]

Sarah Seraj, Kate G Blackburn, and James W Pennebaker. 2021. Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup. Proceedings of the National Academy of Sciences118, 7 (2021), e2017154118

work page 2021
[45]

2009.Active learning literature survey

Burr Settles. 2009.Active learning literature survey. Ph. D. Dissertation. University of Wisconsin-Madison

work page 2009
[46]

Li Shi, Houjiang Liu, Yian Wong, Utkarsh Mujumdar, Dan Zhang, Jacek Gwizdka, and Matthew Lease. 2024. Argumentative experience: Reducing confirmation bias on controversial issues through llm-generated multi-persona debates.arXiv preprint arXiv:2412.04629(2024)

work page arXiv 2024
[47]

Paul J Silvia. 2022. The self-reflection and insight scale: Applying item response theory to craft an efficient short form.Current Psychology41, 12 (2022), 8635–8645

work page 2022
[48]

Steven A. Sloman. 1996. The empirical case for two systems of reasoning. Psychological Bulletin119, 1 (1996), 3–22

work page 1996
[49]

Grant Soosalu, Suzanne Henwood, and Arun Deo. 2019. Head, heart, and gut in decision making: Development of a multiple brain preference questionnaire.Sage Open9, 1 (2019), 2158244019837439

work page 2019
[50]

Sutton and Andrew G

Richard S. Sutton and Andrew G. Barto. 2018.Reinforcement Learning: An Introduction. MIT Press

work page 2018
[51]

Tabachnick and Linda S

Barbara G. Tabachnick and Linda S. Fidell. 2019.Using Multivariate Statistics(7 ed.). Pearson

work page 2019
[52]

Morita Tarvirdians, Senthil Chandrasegaran, Hayley Hung, Catholijn M Jonker, and Catharine Oertel. 2025. Reflection Before Action: Designing a Framework for Quantifying Thought Patterns for Increased Self-awareness in Personal Decision Making.arXiv preprint arXiv:2510.04364(2025)

work page arXiv 2025
[53]

Amos Tversky and Daniel Kahneman. 1974. Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science185, 4157 (1974), 1124–1131

work page 1974
[54]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning.Machine Learning8, 3-4 (1992), 279–292

work page 1992
[55]

Klaus Weber, Annalena Aicher, Wolfang Minker, Stefan Ultes, and Elisabeth André

work page
[56]

Fostering user engagement in the critical reflection of arguments.arXiv preprint arXiv:2308.09061(2023)

work page arXiv 2023
[57]

Yu Zhang, Jingwei Sun, Li Feng, Cen Yao, Mingming Fan, Liuxin Zhang, Qianying Wang, Xin Geng, and Yong Rui. 2024. See widely, think wisely: Toward designing a generative multi-agent system to burst filter bubbles. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–24

work page 2024
[58]

Zelun Tony Zhang and Leon Reicherts. 2025. Augmenting Human Cognition With Generative AI: Lessons From AI-Assisted Decision-Making.arXiv preprint arXiv:2504.03207(2025)

work page arXiv 2025

[1] [1]

Phi-4 Technical Report

Marah I. Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C.T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Y...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[2] [2]

Annalena Aicher, Daniel Kornmüller, Yuki Matsuda, Stefan Ultes, Wolfgang Minker, and Keiichi Yasumoto. 2023. Towards breaking the self-imposed filter bubble in argumentative dialogues. (2023)

work page 2023

[3] [3]

Annalena Aicher, Wolfgang Minker, and Stefan Ultes. 2022. Towards modelling self-imposed filter bubbles in argumentative dialogue systems. (2022)

work page 2022

[4] [4]

Riku Arakawa and Hiromu Yakura. [n. d.]. Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth. InProceedings of the 6th ACM Conference on Conversational User Interfaces. 1–14

work page

[5] [5]

Ruben T Azevedo, Salvatore Maria Aglioti, and Bigna Lenggenhager. 2016. Participants’ above-chance recognition of own-heart sound combined with poor metacognitive awareness suggests implicit knowledge of own heart cardiodynamics. Scientific reports6, 1 (2016), 26545

work page 2016

[6] [6]

Max H Bazerman and Dolly Chugh. 2006. Bounded awareness: Focusing failures in negotiation. InNegotiation theory and research. Psychology Press, 7–26

work page 2006

[7] [7]

Antoine Bechara and Antonio R Damasio. 2005. The somatic marker hypothesis: A neural theory of economic decision.Games and economic behavior52, 2 (2005), 336–372

work page 2005

[8] [8]

Godfred O Boateng, Torsten B Neilands, Edward A Frongillo, Hugo R Melgar-Quiñonez, and Sera L Young. 2018. Best practices for developing and validating scales for health, social, and behavioral research: a primer.Frontiers in public health6 (2018), 149

work page 2018

[9] [9]

2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky

Becca Caddy. 2025.Altman says Gen Z uses ChatGPT for life decisions, here’s why that’s both smart and risky. TechRadar. https://www.techradar.com/computing/artificial-intelligence/altman-says-gen- z-uses-chatgpt-for-life-decisions-heres-why-thats-both-smart-and-risky Accessed: 2025-12-05

work page 2025

[10] [10]

Adrian R Camilleri. 2023. An investigation of big life decisions.Judgment and Decision Making18 (2023), e32

work page 2023

[11] [11]

Timothy A Carey and Richard J Mullan. 2004. What is Socratic questioning? Psychotherapy: theory, research, practice, training41, 3 (2004), 217

work page 2004

[12] [12]

Chun-Wei Chiang, Zhuoran Lu, Zhuoyan Li, and Ming Yin. 2024. Enhancing ai-assisted group decision making through llm-powered devil’s advocate. In Proceedings of the 29th International Conference on Intelligent User Interfaces. 103–119

work page 2024

[13] [13]

NAJ Cornelissen, RJM Van Eerdt, HK Schraffenberger, and Willem FG Haselager. 2022. Reflection machines: increasing meaningful human control over Decision Support Systems.Ethics and Information Technology24, 2 (2022), 19

work page 2022

[14] [14]

Patricia G Devine, Patrick S Forscher, Anthony J Austin, and William TL Cox. 2012. Long-term reduction in implicit race bias: A prejudice habit-breaking intervention. Journal of experimental social psychology48, 6 (2012), 1267–1278

work page 2012

[15] [15]

2006.Head, Heart and Guts: How the world’s best companies develop complete leaders

David L Dotlich, Peter C Cairo, and Stephen H Rhinesmith. 2006.Head, Heart and Guts: How the world’s best companies develop complete leaders. John Wiley & Sons

work page 2006

[16] [16]

Glyn Elwyn and Talya Miron-Shatz. 2010. Deliberation before determination: the definition and evaluation of good decision making.Health Expectations13, 2 (2010), 139–147

work page 2010

[17] [17]

Jonathan St B T Evans. 2008. Dual-processing accounts of reasoning, judgment, and social cognition.Annual Review of Psychology59 (2008), 255–278

work page 2008

[18] [18]

Franz Faul, Edgar Erdfelder, Axel Buchner, and Albert-Georg Lang. 2009. Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior research methods41, 4 (2009), 1149–1160

work page 2009

[19] [19]

Franz Faul, Edgar Erdfelder, Albert-Georg Lang, and Axel Buchner. 2007. G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences.Behavior research methods39, 2 (2007), 175–191

work page 2007

[20] [20]

Robert E Goodin and Simon J Niemeyer. 2003. When does deliberation begin? Internal reflection versus public discussion in deliberative democracy.Political Studies51, 4 (2003), 627–649

work page 2003

[21] [21]

Adam J Guastella and Mark R Dadds. 2006. Cognitive-behavioral models of emotional writing: A validation study.Cognitive Therapy and Research30, 3 (2006), 397–414

work page 2006

[22] [22]

Emmanuel Hadoux, Anthony Hunter, and Sylwia Polberg. 2023. Strategic argumentation dialogues for persuasion: Framework and experiments based on modelling the beliefs and concerns of the persuadee.Argument & Computation14, 2 (2023), 109–161

work page 2023

[23] [23]

Kate E Hamilton-West and Lyn Quine. 2007. Effects of written emotional disclosure on health outcomes in patients with ankylosing spondylitis.Psychology and Health22, 6 (2007), 637–657

work page 2007

[24] [24]

Hsieh-Hong Huang, Jack Shih-Chieh Hsu, and Cheng-Yuan Ku. 2012. Understanding the role of computer-mediated counter-argument in countering confirmation bias. Decision Support Systems53, 3 (2012), 438–447

work page 2012

[25] [25]

Bryan D Jones. 1999. Bounded rationality.Annual review of political science2, 1 (1999), 297–321

work page 1999

[26] [26]

Daniel Kahneman and Shane Frederick. 2002. Representativeness revisited: Attribute substitution in intuitive judgment. InHeuristics and Biases: The Psychology of Intuitive Judgment, Thomas Gilovich, Dale Griffin, and Daniel Kahneman (Eds.). Cambridge University Press, 49–81

work page 2002

[27] [27]

G Klein. 1998. Sources of Power: How People Make decisions MIT Press Cambridge MA. (1998)

work page 1998

[28] [28]

Philipp Koralus. 2025. The philosophic turn for AI agents: replacing centralized digital rhetoric with decentralized truth-seeking: P. Koralus.Mind & Society(2025), 1–24

work page 2025

[29] [29]

Russell F Korte. 2003. Biases in decision making and implications for human resource development.Advances in Developing Human Resources5, 4 (2003), 440–457

work page 2003

[30] [30]

Ivica Kostric, Krisztian Balog, and Ujwal Gadiraju. 2025. Should We Tailor the Talk? Understanding the Impact of Conversational Styles on Preference Elicitation in Conversational Recommender Systems. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. 164–173

work page 2025

[31] [31]

Nguyen-Thinh Le and Laura Wartschinski. 2018. A cognitive assistant for improving human reasoning skills.International Journal of Human-Computer Studies117 (2018), 45–54

work page 2018

[32] [32]

Lewis and William A

David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers.Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1994), 3–12

work page 1994

[33] [33]

Mary R Lynn. 1986. Determination and quantification of content validity.Nursing research35, 6 (1986), 382–386

work page 1986

[34] [34]

Ine Mols, Elise Van den Hoven, and Berry Eggen. 2016. Informing design for reflection: An overview of current everyday practices. InProceedings of the 9th Nordic Conference on Human-Computer Interaction. 1–10

work page 2016

[35] [35]

Paul Norris and Seymour Epstein. 2011. An experiential thinking style: Its facets and relations with objective and subjective criterion measures.Journal of personality79, 5 (2011), 1043–1080

work page 2011

[36] [36]

Soya Park and Chinmay Kulkarni. 2023. Thinking assistants: Llm-based conversational assistants that help users think by asking rather than answering.arXiv preprint arXiv:2312.06024(2023)

work page arXiv 2023

[37] [37]

Richard Paul and Linda Elder. 2007. Critical thinking: The art of Socratic questioning. Journal of developmental education31, 1 (2007), 36

work page 2007

[38] [38]

Pennebaker, Ryan L

James W. Pennebaker, Ryan L. Boyd, Richard J. Booth, A. Ashokkumar, and M. E. Francis. 2022.Linguistic Inquiry and Word Count: LIWC-22. Pennebaker Conglomerates, Austin, TX. https://www.liwc.app

work page 2022

[39] [39]

Leon Reicherts, Gun Woo Park, and Yvonne Rogers. 2022. Extending Chatbots to probe users: Enhancing complex decision-making through probing conversations. In Proceedings of the 4th Conference on Conversational User Interfaces. 1–10

work page 2022

[40] [40]

Leon Reicherts, Zelun Tony Zhang, Elisabeth von Oswald, Yuanting Liu, Yvonne Rogers, and Mariam Hassib. 2025. AI, help me think—but for myself: Assisting people in complex decision-making by providing different kinds of cognitive support. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–19

work page 2025

[41] [41]

Troy D Sadler and Dana L Zeidler. 2005. Patterns of informal reasoning in the context of socioscientific decision making.Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching42, 1 (2005), 112–138

work page 2005

[42] [42]

Lucrezia Savioni, Stefano Triberti, Ilaria Durosini, and Gabriella Pravettoni. 2023. How to make big decisions: A cross-sectional study on the decision making process in life choices.Current Psychology42, 18 (2023), 15223–15236

work page 2023

[43] [43]

Irene Scopelliti, Carey K Morewedge, Erin McCormick, H Lauren Min, Sophie Lebrecht, and Karim S Kassam. 2015. Bias blind spot: Structure, measurement, and consequences.Management Science61, 10 (2015), 2468–2486. UMAP ’26, June 08–11, 2026, Gothenburg, Sweden Tarvirdians et al

work page 2015

[44] [44]

Sarah Seraj, Kate G Blackburn, and James W Pennebaker. 2021. Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup. Proceedings of the National Academy of Sciences118, 7 (2021), e2017154118

work page 2021

[45] [45]

2009.Active learning literature survey

Burr Settles. 2009.Active learning literature survey. Ph. D. Dissertation. University of Wisconsin-Madison

work page 2009

[46] [46]

Li Shi, Houjiang Liu, Yian Wong, Utkarsh Mujumdar, Dan Zhang, Jacek Gwizdka, and Matthew Lease. 2024. Argumentative experience: Reducing confirmation bias on controversial issues through llm-generated multi-persona debates.arXiv preprint arXiv:2412.04629(2024)

work page arXiv 2024

[47] [47]

Paul J Silvia. 2022. The self-reflection and insight scale: Applying item response theory to craft an efficient short form.Current Psychology41, 12 (2022), 8635–8645

work page 2022

[48] [48]

Steven A. Sloman. 1996. The empirical case for two systems of reasoning. Psychological Bulletin119, 1 (1996), 3–22

work page 1996

[49] [49]

Grant Soosalu, Suzanne Henwood, and Arun Deo. 2019. Head, heart, and gut in decision making: Development of a multiple brain preference questionnaire.Sage Open9, 1 (2019), 2158244019837439

work page 2019

[50] [50]

Sutton and Andrew G

Richard S. Sutton and Andrew G. Barto. 2018.Reinforcement Learning: An Introduction. MIT Press

work page 2018

[51] [51]

Tabachnick and Linda S

Barbara G. Tabachnick and Linda S. Fidell. 2019.Using Multivariate Statistics(7 ed.). Pearson

work page 2019

[52] [52]

Morita Tarvirdians, Senthil Chandrasegaran, Hayley Hung, Catholijn M Jonker, and Catharine Oertel. 2025. Reflection Before Action: Designing a Framework for Quantifying Thought Patterns for Increased Self-awareness in Personal Decision Making.arXiv preprint arXiv:2510.04364(2025)

work page arXiv 2025

[53] [53]

Amos Tversky and Daniel Kahneman. 1974. Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science185, 4157 (1974), 1124–1131

work page 1974

[54] [54]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning.Machine Learning8, 3-4 (1992), 279–292

work page 1992

[55] [55]

Klaus Weber, Annalena Aicher, Wolfang Minker, Stefan Ultes, and Elisabeth André

work page

[56] [56]

Fostering user engagement in the critical reflection of arguments.arXiv preprint arXiv:2308.09061(2023)

work page arXiv 2023

[57] [57]

Yu Zhang, Jingwei Sun, Li Feng, Cen Yao, Mingming Fan, Liuxin Zhang, Qianying Wang, Xin Geng, and Yong Rui. 2024. See widely, think wisely: Toward designing a generative multi-agent system to burst filter bubbles. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–24

work page 2024

[58] [58]

Zelun Tony Zhang and Leon Reicherts. 2025. Augmenting Human Cognition With Generative AI: Lessons From AI-Assisted Decision-Making.arXiv preprint arXiv:2504.03207(2025)

work page arXiv 2025