pith. sign in

arxiv: 2606.21886 · v1 · pith:N6GKTGYKnew · submitted 2026-06-20 · 💻 cs.HC · cs.AI· cs.CL· cs.CY

AI-Mediated Negotiation: Design Reflections and Lessons

Pith reviewed 2026-06-26 12:09 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CLcs.CY
keywords conversational AInegotiation preparationcoaching systemuser studyempowermentusabilitydesign guidelinesrecursive tasks
0
0 comments X

The pith

Conversational AI imposes a linear model on negotiation preparation, which is fundamentally recursive, so a static handbook outperformed both AI conditions on empowerment and usability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper built Trucey, a theory-driven conversational AI coaching system for high-stakes workplace negotiations, encoding four assumptions about how AI should help: articulation supports clarification, personalization builds strategic competence, chunked delivery reduces cognitive load, and structured scaffolding removes metacognitive burden. A pre-registered experiment with 267 participants and interviews with 15 showed that a static handbook control outperformed both AI versions on empowerment and usability. The authors conclude that each assumption encoded a linear model of preparation, but negotiation preparation is recursive and iterative. They identify an unexamined scope condition on HAI design guidelines and propose a sequencing principle for future AI coaching: map before path, path before simulation.

Core claim

Conversational AI for negotiation coaching imposes a linear execution model on a task that is fundamentally recursive, as shown by the static handbook outperforming the AI conditions on empowerment and usability in the experiment.

What carries the argument

Trucey system and its four encoded assumptions about AI support, tested against a passive static handbook control that revealed the linear-recursive mismatch.

Load-bearing premise

The four assumptions about how preparation unfolds and the chosen measures of empowerment and usability adequately capture the recursive nature of negotiation preparation.

What would settle it

An experiment comparing the handbook to an AI system redesigned with explicit support for revisiting and revising earlier steps at any time, measuring whether empowerment and usability scores then exceed the handbook.

read the original abstract

Conversational AI promises a new kind of preparation for high-stakes workplace negotiations -- personalized, interactive, and capable of simulating realistic resistance. That promise is intuitive. We built Trucey, a theory-driven coaching system, to test it. The system encoded four assumptions: that articulation supports clarification, that personalization builds strategic competence, that chunked delivery reduces cognitive load, and that structured scaffolding removes metacognitive burden. A pre-registered experiment (N=267) and interviews (N=15) complicated each of them. Notably, the static handbook we included as a passive control outperformed both AI conditions on empowerment and usability. We reflect on why: each assumption encoded a specific model of how preparation unfolds, and the findings revealed that conversational AI imposes a linear execution model on a task that is fundamentally recursive. We identify an unexamined scope condition on established HAI design guidelines and close with a sequencing principle -- map before path, path before simulation -- for future AI coaching design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript reports the design of Trucey, a conversational AI coaching system for workplace negotiation preparation that encodes four assumptions (articulation supports clarification, personalization builds strategic competence, chunked delivery reduces cognitive load, and structured scaffolding removes metacognitive burden). A pre-registered experiment (N=267) and qualitative interviews (N=15) found that a static handbook control outperformed both AI conditions on empowerment and usability scales; the authors interpret this as evidence that conversational AI imposes a linear execution model on a task that is fundamentally recursive and propose a sequencing principle ('map before path, path before simulation') for future AI coaching systems.

Significance. If the interpretation holds, the work supplies empirical evidence of a scope condition on established HAI design guidelines for coaching and supplies a concrete, testable sequencing principle. The pre-registered experiment with a passive control condition and the mixed-methods design are clear strengths that allow the central claim to be evaluated against data rather than theory alone.

major comments (3)
  1. [Discussion] Discussion (and §4.3 Results): the claim that handbook outperformance demonstrates imposition of a linear model on a recursive task is not directly supported by any manipulation check or measure of perceived linearity, iteration, or backtracking; the empowerment and usability instruments could reflect content fidelity, interaction friction, or implementation differences instead.
  2. [Methods] Methods (qualitative analysis subsection): the N=15 interviews are described as complicating the four assumptions, yet no coding scheme, inter-rater reliability, or explicit extraction of recursion/backtracking evidence is reported, leaving the post-hoc interpretation without a traceable empirical anchor.
  3. [Experiment design] Experiment design: equivalence of substantive content across the handbook and the two AI conditions is not demonstrated (e.g., via expert review or content analysis), so differences in outcomes cannot be attributed solely to interaction style versus information quality.
minor comments (2)
  1. [Introduction] The four assumptions are listed in the abstract and introduction but never restated with their exact operationalizations in the methods; adding a table that maps each assumption to its implemented feature and chosen measure would improve traceability.
  2. [Results] Figure captions and axis labels for the empowerment/usability plots should explicitly state the scale ranges and whether higher scores indicate better outcomes.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify the scope and limitations of our claims. We address each major comment below, proposing revisions to strengthen the manuscript where the concerns are valid.

read point-by-point responses
  1. Referee: [Discussion] Discussion (and §4.3 Results): the claim that handbook outperformance demonstrates imposition of a linear model on a recursive task is not directly supported by any manipulation check or measure of perceived linearity, iteration, or backtracking; the empowerment and usability instruments could reflect content fidelity, interaction friction, or implementation differences instead.

    Authors: We agree that the linear-versus-recursive interpretation is inferential, drawn from the pattern of quantitative results combined with interview themes rather than a dedicated manipulation check. The pre-registered scales captured downstream effects on empowerment and usability, while interviews surfaced participant descriptions of iterative preparation that the AI flow did not support. We will revise the discussion section to present this interpretation explicitly as a post-hoc hypothesis, enumerate alternative explanations (including content fidelity and interaction friction), and qualify the claim accordingly without changing the reported findings. revision: partial

  2. Referee: [Methods] Methods (qualitative analysis subsection): the N=15 interviews are described as complicating the four assumptions, yet no coding scheme, inter-rater reliability, or explicit extraction of recursion/backtracking evidence is reported, leaving the post-hoc interpretation without a traceable empirical anchor.

    Authors: The qualitative analysis was conducted inductively to interpret the quantitative pattern. We will expand the methods subsection to describe the analytic process in greater detail, including how transcripts were reviewed for themes related to the four assumptions and how evidence of recursive or backtracking behavior was identified. Because no formal coding scheme or inter-rater reliability assessment was performed, we will note this as a limitation of the exploratory component. revision: yes

  3. Referee: [Experiment design] Experiment design: equivalence of substantive content across the handbook and the two AI conditions is not demonstrated (e.g., via expert review or content analysis), so differences in outcomes cannot be attributed solely to interaction style versus information quality.

    Authors: The handbook was derived from the same negotiation principles and scenario set used to populate the AI knowledge base and prompts. We will add an explicit statement in the methods section describing this shared foundation. We acknowledge, however, that the absence of a formal content analysis or expert review means we cannot fully exclude differences in information quality as a contributing factor; this will be noted as a study limitation. revision: partial

Circularity Check

0 steps flagged

Empirical user study with no derivation chain or self-referential reduction

full rationale

The paper reports a pre-registered experiment (N=267) and interviews (N=15) testing four design assumptions encoded in Trucey, then offers post-hoc reflections on why the static handbook outperformed the AI conditions. No mathematical derivations, equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The central interpretive claim (conversational AI imposes a linear model on a recursive task) is a qualitative reflection on empirical outcomes rather than a step that reduces to its own inputs by construction. The study is self-contained as an empirical report; external benchmarks or falsifiability are not required for a non-circularity finding here.

Axiom & Free-Parameter Ledger

0 free parameters · 4 axioms · 0 invented entities

The paper is an empirical design study; the four assumptions tested function as domain assumptions that the system encoded and the experiment evaluated. No free parameters or invented entities are evident from the abstract.

axioms (4)
  • domain assumption articulation supports clarification
    One of the four assumptions encoded in Trucey design and tested in the experiment.
  • domain assumption personalization builds strategic competence
    One of the four assumptions encoded in Trucey design and tested in the experiment.
  • domain assumption chunked delivery reduces cognitive load
    One of the four assumptions encoded in Trucey design and tested in the experiment.
  • domain assumption structured scaffolding removes metacognitive burden
    One of the four assumptions encoded in Trucey design and tested in the experiment.

pith-pipeline@v0.9.1-grok · 5722 in / 1148 out tokens · 20722 ms · 2026-06-26T12:09:20.641617+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 1 canonical work pages

  1. [1]

    Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, et al. 2019. Guidelines for human-AI interaction. InProceedings of the 2019 chi conference on human factors in computing systems. 1–13

  2. [2]

    Albert Bandura. 1977. Self-efficacy: toward a unifying theory of behavioral change.Psychological review84, 2 (1977), 191

  3. [3]

    Bradley and Amanda C

    Graham L. Bradley and Amanda C. Campbell. 2016. Managing difficult workplace conversations.International Journal of Business Communication53, 4 (Jul 2016), 443–464. doi:10.1177/2329488414525468

  4. [4]

    Jeanne Brett and Leigh Thompson. 2016. Negotiation.Organizational behavior and human decision processes136 (2016), 68–79

  5. [5]

    Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making.Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–21

  6. [6]

    Ulrike Cress and Joachim Kimmerle. 2023. Co-constructing knowledge with generative AI tools: Reflections from a CSCL perspective: U. Cress, J. Kimmerle.International Journal of Computer-Supported Collaborative Learning18, 4 (2023), 607–614

  7. [7]

    Vedant Das Swain and Koustuv Saha. 2024. Teacher, trainer, counsel, spy: How generative AI can bridge or widen the gaps in worker-centric digital phenotyping of Wellbeing. InProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work. 1–13

  8. [8]

    Vedant Das Swain, Qiuyue "Joy" Zhong, Jash Rajesh Parekh, Yechan Jeon, Roy Zimmerman, Mary Czerwinski, Jina Suh, Varun Mishra, Koustuv Saha, and Javier Hernandez. 2025. AI on My Shoulder: Supporting Emotional Labor in Front-Office Roles with an LLM-based Empathetic Coworker. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems

  9. [9]

    Veda Duddu, Jash Rajesh Parekh, Andy Mao, Hanyi Min, Ziang Xiao, Vedant Das Swain, and Koustuv Saha. 2025. Does AI Coaching Prepare us for Workplace Negotiations?arXiv preprint arXiv:2509.22545(2025). AI-Mediated Negotiation: Design Reflections and Lessons 7

  10. [10]

    Veda Duddu, Jash Rajesh Parekh, Andy Mao, Hanyi Min, Ziang Xiao, Vedant Das Swain, and Koustuv Saha. 2026. Not My Truce: Personality Differences in AI-Mediated Workplace Negotiation.arXiv preprint arXiv:2604.00464(2026)

  11. [11]

    couch potatoes

    Gerhard Fischer. 1998. Beyond" couch potatoes": From consumers to designers. InProceedings. 3rd Asia Pacific Computer Human Interaction (Cat. No. 98EX110). IEEE, 2–9

  12. [12]

    Edna B Foa and Michael J Kozak. 1986. Emotional processing of fear: exposure to corrective information.Psychological bulletin99, 1 (1986), 20

  13. [13]

    Jon L Pierce, Tatiana Kostova, and Kurt T Dirks. 2001. Toward a theory of psychological ownership in organizations. Academy of management review26, 2 (2001), 298–310

  14. [14]

    Anthony J Porcelli and Mauricio R Delgado. 2017. Stress and decision making: effects on valuation, learning, and risk-taking.Current opinion in behavioral sciences14 (2017), 33–39

  15. [15]

    Omar Shaikh, Valentino Emil Chai, Michele Gelfand, Diyi Yang, and Michael S Bernstein. 2024. Rehearsal: Simulating conflict to teach conflict resolution. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–20

  16. [16]

    Ashish Sharma, Kevin Rushton, Inna Lin, David Wadden, Khendra Lucas, Adam Miner, Theresa Nguyen, and Tim Althoff. 2023. Cognitive reframing of negative thoughts through human-language model interaction. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 9977–10000

  17. [17]

    Gretchen M Spreitzer. 1995. Psychological empowerment in the workplace: Dimensions, measurement, and validation. Academy of management Journal38, 5 (1995), 1442–1465

  18. [18]

    John Sweller. 2011. Cognitive load theory. InPsychology of learning and motivation. Vol. 55. Elsevier, 37–76