pith. sign in

arxiv: 1907.01923 · v1 · pith:XZ4WGMDWnew · submitted 2019-07-03 · 💻 cs.HC

A Need for Trust in Conversational Interface Research

Pith reviewed 2026-05-25 09:56 UTC · model grok-4.3

classification 💻 cs.HC
keywords trustconversational interfacessocial robotsembodied agentsconversational assistantshuman-computer interactionmeasurement
0
0 comments X

The pith

Trust is critical yet inconsistently defined and measured across conversational interface research.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how trust has been treated in research on social robots, embodied agents, and conversational assistants. It finds that while users consistently flag trust as important, the field has not settled on a shared understanding of the concept or reliable ways to measure it. The authors review various dimensions of trust and existing measurement approaches with the goal of encouraging more cross-field discussion. This matters because without clearer concepts, it is difficult to compare results or design systems that reliably earn user trust.

Core claim

Across several branches of conversational interaction research including interactions with social robots, embodied agents, and conversational assistants, users have identified trust as a critical part of those interactions. Nevertheless, there is little agreement on what trust means within these sort of interactions or how trust can be measured. The paper explores some of the dimensions of trust as it has been understood in previous work and outlines some of the ways trust has been measured in the hopes of furthering discussion of the concept across the field.

What carries the argument

The review and comparison of trust dimensions and measurement techniques drawn from prior studies on robots, agents, and assistants.

If this is right

  • Shared definitions would let researchers directly compare trust findings from robot studies with those from agent and assistant studies.
  • Agreed-upon measures could improve how conversational systems are evaluated for their ability to build trust.
  • Greater consensus on trust might guide the design of interfaces that more consistently earn user confidence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If trust differs by interaction type, context-specific scales for robots versus assistants may work better than one universal framework.
  • Future tests could check whether using a common trust measure alters which features designers prioritize in new interfaces.
  • Linking this review to psychological models of trust might clarify whether conversational trust is a distinct type of relationship.

Load-bearing premise

The lack of agreement on trust definitions and measures is mainly a barrier to progress rather than a sign that trust means genuinely different things in each context.

What would settle it

An empirical study showing that trust in social robot interactions and trust in text-based conversational assistant interactions are unrelated constructs with no shared ability to predict user behavior or preferences.

read the original abstract

Across several branches of conversational interaction research including interactions with social robots, embodied agents, and conversational assistants, users have identified trust as a critical part of those interactions. Nevertheless, there is little agreement on what trust means within these sort of interactions or how trust can be measured. In this paper, we explore some of the dimensions of trust as it has been understood in previous work and we outline some of the ways trust has been measured in the hopes of furthering discussion of the concept across the field.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript is a position paper claiming that trust is identified as a critical factor by users in interactions with social robots, embodied agents, and conversational assistants across conversational interaction research. However, there is little agreement on the meaning of trust in these contexts or on appropriate measurement methods. The paper reviews dimensions of trust from prior work and outlines measurement approaches with the goal of stimulating discussion in the field.

Significance. If the central observation of fragmented understanding of trust holds, the paper could play a useful role in prompting the conversational interfaces community to develop more shared definitions and metrics, potentially improving comparability of studies across sub-areas like robotics and virtual agents.

major comments (2)
  1. Abstract: The assertion that 'there is little agreement on what trust means within these sort of interactions or how trust can be measured' is presented without specific examples of conflicting definitions or measures from the literature, which is load-bearing for the paper's motivation to survey dimensions and methods.
  2. Introduction (or equivalent section): The paper does not address whether observed differences in trust conceptualizations reflect genuinely distinct phenomena across robot, agent, and assistant contexts rather than a lack of consensus that requires resolution.
minor comments (2)
  1. The abstract could benefit from one or two concrete citations illustrating divergent trust definitions to ground the claim of disagreement.
  2. The manuscript would be strengthened by a brief concluding section that proposes next steps for the community rather than ending solely on the invitation to discuss.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our position paper. We address each major comment below and indicate planned revisions where appropriate.

read point-by-point responses
  1. Referee: Abstract: The assertion that 'there is little agreement on what trust means within these sort of interactions or how trust can be measured' is presented without specific examples of conflicting definitions or measures from the literature, which is load-bearing for the paper's motivation to survey dimensions and methods.

    Authors: We agree the abstract claim would be stronger with immediate grounding. The manuscript body reviews multiple dimensions and measurement approaches drawn from prior work across robots, agents, and assistants, which collectively illustrate the variation. In revision we will expand the introduction to include two or three brief, concrete examples of conflicting definitions and measures (e.g., differing emphasis on competence versus benevolence, or questionnaire versus behavioral metrics) so the motivation is explicitly supported before the survey sections. revision: yes

  2. Referee: Introduction (or equivalent section): The paper does not address whether observed differences in trust conceptualizations reflect genuinely distinct phenomena across robot, agent, and assistant contexts rather than a lack of consensus that requires resolution.

    Authors: The paper frames the observed fragmentation as motivation for cross-field discussion rather than asserting that all differences must be resolved into a single consensus. We acknowledge the referee's point that some differences may legitimately reflect context-specific phenomena. In the revised introduction we will add a short paragraph explicitly noting this alternative explanation and positioning the call for discussion as a means to determine whether shared metrics, context-tailored approaches, or both are warranted. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a qualitative position statement surveying dimensions and measurements of trust across conversational interaction research. It advances no derivations, equations, predictions, or fitted quantities. The core claim of limited agreement on trust definitions is presented as an observation drawn from external literature rather than derived from any internal construction or self-citation chain. No load-bearing steps reduce to inputs by definition or renaming.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Position paper without formal models, data fitting, or technical derivations; no free parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5595 in / 953 out tokens · 22429 ms · 2026-05-25T09:56:42.988331+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    Oya Celiktutan and Hatice Gunes. 2017. Automatic Predic tion of Impres- sions in Time and across Varying Context: Personality, Attr activeness and Likeability. IEEE Transactions on Affective Computing 8, 1 (Jan. 2017), 29–42. https://doi.org/10.1109/TAFFC.2015.2513401

  2. [2]

    Leigh Clark, Phillip Doyle, Diego Garaialde, Emer Gilma rtin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, and Benjamin Cowan. 2018. The State of Speech in HCI: Trends, Themes and Ch allenges. arXiv:1810.06828 [cs] (Oct. 2018). http://arxiv.org/abs/1810.06828 arXiv: 1810.06828

  3. [3]

    Cowan, Na- dia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Ju stin Edwards, Brendan Spillane, Emer Gilmartin, and Christine Murad

    Leigh Clark, Cosmin Munteanu, Vincent Wade, Benjamin R. Cowan, Na- dia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Ju stin Edwards, Brendan Spillane, Emer Gilmartin, and Christine Murad. 201 9. What Makes a Good Conversation?: Challenges in Designing Truly C onversa- tional Agents. In Proceedings of the 2019 CHI Conference on Human Factors in Com...

  4. [4]

    Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Mor rissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. W hat can i help you with?: infrequent users’ experiences of intelligent perso nal assistants. In Proceed- ings of the 19th International Conference on Human-Compute r Interaction with Mobile Devices and Services . ACM, 43

  5. [5]

    Ewart J de Visser, Samuel S Monfort, Ryan McKendrick, Mel issa AB Smith, Patrick E McKnight, Frank Krueger, and Raja Parasuraman. 20 16. Almost hu- man: Anthropomorphism increases trust resilience in cogni tive agents. Journal of Experimental Psychology: Applied 22, 3 (2016), 331

  6. [6]

    Florian N Egger. 2000. Trust me, I’m an online vendor: tow ards a model of trust for e-commerce system design. In CHI’00 extended abstracts on Human factors in computing systems. ACM, 101–102

  7. [7]

    Flanagin and Miriam J

    Andrew J. Flanagin and Miriam J. Metzger. 2007. The role o f site features, user attributes, and information verification behaviors on the p erceived credibility of web-based information. New Media & Society 9, 2 (April 2007), 319–342. https://doi.org/10.1177/1461444807075015

  8. [8]

    BJ Fogg and Hsiang Tseng. 1999. The elements of computer c redibility. In Pro- ceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 80–87

  9. [9]

    Amos Freedy, Ewart DeVisser, Gershon Weltman, and Nicol e Coeyman. 2007. Measurement of trust in human-robot collaboration. In 2007 International Sym- posium on Collaborative Technologies and Systems . IEEE, 106–114

  10. [10]

    Kerstin Sophie Haring, David Silvera-Tawil, Yoshio Ma tsumoto, Mari Velonaki, and Katsumi Watanabe. 2014. Perception of an android robot i n Japan and Aus- tralia: A cross-cultural comparison. In International conference on social robotics . Springer, 166–175

  11. [11]

    Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in aut omation: Integrating empirical evidence on factors that influence trust. Human Factors 57, 3 (2015), 407–434

  12. [12]

    Oliver P John, Sanjay Srivastava, and others. 1999. The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of personality: Theory and research 2, 1999 (1999), 102–138

  13. [13]

    Spiro Kiousis. 2001. Public Trust or Mistrust? Percept ions of Media Credibility in the Information Age. Mass Communication and Society 4, 4 (Nov. 2001), 381–403. https://doi.org/10.1207/S15327825MCS0404_4

  14. [14]

    John D Lee and Katrina A See. 2004. Trust in automation: D esigning for appro- priate reliance. Human factors 46, 1 (2004), 50–80

  15. [15]

    Jin Joo Lee, Brad Knox, Jolie Baumann, Cynthia Breazeal , and David DeSteno

  16. [16]

    Frontiers in psychology 4 (2013), 893

    Computationally modeling interpersonal trust. Frontiers in psychology 4 (2013), 893

  17. [17]

    Like Having a Reall y Bad PA

    Ewa Luger and Abigail Sellen. 2016. "Like Having a Reall y Bad PA": The Gulf between User Expectation and Experience of Conversati onal Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Com put- ing Systems - CHI ’16 . ACM Press, Santa Clara, California, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288

  18. [18]

    D Harrison McKnight, Vivek Choudhury, and Charles Kacm ar. 2002. The impact of initial consumer trust on intentions to transact with a web site: a trust building model. The journal of strategic information systems 11, 3-4 (2002), 297–323. A Need for Trust in Conversational Interface Research CUI 20 19, August 22–23, 2019, Dublin, Ireland

  19. [19]

    Panagiotis Mitkidis, John J McGraw, Andreas Roepstorff , and Sebastian Wallot

  20. [20]

    Physiology & behavior 149 (2015), 101–106

    Building trust: Heart rate synchrony and arousal duri ng joint action in- creased by public goods game. Physiology & behavior 149 (2015), 101–106

  21. [21]

    Christie Olson and Kelli Kemery. 2019. 2019 Voice report: Consumer adoption of voice technology and digital assistants . Technical Report. Microsoft

  22. [22]

    Jens Riegelsberger, M Angela Sasse, and John D McCarthy . 2003. Shiny happy people building trust?: photos on e-commerce websites and c onsumer trust. In Proceedings of the SIGCHI conference on Human factors in comp uting systems . ACM, 121–128

  23. [23]

    Denise M Rousseau, Sim B Sitkin, Ronald S Burt, and Colin Camerer. 1998. Not so different after all: A cross-discipline view of trust. Academy of management review 23, 3 (1998), 393–404

  24. [24]

    Maha Salem, Gabriella Lakatos, Farshid Amirabdollahi an, and Kerstin Dauten- hahn. 2015. Would you trust a (faulty) robot?: Effects of erro r, task type and personality on human-robot cooperation and trust. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot I nteraction. ACM, 141–148

  25. [25]

    Scissors, Alastair J

    Lauren E. Scissors, Alastair J. Gill, Kathleen Geraght y, and Darren Gergle. 2009. In CMC we trust: the role of similarity. In Proceedings of the 27th international conference on Human factors in computing systems - CHI 09 . ACM Press, Boston, MA, USA, 527. https://doi.org/10.1145/1518701.1518783

  26. [26]

    Elaine Short, Justin Hart, Michelle Vu, and Brian Scass ellati. 2010. No fair an interaction with a cheating robot. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI) . IEEE, 219–226

  27. [27]

    Ilaria Torre, Leigh Clark, and Benjamin R Cowan. 2018. Measuring and designing trust in Human-Agent Interaction

  28. [28]

    Ilaria Torre, Jeremy Goslin, Laurence White, and Debor a Zanatto. 2018. Trust in artificial voices: A congruency effect of first impressions an d behavioural expe- rience. In Proceedings of the Technology, Mind, and Society . ACM, 40

  29. [29]

    Lin Wang, Pei-Luen Patrick Rau, Vanessa Evers, Benjamin Krisper Robinson, and Pamela Hinds. 2010. When in Rome: the role of culture & contex t in adherence to robot recommendations. In Proceedings of the 5th ACM/IEEE international con- ference on Human-robot interaction . IEEE Press, 359–366

  30. [30]

    James E Young, Richard Hawkins, Ehud Sharlin, and Takeo Igarashi. 2009. To- ward acceptable domestic robots: Applying insights from so cial psychology. In- ternational Journal of Social Robotics 1, 1 (2009), 95