How Much Trust is Enough? Towards Calibrating Trust in Technology
Pith reviewed 2026-05-10 18:48 UTC · model grok-4.3
The pith
The Human-Computer Trust Scale offers an initial measure of trust propensity but requires contextual interpretation to be useful.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper's central claim is that the HCTS is a promising tool for an initial evaluation of propensity to trust, but such an assessment requires reflection and interpretation that should be considered within the context of the interaction. The authors present the process used to develop a guideline for interpreting the instrument's results and explain the rationale for their decisions, advocating for calibrating trust in technology within HCI.
What carries the argument
The Human-Computer Trust Scale (HCTS) as a tool for initial trust propensity evaluation, supported by newly developed interpretation guidelines derived from empirical data.
If this is right
- Designers can use HCTS scores early to adjust system transparency and reduce mismatched expectations.
- Users can apply the guidelines to reflect on their trust tendencies before committing to new technologies.
- HCI researchers obtain a structured method for turning abstract trust concepts into actionable assessments.
- Trust calibration shifts from an ideal to a practical process integrated into system development.
Where Pith is reading between the lines
- These guidelines could be adapted for high-stakes domains such as medical or autonomous systems to mitigate risks from misplaced trust.
- The focus on context implies that static trust metrics alone may fall short for rapidly evolving technologies.
- Training users to apply the interpretation process might increase long-term adoption rates of complex systems.
Load-bearing premise
That the empirical study provides sufficient evidence to create reliable, generalizable interpretation guidelines for the HCTS without further validation or details on study methods, sample, or limitations.
What would settle it
A replication study that applies the developed HCTS interpretation guidelines in a different interaction context and finds that the resulting trust level predictions do not align with observed user behaviors or system outcomes.
Figures
read the original abstract
The role of trust within Human-Computer Interaction is being redefined. With the increasing omnipresence, autonomy, and opacity of technology, users often struggle to understand the capabilities and limitations of systems. In this article, we present the results of an empirical study designed to provide a practical, evidence-based interpretation of trust propensity assessment using the Human-Computer Trust Scale (HCTS). We outline the process used to develop a guideline for interpreting the instrument's results and explain the rationale for our decisions, advocating for calibrating trust in technology within HCI. Our findings demonstrate that the HCTS is a promising tool for conducting an initial evaluation of propensity to trust, but that such an assessment requires reflection and interpretation that should be considered within the context of the interaction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents results from an empirical study intended to yield a practical, evidence-based guideline for interpreting scores on the Human-Computer Trust Scale (HCTS) as a measure of users' propensity to trust technology. The authors describe the guideline-development process, rationalize their decisions, and conclude that the HCTS is a promising instrument for initial propensity assessment provided that interpretation remains contextual and reflective.
Significance. If the underlying empirical evidence is shown to be robust, the work would offer a timely contribution to HCI by supplying a concrete tool for calibrating trust in increasingly autonomous and opaque systems. The explicit caveat that assessment requires contextual reflection is a constructive strength that guards against over-interpretation. Transparent reporting of the decision rationale during guideline construction is also a positive feature.
major comments (2)
- [§3] §3 (Empirical Study / Methods): The manuscript provides no information on participant count, demographics, recruitment, task design, or statistical procedures used to derive the interpretation guidelines. These omissions make it impossible to evaluate whether the guidelines are supported by adequate evidence or influenced by post-hoc choices.
- [§4] §4 (Results / Guideline Presentation): No quantitative findings, tables, or validation metrics (e.g., reliability coefficients, inter-rater agreement, or cross-validation results) are reported to justify the specific score thresholds or interpretive categories in the proposed guideline.
minor comments (2)
- [Abstract] The abstract could include a brief statement of sample size and primary quantitative outcomes to help readers gauge the strength of the claims at first reading.
- [Discussion] The limitations paragraph should explicitly discuss the scope of generalizability given the (currently unreported) participant pool and study context.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. The comments identify key omissions in the description of our empirical study that must be addressed to allow proper evaluation of the work. We respond to each major comment below and will revise the manuscript to incorporate the requested details.
read point-by-point responses
-
Referee: [§3] §3 (Empirical Study / Methods): The manuscript provides no information on participant count, demographics, recruitment, task design, or statistical procedures used to derive the interpretation guidelines. These omissions make it impossible to evaluate whether the guidelines are supported by adequate evidence or influenced by post-hoc choices.
Authors: We acknowledge that the methods section omits these critical details. This was an oversight during manuscript preparation. In the revised version we will expand §3 to report the exact participant count, full demographics, recruitment procedures, task design, and the statistical analyses used to derive the interpretation guidelines. These additions will make the empirical foundation transparent and allow readers to assess whether the guidelines rest on adequate evidence. revision: yes
-
Referee: [§4] §4 (Results / Guideline Presentation): No quantitative findings, tables, or validation metrics (e.g., reliability coefficients, inter-rater agreement, or cross-validation results) are reported to justify the specific score thresholds or interpretive categories in the proposed guideline.
Authors: We agree that the results section currently lacks quantitative findings, tables, and validation metrics. The submitted manuscript focused on the guideline and its rationale but did not present the supporting data. We will revise §4 to include the relevant quantitative results, tables summarizing participant responses, reliability coefficients, and any validation metrics (such as inter-rater agreement or cross-validation) that justify the chosen thresholds and categories. revision: yes
Circularity Check
No circularity: empirical guideline development does not reduce to self-referential inputs
full rationale
The paper presents results from an empirical study to develop an interpretation guideline for the Human-Computer Trust Scale (HCTS). No mathematical derivations, equations, parameter fitting, or predictive models are described that could reduce by construction to the study's own inputs. The central claim rests on the outcomes of the empirical process itself, which is presented as external evidence rather than a self-defining loop. Self-citations are not invoked as load-bearing uniqueness theorems or ansatzes. The work is self-contained as a descriptive empirical contribution without the circular patterns enumerated in the analysis criteria.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
1989.Questionnaire construction manual
Bettina A Babbitt and Charles O Nystrom. 1989.Questionnaire construction manual. Technical Report
work page 1989
-
[2]
Tita Alissa Bach, Amna Khan, Harry Hallock, Gabriela Beltrão, and Sonia Sousa
-
[3]
A Systematic Literature Review of User Trust in AI-Enabled Systems: An HCI Perspective.International Journal of Human–Computer Interaction(2022), 1–16
work page 2022
-
[4]
Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining what indi- vidual SUS scores mean: Adding an adjective rating scale.Journal of usability studies4, 3 (2009), 114–123
work page 2009
-
[5]
Aaron Bangor, Philip T Kortum, and James T Miller. 2008. An empirical evaluation of the system usability scale.Intl. Journal of Human–Computer Interaction24, 6 (2008), 574–594
work page 2008
-
[6]
Gabriela Beltrão and Sonia Sousa. 2021. Factors Influencing Trust in WhatsApp: A Cross-Cultural Study. InInternational Conference on Human-Computer Interaction. Springer, 495–508
work page 2021
-
[7]
Gabriela Beltrão, Sonia Sousa, and David Lamas. 2023. Trust in Facial Recognition Systems: A Perspective from the Users. InIFIP Conference on Human-Computer Interaction. Springer, 379–388
work page 2023
-
[8]
Gabriela Beltrão, Sonia Sousa, and David Lamas. 2025. Assessing the Measure- ment Invariance of the Human–Computer Trust Scale.Electronics14, 9 (2025), 1806
work page 2025
-
[9]
Anol Bhattacherjee. 2002. Individual trust in online firms: Scale development and initial test.Journal of management information systems19, 1 (2002), 211–241
work page 2002
-
[10]
Susanne Bødker. 2006. When second wave HCI meets third wave challenges. In Proceedings of the 4th Nordic conference on Human-computer interaction: changing roles. 1–8
work page 2006
-
[11]
Susanne Bødker. 2015. Third-wave HCI, 10 years later—participation and sharing. interactions22, 5 (2015), 24–31. CHI ’26, April 13–17, 2026, Barcelona, Spain Beltrão et al
work page 2015
-
[12]
John Brooke et al. 1996. SUS-A quick and dirty usability scale.Usability evaluation in industry189, 194 (1996), 4–7
work page 1996
-
[13]
Debora Firmino de Souza, Sonia Sousa, Kadri Kristjuhan-Ling, Olga Dunajeva, Mare Roosileht, Avar Pentel, Mati Mõttus, Mustafa Can Özdemir, and Žanna Gratšjova. 2025. Trust and Trustworthiness from Human-Centered Perspective in HRI – A Systematic Literature Review. arXiv:2501.19323 [cs.HC] https://arxiv. org/abs/2501.19323
-
[14]
Ewart J De Visser, Marieke MM Peeters, Malte F Jung, Spencer Kohn, Tyler H Shaw, Richard Pak, and Mark A Neerincx. 2020. Towards a theory of longitudinal trust calibration in human–robot teams.International journal of social robotics 12, 2 (2020), 459–478
work page 2020
-
[15]
Stuart Carter Dodd and Thomas R Gerbrick. 1960. Word scales for degrees of opinion.Language and Speech3, 1 (1960), 18–31
work page 1960
-
[16]
Fred E Emery and Eric L Trist. 1960. Socio-technical systems.Management science, models and techniques2 (1960), 83–97
work page 1960
-
[17]
Kairi Fimberg and Sonia Sousa. 2020. The Impact of Website Design on Users’ Trust Perceptions. InInternational Conference on Applied Human Factors and Ergonomics. Springer, 267–274
work page 2020
-
[18]
Kerstin Fischer, Hanna Mareike Weigelin, and Leon Bodenhagen. 2018. In- creasing trust in human–robot medical interactions: effects of transparency and adaptability.Paladyn, Journal of Behavioral Robotics9, 1 (2018), 95–109. doi:10.1515/pjbr-2018-0007
-
[19]
Ilaria Gaudiello, Elisabetta Zibetti, Sébastien Lefort, Mohamed Chetouani, and Serena Ivaldi. 2016. Trust as indicator of robot functional and social acceptance. An experimental study on user conformation to iCub answers.Computers in Human Behavior61 (2016), 633–655
work page 2016
-
[20]
David Gefen, Elena Karahanna, and Detmar W Straub. 2003. Trust and TAM in online shopping: An integrated model.MIS quarterly(2003), 51–90
work page 2003
-
[21]
Harjinder Gill, Kathleen Boies, Joan E Finegan, and Jeffrey McNally. 2005. An- tecedents of trust: Establishing a boundary condition for the relation between propensity to trust and intention to trust.Journal of business and psychology19 (2005), 287–302
work page 2005
-
[22]
Siddharth Gulati, Sonia Sousa, and David Lamas. 2017. Modelling trust: An em- pirical assessment. InIFIP Conference on Human-Computer Interaction. Springer, 40–61
work page 2017
-
[23]
Siddharth Gulati, Sonia Sousa, and David Lamas. 2019. Design, development and evaluation of a human-computer trust scale.Behaviour & Information Technology 38, 10 (2019), 1004–1015
work page 2019
-
[24]
Peter A. Hancock, Deborah R. Billings, Kristin E. Schaefer, Jessie Y. C. Chen, Ewart J. de Visser, and Raja Parasuraman. 2011. A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction.Human Factors: The Journal of Human Factors and Ergonomics Society53, 5 (2011), 517–527. doi:10.1177/0018720811417254
-
[25]
Kevin Anthony Hoff and Masooda Bashir. 2014. Trust in Automation.Human Factors: The Journal of Human Factors and Ergonomics Society57, 3 (2014), 407–434. doi:10.1177/0018720814547570
-
[26]
Alexandra D Kaplan, Theresa T Kessler, J Christopher Brill, and Peter A Hancock
-
[27]
Trust in artificial intelligence: Meta-analytic findings.Human factors65, 2 (2023), 337–359
work page 2023
-
[28]
Spencer C Kohn, Ewart J De Visser, Eva Wiese, Yi-Ching Lee, and Tyler H Shaw
-
[29]
Measurement of trust in automation: A narrative review and reference guide.Frontiers in psychology12 (2021), 604977
work page 2021
-
[30]
Bing Cai Kok and Harold Soh. 2020. Trust in robots: Challenges and opportunities. Current Robotics Reports1, 4 (2020), 297–309
work page 2020
-
[31]
2015.How many subjects?: Statis- tical power analysis in research
Helena Chmura Kraemer and Christine Blasey. 2015.How many subjects?: Statis- tical power analysis in research. Sage publications
work page 2015
-
[32]
John Lee and Neville Moray. 1992. Trust, control strategies and allocation of function in human-machine systems.Ergonomics35, 10 (1992), 1243–1270
work page 1992
-
[33]
John D Lee and Katrina A See. 2004. Trust in automation: Designing for appro- priate reliance.Human factors46, 1 (2004), 50–80
work page 2004
-
[34]
David Lewis and Andrew Weigert
J. David Lewis and Andrew Weigert. 1985. Trust as a Social Reality.Social Forces 63, 4 (1985), 967–985. doi:10.1093/sf/63.4.967
-
[35]
Mei Yii Lim, David A. Robb, Bruce W. Wilson, and Helen Hastie. 2023. Feeding the Coffee Habit: A Longitudinal Study of a Robo-Barista.2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)00 (2023), 1983–1990. doi:10.1109/ro-man57019.2023.10309621
-
[36]
1978.The perceived favorableness of selected scale anchors and response alternatives
Josephine L Matthews, Calvin E Wright, Kenneth L Yudowitch, James Geddie, and RL Palmer. 1978.The perceived favorableness of selected scale anchors and response alternatives. Technical Report
work page 1978
-
[37]
Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust.Academy of management review20, 3 (1995), 709–734
work page 1995
-
[38]
D Harrison McKnight, Vivek Choudhury, and Charles Kacmar. 2002. Devel- oping and validating trust measures for e-commerce: An integrative typology. Information systems research13, 3 (2002), 334–359
work page 2002
-
[39]
Linda Miller, Johannes Kraus, Franziska Babel, and Martin Baumann. 2021. More Than a Feeling—Interrelation of Trust Layers in Human-Robot Interaction and the Role of User Dispositions and State Anxiety.Frontiers in Psychology12 (2021). doi:10.3389/fpsyg.2021.592711
-
[40]
Bonnie M Muir. 1987. Trust between humans and machines, and the design of decision aids.International journal of man-machine studies27, 5-6 (1987), 527–539
work page 1987
-
[41]
Bonnie M Muir and Neville Moray. 1996. Trust in automation. Part II. Experi- mental studies of trust and human intervention in a process control simulation. Ergonomics39, 3 (1996), 429–460
work page 1996
-
[42]
Triin Oper and Sonia Sousa. 2020. User attitudes towards Facebook: perception and reassurance of trust (Estonian Case Study). InInternational Conference on Human-Computer Interaction. Springer, 224–230
work page 2020
-
[43]
Tim O’reilly. 2005. What is web 2.0. InOnline Communication and Collaboration: A Reader, Helen Donelan, Karen Kear, and Magnus Ramage (Eds.). Routledge
work page 2005
-
[44]
Raja Parasuraman and Victor Riley. 1997. Humans and automation: Use, misuse, disuse, abuse.Human factors39, 2 (1997), 230–253
work page 1997
-
[45]
Paul A Pavlou and David Gefen. 2004. Building effective online marketplaces with institution-based trust.Information systems research15, 1 (2004), 37–59
work page 2004
-
[46]
Ana Pinto, Sonia Sousa, Cristóvão Silva, and Pedro Coelho. 2020. Adaptation and validation of the HCTM scale into human-robot interaction Portuguese context: a study of measuring trust in human-robot interactions. InProceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society. 1–4
work page 2020
-
[47]
Ana Pinto, Sónia Sousa, Ana Simões, Joana Santos, et al. [n. d.]. A Trust Scale for Human-Robot Interaction: Translation, Adaptation, and Validation of a Human Computer Trust Scale.Human Behavior and Emerging Technologies2022 ([n. d.])
-
[48]
Felix Schoeller, Mark Miller, Roy Salomon, and Karl J Friston. 2021. Trust as extended control: Human-machine interactions as active inference.Frontiers in Systems Neuroscience15 (2021), 669810
work page 2021
-
[49]
Sonia Sousa, David Lamas, and Paulo Dias. 2014. A model for Human-computer Trust. InInternational Conference on Learning and Collaboration Technologies. Springer, 128–137
work page 2014
-
[50]
Nathan Tenhundfeld, Mustafa Demir, and Ewart de Visser. 2022. Assessment of Trust in Automation in the “Real World”: Requirements for New Trust in Automation Measurement Techniques for Use by Practitioners.Journal of Cognitive Engineering and Decision Making16, 2 (2022), 101–118. doi:10.1177/ 15553434221096261
work page 2022
-
[51]
Hong Wang, Natalia Calvo-Barajas, Katie Winkle, and Ginevra Castellano. 2025. “Who Should I Believe?”: User Interpretation and Decision-Making When a Family Healthcare Robot Contradicts Human Memory.arXiv(2025). doi:10. 48550/arxiv.2506.21322
-
[52]
Larry Wasserman. 2013.All of statistics: a concise course in statistical inference. Springer Science & Business Media. A How to use the HCTS and interpret the results The Human Computer Trust scale (HCTS) is a quick-and-dirty psy- chometric scale that assesses individuals’ predisposition to trust a technological artifact. This instrument can be used for e...
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.