pith. sign in

arxiv: 2606.28090 · v1 · pith:5HFEFYNYnew · submitted 2026-06-26 · 💻 cs.HC

Typing Behavior in Human-LLM Interaction: Keystroke Dynamics Reveal Cognitive Effort During Prompting

Pith reviewed 2026-06-29 02:24 UTC · model grok-4.3

classification 💻 cs.HC
keywords keystroke dynamicscognitive efforthuman-LLM interactionprompting behaviortyping patternsmental workloaduser studyNASA-TLX
0
0 comments X

The pith

Harder LLM prompting tasks increase keystrokes, slow typing, and add pauses as signs of cognitive effort.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests whether keystroke dynamics during prompt entry can measure the mental effort users expend when interacting with large language models. In a controlled study, participants faced easy and hard tasks on desktop or mobile devices, with typing behavior logged and workload assessed via NASA-TLX. Hard tasks produced more keystrokes, slower typing rates, more pauses between keys, and higher workload scores. Effects from device type were smaller, mainly slight reductions in input length and speed on mobile. Keystroke data distinguished effort levels but showed no link to how useful participants rated the LLM outputs.

Core claim

The central finding is that task difficulty in LLM prompting reliably alters typing behavior: hard tasks led to significantly more keystrokes, slower typing, increased pauses, and higher self-reported workload. Device type exerted weaker effects, with mobile use slightly reducing input length and typing speed. Keystroke dynamics captured differences in cognitive effort but failed to predict perceived usefulness of the LLM's responses.

What carries the argument

Keystroke dynamics (count, speed, and pause patterns) as real-time behavioral measures of cognitive effort in prompt formulation.

If this is right

  • Task difficulty drives measurable changes in typing effort and workload reports.
  • Mobile devices produce modestly shorter and slower inputs compared to desktops.
  • Keystroke features indicate cognitive effort levels during interaction.
  • These measures do not relate to users' judgments of LLM output usefulness.
  • Keystroke dynamics offer potential as indicators of mental demand in human-LLM collaboration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Interfaces could monitor typing in real time to detect high effort and offer assistance or simpler prompts.
  • The method might apply to other text-based AI tools to gauge user strain without explicit surveys.
  • Controlling for individual typing habits in future experiments could strengthen the link to effort.
  • Combining keystroke data with other signals like mouse movements may improve prediction of collaboration outcomes.

Load-bearing premise

Observed differences in keystroke counts, speeds, and pauses stem mainly from cognitive effort induced by task difficulty rather than from task wording, personal habits, or other unmeasured variables.

What would settle it

A replication where task difficulty varies but keystroke patterns stay similar across conditions, or where keystroke data begins to predict usefulness ratings.

Figures

Figures reproduced from arXiv: 2606.28090 by Clara Sayffaerth, Francesco Chiossi, Laura Sch\"utz, Thomas Weber, Yousri Cherif.

Figure 1
Figure 1. Figure 1: We investigated how device type (mobile versus desktop) and task difficulty (easy versus hard) shape [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Experiment protocol of the between-subjects user study. Participants either completed the desktop or [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Interaction counts (number of prompts) by device type and task. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Box plots of keystroke metrics by task difficulty and device type. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Raw NASA-TLX results by task difficulty and device type for all six subscales: Mental Demand, Physical [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

As Large Language Models (LLMs) become increasingly integrated into daily routines, understanding how users interact with these systems is crucial for effective human-AI collaboration. This work investigates keystroke dynamics as a behavioral measure of user mental effort and perceived output usefulness in human-LLM interaction. We conducted a user study (N = 36) to examine how task difficulty (easy vs. hard) and device type (desktop vs. mobile) influence typing behavior and workload (NASA-TLX) during interactions. Our results indicate that hard tasks led to significantly more keystrokes, slower typing, increased pauses, and higher self-reported workload. Device type had weaker effects, with mobile use slightly reducing input length and typing speed. While keystrokes captured differences in cognitive effort, they did not predict perceived LLM output usefulness. These findings highlight the potential of keystroke dynamics as real-time indicators of cognitive effort during LLM prompting, while also showing their limitations in capturing perceived collaboration success.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper reports results from a user study (N=36) examining how task difficulty (easy vs. hard) and device type (desktop vs. mobile) affect keystroke dynamics (keystroke count, typing speed, pauses) and NASA-TLX workload scores during LLM prompting interactions. It claims that hard tasks produce significantly more keystrokes, slower typing, more pauses, and higher workload, with weaker device effects; keystroke measures capture cognitive effort differences but do not predict perceived LLM output usefulness.

Significance. If the central claims hold after addressing potential confounds, the work provides initial empirical evidence that keystroke dynamics can function as real-time behavioral proxies for cognitive effort in human-LLM interactions, complementing subjective measures like NASA-TLX. This could inform adaptive interfaces that detect user mental load. The inclusion of both behavioral logging and validated workload scales is a methodological strength for an exploratory study in this emerging area.

major comments (2)
  1. [Abstract] Abstract: The claim of 'significantly more keystrokes, slower typing, increased pauses' for hard tasks is presented without any reference to statistical tests, p-values, effect sizes, exclusion criteria, or power analysis. This omission prevents evaluation of whether the data support the stated effects, especially with N=36.
  2. [Results/Discussion] Results/Discussion: The interpretation that keystroke increases index cognitive effort per se is load-bearing for the central claim, yet the manuscript does not report normalization of keystroke counts by final prompt length, pre-validation that easy/hard tasks equate on required input volume, or baseline typing measures. Without these, differences may arise from task wording or output requirements rather than mental demand (as raised by the length-confound concern).
minor comments (1)
  1. [Abstract] Abstract: Adding one sentence on the statistical approach used (e.g., ANOVA or mixed models) would make the significance claims more transparent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, proposing revisions where they strengthen the work without misrepresenting our exploratory study design.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim of 'significantly more keystrokes, slower typing, increased pauses' for hard tasks is presented without any reference to statistical tests, p-values, effect sizes, exclusion criteria, or power analysis. This omission prevents evaluation of whether the data support the stated effects, especially with N=36.

    Authors: The full Results section reports the relevant statistical tests (paired t-tests and mixed ANOVAs), associated p-values, effect sizes (Cohen's d), exclusion criteria (e.g., incomplete sessions), and a post-hoc power analysis for the N=36 sample. The abstract was intentionally concise. We will revise the abstract to incorporate brief references to the key statistical outcomes and significance levels to allow readers to evaluate the claims directly from the abstract. revision: yes

  2. Referee: [Results/Discussion] Results/Discussion: The interpretation that keystroke increases index cognitive effort per se is load-bearing for the central claim, yet the manuscript does not report normalization of keystroke counts by final prompt length, pre-validation that easy/hard tasks equate on required input volume, or baseline typing measures. Without these, differences may arise from task wording or output requirements rather than mental demand (as raised by the length-confound concern).

    Authors: This is a valid methodological concern. Our task prompts were constructed to require comparable base input volumes across difficulty levels (verified informally in pilot testing), and the within-subjects design isolates relative changes; however, we did not report explicit pre-validation details, perform normalization by final prompt length, or collect separate baseline typing speed measures. We will add a Methods subsection describing the task-equivalence rationale and pilot checks, include a supplementary normalization analysis (keystrokes per character of final prompt) using the existing data, and explicitly discuss the absence of baseline measures as a limitation while noting that relative within-subject differences still support the cognitive-effort interpretation. revision: partial

Circularity Check

0 steps flagged

Empirical reporting of observed differences; no derivations or self-referential predictions

full rationale

The paper is a user study (N=36) reporting statistical comparisons of keystroke metrics (count, speed, pauses) and NASA-TLX scores across task difficulty and device conditions. The abstract and described results contain no equations, no fitted parameters renamed as predictions, no self-citation chains, and no uniqueness theorems. Claims rest on direct observation of group differences rather than any derivation that reduces to its own inputs by construction. This is self-contained empirical work with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the experimental tasks, the assumption that keystroke metrics index cognitive effort, and standard statistical inference; none of these are detailed beyond the abstract.

axioms (1)
  • standard math Standard inferential statistics can identify meaningful differences between easy and hard task conditions.
    The abstract uses the word 'significantly' without naming tests or corrections.

pith-pipeline@v0.9.1-grok · 5712 in / 1136 out tokens · 55636 ms · 2026-06-29T02:24:51.900282+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

65 extracted references · 49 canonical work pages

  1. [1]

    Alejandro Acien, Aythami Morales, Ruben Vera-Rodriguez, Julian Fierrez, Ijah Mondesire-Crump, and Teresa Arroyo- Gallego. 2022. Detection of Mental Fatigue in the General Population: Feasibility Study of Keystroke Dynamics as a Real-world Biomarker.JMIR Biomed Eng7, 2 (21 Nov 2022), e41003

  2. [2]

    Hessa Alfalahi, Ahsan H Khandoker, Nayeefa Chowdhury, Dimitrios Iakovakis, Sofia B Dias, K Ray Chaudhuri, and Leontios J Hadjileontiadis. 2022. Diagnostic accuracy of keystroke dynamics as digital biomarkers for fine motor decline in neuropsychiatric disorders: a systematic review and meta-analysis.Scientific reports12, 1 (2022), 7690

  3. [3]

    Ahmed Sabbir Arif and Wolfgang Stuerzlinger. 2009. Analysis of text entry performance metrics. In2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH). IEEE, New York, NY, USA, 100–105. doi:10.1109/TIC-STH.2009.5444533 Typing Behavior in Human-LLM Interaction 15

  4. [4]

    Julia Barrett and Helmut Krueger. 1994. Performance effects of reduced proprioceptive feedback on touch typists and casual users in a typing task.Behaviour & Information Technology13, 6 (1994), 373–381

  5. [5]

    Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of statistical software67 (2015), 1–48

  6. [6]

    Phoha, and Andrew Rosenberg

    David Guy Brizan, Adam Goodkind, Patrick Koch, Kiran Balagani, Vir V. Phoha, and Andrew Rosenberg. 2015. Utilizing linguistically enhanced keystroke dynamics to predict typist cognition and demographics.International Journal of Human-Computer Studies82 (2015), 57–68. doi:10.1016/j.ijhcs.2015.04.005

  7. [7]

    Daniel Buschek, Benjamin Bisinger, and Florian Alt. 2018. ResearchIME: A Mobile Keyboard Application for Studying Free Typing Behaviour in the Wild. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada)(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–14. doi:10.1145/ 3173574.3173829

  8. [8]

    Koch, Samantha Straka, Marc Erich Latoschik, and Carolin Wienrich

    Astrid Carolus, Martin J. Koch, Samantha Straka, Marc Erich Latoschik, and Carolin Wienrich. 2023. MAILS - Meta AI literacy scale: Development and testing of an AI literacy questionnaire based on well-founded competency models and psychological change- and meta-competencies.Computers in Human Behavior: Artificial Humans1, 2 (2023), 100014. doi:10.1016/j.c...

  9. [9]

    Francesco Chiossi, Yassmine El Khaoudi, Changkun Ou, Ludwig Sidenmark, Abdelrahman Zaky, Tiare Feuchtner, and Sven Mayer. 2024. Evaluating Typing Performance in Different Mixed Reality Manifestations using Physiological Features.Proc. ACM Hum.-Comput. Interact.8, ISS, Article 542 (Oct. 2024), 30 pages. doi:10.1145/3698142

  10. [10]

    Rianne Conijn, Jens Roeser, and Menno Van Zaanen. 2019. Understanding the keystroke log: The effect of writing task on keystroke features.Reading and Writing32, 9 (2019), 2353–2374

  11. [11]

    Hai Dang, Sven Goller, Florian Lehmann, and Daniel Buschek. 2023. Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 408, 17 pages...

  12. [12]

    Vivek Dhakal, Anna Maria Feit, Per Ola Kristensson, and Antti Oulasvirta. 2018. Observations on Typing from 136 Million Keystrokes. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems(Montreal QC, Canada)(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. doi:10.1145/3173574.3174220

  13. [13]

    Clayton Epp, Michael Lippold, and Regan L. Mandryk. 2011. Identifying emotional states using keystroke dynamics. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Vancouver, BC, Canada)(CHI ’11). Association for Computing Machinery, New York, NY, USA, 715–724. doi:10.1145/1978942.1979046

  14. [14]

    Tairan Fu, Raquel Ferrando, Javier Conde, Carlos Arriaga, and Pedro Reviriego. 2026. When Do Large Language Models (LLMs) Struggle to Count Letters?ACM Trans. Intell. Syst. Technol.17, 4 (2026), 1–17. doi:10.1145/3818606

  15. [15]

    Marc-Philipp Funk, Nassir Navab, and Laura Schütz. 2025. Eye Tracking-Based Adaptive User Interfaces in Virtual Reality Eye Surgery Training. InProceedings of the Mensch und Computer 2025 (MuC ’25). Association for Computing Machinery, New York, NY, USA, 578–582. doi:10.1145/3743049.3748551

  16. [16]

    Jie Gao, Simret Araya Gebreegziabher, Kenny Tsu Wei Choo, Toby Jia-Jun Li, Simon Tangi Perrault, and Thomas W Malone. 2024. A Taxonomy for Human-LLM Interaction Modes: An Initial Exploration. InExtended Abstracts of the CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI EA ’24). Association for Computing Machinery, New York, NY, U...

  17. [17]

    Surjya Ghosh, Niloy Ganguly, Bivas Mitra, and Pradipta De. 2017. TapSense: combining self-report patterns and typing characteristics for smartphone based emotion detection. InProceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services(Vienna, Austria)(MobileHCI ’17). Association for Computing Machinery, ...

  18. [18]

    Surjya Ghosh, Kaustubh Hiware, Niloy Ganguly, Bivas Mitra, and Pradipta De. 2019. Emotion detection from touch interactions during text entry on smartphones.International Journal of Human-Computer Studies130 (2019), 47–57. doi:10.1016/j.ijhcs.2019.04.005

  19. [19]

    Patrick Hemmer, Monika Westphal, Max Schemmer, Sebastian Vetter, Michael Vössing, and Gerhard Satzger. 2023. Human-AI Collaboration: The Effect of AI Delegation on Human Task Performance and Task Satisfaction. InProceedings of the 28th International Conference on Intelligent User Interfaces(Sydney, NSW, Australia)(IUI ’23). Association for Computing Machi...

  20. [20]

    Agata Kołakowska. 2015. Recognizing emotions on the basis of keystroke dynamics. In2015 8th International Conference on Human System Interaction (HSI). IEEE, New York, NY, USA, 291–297. doi:10.1109/HSI.2015.7170682

  21. [21]

    Hao-Ping (Hank) Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, and Nicholas Wilson

  22. [22]

    InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25)

    The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 1121, 22 pages. doi:10.1145/ 3706598.3713778 16 S...

  23. [23]

    Po-Ming Lee, Wei-Hsuan Tsui, and Tzu-Chien Hsiao. 2015. The Influence of Emotion on Keyboard Typing: An Experimental Study Using Auditory Stimuli.PloS one10, 6 (2015), 1–16. doi:10.1371/journal.pone.0129056

  24. [24]

    Aaron D Likens, Laura K Allen, and Danielle S McNamara. 2017. Keystroke dynamics predict essay quality. In Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 39. Cognitive Science Society, Seattle, WA, USA, 2573–2578

  25. [25]

    Yee Mei Lim, Aladdin Ayesh, and Martin Stacey. 2014. Detecting cognitive stress from keyboard and mouse dynamics during mental arithmetic. In2014 Science and Information Conference. IEEE, New York, NY, USA, 146–152. doi:10.1109/ SAI.2014.6918183

  26. [26]

    Ailin Liu, Fiona Yesmine Karoui, Draxler, Frauke Kreuter, and Francesco Chiossi. 2026. Sensing What Surveys Miss: Understanding and Personalizing Proactive Intelligent Support by User Modeling. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI ’26). Association for Computing Machinery, New York, NY, USA, 1–21. doi:10.1145...

  27. [27]

    Stefano Marrone, Carlo Sansone, et al. 2022. Identifying Users’ Emotional States through Keystroke Dynamics.DeLTA 2022 (2022), 207–214

  28. [28]

    Andreia Martins, Tiago Dias, André Dias, João Vitorino, Eva Maia, and Isabel Praça. 2025. Keystroke dynamics for intelligent biometric authentication with machine learning.Discover Applied Sciences7, 9 (2025), 992

  29. [29]

    Lukas Mecke, Assem Mahmoud, Simon Marat, and Florian Alt. 2025. Exploring the Effect of Music on User Typing and Identification through Keystroke Dynamics. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 761, 10 pages. doi:10.1145/3706598.3713222

  30. [30]

    Nazmul Haque Nahin, Jawad Mohammad Alam, Hasan Mahmud, and Kamrul Hasan

    A.F.M. Nazmul Haque Nahin, Jawad Mohammad Alam, Hasan Mahmud, and Kamrul Hasan. 2014. Identifying emotion by keystroke dynamics and text pattern analysis.Behaviour & Information Technology33, 9 (2014), 987–996. doi:10.1080/0144929X.2014.907343

  31. [31]

    Shinichi Nakagawa and Holger Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models.Methods in Ecology and Evolution4, 2 (2013), 133–142. doi:10.1111/j.2041-210x.2012.00261.x

  32. [32]

    Yafei Nie, Shurong Tong, Jing Li, Yicha Zhang, Chen Zheng, and Bin Fan. 2022. Time identification of design knowledge push based on cognitive load measurement.Advanced Engineering Informatics54 (2022), 101783. doi:10.1016/j.aei. 2022.101783

  33. [33]

    Shakked Noy and Whitney Zhang. 2023. Experimental evidence on the productivity effects of generative artificial intelligence.Science381, 6654 (2023), 187–192. doi:10.1126/science.adh2586

  34. [34]

    Natalia Obukhova. 2021. A Meta-Analysis of Effect Sizes of CHI Typing Experiments. InExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI EA ’21). Association for Computing Machinery, New York, NY, USA, Article 476, 7 pages. doi:10.1145/3411763.3451520

  35. [35]

    Eduardo Araujo Oliveira, Rianne Conijn, Paula Galvao de Barba, Kelly Trezise, Menno van Zaanen, and Gregor Kennedy. 2020. Writing analytics across essay tasks with different cognitive load demands. InASCILITE 2020 Conference Proceedings. Australasian Society for Computers in Learning in Tertiary Education, Tugun, Australia, 60–70

  36. [36]

    Kseniia Palin, Anna Maria Feit, Sunjun Kim, Per Ola Kristensson, and Antti Oulasvirta. 2019. How do People Type on Mobile Devices? Observations from a Study with 37,000 Volunteers. InProceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services(Taipei, Taiwan)(MobileHCI ’19). Association for Computing Mach...

  37. [37]

    Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, and Brendan Dolan-Gavitt. 2023. Examining Zero- Shot Vulnerability Repair with Large Language Models. In2023 IEEE Symposium on Security and Privacy (SP). IEEE, New York, NY, USA, 2339–2356. doi:10.1109/SP46215.2023.10179324

  38. [38]

    Yuqing Qi, Weichen Jia, and Shuo Gao. 2021. Emotion Recognition Based on Piezoelectric Keystroke Dynamics and Machine Learning. In2021 IEEE International Conference on Flexible and Printable Sensors and Systems (FLEPS). IEEE, New York, NY, USA, 1–4. doi:10.1109/FLEPS51544.2021.9469843

  39. [39]

    Crystal Qian and James Wexler. 2024. Take It, Leave It, or Fix It: Measuring Productivity and Trust in Human-AI Collaboration. InProceedings of the 29th International Conference on Intelligent User Interfaces(Greenville, SC, USA) (IUI ’24). Association for Computing Machinery, New York, NY, USA, 370–384. doi:10.1145/3640543.3645198

  40. [40]

    Han Qiao, Jo Vermeulen, George Fitzmaurice, and Justin Matejka. 2025. To Use or Not to Use: Impatience and Overreliance When Using Generative AI Productivity Support Tools. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 1122, 18 pages. doi:10.1145/37...

  41. [41]

    Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo Ponti, and Shay B. Cohen. 2024. Are Large Language Model Temporally Grounded?. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Association for Computational Linguistics, Mexico ...

  42. [42]

    Leon Reicherts, Zelun Tony Zhang, Elisabeth von Oswald, Yuanting Liu, Yvonne Rogers, and Mariam Hassib. 2025. AI, Help Me Think—but for Myself: Assisting People in Complex Decision-Making by Providing Different Kinds of Cognitive Support. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Ma...

  43. [43]

    Laura Schütz, Shervin Dehghani, Michael Sommersperger, Koorosh Faridpooya, and Nassir Navab. 2025. The impact of intraoperative optical coherence tomography on cognitive load in virtual reality vitreoretinal surgery training. Scientific Reports15, 1 (2025), 24848. doi:10.1038/s41598-025-07670-7

  44. [44]

    Rashik Shadman, Ahmed Anu Wahab, Michael Manno, Matthew Lukaszewski, Daqing Hou, and Faraz Hussain. 2025. Keystroke Dynamics: Concepts, Techniques, and Applications.ACM Comput. Surv.57, 11, Article 283 (June 2025), 35 pages. doi:10.1145/3733103

  45. [45]

    Bernstein

    Omar Shaikh, Shardul Sapkota, Shan Rizvi, Eric Horvitz, Joon Sung Park, Diyi Yang, and Michael S. Bernstein. 2025. Creating General User Models from Computer Use. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST ’25). Association for Computing Machinery, New York, NY, USA, Article 35, 23 pages. doi:10.1145/374...

  46. [46]

    Danqing Shi, Yujun Zhu, Jussi P. P. Jokinen, Aditya Acharya, Aini Putkonen, Shumin Zhai, and Antti Oulasvirta

  47. [47]

    InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24)

    CRTypist: Simulating Touchscreen Typing Behavior via Computational Rationality. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 942, 17 pages. doi:10.1145/3613904.3642918

  48. [48]

    Raj Shrestha, Juho Leinonen, Albina Zavgorodniaia, Arto Hellas, and John Edwards. 2022. Pausing while programming: insights from keystroke analysis. InProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Software Engineering Education and Training(Pittsburgh, Pennsylvania)(ICSE-SEET ’22). Association for Computing Machinery, ...

  49. [49]

    Prakash Shukla, Phuong Bui, Sean S Levy, Max Kowalski, Ali Baigelenov, and Paul Parsons. 2025. De-skilling, Cognitive Offloading, and Misplaced Responsibilities: Potential Ironies of AI-Assisted Design. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’25). Association for Computing Machinery, New...

  50. [50]

    Auste Simkute, Lev Tankelevitch, Viktor Kewenig, Ava Elizabeth Scott, Abigail Sellen, and Sean Rintel and. 2025. Ironies of Generative AI: Understanding and Mitigating Productivity Loss in Human-AI Interaction.International Journal of Human–Computer Interaction41, 5 (2025), 2898–2919. doi:10.1080/10447318.2024.2405782

  51. [51]

    William Soukoreff and I

    R. William Soukoreff and I. Scott MacKenzie. 2003. Metrics for text entry research: an evaluation of MSD and KSPC, and a new unified error metric. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Ft. Lauderdale, Florida, USA)(CHI ’03). Association for Computing Machinery, New York, NY, USA, 113–120. doi:10.1145/642611.642632

  52. [52]

    Hari Subramonyam, Roy Pea, Christopher Pondoc, Maneesh Agrawala, and Colleen Seifert. 2024. Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMs. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Articl...

  53. [53]

    Benjamin Todd. 2025. AI is the most rapidly adopted technology in history. https://benjamintodd.substack.com/p/when- people-say-ai-isnt-finding-real

  54. [54]

    Matthias Trojahn, Florian Arndt, Markus Weinmann, and Frank Ortmeier. 2013. Emotion Recognition through Keystroke Dynamics on Touchscreen Keyboards. InProceedings of the 15th International Conference on Enterprise Information Systems - Volume 3: ICEIS. INSTICC, SciTePress, Setubal, Portugal, 31–37. doi:10.5220/0004415500310037

  55. [55]

    Michelle Vaccaro, Abdullah Almaatouq, and Thomas Malone. 2024. When combinations of humans and AI are useful: A systematic review and meta-analysis.Nature Human Behaviour8, 12 (Oct. 2024), 2293–2303. doi:10.1038/s41562- 024-02024-1

  56. [56]

    Lisa M. Vizer. 2009. Detecting cognitive and physical stress through typing behavior. InCHI ’09 Extended Abstracts on Human Factors in Computing Systems(Boston, MA, USA)(CHI EA ’09). Association for Computing Machinery, New York, NY, USA, 3113–3116. doi:10.1145/1520340.1520440

  57. [57]

    Schinazi, and Markus Gross

    Rafael Wampfler, Severin Klingler, Barbara Solenthaler, Victor R. Schinazi, and Markus Gross. 2020. Affective State Prediction Based on Semi-Supervised Learning from Smartphone Touch Data. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13...

  58. [58]

    Schinazi, Markus Gross, and Christian Holz

    Rafael Wampfler, Severin Klingler, Barbara Solenthaler, Victor R. Schinazi, Markus Gross, and Christian Holz. 2022. Affective State Prediction from Smartphone Touch and Sensor Data in the Wild. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, U...

  59. [59]

    Thomas Weber, Maximilian Brandmaier, Albrecht Schmidt, and Sven Mayer. 2024. Significant Productivity Gains through Programming with Large Language Models.Proc. ACM Hum.-Comput. Interact.8, EICS, Article 256 (June 2024), 29 pages. doi:10.1145/3661145

  60. [60]

    Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise, in: Yamashita, N., Evers, V., Yatani, K., Ding, S.X

    Justin D. Weisz, Shraddha Vijay Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Katrin Ellice Heintze, and Shagun Bajpai. 2025. Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CH...

  61. [61]

    Nan Xu and Xuezhe Ma. 2025. LLM The Genius Paradox: A Linguistic and Math Expert’s Struggle with Simple Word-based Counting Problems. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Association for Computational Linguistics, A...

  62. [62]

    Liying Yang and Shengfeng Qin. 2025. Identify user features impacting keystroke, mouse and touchscreen dynamics. Multimedia Tools and Applications84 (16 Aug 2025), 48685–48713. doi:10.1007/s11042-025-21043-2

  63. [63]

    Koji Yatani. 2016. Effect Sizes and Power Analysis in HCI. InModern Statistical Methods for HCI, Judy Robertson and Maurits Kaptein (Eds.). Springer, Cham, 87–110. doi:10.1007/978-3-319-26633-6_5

  64. [64]

    Xiang Zhang, Juntai Cao, and Chenyu You. 2024. Counting Ability of Large Language Models and Impact of Tokeniza- tion. arXiv:2410.19730 [cs.CL] https://arxiv.org/abs/2410.19730

  65. [65]

    ISBN 9781450392730

    Albert Ziegler, Eirini Kalliamvakou, X. Alice Li, Andrew Rice, Devon Rifkin, Shawn Simister, Ganesh Sittampalam, and Edward Aftandilian. 2022. Productivity assessment of neural code completion. InProceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming(San Diego, CA, USA)(MAPS 2022). Association for Computing Machinery, New York, ...