Multitasking with Alexa Multitasking with Alexa: How Using Intelligent Personal Assistants Impacts Language-based Primary Task Performance
Pith reviewed 2026-05-25 09:53 UTC · model grok-4.3
The pith
Using intelligent personal assistants disrupts content generation in writing more than copying.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In experiments using a dual-task paradigm, IPA interactions significantly disrupted performance on content-generating writing tasks more than on copying tasks, as these share more cognitive resources needed for IPA use.
What carries the argument
Dual-task paradigm with two writing primary tasks: copying versus generating content, during IPA interactions.
If this is right
- Content generation writing is more impaired by concurrent IPA use than copying.
- Multiple resource theory and working memory explain why language-based tasks vary in susceptibility to IPA interference.
- Future studies should examine how interruption length, relevance, and timing affect primary task performance.
- IPA design may need to account for the cognitive demands of primary tasks to minimize disruption.
Where Pith is reading between the lines
- Designers of IPAs could prioritize minimizing interruptions during creative tasks.
- This suggests that voice interfaces might be better suited for low-demand primary tasks.
- Similar effects might appear in other multitasking scenarios involving speech and writing.
Load-bearing premise
That the observed differences in disruption between tasks are caused by shared cognitive resources rather than variations in interruption timing, length, or user familiarity with the tasks.
What would settle it
An experiment that controls for interruption timing and length and finds no difference in disruption between copying and generating tasks would falsify the claim.
Figures
read the original abstract
Intelligent personal assistants (IPAs) are supposed to help us multitask. Yet the impact of IPA use on multitasking is not clearly quantified, particularly in situations where primary tasks are also language based. Using a dual task paradigm, our study observes how IPA interactions impact two different types of writing primary tasks; copying and generating content. We found writing tasks that involve content generation, which are more cognitively demanding and share more of the resources needed for IPA use, are significantly more disrupted by IPA interaction than less demanding tasks such as copying content. We discuss how theories of cognitive resources, including multiple resource theory and working memory, explain these results. We also outline the need for future work how interruption length and relevance may impact primary task performance as well as the need to identify effects of interruption timing in user and IPA led interruptions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports a dual-task experiment on the effects of IPA (Alexa) interactions on two language-based primary tasks: copying text versus generating content. It claims that content-generation tasks are significantly more disrupted by IPA use than copying tasks because they are more cognitively demanding and share more resources with IPA interaction, consistent with multiple resource theory and working memory models. The authors discuss theoretical implications and flag the need for future work on interruption length, relevance, and timing.
Significance. If the differential disruption result holds after proper controls and statistical reporting, the work would offer empirical support for resource-competition accounts of voice-assistant multitasking in language tasks and could inform IPA interface design. The contribution is modest in scope, however, because the study is purely observational and the abstract already identifies the key confounds as open questions.
major comments (2)
- [Abstract] Abstract: the claim that content-generation tasks 'are significantly more disrupted' supplies no sample size, statistical test, p-value, effect size, or error bars, so the central empirical result cannot be evaluated from the text provided.
- [Abstract] Abstract: the explicit statement that future work is needed on 'interruption length and relevance' and 'interruption timing in user and IPA led interruptions' indicates these factors were not controlled or matched between the copying and generation conditions; this directly undermines the attribution of the performance gap to resource overlap rather than to differences in interruption properties.
Simulated Author's Rebuttal
We thank the referee for the detailed comments. We address each major point below and have prepared revisions to strengthen the abstract.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that content-generation tasks 'are significantly more disrupted' supplies no sample size, statistical test, p-value, effect size, or error bars, so the central empirical result cannot be evaluated from the text provided.
Authors: We agree that the abstract should report these details for transparency. In the revised version we will add the sample size, the statistical test performed, exact p-value, effect size, and reference to variability measures from the results section. revision: yes
-
Referee: [Abstract] Abstract: the explicit statement that future work is needed on 'interruption length and relevance' and 'interruption timing in user and IPA led interruptions' indicates these factors were not controlled or matched between the copying and generation conditions; this directly undermines the attribution of the performance gap to resource overlap rather than to differences in interruption properties.
Authors: We disagree. The study used identical IPA interaction scripts in both primary-task conditions, holding interruption length, relevance, and timing constant by design. The observed difference is therefore attributable to primary-task resource demands. The abstract flags future work on variations of these factors (e.g., longer or user-initiated interruptions), not on the matched controls already implemented. We will revise the abstract to state explicitly that interruption properties were matched across conditions. revision: no
Circularity Check
Purely empirical study; no derivations or load-bearing self-citations
full rationale
This is an experimental HCI paper reporting observed performance differences in a dual-task paradigm (copying vs. content-generation writing under IPA interruption). The abstract and description contain no equations, fitted parameters, model predictions, uniqueness theorems, or ansatzes. Results are presented as direct empirical observations, with explicit notes on future work for confounds such as interruption timing. No self-citation chains or renamings of known results appear as load-bearing steps. The central claim rests on data collection rather than any reduction to inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Dual-task performance differences can be attributed to overlap in cognitive resources (multiple resource theory and working memory limits).
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
writing tasks that involve content generation... are significantly more disrupted by IPA interaction than less demanding tasks such as copying content
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
theories of cognitive resources, including multiple resource theory and working memory
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Agnès Alsius, Jordi Navarra, and Salvador Soto-Faraco. 2007. Atten- tion to touch weakens audiovisual speech integration. Experimental Brain Research 183, 3 (01 Nov 2007), 399–404. https://doi.org/10.1007/ s00221-007-1110-1
work page 2007
-
[2]
Erik M Altmann and J Gregory Trafton. 2002. Memory for goals: An activation-based model. Cognitive science 26, 1 (2002), 39–83
work page 2002
-
[3]
Adaptation and Personalization for Web2. 0
Liliana Ardissono, Gianni Bosio, Annamaria Goy, and Giovanna Petrone. 2009. Context-aware notification management in an inte- grated collaborative environment. InUMAP 2009 workshop" Adaptation and Personalization for Web2. 0" , Vol. 485. CEUR, 21–30
work page 2009
-
[4]
Aylett, Per Ola Kristensson, Steve Whittaker, and Yolanda Vazquez-Alvarez
Matthew P. Aylett, Per Ola Kristensson, Steve Whittaker, and Yolanda Vazquez-Alvarez. 2014. None of a CHInd: relationship counselling for HCI and speech technology. In Proceedings of the extended abstracts of the 32nd annual ACM conference on Human factors in computing systems - CHI EA ’14. ACM Press, Toronto, Ontario, Canada, 749–760. https://doi.org/10....
-
[5]
R Harald Baayen, Douglas J Davidson, and Douglas M Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of memory and language 59, 4 (2008), 390–412
work page 2008
-
[6]
R Harald Baayen and Petar Milin. 2010. Analyzing reaction times. International Journal of Psychological Research 3, 2 (2010), 12–28
work page 2010
-
[7]
Alan D Baddeley and Graham Hitch. 1974. Working memory. In Psychology of learning and motivation . Vol. 8. Elsevier, 47–89
work page 1974
-
[8]
Brian P Bailey, Joseph A Konstan, and John V Carlis. 2001. The Effects of Interruptions on Task Performance, Annoyance, and Anxiety in the User Interface.. In Interact, Vol. 1. 593–601
work page 2001
-
[9]
Dale J Barr, Roger Levy, Christoph Scheepers, and Harry J Tily. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language 68, 3 (2013), 255–278
work page 2013
-
[10]
Douglas Bates, Martin Maechler, Ben Bolker, Steven Walker, Rune Haubo Bojesen Christensen, Henrik Singmann, Bin Dai, Fabian Scheipl, and Gabor Grothendieck. [n. d.]. Package ‘lme4’. ([n. d.])
-
[11]
Jelmer P. Borst, Niels A. Taatgen, and Hedderik van Rijn. 2010. The problem state: A cognitive bottleneck in multitasking. Journal of Experimental Psychology: Learning, Memory, and Cognition 36, 2 (2010), 363–382. https://doi.org/10.1037/a0018106
-
[12]
Duncan P Brumby, Anna L Cox, Jonathan Back, and Sandy JJ Gould
-
[13]
Journal of Experimental Psychology: Applied 19, 2 (2013), 95
Recovering from an interruption: Investigating speed- accu- racy trade-offs in task resumption behavior. Journal of Experimental Psychology: Applied 19, 2 (2013), 95
work page 2013
-
[14]
Leigh Clark, Phillip Doyle, Diego Garaialde, Emer Gilmartin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, and Benjamin R. Cowan. 2018. The State of Speech in HCI: Trends, Themes and Challenges. Unpublished (2018). https://doi.org/10.13140/ rg.2.2.17331.07202
-
[15]
Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Gara- ialde, Justin Edwards, Brendan Spillane, Christine Murad, Cosmin Munteanu, Vincent Wade, et al. 2019. What Makes a Good Conver- sation? Challenges in Designing Truly Conversational Agents. arXiv preprint arXiv:1901.06525 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[16]
Andy Cockburn and Amal Siresena. 2003. Evaluating mobile text entry with the Fastap keypad. (2003)
work page 2003
-
[17]
Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira
-
[18]
What can i help you with?: infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services. ACM, 43
-
[19]
Mary Czerwinski, Edward Cutrell, and Eric Horvitz. 2000. Instant messaging and interruption: Influence of task type on performance. In CUI 2019, August 22-23, 2019, Dublin, Ireland J. Edwards et al. OZCHI 2000 conference proceedings , Vol. 356. 361–367
work page 2000
-
[20]
Jamie L Desjardins and Karen A Doherty. 2014. The effect of hearing aid noise reduction on listening effort in hearing-impaired adults. Ear and Hearing 35, 6 (2014), 600–610
work page 2014
-
[21]
Mateusz Dubiel, Martin Halvey, and Leif Azzopardi. 2018. A Survey Investigating Usage of Virtual Personal Assistants. arXiv preprint arXiv:1807.04606 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[22]
Listening Expres. 2017. 39 Nothing to Worry About. http://www. listeningexpress.com/nce-a/book3/39-Nothing-to-Worry-About. html Accessed on 05.07.2018
work page 2017
-
[23]
Cyrus K Foroughi, Nicole E Werner, Daniela Barragán, and Deborah A Boehm-Davis. 2015. Interruptions disrupt reading comprehension. Journal of Experimental Psychology: General 144, 3 (2015), 704
work page 2015
-
[24]
Sarah Fraser, Jean-Pierre Gagné, Majolaine Alepins, and Pascale Dubois. 2010. Evaluating the effort expended to understand speech in noise using a dual-task paradigm: The effects of providing visual speech cues. Journal of speech, language, and hearing research 53, 1 (2010), 18–33
work page 2010
-
[25]
Edith Galy, Magali Cariou, and Claudine Mélan. 2012. What is the relationship between mental workload factors and cognitive load types? International Journal of Psychophysiology 83, 3 (2012), 269 – 275. https://doi.org/10.1016/j.ijpsycho.2011.09.023
-
[26]
Sandy J. J. Gould, Duncan P. Brumby, and Anna L. Cox. 2013. What does it mean for an interruption to be relevant? An investigation of relevance as a memory effect. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 57, 1 (Sept. 2013), 149–153. https://doi.org/10.1177/1541931213571034
-
[27]
Avashna Govender and Simon King. 2018. Measuring the Cogni- tive Load of Synthetic Speech Using a Dual Task Paradigm. In Inter- speech 2018. ISCA, 2843–2847. https://doi.org/10.21437/Interspeech. 2018-1199
-
[28]
Grabowski, Hanna Damasio, and Antonio R
Thomas J. Grabowski, Hanna Damasio, and Antonio R. Damasio. 1998. Premotor and Prefrontal Correlates of Category-Related Lexical Re- trieval. NeuroImage 7, 3 (April 1998), 232–243. https://doi.org/10.1006/ nimg.1998.0324
-
[29]
Sandra G Hart. 2006. NASA-task load index (NASA-TLX); 20 years later. In Proceedings of the human factors and ergonomics society annual meeting, Vol. 50. Sage Publications Sage CA: Los Angeles, CA, 904–908
work page 2006
-
[30]
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA- TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology. Vol. 52. Elsevier, 139–183
work page 1988
-
[31]
Chih-Yuan Ho, Mark I Nikolic, Molly J Waters, and Nadine B Sarter
-
[32]
Not now! Supporting interruption management by indicating the modality and urgency of pending tasks.Human Factors 46, 3 (2004), 399–409
work page 2004
- [33]
-
[34]
Shamsi T Iqbal and Brian P Bailey. 2006. Leveraging characteristics of task structure to predict the cost of interruption. In Proceedings of the SIGCHI conference on Human Factors in computing systems . ACM, 741–750
work page 2006
-
[35]
Shamsi T Iqbal and Eric Horvitz. 2007. Disruption and recovery of computing tasks: field study, analysis, and directions. In Proceedings of the SIGCHI conference on Human factors in computing systems . ACM, 677–686
work page 2007
-
[36]
Shamsi T Iqbal, Yun-Cheng Ju, and Eric Horvitz. 2010. Cars, calls, and cognition: investigating driving and divided attention. In Proceedings of the SIGCHI conference on human factors in computing systems . ACM, 1281–1290
work page 2010
-
[37]
Christian P Janssen and Duncan P Brumby. 2010. Strategic adaptation to performance objectives in a dual-task setting. Cognitive science 34, 8 (2010), 1548–1560
work page 2010
-
[38]
Joyen. 2004. Lesson 37 The Westhaven Express. http://www.joyen. net/article/lesson/nce/nce3/200410/258.html Accessed on 05.07.2018
work page 2004
-
[39]
Ewa Luger and Abigail Sellen. 2016. Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5286–5297
work page 2016
-
[40]
Gloria Mark, Stephen Voida, and Armand Cardello. 2012. A pace not dictated by electrons: an empirical study of work without email. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 555–564
work page 2012
-
[41]
Deborah McCutchen. 1996. A capacity theory of writing: Working memory in composition. Educational Psychology Review 8, 3 (1996), 299–325
work page 1996
-
[42]
Christopher A Monk, J Gregory Trafton, and Deborah A Boehm-Davis
-
[43]
Journal of Experimental Psychology: Applied 14, 4 (2008), 299
The effect of interruption duration and demand on resuming suspended goals. Journal of Experimental Psychology: Applied 14, 4 (2008), 299
work page 2008
-
[44]
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. arXiv:1609.03499 [cs] (Sept. 2016). http://arxiv.org/abs/1609. 03499 arXiv: 1609.03499
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[45]
Martin Porcheron, Joel E Fischer, Moira McGregor, Barry Brown, Ewa Luger, Heloisa Candello, and Kenton O’Hara. 2017. Talking with conversational agents in collaborative action. InCompanion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 431–436
work page 2017
-
[46]
Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples
-
[47]
In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems . ACM, 640
work page 2018
-
[48]
R Core Team. 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
work page 2018
-
[49]
Raj M Ratwani, Alyssa E Andrews, Jenny D Sousk, and J Gregory Trafton. 2008. The effect of interruption modality on primary task resumption. InProceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 52. Sage Publications Sage CA: Los Angeles, CA, 393–397
work page 2008
-
[50]
Dario D Salvucci, Niels A Taatgen, and Jelmer P Borst. 2009. Toward a unified theory of the multitasking continuum: From concurrent perfor- mance to task switching, interruption, and resumption. In Proceedings of the SIGCHI conference on human factors in computing systems . ACM, 1819–1828
work page 2009
-
[51]
Henrik Singmann and David Kellen. [n. d.]. An Introduction to Mixed Models for Experimental Psychology
-
[52]
J.Gregory Trafton, Erik M Altmann, Derek P Brock, and Farilee E Mintz
-
[53]
International Journal of Human-Computer Studies 58, 5 (May 2003), 583–603
Preparing to resume an interrupted task: effects of prospective goal encoding and retrospective rehearsal. International Journal of Human-Computer Studies 58, 5 (May 2003), 583–603. https://doi.org/ 10.1016/S1071-5819(03)00023-5
-
[54]
Heather L Tubbs-Cooley, Jeannie P Cimiotti, Jeffrey H Silber, Dou- glas M Sloane, and Linda H Aiken. 2013. An observational study of nurse staffing ratios and hospital readmission among children admitted for common conditions. BMJ Qual Saf 22, 9 (2013), 735–742
work page 2013
-
[55]
Christopher D. Wickens. 2002. Multiple resources and performance prediction. Theoretical Issues in Ergonomics Science 3, 2 (Jan. 2002), 159–177. https://doi.org/10.1080/14639220210123806
-
[56]
Eric N Wiebe, Edward Roberts, and Tara S Behrend. 2010. An exam- ination of two mental workload measurement approaches to under- standing multimedia learning. Computers in Human Behavior 26, 3 (2010), 474–481
work page 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.