Effects of Swarm Size Variability on Operator Workload
Pith reviewed 2026-05-09 21:31 UTC · model grok-4.3
The pith
Small decreases in swarm size leave operator workload elevated while small increases keep it low and large changes reset it.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that objective performance is largely unaffected by small changes in swarm size, while subjective workload is sensitive to both change direction and magnitude. Small increases preserve lower workload, whereas small decreases leave workload elevated, indicating workload residue; large changes in either direction attenuate these effects, suggesting a reset response.
What carries the argument
Workload history, the carryover of prior effort levels into current perception, combined with a cognitive reset triggered when swarm size changes exceed a threshold.
Load-bearing premise
That workload dynamics measured in short simulated drone monitoring episodes with discrete size shifts accurately capture the effects in continuous, high-stakes real-world human-swarm operations.
What would settle it
A study that measures subjective workload continuously during long-running real drone missions and finds no sustained elevation after small reductions or no drop after large shifts would falsify the residue and reset claims.
Figures
read the original abstract
Real-world deployments of human--swarm teams depend on balancing operator workload to leverage human strengths without inducing overload. A key challenge is that swarm size is often dynamic: robots may join or leave the mission due to failures or redeployment, causing abrupt workload fluctuations. Understanding how such changes affect human workload and performance is critical for robust human--swarm interaction design. This paper investigates how the magnitude and direction of changes in swarm size influence operator workload. Drawing on the concept of workload history, we test three hypotheses: (1) workload remains elevated following decreases in swarm size, (2) small increases are more manageable than large jumps, and (3) sufficiently large changes override these effects by inducing a cognitive reset. We conducted two studies (N = 34) using a monitoring task with simulated drone swarms of varying sizes. By varying the swarm size between episodes, we measured perceived workload relative to swarm size changes. Results show that objective performance is largely unaffected by small changes in swarm size, while subjective workload is sensitive to both change direction and magnitude. Small increases preserve lower workload, whereas small decreases leave workload elevated, indicating workload residue; large changes in either direction attenuate these effects, suggesting a reset response. These findings offer actionable guidance for managing swarm-size transitions to support operator workload in dynamic human--swarm systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper examines how changes in swarm size affect operator workload and performance in human-swarm interaction. Using two empirical studies with a total of 34 participants in a simulated drone monitoring task, it tests hypotheses regarding workload history effects: elevated workload after decreases in swarm size, better manageability of small increases, and cognitive reset from large changes. Key findings are that objective performance remains largely unaffected by small swarm size variations, while subjective workload is sensitive to both the direction and magnitude of changes, with small decreases causing persistent elevated workload (residue) and large changes leading to a reset effect.
Significance. If validated, these results offer practical guidance for managing dynamic swarm sizes in real-world deployments to optimize operator workload without compromising performance. The work contributes empirical evidence on workload dynamics in HRI, highlighting the importance of considering change history and magnitude in system design. Strengths include the hypothesis-driven approach with two studies testing specific predictions about direction and magnitude effects.
major comments (2)
- [Methods] Methods section: The total sample size is reported as N=34 across two studies, but no details on power analysis, effect sizes, or statistical power are provided. This is critical because the central claim relies on detecting directional effects in subjective workload measures while finding no effect on objective performance; without power information, it is unclear if null results on objective measures reflect true absence or insufficient sensitivity.
- [Results and Discussion] Results and Discussion: The interpretations of 'workload residue' following small decreases and 'reset response' from large changes depend on measurements across discrete episodes. The manuscript does not provide evidence addressing potential confounds such as task switching effects, adaptation during inter-episode breaks, or differences due to the low-stakes simulated environment, which directly impacts the validity of generalizing to continuous, high-stakes real-world operations.
minor comments (1)
- [Abstract] The abstract states 'two studies (N = 34)' but does not clarify the distribution of participants between studies, which would help assess the robustness of the findings.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and indicate where revisions will be made to improve clarity and rigor.
read point-by-point responses
-
Referee: [Methods] Methods section: The total sample size is reported as N=34 across two studies, but no details on power analysis, effect sizes, or statistical power are provided. This is critical because the central claim relies on detecting directional effects in subjective workload measures while finding no effect on objective performance; without power information, it is unclear if null results on objective measures reflect true absence or insufficient sensitivity.
Authors: We agree that explicit reporting of power and effect sizes would strengthen the paper. The studies were designed drawing on sample sizes from prior HRI workload research, but a formal a priori power analysis was not included. In the revised manuscript we will add a post-hoc power analysis (using G*Power or equivalent) based on the observed effect sizes for the significant subjective workload effects, report Cohen's d or partial eta-squared for all key comparisons, and explicitly discuss the implications for interpreting the null findings on objective performance measures. This will allow readers to better evaluate the sensitivity of the design. revision: yes
-
Referee: [Results and Discussion] Results and Discussion: The interpretations of 'workload residue' following small decreases and 'reset response' from large changes depend on measurements across discrete episodes. The manuscript does not provide evidence addressing potential confounds such as task switching effects, adaptation during inter-episode breaks, or differences due to the low-stakes simulated environment, which directly impacts the validity of generalizing to continuous, high-stakes real-world operations.
Authors: We appreciate the referee's attention to these methodological considerations. The experimental protocol maintained the same monitoring task across episodes specifically to minimize task-switching confounds, and breaks were kept brief and standardized to permit workload ratings without continuous-operation carryover. Nevertheless, we acknowledge that discrete episodes and the low-stakes simulation cannot fully replicate adaptation dynamics or stakes in real deployments. In the revision we will add an expanded limitations paragraph that directly discusses these factors, their potential influence on the residue and reset interpretations, and the consequent boundaries on generalizability. We maintain that the controlled design still yields internally valid evidence on workload-history effects that can inform subsequent real-world studies. revision: partial
Circularity Check
No circularity: purely empirical hypothesis-testing study
full rationale
The paper reports results from two controlled studies (N=34) testing three hypotheses on workload history effects in a simulated drone monitoring task. No mathematical derivations, equations, fitted parameters, or predictions appear in the central claims. Workload residue and reset interpretations are direct inferences from measured subjective ratings across discrete episodes, with no self-citation chains or ansatzes invoked to justify the findings. The analysis is self-contained against external benchmarks and contains no load-bearing reductions to inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Subjective workload scales validly capture mental load in monitoring tasks
- standard math Standard statistical assumptions for comparing conditions across episodes
Reference graph
Works this paper leans on
-
[1]
O., Landowska, A., Hunt, W., Maior, H., Ramchurn, S
Abioye, A. O., Landowska, A., Hunt, W., Maior, H., Ramchurn, S. D., Naiseh, M., Banks, A., and Soorati, M. D. (2024). Adaptive human-swarm interaction based on workload measurement using functional near-infrared spectroscopy
work page 2024
-
[2]
Adams, J., Hamell, J., and Walker, P. (2023). Can a single human supervise a swarm of 100 heterogeneous robots?Field Robotics
work page 2023
-
[3]
Chandarana, M., Lewis, M., Sycara, K., and Scherer, S. (2018). Determining effective swarm sizes for multi-job type missions. In2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4848–4853. 8
work page 2018
-
[4]
Devlin, S. P., Moacdieh, N. M., Wickens, C. D., and Riggs, S. L. (2020). Transitions between low and high levels of mental workload can improve multitasking performance. IISE transactions on occupational ergonomics and human factors, 8(2):72–87
work page 2020
-
[5]
Divband Soorati, M., Clark, J., Ghofrani, J., Tarapore, D., and Ramchurn, S. D. (2021). Designing a user-centered interaction interface for human–swarm teaming. Drones, 5(4):131
work page 2021
-
[6]
Duchevet, A., Imbert, J.-P., Garcia, J., Lamirault, B., and Causse, M. (2025). Inves- tigating the independent and combined effects of startle and surprise in a simulated flight task.Human Factors, 67(11):1170–1187. PMID: 40373188
work page 2025
-
[7]
Fuenzalida, E. (2007). Effect of workload history on task performance.Human factors, 49:277–91
work page 2007
-
[8]
Harriott, C. E., Seiffert, A. E., Hayes, S. T., and Adams, J. A. (2014). Biologically- inspired human-swarm interaction metrics. InProceedings of the Human Factors and Ergonomics Society Annual Meeting, volume 58, pages 1471–1475
work page 2014
-
[9]
Hart, S. G. and Staveland, L. E. (1988). Development of nasa-tlx (task load index): Results of empirical and theoretical research. In Hancock, P. A. and Meshkati, N., editors,Human Mental Workload, pages 139–183. North-Holland
work page 1988
-
[10]
Humann, J. and Pollard, K. A. (2019). Human factors in the scalability of multi- robot operation: A review and simulation. In2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pages 700–707
work page 2019
-
[11]
Jansen, R. J., Sawyer, B. D., Van Egmond, R., De Ridder, H., and Hancock, P. A. (2016). Hysteresis in mental workload and task performance: the influence of demand transitions and task prioritization.Human factors, 58(8):1143–1157
work page 2016
-
[12]
Kaduk, J., Cavdan, M., Drewing, K., Vatakis, A., and Hamann, H. (2023). Effects of human-swarm interaction on subjective time perception: Swarm size and speed. In Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Inter- action, pages 456–465
work page 2023
-
[13]
Kaduk, J., Cavdan, M., Drewing, K., Vatakis, A., and Hamann, H. (2024). From one to many: How active robot swarm sizes influence human cognitive processes. In2024 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
work page 2024
-
[14]
Kolling, A., Sycara, K., Nunnally, S., and Lewis, M. (2013). Human swarm interac- tion: An experimental study of two types of interaction with foraging swarms.Journal of Human-Robot Interaction, 2(2):103–129
work page 2013
-
[15]
Kolling, A., Walker, P., Chakraborty, N., Sycara, K., and Lewis, M. (2015). Human interaction with robot swarms: A survey.IEEE Transactions on Human-Machine Systems, 46(1):9–26
work page 2015
-
[16]
Lyons, J. B., Capiola, A., Adams, J. A., Mator, J. D., Cherry, E., and Barrera, K. (2025). Examining the human-centred challenges of human–swarm interaction. Philosophical Transactions A, 383(2289):20240140. 9
work page 2025
-
[17]
Marois, A., Mouratille, D., Pratviel, Y., Chamberland, C., and Tremblay, S. (2024). Using cardiac and electrodermal activity as cognitive markers for interruptions and distraction in a surveillance simulation. InNeuroergonomics and Cognitive Engineering (AHFE Conference Proceedings)
work page 2024
-
[18]
Meyer, J., Pinosky, A., Trzpit, T., Colgate, E., and Murphey, T. D. (2022). A game benchmark for real-time human-swarm control. In2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), pages 743–750
work page 2022
-
[19]
Morrow, J. and Zawodniok, M. (2024). Evaluation of the human-robot-interaction dynamic under mental fatigue constraints in search and rescue operations. In2024 In- ternational Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pages 1–7. IEEE
work page 2024
-
[20]
Pendleton, B. and Goodrich, M. (2013). Scalable human interaction with robotic swarms. InAIAA Infotech@Aerospace (I@A) Conference
work page 2013
-
[21]
Ramchurn, S. D., Huynh, T. D., Wu, F., Ikuno, Y., Flann, J., Moreau, L., Fischer, J. E., Jiang, W., Rodden, T., Simpson, E., et al. (2016). A disaster response system based on human-agent collectives.Journal of Artificial Intelligence Research, 57:661– 708
work page 2016
-
[22]
Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioral model. InProceedings of the 14th annual conference on Computer graphics and interactive techniques, pages 25–34
work page 1987
-
[23]
Singh, S. (2025). Optimizing human-machine interfaces for neuroergonomics: Cog- nitive workload and performance in suas operations. InHuman-Computer Interaction & Emerging Technologies (AHFE Conference Proceedings)
work page 2025
-
[24]
D., Naiseh, M., Hunt, W., Parnell, K., Clark, J., and Ramchurn, S
Soorati, M. D., Naiseh, M., Hunt, W., Parnell, K., Clark, J., and Ramchurn, S. D. (2024). Enabling trustworthiness in human-swarm systems through a digital twin. In Putting AI in the Critical Loop, pages 93–125. Elsevier
work page 2024
-
[25]
St-Onge, D., Kaufmann, M., Panerati, J., Ramtoula, B., Cao, Y., Coffey, E. B., and Beltrame, G. (2019). Planetary exploration with robot teams: Implementing higher autonomy with swarm intelligence.IEEE Robotics & Automation Magazine, 27(2):159– 168
work page 2019
-
[26]
Watson, D. and Clark, L. A. (1994). The panas-x: Manual for the positive and negative affect schedule-expanded form. 10
work page 1994
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.