A Practical Framework of Key Performance Indicators for Multi-Robot Lunar and Planetary Field Tests
Pith reviewed 2026-05-21 15:10 UTC · model grok-4.3
The pith
A KPI framework derived from three realistic multi-robot lunar scenarios links robot performance to science objectives and enables consistent trial comparisons.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish a practical framework of key performance indicators by starting from three realistic multi-robot lunar scenarios that incorporate scientific objectives like resource prospecting and operational constraints such as harsh terrain. This framework organizes KPIs around scenario-dependent emphases on efficiency, robustness, and precision, making it directly applicable to field tests. Validation in an actual multi-robot deployment showed that efficiency and robustness indicators are straightforward to measure, while precision ones depend on reliable ground-truth information not always available in analog environments. The result is presented as a common standard to make cross
What carries the argument
The KPI framework structured around scenario-dependent priorities in efficiency, robustness, and precision and derived from three realistic multi-robot lunar prospecting scenarios.
If this is right
- Enables consistent, goal-oriented comparison of multi-robot field trials across different teams and platforms.
- Supports systematic development of robotic systems for future planetary exploration by tying metrics to science objectives.
- Shows that efficiency- and robustness-related KPIs can be measured practically in outdoor analog environments.
- Indicates that precision-oriented KPIs require reliable ground-truth data that is not always feasible to obtain in field settings.
- Provides a ready-to-use evaluation tool explicitly designed for practical applicability in deployments.
Where Pith is reading between the lines
- If widely adopted, the framework could reduce redundant metric design across research groups and speed up creation of shared benchmarks for lunar robotics.
- Extending the underlying scenarios to cover additional terrains or resource types could increase the framework's coverage for varied mission profiles.
- Pairing the KPIs with onboard sensor logging might lessen reliance on external ground-truth collection for precision measures.
- Comparable KPI structures could be developed for single-robot operations or mixed human-robot teams in similar exploration contexts.
Load-bearing premise
The three realistic multi-robot lunar scenarios used to derive the KPIs accurately capture the scientific objectives and operational constraints of actual planetary prospecting missions.
What would settle it
Apply the framework and the original custom metrics to the same set of independent multi-robot field tests on lunar analogs and check whether the framework produces more consistent, science-aligned comparisons across trials.
Figures
read the original abstract
Robotic prospecting for critical resources on the Moon, such as ilmenite, rare earth elements, and water ice, requires robust exploration methods given the diverse terrain and harsh environmental conditions. Although numerous analog field trials address these goals, comparing their results remains challenging because of differences in robot platforms and experimental setups. These missions typically assess performance using selected, scenario-specific engineering metrics that fail to establish a clear link between field performance and science-driven objectives. In this paper, we address this gap by deriving a structured framework of KPI from three realistic multi-robot lunar scenarios reflecting scientific objectives and operational constraints. Our framework emphasizes scenario-dependent priorities in efficiency, robustness, and precision, and is explicitly designed for practical applicability in field deployments. We validated the framework in a multi-robot field test and found it practical and easy to apply for efficiency- and robustness-related KPI, whereas precision-oriented KPI require reliable ground-truth data that is not always feasible to obtain in outdoor analog environments. Overall, we propose this framework as a common evaluation standard enabling consistent, goal-oriented comparison of multi-robot field trials and supporting systematic development of robotic systems for future planetary exploration.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper derives a structured KPI framework for multi-robot lunar and planetary field tests from three realistic scenarios, emphasizing scenario-dependent priorities among efficiency, robustness, and precision. The framework is validated in one outdoor multi-robot field test, with explicit discussion that efficiency and robustness KPIs are practical to apply while precision KPIs require ground-truth data that is often unavailable in analog environments. The authors propose the framework as a common evaluation standard to enable consistent, goal-oriented comparisons across trials and support systematic robotic system development.
Significance. If the framework transfers beyond the source scenarios, it would provide a valuable bridge between engineering metrics and science-driven objectives in planetary robotics, addressing a real gap in comparing heterogeneous multi-robot analog missions. The explicit acknowledgment of practical limitations (e.g., ground-truth requirements) and the derivation from stated operational constraints are strengths that increase the work's utility for field deployments.
major comments (1)
- [Abstract and concluding section] Abstract and concluding section: The central claim that the framework 'enables consistent, goal-oriented comparison of multi-robot field trials' as a common evaluation standard is load-bearing but only partially supported. The derivation is from three scenarios and validation occurs in a single author-conducted test; no retrospective application to independent published multi-robot analog trials (e.g., prior NASA or ESA efforts) is shown to confirm that the same KPI definitions yield comparable results across differing platforms and setups.
minor comments (1)
- [Framework description] The manuscript would benefit from a table summarizing the full KPI set with explicit formulas or measurement procedures for each category to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and for recognizing the framework's potential to bridge engineering metrics with science objectives. We address the major comment point by point below.
read point-by-point responses
-
Referee: The central claim that the framework 'enables consistent, goal-oriented comparison of multi-robot field trials' as a common evaluation standard is load-bearing but only partially supported. The derivation is from three scenarios and validation occurs in a single author-conducted test; no retrospective application to independent published multi-robot analog trials (e.g., prior NASA or ESA efforts) is shown to confirm that the same KPI definitions yield comparable results across differing platforms and setups.
Authors: We agree that the claim of serving as a common evaluation standard is only partially supported by the current evidence and merits qualification. The KPIs were derived directly from three scenarios that reflect documented lunar prospecting objectives and operational constraints (e.g., limited communication windows, heterogeneous robot teams, and science return requirements). The outdoor multi-robot validation then demonstrated that efficiency and robustness KPIs can be computed from readily available field data, while precision KPIs require ground truth that is often unavailable. We acknowledge that a retrospective application to independent prior trials would provide stronger confirmation of transferability. However, such an analysis would require access to raw logs, exact robot trajectories, and environmental metadata from those experiments, which are frequently not published at the necessary granularity. To address this concern without overstating the current results, we will revise the abstract and concluding section to state that the framework supplies a structured, scenario-derived basis for consistent comparisons in analogous settings, while explicitly noting the need for future cross-trial validation. We will also add a short discussion paragraph outlining how the KPI definitions could be mapped onto existing published datasets where sufficient information is available. revision: yes
Circularity Check
No circularity: KPI framework derived from external scenarios without reduction to inputs by construction
full rationale
The paper derives its KPI framework directly from three realistic multi-robot lunar scenarios that reflect stated scientific objectives and operational constraints, as described in the abstract. This process begins from external scenario descriptions and field-test observations rather than from fitted parameters, self-referential equations, or self-citations that would force the outputs to match the inputs by definition. Validation occurs in a separate multi-robot field test as an application step, with no load-bearing claims that reduce the central proposal of a common evaluation standard to a renaming or tautological fit. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- scenario-dependent priorities for efficiency, robustness, and precision
axioms (1)
- domain assumption Three realistic multi-robot lunar scenarios reflect scientific objectives and operational constraints.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We address this gap by deriving a structured framework of Key Performance Indicators (KPIs) from three realistic multi-robot lunar scenarios... efficiency, robustness, and precision
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
validated the framework in a multi-robot field test... precision-oriented KPI require reliable ground-truth data
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
European Space Agency, “Esa space resources strategy,” European Space Agency, Paris, France, Technical Report, May 2019, covers period up to 2030; last updated September 1,
work page 2019
-
[2]
Available: https://exploration.esa.int/web/moon/-/ 61369-esa-space-resources-strategy
[Online]. Available: https://exploration.esa.int/web/moon/-/ 61369-esa-space-resources-strategy
-
[3]
L. David. (2005) Opportunity mars rover stuck in sand. Accessed: 2025-07-18. [Online]. Available: https://www.space.com/ 1019-opportunity-mars-rover-stuck-sand.html
work page 2005
-
[4]
N. J. Potts, A. L. Gullikson, N. M. Curran, J. K. Dhaliwal, M. K. Leader, R. N. Rege, K. K. Klaus, and D. A. Kring, “Robotic tra- verse and sample return strategies for a lunar farside mission to the schr¨odinger basin,”Advances in Space Research, vol. 55, no. 4, pp. 1241–1254, 2015
work page 2015
-
[5]
Scientific exploration of challenging planetary analog environments with a team of legged robots,
P. Arm, G. Waibel, J. Preisig, T. Tuna, R. Zhou, V . Bickel, G. Ligeza, T. Miki, F. Kehl, H. Kolvenbachet al., “Scientific exploration of challenging planetary analog environments with a team of legged robots,”Science robotics, vol. 8, no. 80, p. eade9548, 2023
work page 2023
-
[6]
Traversing steep and granular martian analog slopes with a dynamic quadrupedal robot,
H. Kolvenbach, P. Arm, E. Hampp, A. Dietsche, V . Bickel, B. Sun, C. Meyer, and M. Hutter, “Traversing steep and granular martian analog slopes with a dynamic quadrupedal robot,”Field robotics, vol. 2, pp. 910–939, 2022
work page 2022
-
[7]
U.S. Department of Defense, “Subterranean challenge,” https: //www.defense.gov/Multimedia/Experience/Subterranean-Challenge/, 2021, accessed: 2025-07-18
work page 2021
-
[8]
European Space Agency, “The challenge 2021–2022,” https://src.esa. int/the-challenge-2021-2022/, 2023, accessed: 2025-07-18
work page 2021
-
[9]
An effi- cient scalable autonomy approach for teams of heterogeneous mobile robots,
T. Schnell, D. Oberacker, F. Exner, L. Puck, M. G. Besselmann, N. Spielbauer, C. Plasberg, A. Roennau, and R. Dillmann, “An effi- cient scalable autonomy approach for teams of heterogeneous mobile robots,” in2023 IEEE 19th International Conference on Automation Science and Engineering (CASE). IEEE, 2023, pp. 1–7
work page 2023
-
[10]
(2022) LUVMI-XR Team Passes First Field Trial of the Space Resources Challenge
Space Applications Services. (2022) LUVMI-XR Team Passes First Field Trial of the Space Resources Challenge. Accessed: 2025- 07-18. [Online]. Available: https://www.spaceapplications.com/news/ luvmi-xr-team-passes-first-field-trial-of-the-space-resources-challenge
work page 2022
-
[11]
M. J. Schuster, M. G. M ¨uller, S. G. Brunner, H. Lehner, P. Lehner, R. Sakagami, A. D ¨omel, L. Meyer, B. V odermayer, R. Giubilato et al., “The arches space-analogue demonstration mission: Towards heterogeneous teams of autonomous robots for collaborative scientific sampling in planetary exploration,”IEEE Robotics and Automation Letters, vol. 5, no. 4, ...
work page 2020
-
[12]
B. J. Morrell, M. S. da Silva, M. Kaufmann, S. Moon, T. Kim, X. Lei, C. Patterson, J. Uribe, T. S. Vaquero, G. J. Correaet al., “Robotic exploration of martian caves: Evaluating operational concepts through analog experiments in lava tubes,”Acta Astronautica, vol. 223, pp. 741–758, 2024
work page 2024
-
[13]
Multi- robot exploration for the cadre mission,
S. Nayak, G. Lim, F. Rossi, M. Otte, and J.-P. de la Croix, “Multi- robot exploration for the cadre mission,”Autonomous Robots, vol. 49, no. 2, p. 17, 2025
work page 2025
-
[14]
Collaborative multi- rover crater exploration: Concept and results from the arches analog mission,
L. Burkhard, R. Sakagami, K. Lakatos, H. Gmeiner, P. Lehner, J. Reill, M. G. M ¨uller, M. Durner, and A. Wedler, “Collaborative multi- rover crater exploration: Concept and results from the arches analog mission,” in2024 IEEE Aerospace Conference. IEEE, 2024, pp. 1–14
work page 2024
-
[15]
Field testing of a cooperative multi-robot sample return mission in mars analogue environment,
R. Sonsalla, F. Cordes, L. Christensen, T. M. Roehr, T. Stark, S. Plan- thaber, M. Maurus, M. Mallwitz, and E. A. Kirchner, “Field testing of a cooperative multi-robot sample return mission in mars analogue environment,” inProceedings of the 14th symposium on advanced space technologies in robotics and automation (ASTRA), 2017
work page 2017
-
[16]
Development strategies for multi-robot teams in context of planetary exploration,
W. Brinkmann, L. Danter, A. Suresh, M. Y ¨uksel, M. Meder, and F. Kirchner, “Development strategies for multi-robot teams in context of planetary exploration,” in2024 International Conference on Space Robotics (iSpaRo). IEEE, 2024, pp. 64–69
work page 2024
-
[17]
(2025) Lunar Reconnaissance Orbiter
NASA Science. (2025) Lunar Reconnaissance Orbiter. National Aeronautics and Space Administration. Page last updated 1 July 2025. Accessed 21 July 2025. [Online]. Available: https: //science.nasa.gov/mission/lro/
work page 2025
-
[18]
M. Anand, I. A. Crawford, M. Balat-Pichelin, S. Abanades, W. Van Westrenen, G. P ´eraudeau, R. Jaumann, and W. Seboldt, “A brief review of chemical and mineralogical resources on the moon and likely initial in situ resource utilization (isru) applications,”Planetary and Space Science, vol. 74, no. 1, pp. 42–48, 2012
work page 2012
-
[19]
P. O. Hayne, A. Hendrix, E. Sefton-Nash, M. A. Siegler, P. G. Lucey, K. D. Retherford, J.-P. Williams, B. T. Greenhagen, and D. A. Paige, “Evidence for exposed water ice in the moon’s south polar regions from lunar reconnaissance orbiter ultraviolet albedo and temperature measurements,”Icarus, vol. 255, pp. 58–69, 2015
work page 2015
-
[20]
Quantitative estimation of helium-3 spatial distribution in the lunar regolith layer,
W. Fa and Y .-Q. Jin, “Quantitative estimation of helium-3 spatial distribution in the lunar regolith layer,”Icarus, vol. 190, no. 1, pp. 15–23, 2007
work page 2007
-
[21]
Lunar mare tio2 abundances estimated from uv/vis reflectance,
H. Sato, M. S. Robinson, S. J. Lawrence, B. W. Denevi, B. Hapke, B. L. Jolliff, and H. Hiesinger, “Lunar mare tio2 abundances estimated from uv/vis reflectance,”Icarus, vol. 296, pp. 216–238, 2017
work page 2017
-
[22]
Landing site selection and overview of china’s lunar landing missions,
J. Liu, X. Zeng, C. Li, X. Ren, W. Yan, X. Tan, X. Zhang, W. Chen, W. Zuo, Y . Liuet al., “Landing site selection and overview of china’s lunar landing missions,”Space science reviews, vol. 217, no. 1, p. 6, 2021
work page 2021
-
[23]
Investigation on lunar landing candidate sites for a future lunar exploration mission,
S. Kim, K. J. Kim, and Y . Yi, “Investigation on lunar landing candidate sites for a future lunar exploration mission,”International Journal of Aeronautical and Space Sciences, vol. 23, no. 1, pp. 221–232, 2022
work page 2022
-
[24]
A. Agha, K. Otsu, B. Morrell, D. D. Fan, R. Thakker, A. Santamaria- Navarro, S.-K. Kim, A. Bouman, X. Lei, J. Edlundet al., “Nebula: Quest for robotic autonomy in challenging environments; team costar at the darpa subterranean challenge,”arXiv preprint arXiv:2103.11470, 2021
-
[25]
Rosmc: A high-level mission operation framework for heterogeneous robotic teams,
R. Sakagami, S. G. Brunner, A. D ¨omel, A. Wedler, and F. Stulp, “Rosmc: A high-level mission operation framework for heterogeneous robotic teams,” in2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 5473–5479
work page 2023
-
[26]
N. Hudson, F. Talbot, M. Cox, J. Williams, T. Hines, A. Pitt, B. Wood, D. Frousheger, K. L. Surdo, T. Molnaret al., “Heterogeneous ground and air platforms, homogeneous sensing: Team csiro data61’s approach to the darpa subterranean challenge,”Field Robotics, vol. 2, pp. 595– 636, 2022
work page 2022
-
[27]
Metrics for perfor- mance benchmarking of multi-robot exploration,
Z. Yan, L. Fabresse, J. Laval, and N. Bouraqadi, “Metrics for perfor- mance benchmarking of multi-robot exploration,” in2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015, pp. 3407–3414
work page 2015
-
[28]
Evaluating maps produced by urban search and rescue robots: lessons learned from robocup,
B. Balaguer, S. Balakirsky, S. Carpin, and A. Visser, “Evaluating maps produced by urban search and rescue robots: lessons learned from robocup,”Autonomous Robots, vol. 27, no. 4, pp. 449–464, 2009
work page 2009
-
[29]
Comparison of legged single- robot and multi-robot planetary analog exploration systems,
P. Arm, H. Kolvenbach, and M. Hutter, “Comparison of legged single- robot and multi-robot planetary analog exploration systems,” inIAC 2023 Conference Proceedings. International Astronautical Federation, 2023, p. 78381
work page 2023
-
[30]
Y . Xu, J. Yu, J. Tang, J. Qiu, J. Wang, Y . Shen, Y . Wang, and H. Yang, “Explore-bench: Data sets, metrics and evaluations for frontier-based and deep-reinforcement-learning-based autonomous exploration,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 6225–6231
work page 2022
-
[31]
Survey of metrics for human- robot interaction,
R. R. Murphy and D. Schreckenghost, “Survey of metrics for human- robot interaction,” in2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 2013, pp. 197–198
work page 2013
-
[32]
Development of nasa-tlx (task load index): Results of empirical and theoretical research,
S. G. Hart and L. E. Staveland, “Development of nasa-tlx (task load index): Results of empirical and theoretical research,” inAdvances in psychology. Elsevier, 1988, vol. 52, pp. 139–183
work page 1988
-
[33]
J. Nelles, S. T. Kwee-Meier, and A. Mertens, “Evaluation metrics regarding human well-being and system performance in human-robot interaction–a literature review,” inCongress of the International Er- gonomics Association. Springer, 2019, pp. 124–135
work page 2019
-
[34]
G. Ligeza, V . T. Bickel, I. Drozdovskiy, T. Bontognali, N. Kuhn, and F. Kehl, “Exploring the lunar surface: A review of technologies for resource prospection and their complementarity,”Acta Astronautica, 2025, under review
work page 2025
-
[35]
(2025) ACT–REACT QuickMap — Lunar Reconnaissance Orbiter Camera
LROC Science Operations Center. (2025) ACT–REACT QuickMap — Lunar Reconnaissance Orbiter Camera. NASA/GSFC & Arizona State University. Accessed 21 July 2025. [Online]. Available: https://quickmap.lroc.im-ldi.com
work page 2025
-
[36]
I. A. Crawford, “Lunar resources: A review,”Progress in Physical Geography, vol. 39, no. 2, pp. 137–167, 2015
work page 2015
-
[37]
V olcanic processes in the gassendi region of the moon,
T. A. Giguere, B. R. Hawke, J. J. Gillis-Davis, M. Lemelin, J. M. Boyce, D. Trang, S. J. Lawrence, J. D. Stopar, B. A. Campbell, L. R. Gaddiset al., “V olcanic processes in the gassendi region of the moon,”Journal of Geophysical Research: Planets, vol. 125, no. 9, p. e2019JE006034, 2020
work page 2020
-
[38]
Compositional analyses of lunar pyroclastic deposits,
L. R. Gaddis, M. I. Staid, J. A. Tyburczy, B. R. Hawke, and N. E. Petro, “Compositional analyses of lunar pyroclastic deposits,”Icarus, vol. 161, no. 2, pp. 262–280, 2003
work page 2003
-
[39]
T. Powell, T. Horvath, V . L. Robles, J.-P. Williams, P. Hayne, C. Gallinger, B. Greenhagen, D. McDougall, and D. Paige, “High- resolution nighttime temperature and rock abundance mapping of the moon using the diviner lunar radiometer experiment with a model for topographic removal,”Journal of Geophysical Research: Planets, vol. 128, no. 2, p. e2022JE007532, 2023
work page 2023
-
[40]
Relative depths of simple craters and the nature of the lunar regolith,
J. D. Stopar, M. S. Robinson, O. S. Barnouin, A. S. McEwen, E. J. Speyerer, M. R. Henriksen, and S. S. Sutton, “Relative depths of simple craters and the nature of the lunar regolith,”Icarus, vol. 298, pp. 34–48, 2017
work page 2017
-
[41]
V olatiles investigating polar exploration rover (viper) proposal information package,
A. Colaprete, “V olatiles investigating polar exploration rover (viper) proposal information package,” NASA, Tech. Rep.,
-
[42]
Available: https://science.nasa.gov/wp-content/ uploads/2024/08/viper-pip-final.pdf
[Online]. Available: https://science.nasa.gov/wp-content/ uploads/2024/08/viper-pip-final.pdf
work page 2024
-
[43]
Distance to nearest neighbor as a measure of spatial relationships in populations,
P. J. Clark and F. C. Evans, “Distance to nearest neighbor as a measure of spatial relationships in populations,”Ecology, vol. 35, no. 4, pp. 445–453, 1954
work page 1954
-
[44]
MOSAIC: Modular scalable autonomy for intelligent coordination of heterogeneous robotic teams,
D. Oberacker, J. Richter, P. Arm, M. Grosse Besselmann, L. Puck, W. Talbot, M. Schik, S. Bellmann, T. Schnell, H. Kolvenbach, R. Dill- mann, M. Hutter, and A. Roennau, “MOSAIC: Modular scalable autonomy for intelligent coordination of heterogeneous robotic teams,” 2026, manuscript under review
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.