A Position Statement on Endovascular Models and Effectiveness Metrics for Mechanical Thrombectomy Navigation, on behalf of the Stakeholder Taskforce for AI-assisted Robotic Thrombectomy (START)
Pith reviewed 2026-05-14 22:06 UTC · model grok-4.3
The pith
Expert consensus defines four testbed environments and two macro-classes of metrics for validating AI-assisted robotic thrombectomy systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that expert consensus has produced frameworks specifying four essential testbed environments with distinct validation roles and graded realism requirements, alongside two macro-classes of effectiveness metrics focused on technical navigation in simulated settings and clinical outcomes in living subjects, to guide safe development of AI-assisted robotic thrombectomy navigation.
What carries the argument
The consensus frameworks for testbeds and metrics, built through an incubator day and Delphi process, which assign specific realism levels and metric types to each validation stage.
If this is right
- Simpler testbeds require realistic vessel anatomy compatible with guidewire and catheter use.
- Standard testbeds must include deformable vessels.
- Advanced testbeds incorporate blood flow, pulsatility, and disease features.
- Technical navigation metrics apply to in silico, in vitro, and ex vivo stages.
- Clinical outcome metrics are required for in vivo validation stages.
Where Pith is reading between the lines
- Standardized environments could shorten the path from prototype to regulatory approval by providing clear validation benchmarks.
- The graded approach may allow less experienced operators to train safely before performing procedures in patients.
- Linking metrics across environments could highlight specific robot design changes that lower complication risks.
- Widespread adoption might expand timely thrombectomy access to regions without specialized neurointerventionalists.
Load-bearing premise
Expert consensus through the Delphi process yields practical frameworks sufficient for safe development, including the assumption that correlating in vitro measurements to in vivo complications can be achieved without additional empirical validation.
What would settle it
A follow-up study finding no predictive correlation between in vitro technical navigation metrics and actual rates of in vivo complications would show the proposed metrics do not support safety claims.
Figures
read the original abstract
While we are making progress in overcoming infectious diseases and cancer; one of the major medical challenges of the mid-21st century will be the rising prevalence of stroke. Large vessels occlusions are especially debilitating, yet effective treatment (needed within hours to achieve best outcomes) remains limited due to geography. One solution for improving timely access to mechanical thrombectomy in geographically diverse populations is the deployment of robotic surgical systems. Artificial intelligence (AI) assistance may enable the upskilling of operators in this emerging therapeutic delivery approach. Our aim was to establish consensus frameworks for developing and validating AI-assisted robots for thrombectomy. Objectives included standardizing effectiveness metrics and defining reference testbeds across in silico, in vitro, ex vivo, and in vivo environments. To achieve this, we convened experts in neurointervention, robotics, data science, health economics, policy, statistics, and patient advocacy. Consensus was built through an incubator day, a Delphi process, and a final Position Statement. We identified that the four essential testbed environments each had distinct validation roles. Realism requirements vary: simpler testbeds should include realistic vessel anatomy compatible with guidewire and catheter use, while standard testbeds should incorporate deformable vessels. More advanced testbeds should include blood flow, pulsatility, and disease features. There are two macro-classes of effectiveness metrics: one for in silico, in vitro, and ex vivo stages focusing on technical navigation, and another for in vivo stages, focused on clinical outcomes. Patient safety is central to this technology's development. One requisite patient safety task needed now is to correlate in vitro measurements to in vivo complications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a position statement reporting the outcomes of a multi-stakeholder incubator day and Delphi process involving experts in neurointervention, robotics, data science, health economics, policy, statistics, and patient advocacy. It establishes consensus frameworks for validating AI-assisted robotic systems for mechanical thrombectomy, defining four essential testbed environments (in silico, in vitro, ex vivo, in vivo) with distinct roles and tiered realism requirements: simpler testbeds need realistic vessel anatomy compatible with guidewires and catheters, standard testbeds require deformable vessels, and advanced testbeds add blood flow, pulsatility, and disease features. Two macro-classes of effectiveness metrics are proposed—technical navigation metrics for in silico/in vitro/ex vivo stages and clinical outcomes metrics for in vivo stages—while emphasizing patient safety and identifying correlation of in vitro measurements to in vivo complications as a requisite next task.
Significance. If the consensus frameworks are adopted, the work could standardize development and validation pathways for AI-assisted robotic thrombectomy, helping address geographic disparities in timely treatment for large vessel occlusion strokes. The multi-disciplinary stakeholder process provides a strength by grounding recommendations in practical expertise across technical and clinical domains. Explicit delineation of testbed roles and metric classes, together with the forward-looking identification of the in vitro–in vivo correlation task, supplies a clear roadmap that could accelerate safe technology maturation without overclaiming empirical validation.
minor comments (2)
- Abstract: the realism tiers are described at a high level (simpler vs. standard vs. advanced testbeds); adding a concise summary table in the main text that maps each environment to its required features would improve readability and adoption.
- Main text: while the two macro-classes of metrics are introduced, the manuscript would benefit from one or two illustrative examples of specific technical navigation metrics (e.g., path length, force thresholds) to make the framework more immediately usable by developers.
Simulated Author's Rebuttal
We thank the referee for their positive and constructive review, which accurately summarizes the multi-stakeholder consensus process and the proposed frameworks for testbed environments and effectiveness metrics. We appreciate the recognition of the work's potential to standardize validation pathways for AI-assisted robotic thrombectomy and address geographic disparities in stroke care. No specific major comments were raised in the report, so we have no point-by-point rebuttals to provide. We will incorporate any minor editorial or formatting suggestions in the revised manuscript to meet the minor revision recommendation.
Circularity Check
No significant circularity
full rationale
The manuscript is explicitly a position statement reporting outcomes of an expert incubator day and Delphi process involving multiple stakeholders from neurointervention, robotics, data science, and related fields. Its central claims describe consensus-derived testbed roles, realism tiers, and two macro-classes of metrics without any equations, fitted parameters, derivations, or self-referential models. No load-bearing steps reduce by construction to inputs, self-citations, or prior author-specific results; the text instead acknowledges remaining tasks such as correlating in vitro measurements to in vivo complications as future work rather than established results.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Valery L Feigin et al. Global, regional, and national burden of stroke and its risk factors, 1990–2019: a systematic analysis for the global burden of disease study 2019.The Lancet Neurology, 20:795–820, 10 2021
work page 1990
-
[2]
Theo V os et al. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the global burden of disease study 2019.The Lancet, 396:1204–1222, 10 2020
work page 1990
-
[3]
Raul G. Nogueira et al. Thrombectomy 6 to 24 hours after stroke with a mismatch between deficit and infarct. New England Journal of Medicine, 378:11–21, 1 2018. 15 Position Statement on AI in Autonomous Mechanical Thrombectomy
work page 2018
-
[4]
Mayank Goyal et al. Endovascular thrombectomy after large-vessel ischaemic stroke: A meta-analysis of individual patient data from five randomised trials.The Lancet, 387:1723–1731, 4 2016
work page 2016
-
[5]
Martin Bendszus et al. Endovascular thrombectomy for acute ischaemic stroke with established large infarct: multicentre, open-label, randomised trial.The Lancet, 11 2023
work page 2023
-
[6]
Gregory W. Albers et al. Thrombectomy for stroke at 6 to 16 hours with selection by perfusion imaging.New England Journal of Medicine, 378:708–718, 2 2018
work page 2018
-
[7]
Jeffrey L. Saver et al. Time to treatment with endovascular thrombectomy and outcomes from ischemic stroke: Ameta-analysis.JAMA - Journal of the American Medical Association, 316:1279–1288, 9 2016
work page 2016
-
[8]
SSNAP annual report 2023, 2023
Sentinel Stroke National Audit Programme. SSNAP annual report 2023, 2023
work page 2023
-
[9]
Estimating the number of uk stroke patients eligible for endovascular thrombectomy
Peter McMeekin et al. Estimating the number of uk stroke patients eligible for endovascular thrombectomy. European stroke journal, 2:319–326, 12 2017
work page 2017
-
[10]
Kaiz S. Asif et al. Mechanical thrombectomy global access for stroke (mt-glass): A mission thrombectomy (mt-2020 plus) study.Circulation, 147:1208–1220, 4 2023
work page 2020
-
[11]
Liqun Zhang et al. Hub-and-spoke model for thrombectomy service in uk nhs practice.Clinical Medicine, Journal of the Royal College of Physicians of London, 21:E26–E31, 1 2021
work page 2021
-
[12]
Robert W. Regenhardt et al. Delays in the air or ground transfer of patients for endovascular thrombectomy. Stroke, 49:1419–1425, 2018
work page 2018
-
[13]
Olvert A. Berkhemer et al. A randomized trial of intraarterial treatment for acute ischemic stroke.New England Journal of Medicine, 372:11–20, 1 2015
work page 2015
-
[14]
Occupational health hazards in the interventional laboratory: Time for a safer environment
Lloyd W Klein et al. Occupational health hazards in the interventional laboratory: Time for a safer environment. Society of Interventional Radiology, 250:538–544, 2 2009
work page 2009
-
[15]
Pei Ho et al. Ionizing radiation absorption of vascular surgeons during endovascular procedures.Journal of V ascular Surgery, 46:455–459, 9 2007
work page 2007
-
[16]
Ryan D. Madder et al. Impact of robotics and a suspended lead suit on physician radiation exposure during percutaneous coronary intervention.Cardiovascular Revascularization Medicine, 18:190–196, 4 2017
work page 2017
-
[17]
William Crinnion et al. Robotics in neurointerventional surgery: a systematic review of the literature.Journal of neurointerventional surgery, 14:539–545, 6 2022
work page 2022
-
[18]
Yusuf Ahmad et al. Erratum: Geospatial and socioeconomic disparities in access to ir care in the united states (journal of vascular and interventional radiology (2024) 35(2) (293–300.e3), (s1051044323007893), (10.1016/j.jvir.2023.10.021)).Journal of V ascular and Interventional Radiology, 35:e87–e97, 11 2024
-
[19]
Celia V . Riga et al. The role of robotic endovascular catheters in fenestrated stent grafting.Journal of V ascular Surgery, 51:810–820, 4 2010
work page 2010
-
[20]
Mendes Pereira et al. First-in-human, robotic-assisted neuroendovascular intervention.J NeuroIntervent Surg, 12:338–340, 2020
work page 2020
-
[21]
Nicole Mariantonia Cancelliere et al. Robotic-assisted intracranial aneurysm treatment: 1 year follow-up imaging and clinical outcomes.Journal of neurointerventional surgery, 14:1229–1233, 12 2022
work page 2022
-
[22]
Wietse van Dijk et al. The effect of human autonomy and robot work pace on perceived workload in human-robot collaborative assembly work.Frontiers in Robotics and AI, 10, 2023
work page 2023
-
[23]
Iqbal H. Sarker. Machine learning: Algorithms, real-world applications and research directions.SN Computer Science, 2, 5 2021
work page 2021
-
[24]
R. Mirnezami and A. Ahmed. Surgery 3.0, artificial intelligence and the next-generation surgeon.British Journal of Surgery, 105:463–465, 4 2018
work page 2018
-
[25]
Harry Robertshaw et al. Artificial intelligence in the autonomous navigation of endovascular interventions: a systematic review.Frontiers in Human Neuroscience, 17, 8 2023
work page 2023
-
[26]
Jeremy Howick et al. Oxford centre for evidence-based medicine 2011 levels of evidence.Oxford Centre for Evidence-Based Medicine, 2011
work page 2011
-
[27]
Technology readiness level - a white paper, 4 1995
John Mankins. Technology readiness level - a white paper, 4 1995
work page 1995
-
[28]
Robert Crossley et al. Validation studies of virtual reality simulation performance metrics for mechanical thrombectomy in ischemic stroke.Journal of NeuroInterventional Surgery, 11:775–780, 8 2019. 16 Position Statement on AI in Autonomous Mechanical Thrombectomy
work page 2019
-
[29]
Cecilia A.C. Prinsen et al. Core outcome measures in effectiveness trials (comet) initiative: Protocol for an international delphi study to achieve consensus on how to select outcome measurement instruments for outcomes included in a ’core outcome set’.Trials, 15, 6 2014
work page 2014
-
[30]
Europarat Council of Europe. Developing a methodology for drawing up guidelines on best medical practices - recommendation rec13 and explanatory memorandum (2002).Strasbourg: Council of Europe Publications, 2002
work page 2002
-
[31]
Sinead Keeney, Felicity Hasson, and Hugh P Mckenna. A critical review of the delphi technique as a research methodology for nursing.International Journal of Nursing Studies, 2001
work page 2001
-
[32]
Research guidelines for the delphi survey technique
Felicity Hasson, Sinead Keeney, and Hugh McKenna. Research guidelines for the delphi survey technique. Journal of Advanced Nursing, 32:1008–1015, 2000
work page 2000
-
[33]
Ian P. Sinha, Rosalind L. Smyth, and Paula R. Williamson. Using the delphi technique to determine which outcomes to measure in clinical trials: Recommendations for the future based on a systematic review of existing studies.PLoS Medicine, 8, 2011
work page 2011
-
[34]
Harry Robertshaw et al. Reinforcement learning for safe autonomous two device navigation of cerebral vessels in mechanical thrombectomy.International Journal of Computer Assisted Radiology and Surgery, 2025
work page 2025
-
[35]
Osama O. Zaidat et al. Recommendations on angiographic revascularization grading standards for acute ischemic stroke: A consensus statement.Stroke, 44:2650–2663, 9 2013
work page 2013
-
[36]
David S. Liebeskind et al. Etici reperfusion: Defining success in endovascular stroke therapy.Journal of NeuroInterventional Surgery, 11:433–438, 5 2019
work page 2019
-
[37]
Rubén Darío Solarte Bolaños and Sanderson César Macêdo Barbalho. Exploring product complexity and prototype lead-times to predict new product development cycle-times.International Journal of Production Economics, 235, 5 2021
work page 2021
-
[38]
Lennart Karstensen et al. Learning-based autonomous navigation, benchmark environments and simulation framework for endovascular interventions.arXiv preprint, 10 2024
work page 2024
-
[39]
Rafic Nader, Romain Bourcier, and Florent Autrusseau. Synthetic vascular models : Application to bifurcation classification and aneurysm detection.Pattern Recognition. ICPR 2024, 2024
work page 2024
-
[40]
Quality-dependent deep learning for safe autonomous guidewire navigation
Jacqueline Ritter et al. Quality-dependent deep learning for safe autonomous guidewire navigation. InCurrent Directions in Biomedical Engineering, volume 8, pages 21–24. Walter de Gruyter GmbH, 7 2022
work page 2022
-
[41]
Benjamin Jackson et al. Comparative verification of control methodology for robotic interventional neuroradiology procedures.International Journal of Computer Assisted Radiology and Surgery, 2023
work page 2023
-
[42]
Reducing contact forces in the arch and supra-aortic vessels using the magellan robot
Hedyeh Rafii-Tari et al. Reducing contact forces in the arch and supra-aortic vessels using the magellan robot. In Journal of V ascular Surgery, volume 64, pages 1422–1432. Mosby Inc., 11 2016
work page 2016
-
[43]
Wenqiang Chi et al. Collaborative robot-assisted endovascular catheterization with generative adversarial imitation learning. InIEEE International Conference on Robotics and Automation (ICRA), pages 2414–2420, 2020
work page 2020
-
[44]
A sensorized modular training platform to reduce vascular damage in endovascular surgery
Nikola Fischer et al. A sensorized modular training platform to reduce vascular damage in endovascular surgery. International Journal of Computer Assisted Radiology and Surgery, 18:1687–1695, 9 2023
work page 2023
-
[45]
Nancy J. Deaton et al. Simultaneous shape and tip force sensing for the coast guidewire robot.IEEE Robotics and Automation Letters, 8:3725–3731, 6 2023
work page 2023
-
[46]
Development of a force sensor for a neuroendovascular intervention support robot system
Hiroki Tadauchi et al. Development of a force sensor for a neuroendovascular intervention support robot system. Journal of Robotics and Mechatronics, 34:1297–1305, 12 2022
work page 2022
-
[47]
John Clarkson et al. Process and systems a systems approach to healthcare: from thinking to practice.Future Healthcare Journal, 5:151–156, 2018
work page 2018
-
[48]
Janet Bouttell, Andrew Briggs, and Neil Hawkins. A different animal? identifying the features of health technology assessment for developers of medical technologies.International Journal of Technology Assessment in Health Care, 36(4):285–291, 2020
work page 2020
-
[49]
Laura Bojke et al. Developing a reference protocol for structured expert elicitation in health-care decision-making: a mixed-methods study.Health Technology Assessment, 25(37):1–124, June 2021. Funded by the NIHR Health Technology Assessment programme and the Medical Research Council (MR/N028511/1)
work page 2021
-
[50]
Kyriakos Lobotesis et al. Cost-effectiveness of stent-retriever thrombectomy in combination with iv t-pa compared with iv t-pa alone for acute ischemic stroke in the uk.Journal of Medical Economics, 19:785–794, 8 2016
work page 2016
-
[51]
Xuli Tang et al. The pace of artificial intelligence innovations: Speed, talent, and trial-and-error.Journal of Informetrics, 14, 11 2020. 17 Position Statement on AI in Autonomous Mechanical Thrombectomy
work page 2020
-
[52]
Kornelia Kreiser et al. Simulation training in neuroangiography—validation and effectiveness.Clinical Neurora- diology, 31:465–473, 6 2021
work page 2021
-
[53]
Yucheng Peng et al. Feasibility and safety of stanford a aortic dissection complete endovascular repair system in a porcine model.BMC Cardiovascular Disorders, 23, 12 2023. 18 Position Statement on AI in Autonomous Mechanical Thrombectomy Supplementary Table1: Benefits and risks without consensus after three rounds for robotic MT, both with and without AI...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.