Supervised MoE on top of ACT achieves higher success in bowel grasping/retraction from <150 demos than standard ACT or generalist VLAs, with OOD robustness, unseen viewpoint generalization, and zero-shot ex vivo porcine transfer.
W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024)
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Trains imitation learning policies in a simulated bi-plane X-ray environment for spine cannula insertion, achieving 68.5% first-attempt success with partial sim-to-real transfer.
Wavelet Policy combines world prior memory from background images with wavelet-domain multi-scale action modeling via a single-encoder multiple-decoder architecture to improve long-horizon robotic imitation learning.
Digital twin representations from vision foundation models enable LLM-based planning for robust peg transfer and gauze retrieval on the dVRK surgical platform with claimed generalizability.
SegSTRONG-C provides a new benchmark where top models reach 0.9394 DSC and 0.9301 NSD on corrupted surgical tool segmentation tests, showing conventional techniques help but calling for more innovative robustness methods.
citing papers explorer
-
Supervised Mixture-of-Experts for Surgical Grasping and Retraction
Supervised MoE on top of ACT achieves higher success in bowel grasping/retraction from <150 demos than standard ACT or generalist VLAs, with OOD robustness, unseen viewpoint generalization, and zero-shot ex vivo porcine transfer.
-
Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures
Trains imitation learning policies in a simulated bi-plane X-ray environment for spine cannula insertion, achieving 68.5% first-attempt success with partial sim-to-real transfer.
-
Wavelet Policy: Imitation Learning in the Scale Domain with World Prior Memory
Wavelet Policy combines world prior memory from background images with wavelet-domain multi-scale action modeling via a single-encoder multiple-decoder architecture to improve long-horizon robotic imitation learning.
-
Towards Robust Surgical Automation via Digital Twin Representations from Foundation Models
Digital twin representations from vision foundation models enable LLM-based planning for robust peg transfer and gauze retrieval on the dVRK surgical platform with claimed generalizability.
-
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C provides a new benchmark where top models reach 0.9394 DSC and 0.9301 NSD on corrupted surgical tool segmentation tests, showing conventional techniques help but calling for more innovative robustness methods.