Weakly Supervised Recognition of Surgical Gestures

Beatrice van Amsterdam; Danail Stoyanov; Elena De Momi; Hirenkumar Nakawala

arxiv: 1907.10993 · v1 · pith:BFV64YYVnew · submitted 2019-07-25 · 💻 cs.RO · cs.CV· eess.IV

Weakly Supervised Recognition of Surgical Gestures

Beatrice van Amsterdam , Hirenkumar Nakawala , Elena De Momi , Danail Stoyanov This is my paper

Pith reviewed 2026-05-24 16:27 UTC · model grok-4.3

classification 💻 cs.RO cs.CVeess.IV

keywords surgical gesture recognitionweakly supervised learningGaussian mixture modelkinematic trajectoriesrobot-assisted surgeryaction segmentationskill assessment

0 comments

The pith

One expert demonstration with ground-truth labels initializes a GMM to recognize surgical gestures better than standard unsupervised methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Kinematic data from surgical robots encodes gestures but full manual labeling of large sets is impractical. Unsupervised GMM approaches often require heavy tuning and underperform on variable trajectories. The paper shows that deriving initial parameters from a minimum of one annotated expert demonstration yields significantly higher recognition accuracy on real demonstrations. This weak-supervision step avoids labeling entire datasets while beating task-agnostic initialization. Additional accuracy gains follow from redefining action classes and selecting better input features.

Core claim

Parameters derived from at least one expert demonstration and its ground-truth annotations supply an appropriate initialization for a GMM-based gesture recognition algorithm; on real surgical demonstrations this initialization produces significantly higher accuracy than standard task-agnostic methods, and further improvement is obtained by redefining the actions and optimizing the inputs.

What carries the argument

GMM algorithm whose initial parameters are taken from one expert demonstration and its annotations.

If this is right

Kinematic trajectories can be segmented into gestures without labeling every demonstration.
New quantitative metrics for surgical skill become feasible once gestures are automatically identified.
Surgical automation pipelines can operate on segmented rather than raw trajectories.
Redefining action boundaries and choosing input features raises recognition accuracy further.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same one-shot initialization tactic could reduce annotation cost in other trajectory domains that exhibit high inter-trial variability.
If the expert demonstration is itself atypical, the method may embed bias that later data cannot correct without additional labeled examples.

Load-bearing premise

Parameters taken from a single expert demonstration supply a generalizable starting point for the GMM on other demonstrations that vary substantially.

What would settle it

A new collection of surgical demonstrations where the single-expert initialization produces no accuracy gain over standard random or k-means initializations.

Figures

Figures reproduced from arXiv: 1907.10993 by Beatrice van Amsterdam, Danail Stoyanov, Elena De Momi, Hirenkumar Nakawala.

**Figure 2.** Figure 2: The schematic shows the augmented state vector [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Redefined action dictionary. Each surgeme is represented with a [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: We conducted a first set of experiments on expert demonstrations [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Accuracy score as a function of the sliding window length W. The [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 7.** Figure 7: Example of normalized position trajectory (top) and normalized [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Example of segmentation output (bottom) and corresponding ground [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: t-SNE representation of the transition point distribution identified [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗

read the original abstract

Kinematic trajectories recorded from surgical robots contain information about surgical gestures and potentially encode cues about surgeon's skill levels. Automatic segmentation of these trajectories into meaningful action units could help to develop new metrics for surgical skill assessment as well as to simplify surgical automation. State-of-the-art methods for action recognition relied on manual labelling of large datasets, which is time consuming and error prone. Unsupervised methods have been developed to overcome these limitations. However, they often rely on tedious parameter tuning and perform less well than supervised approaches, especially on data with high variability such as surgical trajectories. Hence, the potential of weak supervision could be to improve unsupervised learning while avoiding manual annotation of large datasets. In this paper, we used at a minimum one expert demonstration and its ground truth annotations to generate an appropriate initialization for a GMM-based algorithm for gesture recognition. We showed on real surgical demonstrations that the latter significantly outperforms standard task-agnostic initialization methods. We also demonstrated how to improve the recognition accuracy further by redefining the actions and optimising the inputs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a simple way to seed GMMs for surgical gesture segmentation from one annotated expert demo, but the abstract gives no numbers so the performance claim is impossible to judge.

read the letter

The core contribution is using parameters from a single expert demonstration and its labels to initialize the means and covariances of a GMM instead of relying on random or k-means starts. They report that this beats standard task-agnostic initialization on real surgical trajectories and that further accuracy comes from redefining the gesture classes and choosing better input features. That is a practical, low-effort form of weak supervision tailored to a domain where full labeling is expensive. The motivation is stated clearly: unsupervised methods are brittle on variable surgical data while full supervision is impractical. The engineering steps around action redefinition and input optimization are the kind of details that often matter more than the base model. The soft spots are straightforward. The abstract asserts significant outperformance yet supplies no metrics, no baseline descriptions, no statistical tests, and no information on how the single expert demo was chosen or whether results hold across different surgeons. The stress-test concern about inter-surgeon variability is therefore hard to dismiss without seeing the experiments; if the chosen demo is atypical, the reported gains could be an artifact of the split rather than a reliable property of the initialization. The work is aimed at people already working on robotic surgery skill assessment or automation. A reader in that niche could try the initialization trick quickly and see whether it helps on their own data. Outside that subfield the paper is too incremental and domain-specific to draw broader attention. I would send it for review because the idea is easy to reproduce and the annotation-cost problem is real, but the authors must add the missing quantitative results and cross-expert checks before the claim can be evaluated.

Referee Report

2 major / 1 minor

Summary. The paper proposes initializing a GMM-based gesture segmentation algorithm for kinematic surgical trajectories using parameters derived from a minimum of one expert demonstration and its ground-truth annotations. It claims this weakly supervised initialization significantly outperforms standard task-agnostic methods on real surgical data and reports further accuracy gains from redefining actions and optimizing inputs.

Significance. If the empirical outperformance claim holds with proper validation, the method could meaningfully reduce annotation effort for surgical gesture recognition while handling trajectory variability better than fully unsupervised baselines. The work directly targets a practical bottleneck in surgical robotics and skill assessment.

major comments (2)

[Abstract] Abstract: the central claim that the proposed initialization 'significantly outperforms standard task-agnostic initialization methods' on real surgical demonstrations is asserted without any reported metrics, baselines, statistical tests, number of demonstrations, or cross-validation details, preventing verification that the data supports the stated result.
[Method (initialization procedure)] The generalization assumption that parameters fit from a single annotated expert trajectory provide a reliable GMM initialization for other demonstrations is load-bearing for the weak-supervision claim, yet the manuscript supplies no cross-expert or cross-trial validation to address high inter-surgeon variability in timing, speed, and sub-gesture execution.

minor comments (1)

[Abstract] The abstract refers to 'redefining the actions and optimising the inputs' as sources of further improvement but does not specify the exact changes or their quantitative contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the proposed initialization 'significantly outperforms standard task-agnostic initialization methods' on real surgical demonstrations is asserted without any reported metrics, baselines, statistical tests, number of demonstrations, or cross-validation details, preventing verification that the data supports the stated result.

Authors: We agree that the abstract would be more verifiable with quantitative details. In the revision we will update the abstract to include the key accuracy metrics, number of demonstrations evaluated, baselines compared, and reference to the cross-validation procedure already described in the experiments section. revision: yes
Referee: [Method (initialization procedure)] The generalization assumption that parameters fit from a single annotated expert trajectory provide a reliable GMM initialization for other demonstrations is load-bearing for the weak-supervision claim, yet the manuscript supplies no cross-expert or cross-trial validation to address high inter-surgeon variability in timing, speed, and sub-gesture execution.

Authors: The manuscript initializes the GMM from one expert demonstration and reports results on multiple real surgical demonstrations, showing consistent outperformance versus task-agnostic initialization. We will add a discussion subsection on cross-trial performance within the available dataset and explicitly acknowledge limitations due to inter-surgeon variability. Full cross-expert validation is not feasible with the current data. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical comparison is self-contained

full rationale

The paper describes a practical initialization procedure for GMM-based gesture segmentation that draws parameters from one annotated expert trajectory and then reports empirical accuracy gains versus standard task-agnostic initializers on held-out surgical demonstrations. No equations, uniqueness theorems, or predictions are presented that reduce by construction to the fitted inputs; the central claim rests on an external performance comparison rather than self-definition or self-citation chains. The method is therefore not circular under the stated criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5719 in / 964 out tokens · 28844 ms · 2026-05-24T16:27:16.312294+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

[1]

Dexterity enhancement with robotic surgery,

K. Moorthy, Y . Munz, A. Dosis, J. Hernandez, S. Martin, F. Bello, T. Rockall, and A. Darzi, “Dexterity enhancement with robotic surgery,” Surgical Endoscopy, vol. 18, no. 5, pp. 790–795, 2004

work page 2004
[2]

Task versus subtask surgical skill evaluation of robotic minimally invasive surgery,

C. E. Reiley and G. D. Hager, “Task versus subtask surgical skill evaluation of robotic minimally invasive surgery,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics) , vol. 5761 LNCS, no. PART 1, pp. 435–442, 2009

work page 2009
[3]

Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions,

H. C. Lin, I. Shafran, D. Yuh, and G. D. Hager, “Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions,” Computer Aided Surgery, vol. 11, no. 5, pp. 220–230, 2006

work page 2006
[4]

Learning from demon- stration: Generalization via task segmentation,

N. Ettehadi, S. Manaffam, and A. Behal, “Learning from demon- stration: Generalization via task segmentation,” in IOP Conference Series: Materials Science and Engineering , vol. 261, p. 012001, IOP Publishing, 2017

work page 2017
[5]

Multi-Level Discovery of Deep Options

R. Fox, S. Krishnan, I. Stoica, and K. Goldberg, “Multi-level discovery of deep options,” arXiv preprint arXiv:1703.08294 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[6]

JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling,

Y . Gao, S. S. Vedula, C. E. Reiley, N. Ahmidi, B. Varadarajan, H. C. Lin, L. Tao, L. Zappella, B. B ´ejar, D. D. Yuh, C. C. G. Chen, R. Vidal, S. Khudanpur, and G. D. Hager, “JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling,” Modeling and Monitoring of Computer Assisted Interventions (M2CAI...

work page 2014
[7]

Task and motion analyses in endoscopic surgery,

C. Cao, C. MacKenzie, and S. Payandeh, “Task and motion analyses in endoscopic surgery,” in Proceedings ASME Dynamic Systems and Control Division, pp. 583–590, Citeseer, 1996

work page 1996
[8]

Transition state clustering: Unsupervised surgical tra- jectory segmentation for robot learning,

S. Krishnan, A. Garg, S. Patil, C. Lea, G. Hager, P. Abbeel, and K. Goldberg, “Transition state clustering: Unsupervised surgical tra- jectory segmentation for robot learning,” The International Journal of Robotics Research, vol. 36, no. 13-14, pp. 1595–1618, 2017

work page 2017
[9]

Sparse hidden markov models for surgical gesture classiﬁcation and skill evaluation,

L. Tao, E. Elhamifar, S. Khudanpur, G. D. Hager, and R. Vidal, “Sparse hidden markov models for surgical gesture classiﬁcation and skill evaluation,” in International conference on information processing in computer-assisted interventions, pp. 167–177, Springer, 2012

work page 2012
[10]

Data- derived models for segmentation with application to surgical assess- ment and training,

B. Varadarajan, C. Reiley, H. Lin, S. Khudanpur, and G. Hager, “Data- derived models for segmentation with application to surgical assess- ment and training,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics), vol. 5761 LNCS, no. PART 1, pp. 426–434, 2009

work page 2009
[11]

Surgical gesture segmentation and recognition,

L. Tao, L. Zappella, G. D. Hager, and R. Vidal, “Surgical gesture segmentation and recognition,” Lecture Notes in Computer Science (in- cluding subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics), vol. 8151 LNCS, no. PART 3, pp. 339–346, 2013

work page 2013
[12]

End-to- end ﬁne-grained action segmentation and recognition using conditional random ﬁeld models and discriminative sparse coding,

E. Mavroudi, D. Bhaskara, S. Sefati, H. Ali, and R. Vidal, “End-to- end ﬁne-grained action segmentation and recognition using conditional random ﬁeld models and discriminative sparse coding,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) , pp. 1558–1567, IEEE, 2018

work page 2018
[13]

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos,

A. P. Twinanda, S. Shehata, D. Mutter, J. Marescaux, M. De Mathelin, and N. Padoy, “EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos,” IEEE Transactions on Medical Imaging , vol. 36, no. 1, pp. 86–97, 2017

work page 2017
[14]

Learning convolutional action primitives for ﬁne-grained action recognition,

C. Lea, R. Vidal, and G. D. Hager, “Learning convolutional action primitives for ﬁne-grained action recognition,” Proceedings - IEEE International Conference on Robotics and Automation, vol. 2016-June, pp. 1642–1649, 2016

work page 2016
[15]

Temporal Convolutional Networks: A Uniﬁed Approach to Action Segmentation,

C. L. B, A. Reiter, and G. D. Hager, “Temporal Convolutional Networks: A Uniﬁed Approach to Action Segmentation,” vol. 9915, pp. 47–54, 2016

work page 2016
[16]

Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training,

F. Despinoy, D. Bouget, G. Forestier, C. Penet, N. Zemiti, P. Poignet, and P. Jannin, “Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training,” IEEE Transactions on Biomedical Engineering, vol. 63, no. 6, pp. 1280–1291, 2016

work page 2016
[17]

Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery,

M. J. Fard, S. Ameri, R. B. Chinnam, and R. D. Ellis, “Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery,” IEEE Robotics and Automation Letters , vol. 2, no. 1, pp. 171–178, 2017

work page 2017
[18]

Simple methods for initializing the em algorithm for gaussian mixture models,

J. Bl ¨omer and K. Bujna, “Simple methods for initializing the em algorithm for gaussian mixture models,” CoRR, 2013

work page 2013
[19]

A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery,

N. Ahmidi, L. Tao, S. Sefati, Y . Gao, C. Lea, B. B. Haro, L. Zap- pella, S. Khudanpur, R. Vidal, and G. D. Hager, “A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery,” IEEE Transactions on Biomedical Engineering , vol. 64, no. 9, pp. 2025–2041, 2017

work page 2025
[20]

Autonomous framework for segmenting robot trajectories of manipulation task,

S. H. Lee, I. H. Suh, S. Calinon, and R. Johansson, “Autonomous framework for segmenting robot trajectories of manipulation task,” Autonomous Robots, vol. 38, no. 2, pp. 107–141, 2014

work page 2014
[21]

TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning,

A. Murali, A. Garg, S. Krishnan, F. T. Pokorny, P. Abbeel, T. Darrell, and K. Goldberg, “TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning,” Proceed- ings - IEEE International Conference on Robotics and Automation , vol. 2016-June, pp. 4150–4157, 2016

work page 2016
[22]

Automated derivation of primitives for movement classiﬁcation,

A. Fod, M. J. Matari ´c, and O. C. Jenkins, “Automated derivation of primitives for movement classiﬁcation,” Autonomous robots, vol. 12, no. 1, pp. 39–54, 2002

work page 2002
[23]

Avoiding spurious submovement decom- positions ii: a scattershot algorithm,

B. Rohrer and N. Hogan, “Avoiding spurious submovement decom- positions ii: a scattershot algorithm,” Biological cybernetics, vol. 94, no. 5, pp. 409–414, 2006

work page 2006
[24]

Learning movement primitive libraries through probabilistic segmentation,

R. Lioutikov, G. Neumann, G. Maeda, and J. Peters, “Learning movement primitive libraries through probabilistic segmentation,” The International Journal of Robotics Research , vol. 36, no. 8, pp. 879– 894, 2017

work page 2017
[25]

Real-time recognition of surgical tasks in eye surgery videos,

G. Quellec, K. Charri `ere, M. Lamard, Z. Droueche, C. Roux, B. Coch- ener, and G. Cazuguel, “Real-time recognition of surgical tasks in eye surgery videos,” Medical image analysis, vol. 18, no. 3, pp. 579–590, 2014

work page 2014
[26]

Statistical modeling and recognition of surgical workﬂow,

N. Padoy, T. Blum, S.-A. Ahmadi, H. Feussner, M.-O. Berger, and N. Navab, “Statistical modeling and recognition of surgical workﬂow,” Medical image analysis , vol. 16, no. 3, pp. 632–641, 2012

work page 2012
[27]

An application- dependent framework for the recognition of high-level surgical tasks in the or,

F. Lalys, L. Riffaud, D. Bouget, and P. Jannin, “An application- dependent framework for the recognition of high-level surgical tasks in the or,” in International Conference on Medical Image Computing and Computer-Assisted Intervention , pp. 331–338, Springer, 2011

work page 2011
[28]

An open-source research kit for the da vinci R⃝ surgical system,

P. Kazanzides, Z. Chen, A. Deguet, G. S. Fischer, R. H. Taylor, and S. P. DiMaio, “An open-source research kit for the da vinci R⃝ surgical system,” in 2014 IEEE international conference on robotics and automation (ICRA) , pp. 6434–6439, IEEE, 2014

work page 2014
[29]

Optimism- Driven Exploration for Nonlinear Systems,

T. M. Moldovan, S. Levine, M. I. Jordan, and P. Abbeel, “Optimism- Driven Exploration for Nonlinear Systems,” pp. 3239–3246, 2015

work page 2015
[30]

The expectation-maximization algorithm,

T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal processing magazine, vol. 13, no. 6, pp. 47–60, 1996

work page 1996
[31]

Cluster ensembles—a knowledge reuse framework for combining multiple partitions,

A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002

work page 2002
[32]

k-means++: The advantages of careful seeding,

D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” in Proceedings of the eighteenth annual ACM-SIAM sympo- sium on Discrete algorithms , pp. 1027–1035, Society for Industrial and Applied Mathematics, 2007

work page 2007
[33]

Reﬁning initial points for k-means clustering.,

P. S. Bradley and U. M. Fayyad, “Reﬁning initial points for k-means clustering.,” in ICML, vol. 98, pp. 91–99, Citeseer, 1998

work page 1998
[34]

Visualizing Data using t-SNE,

L. V . D. Maaten and G. Hinton, “Visualizing Data using t-SNE,” Journal of Machine Learning Research 1 , vol. 620, no. 1, pp. 267–84, 2008

work page 2008
[35]

Articulated multi-instrument 2-d pose estimation using fully convolutional networks,

X. Du, T. Kurmann, P.-L. Chang, M. Allan, S. Ourselin, R. Sznitman, J. D. Kelly, and D. Stoyanov, “Articulated multi-instrument 2-d pose estimation using fully convolutional networks,” IEEE transactions on medical imaging, vol. 37, no. 5, pp. 1276–1287, 2018

work page 2018
[36]

3-d pose estimation of articulated instruments in robotic minimally invasive surgery,

M. Allan, S. Ourselin, D. J. Hawkes, J. D. Kelly, and D. Stoyanov, “3-d pose estimation of articulated instruments in robotic minimally invasive surgery,” IEEE transactions on medical imaging , vol. 37, no. 5, pp. 1204–1213, 2018

work page 2018
[37]

An approach based on Hidden Markov Model and Gaussian Mix- ture Regression,

S. Calinon, D. Florent, E. L. Sauser, D. G. Caldwell, and A. G. Billard, “An approach based on Hidden Markov Model and Gaussian Mix- ture Regression,” IEEE Robotics and Automation Magazine , vol. 17, pp. 44–45, 2010

work page 2010
[38]

Toward robust learning of the gaussian mixture state emission densities for hidden markov models,

H. Tang, M. Hasegawa-Johnson, and T. S. Huang, “Toward robust learning of the gaussian mixture state emission densities for hidden markov models,” Audio, pp. 5242–5245, 2010

work page 2010
[39]

Surgical workﬂow analysis with Gaus- sian mixture multivariate autoregressive (GMMAR) models: A simu- lation study,

C. Loukas and E. Georgiou, “Surgical workﬂow analysis with Gaus- sian mixture multivariate autoregressive (GMMAR) models: A simu- lation study,” Computer Aided Surgery , vol. 18, no. 3-4, pp. 47–62, 2013

work page 2013
[40]

Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection,

D. Sarikaya, J. J. Corso, and K. A. Guru, “Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection,” IEEE Transactions on Medical Imaging, vol. 36, pp. 1542–1549, July 2017

work page 2017

[1] [1]

Dexterity enhancement with robotic surgery,

K. Moorthy, Y . Munz, A. Dosis, J. Hernandez, S. Martin, F. Bello, T. Rockall, and A. Darzi, “Dexterity enhancement with robotic surgery,” Surgical Endoscopy, vol. 18, no. 5, pp. 790–795, 2004

work page 2004

[2] [2]

Task versus subtask surgical skill evaluation of robotic minimally invasive surgery,

C. E. Reiley and G. D. Hager, “Task versus subtask surgical skill evaluation of robotic minimally invasive surgery,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics) , vol. 5761 LNCS, no. PART 1, pp. 435–442, 2009

work page 2009

[3] [3]

Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions,

H. C. Lin, I. Shafran, D. Yuh, and G. D. Hager, “Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions,” Computer Aided Surgery, vol. 11, no. 5, pp. 220–230, 2006

work page 2006

[4] [4]

Learning from demon- stration: Generalization via task segmentation,

N. Ettehadi, S. Manaffam, and A. Behal, “Learning from demon- stration: Generalization via task segmentation,” in IOP Conference Series: Materials Science and Engineering , vol. 261, p. 012001, IOP Publishing, 2017

work page 2017

[5] [5]

Multi-Level Discovery of Deep Options

R. Fox, S. Krishnan, I. Stoica, and K. Goldberg, “Multi-level discovery of deep options,” arXiv preprint arXiv:1703.08294 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[6] [6]

JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling,

Y . Gao, S. S. Vedula, C. E. Reiley, N. Ahmidi, B. Varadarajan, H. C. Lin, L. Tao, L. Zappella, B. B ´ejar, D. D. Yuh, C. C. G. Chen, R. Vidal, S. Khudanpur, and G. D. Hager, “JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling,” Modeling and Monitoring of Computer Assisted Interventions (M2CAI...

work page 2014

[7] [7]

Task and motion analyses in endoscopic surgery,

C. Cao, C. MacKenzie, and S. Payandeh, “Task and motion analyses in endoscopic surgery,” in Proceedings ASME Dynamic Systems and Control Division, pp. 583–590, Citeseer, 1996

work page 1996

[8] [8]

Transition state clustering: Unsupervised surgical tra- jectory segmentation for robot learning,

S. Krishnan, A. Garg, S. Patil, C. Lea, G. Hager, P. Abbeel, and K. Goldberg, “Transition state clustering: Unsupervised surgical tra- jectory segmentation for robot learning,” The International Journal of Robotics Research, vol. 36, no. 13-14, pp. 1595–1618, 2017

work page 2017

[9] [9]

Sparse hidden markov models for surgical gesture classiﬁcation and skill evaluation,

L. Tao, E. Elhamifar, S. Khudanpur, G. D. Hager, and R. Vidal, “Sparse hidden markov models for surgical gesture classiﬁcation and skill evaluation,” in International conference on information processing in computer-assisted interventions, pp. 167–177, Springer, 2012

work page 2012

[10] [10]

Data- derived models for segmentation with application to surgical assess- ment and training,

B. Varadarajan, C. Reiley, H. Lin, S. Khudanpur, and G. Hager, “Data- derived models for segmentation with application to surgical assess- ment and training,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics), vol. 5761 LNCS, no. PART 1, pp. 426–434, 2009

work page 2009

[11] [11]

Surgical gesture segmentation and recognition,

L. Tao, L. Zappella, G. D. Hager, and R. Vidal, “Surgical gesture segmentation and recognition,” Lecture Notes in Computer Science (in- cluding subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics), vol. 8151 LNCS, no. PART 3, pp. 339–346, 2013

work page 2013

[12] [12]

End-to- end ﬁne-grained action segmentation and recognition using conditional random ﬁeld models and discriminative sparse coding,

E. Mavroudi, D. Bhaskara, S. Sefati, H. Ali, and R. Vidal, “End-to- end ﬁne-grained action segmentation and recognition using conditional random ﬁeld models and discriminative sparse coding,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) , pp. 1558–1567, IEEE, 2018

work page 2018

[13] [13]

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos,

A. P. Twinanda, S. Shehata, D. Mutter, J. Marescaux, M. De Mathelin, and N. Padoy, “EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos,” IEEE Transactions on Medical Imaging , vol. 36, no. 1, pp. 86–97, 2017

work page 2017

[14] [14]

Learning convolutional action primitives for ﬁne-grained action recognition,

C. Lea, R. Vidal, and G. D. Hager, “Learning convolutional action primitives for ﬁne-grained action recognition,” Proceedings - IEEE International Conference on Robotics and Automation, vol. 2016-June, pp. 1642–1649, 2016

work page 2016

[15] [15]

Temporal Convolutional Networks: A Uniﬁed Approach to Action Segmentation,

C. L. B, A. Reiter, and G. D. Hager, “Temporal Convolutional Networks: A Uniﬁed Approach to Action Segmentation,” vol. 9915, pp. 47–54, 2016

work page 2016

[16] [16]

Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training,

F. Despinoy, D. Bouget, G. Forestier, C. Penet, N. Zemiti, P. Poignet, and P. Jannin, “Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training,” IEEE Transactions on Biomedical Engineering, vol. 63, no. 6, pp. 1280–1291, 2016

work page 2016

[17] [17]

Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery,

M. J. Fard, S. Ameri, R. B. Chinnam, and R. D. Ellis, “Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery,” IEEE Robotics and Automation Letters , vol. 2, no. 1, pp. 171–178, 2017

work page 2017

[18] [18]

Simple methods for initializing the em algorithm for gaussian mixture models,

J. Bl ¨omer and K. Bujna, “Simple methods for initializing the em algorithm for gaussian mixture models,” CoRR, 2013

work page 2013

[19] [19]

A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery,

N. Ahmidi, L. Tao, S. Sefati, Y . Gao, C. Lea, B. B. Haro, L. Zap- pella, S. Khudanpur, R. Vidal, and G. D. Hager, “A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery,” IEEE Transactions on Biomedical Engineering , vol. 64, no. 9, pp. 2025–2041, 2017

work page 2025

[20] [20]

Autonomous framework for segmenting robot trajectories of manipulation task,

S. H. Lee, I. H. Suh, S. Calinon, and R. Johansson, “Autonomous framework for segmenting robot trajectories of manipulation task,” Autonomous Robots, vol. 38, no. 2, pp. 107–141, 2014

work page 2014

[21] [21]

TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning,

A. Murali, A. Garg, S. Krishnan, F. T. Pokorny, P. Abbeel, T. Darrell, and K. Goldberg, “TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning,” Proceed- ings - IEEE International Conference on Robotics and Automation , vol. 2016-June, pp. 4150–4157, 2016

work page 2016

[22] [22]

Automated derivation of primitives for movement classiﬁcation,

A. Fod, M. J. Matari ´c, and O. C. Jenkins, “Automated derivation of primitives for movement classiﬁcation,” Autonomous robots, vol. 12, no. 1, pp. 39–54, 2002

work page 2002

[23] [23]

Avoiding spurious submovement decom- positions ii: a scattershot algorithm,

B. Rohrer and N. Hogan, “Avoiding spurious submovement decom- positions ii: a scattershot algorithm,” Biological cybernetics, vol. 94, no. 5, pp. 409–414, 2006

work page 2006

[24] [24]

Learning movement primitive libraries through probabilistic segmentation,

R. Lioutikov, G. Neumann, G. Maeda, and J. Peters, “Learning movement primitive libraries through probabilistic segmentation,” The International Journal of Robotics Research , vol. 36, no. 8, pp. 879– 894, 2017

work page 2017

[25] [25]

Real-time recognition of surgical tasks in eye surgery videos,

G. Quellec, K. Charri `ere, M. Lamard, Z. Droueche, C. Roux, B. Coch- ener, and G. Cazuguel, “Real-time recognition of surgical tasks in eye surgery videos,” Medical image analysis, vol. 18, no. 3, pp. 579–590, 2014

work page 2014

[26] [26]

Statistical modeling and recognition of surgical workﬂow,

N. Padoy, T. Blum, S.-A. Ahmadi, H. Feussner, M.-O. Berger, and N. Navab, “Statistical modeling and recognition of surgical workﬂow,” Medical image analysis , vol. 16, no. 3, pp. 632–641, 2012

work page 2012

[27] [27]

An application- dependent framework for the recognition of high-level surgical tasks in the or,

F. Lalys, L. Riffaud, D. Bouget, and P. Jannin, “An application- dependent framework for the recognition of high-level surgical tasks in the or,” in International Conference on Medical Image Computing and Computer-Assisted Intervention , pp. 331–338, Springer, 2011

work page 2011

[28] [28]

An open-source research kit for the da vinci R⃝ surgical system,

P. Kazanzides, Z. Chen, A. Deguet, G. S. Fischer, R. H. Taylor, and S. P. DiMaio, “An open-source research kit for the da vinci R⃝ surgical system,” in 2014 IEEE international conference on robotics and automation (ICRA) , pp. 6434–6439, IEEE, 2014

work page 2014

[29] [29]

Optimism- Driven Exploration for Nonlinear Systems,

T. M. Moldovan, S. Levine, M. I. Jordan, and P. Abbeel, “Optimism- Driven Exploration for Nonlinear Systems,” pp. 3239–3246, 2015

work page 2015

[30] [30]

The expectation-maximization algorithm,

T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal processing magazine, vol. 13, no. 6, pp. 47–60, 1996

work page 1996

[31] [31]

Cluster ensembles—a knowledge reuse framework for combining multiple partitions,

A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002

work page 2002

[32] [32]

k-means++: The advantages of careful seeding,

D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” in Proceedings of the eighteenth annual ACM-SIAM sympo- sium on Discrete algorithms , pp. 1027–1035, Society for Industrial and Applied Mathematics, 2007

work page 2007

[33] [33]

Reﬁning initial points for k-means clustering.,

P. S. Bradley and U. M. Fayyad, “Reﬁning initial points for k-means clustering.,” in ICML, vol. 98, pp. 91–99, Citeseer, 1998

work page 1998

[34] [34]

Visualizing Data using t-SNE,

L. V . D. Maaten and G. Hinton, “Visualizing Data using t-SNE,” Journal of Machine Learning Research 1 , vol. 620, no. 1, pp. 267–84, 2008

work page 2008

[35] [35]

Articulated multi-instrument 2-d pose estimation using fully convolutional networks,

X. Du, T. Kurmann, P.-L. Chang, M. Allan, S. Ourselin, R. Sznitman, J. D. Kelly, and D. Stoyanov, “Articulated multi-instrument 2-d pose estimation using fully convolutional networks,” IEEE transactions on medical imaging, vol. 37, no. 5, pp. 1276–1287, 2018

work page 2018

[36] [36]

3-d pose estimation of articulated instruments in robotic minimally invasive surgery,

M. Allan, S. Ourselin, D. J. Hawkes, J. D. Kelly, and D. Stoyanov, “3-d pose estimation of articulated instruments in robotic minimally invasive surgery,” IEEE transactions on medical imaging , vol. 37, no. 5, pp. 1204–1213, 2018

work page 2018

[37] [37]

An approach based on Hidden Markov Model and Gaussian Mix- ture Regression,

S. Calinon, D. Florent, E. L. Sauser, D. G. Caldwell, and A. G. Billard, “An approach based on Hidden Markov Model and Gaussian Mix- ture Regression,” IEEE Robotics and Automation Magazine , vol. 17, pp. 44–45, 2010

work page 2010

[38] [38]

Toward robust learning of the gaussian mixture state emission densities for hidden markov models,

H. Tang, M. Hasegawa-Johnson, and T. S. Huang, “Toward robust learning of the gaussian mixture state emission densities for hidden markov models,” Audio, pp. 5242–5245, 2010

work page 2010

[39] [39]

Surgical workﬂow analysis with Gaus- sian mixture multivariate autoregressive (GMMAR) models: A simu- lation study,

C. Loukas and E. Georgiou, “Surgical workﬂow analysis with Gaus- sian mixture multivariate autoregressive (GMMAR) models: A simu- lation study,” Computer Aided Surgery , vol. 18, no. 3-4, pp. 47–62, 2013

work page 2013

[40] [40]

Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection,

D. Sarikaya, J. J. Corso, and K. A. Guru, “Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection,” IEEE Transactions on Medical Imaging, vol. 36, pp. 1542–1549, July 2017

work page 2017