Skip to main content

Advertisement

Log in

SAGES consensus recommendations on an annotation framework for surgical video

  • Consensus Statement
  • Published:
Surgical Endoscopy Aims and scope Submit manuscript

Abstract

Background

The growing interest in analysis of surgical video through machine learning has led to increased research efforts; however, common methods of annotating video data are lacking. There is a need to establish recommendations on the annotation of surgical video data to enable assessment of algorithms and multi-institutional collaboration.

Methods

Four working groups were formed from a pool of participants that included clinicians, engineers, and data scientists. The working groups were focused on four themes: (1) temporal models, (2) actions and tasks, (3) tissue characteristics and general anatomy, and (4) software and data structure. A modified Delphi process was utilized to create a consensus survey based on suggested recommendations from each of the working groups.

Results

After three Delphi rounds, consensus was reached on recommendations for annotation within each of these domains. A hierarchy for annotation of temporal events in surgery was established.

Conclusions

While additional work remains to achieve accepted standards for video annotation in surgery, the consensus recommendations on a general framework for annotation presented here lay the foundation for standardization. This type of framework is critical to enabling diverse datasets, performance benchmarks, and collaboration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. McKinley SK, Hashimoto DA, Mansur A, Cassidy D, Petrusa E, Mullen JT, Phitayakorn R, Gee DW (2019) Feasibility and perceived usefulness of using head-mounted cameras for resident video portfolios. J Surg Res 239:233–241

    Article  Google Scholar 

  2. Greenberg CC, Byrnes ME, Engler TA, Quamme SP, Thumma JR, Dimick JB (2021) Association of a Statewide Surgical Coaching Program with Clinical Outcomes and Surgeon Perceptions. Ann Surg. https://doi.org/10.1097/SLA.0000000000004800

    Article  PubMed  Google Scholar 

  3. Manabe T, Takasaki M, Ide T, Kitahara K, Sato S, Yunotani S, Hirohashi Y, Iyama A, Taniguchi M, Ogata T, Shimizu S, Noshiro H (2020) Regional education on endoscopic surgery using a teleconference system with high-quality video via the internet: Saga surgical videoconferences. BMC Med Educ 20:329

    Article  Google Scholar 

  4. Hashimoto DA, Rosman G, Rus D, Meireles OR (2018) Artificial intelligence in surgery: promises and perils. Ann Surg 268:70–76

    Article  Google Scholar 

  5. Gibaud B, Forestier G, Feldmann C, Ferrigno G, Gonçalves P, Haidegger T, Julliard C, Katić D, Kenngott H, Maier-Hein L, März K, de Momi E, Nagy DÁ, Nakawala H, Neumann J, Neumuth T, Rojas Balderrama J, Speidel S, Wagner M, Jannin P (2018) Toward a standard ontology of surgical process models. Int J Comput Assist Radiol Surg 13:1397–1408

    Article  Google Scholar 

  6. Garrow CR, Kowalewski K-F, Li L, Wagner M, Schmidt MW, Engelhardt S, Hashimoto DA, Kenngott HG, Bodenstedt S, Speidel S, Müller-Stich BP, Nickel F (2020) Machine learning for surgical phase recognition: a systematic review. Ann Surg. https://doi.org/10.1097/SLA.0000000000004425

    Article  Google Scholar 

  7. Ward TM, Fer DM, Ban Y, Rosman G, Meireles OR, Hashimoto DA (2021) Challenges in surgical video annotation. Comput Assist Surg 26(1):58–68

    Article  Google Scholar 

  8. Deng J, Dong W, Socher R, Li L, Kai Li, Li Fei-Fei (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp 248–255

  9. Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv [cs.CL]

  10. Gokaslan A, Cohen V (2019) Openwebtext corpus. http://Skylion007.github.io/OpenWebTextCorpus

  11. Zhu Y, Kiros R, Zemel R, Salakhutdinov R, Urtasun R, Torralba A, Fidler S (2015) Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE international conference on computer vision. pp 19–27

  12. Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the KITTI dataset. Int J Rob Res 32:1231–1237

    Article  Google Scholar 

  13. Varadarajan B, Reiley C, Lin H, Khudanpur S, Hager G (2009) Data-derived models for segmentation with application to surgical assessment and training. Med Image Comput Comput Assist Interv 12:426–434

    PubMed  Google Scholar 

  14. Katić D, Wekerle A-L, Gärtner F, Kenngott H, Müller-Stich BP, Dillmann R, Speidel S (2014) Knowledge-driven formalization of laparoscopic surgeries for rule-based intraoperative context-aware assistance. Information processing in computer-assisted interventions. Springer, New York, pp 158–167

    Google Scholar 

  15. Ahmadi S-A, Sielhorst T, Stauder R, Horn M, Feussner H, Navab N (2006) Recovery of surgical workflow without explicit models. Med Image Comput Comput Assist Interv 9:420–428

    PubMed  Google Scholar 

  16. Anteby R, Horesh N, Soffer S, Zager Y, Barash Y, Amiel I, Rosin D, Gutman M, Klang E (2021) Deep learning visual analysis in laparoscopic surgery: a systematic review and diagnostic test accuracy meta-analysis. Surg Endosc. https://doi.org/10.1007/s00464-020-08168-1

    Article  PubMed  Google Scholar 

  17. Bhattacharyya SB (2015) Introduction to SNOMED CT. Springer

    Google Scholar 

  18. van Amsterdam B, Clarkson M, Stoyanov D (2021) Gesture recognition in robotic surgery: a review. IEEE Trans Biomed Eng. https://doi.org/10.1109/TBME.2021.3054828

    Article  PubMed  Google Scholar 

  19. Reiley CE, Hager GD (2009) Task versus subtask surgical skill evaluation of robotic minimally invasive surgery. Med Image Comput Comput Assist Interv 12:435–442

    PubMed  Google Scholar 

  20. Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, Chaudhary V, Young M, Crespo J-F, Dennison D (2015) Hidden technical debt in machine learning systems. Adv Neural Inf Process Syst 28:2503–2511

    Google Scholar 

  21. Cockburn A (2001) Writing effective use cases. Pearson Education India

  22. Surgical AI and Innovation Laboratory. SAIIL_public. https://github.com/SAIIL/SAIIL_public/

  23. Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2017) EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36:86–97

    Article  Google Scholar 

  24. Hashimoto DA, Rosman G, Witkowski ER, Stafford C, Navarette-Welton AJ, Rattner DW, Lillemoe KD, Rus DL, Meireles OR (2019) Computer vision analysis of intraoperative video: automated recognition of operative steps in laparoscopic sleeve gastrectomy. Ann Surg 270:414–421

    Article  Google Scholar 

  25. Ward TM, Hashimoto DA, Ban Y, Rattner DW, Inoue H, Lillemoe KD, Rus DL, Rosman G, Meireles OR (2020) Automated operative phase identification in peroral endoscopic myotomy. Surg Endosc. https://doi.org/10.1007/s00464-020-07833-9

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank SAGES staff Sallie Matthews, Jillian Kelly, Jason Levine, and Shelley Ginsberg for their administrative support in this work. We also thank Dr. Aurora Pryor for her support as SAGES leadership. The SAGES Video Annotation for AI Working Groups: Includes all members from Table 1.

Funding

This work was supported by the SAGES Foundation, Digital Surgery, Imagestream, Intuitive Surgical, Johnson & Johnson CSATS, Karl Storz, Medtronic, Olympus, Stryker, Theator, and Verb Surgical.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding authors

Correspondence to Ozanan R. Meireles or Daniel A. Hashimoto.

Ethics declarations

Disclosures

Ozanan Meireles is a consultant for Olympus and Medtronic and has received research support from Olympus. Guy Rosman is an employee of Toyota Research Institute (TRI); the views expressed in this paper do not reflect those of TRI or any other Toyota entity. He has received research support from Olympus. Amin Madani is a consultant for Activ Surgical. Gregory Hager is a consultant for theator.io and has an equity interest in the company. Nicolas Padoy is a consultant for Caresyntax and has received research support from Intuitive Surgical. Thomas Ward has received research support from Olympus. Daniel Hashimoto is a consultant for Johnson & Johnson and Verily Life Sciences. He has received research support from Olympus and the Intuitive Foundation. Maria S. Altieri, Lawrence Carin, Carla M. Pugh and Patricia Sylla have no conflicts of interest or financial ties to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The SAGES Video Annotation for AI Working Groups are listed in Table 1.

Supplementary Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meireles, O.R., Rosman, G., Altieri, M.S. et al. SAGES consensus recommendations on an annotation framework for surgical video. Surg Endosc 35, 4918–4929 (2021). https://doi.org/10.1007/s00464-021-08578-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00464-021-08578-9

Keywords

Navigation