Interactive Surgical Training in Neuroendoscopy: Real-Time Anatomical Feature Localization Using Natural Language Expressions

被引:2
|
作者
Matasyoh, Nevin M. [1 ]
Schmidt, Ruediger [2 ,3 ]
Zeineldin, Ramy A. [1 ]
Spetzger, Uwe [4 ,5 ]
Mathis-Ullrich, Franziska [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg, Dept Artificial Intelligence Biomed Engn, D-91052 Erlangen, Germany
[2] Klinikum Karlsruhe, Dept Neurosurg, Karlsruhe, Germany
[3] Klin Hirslanden, Ctr Endoscop & Minimally Invas Neurosurg, Zurich, Switzerland
[4] Karlsruhe Inst Technol, Inst Anthropomat & Robot, Karlsruhe, Germany
[5] Dept Neurosurg, Aachen, Germany
关键词
Surgery; Training; Neurosurgery; Biomedical imaging; Transformers; Visualization; Task analysis; Anatomical feature localization; endoscopic third ventriculostomy; feature fusion; multimodal deep learning; neuroendoscopy; surgical training; transformer;
D O I
10.1109/TBME.2024.3405814
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
- Objective: This study addresses challenges in surgical education, particularly in neuroendoscopy, where the demand for optimized workflow conflicts with the need for trainees' active participation in surgeries. To overcome these challenges, we propose a framework that accurately identifies anatomical structures within images guided by language descriptions, facilitating authentic and interactive learning experiences in neuroendoscopy. Methods: Utilizing the encoder-decoder architecture of a conventional transformer, our framework processes multimodal inputs (images and language descriptions) to identify and localize features in neuroendoscopic images. We curate a dataset from recorded endoscopic third ventriculostomy (ETV) procedures for training and evaluation. Utilizing evaluation metrics, including "R@n," "IoU=theta," "mIoU," and top-1 accuracy, we systematically benchmark our framework against state-of-the-art methodologies. Results: The framework demonstrates excellent generalization, surpassing the compared methods with 93.67% % accuracy and 76.08% % mIoU on unseen data. It also exhibits better computational speed compared with other methods. Qualitative results affirms the framework's effectiveness in precise localization of referred anatomical features within neuroendoscopic images. Conclusion: The framework's adeptness at localizing anatomical features using language descriptions positions it as a valuable tool for integration into future interactive clinical learning systems, enhancing surgical training in neuroendoscopy. Significance: The exemplary performance reinforces the framework's potential in enhancing surgical education, leading to improved skills and outcomes for trainees in neuroendoscopy.
引用
收藏
页码:2991 / 2999
页数:9
相关论文
共 50 条
  • [41] Real-time Interactive Image Segmentation Using Improved Superpixels
    Ding, Jian-Jiun
    Lin, Chia-Jung
    Lu, I-Fan
    Cheng, Ya-Hsin
    2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 740 - 744
  • [42] Feature-Based Monocular Real-Time Localization for UAVs in Indoor Environment
    Zhang, Yu
    Cai, Zhihao
    Zhao, Jiang
    You, Zhenxing
    Wang, Yingxun
    PROCEEDINGS OF 2017 CHINESE INTELLIGENT AUTOMATION CONFERENCE, 2018, 458 : 357 - 366
  • [43] Towards fast feature adaptation and localization for real-time face recognition systems
    Zuo, F
    de With, PHN
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 1857 - 1865
  • [44] Real-time facial feature localization by combining space displacement neural networks
    Hanif, Shehzad Muhammad
    Prevost, Lionel
    Belaroussi, Rachid
    Milgram, Maurice
    PATTERN RECOGNITION LETTERS, 2008, 29 (08) : 1094 - 1104
  • [45] Real-time and Accurate RFID Tag Localization based on Multiple Feature Fusion
    Fu, Shupo
    Zhang, Shigeng
    Jiang, Danming
    Liu, Xuan
    2020 16TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2020), 2020, : 694 - 699
  • [46] Real-Time Crisis Mapping using Language Distribution
    Sampson, Justin
    Morstatter, Fred
    Zafarani, Reza
    Liu, Huan
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1648 - 1651
  • [47] Real-time instance segmentation of surgical instruments using attention and multi-scale feature fusion
    Angeles Ceron, Juan Carlos
    Ochoa Ruiz, Gilberto
    Chang, Leonardo
    Ali, Sharib
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [48] Using Kinect for real-time emotion recognition via facial expressions
    Qi-rong Mao
    Xin-yu Pan
    Yong-zhao Zhan
    Xiang-jun Shen
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 272 - 282
  • [49] Using Kinect for real-time emotion recognition via facial expressions
    Mao, Qi-rong
    Pan, Xin-yu
    Zhan, Yong-zhao
    Shen, Xiang-jun
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (04) : 272 - 282
  • [50] A real-time interactive nonverbal communication system through semantic feature extraction as an interlingua
    Hou, J
    Aoki, Y
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2004, 34 (01): : 148 - 155