Interactive Surgical Training in Neuroendoscopy: Real-Time Anatomical Feature Localization Using Natural Language Expressions

被引：2

作者：

Matasyoh, Nevin M. ^{[1
]}

Schmidt, Ruediger ^{[2
,3
]}

Zeineldin, Ramy A. ^{[1
]}

Spetzger, Uwe ^{[4
,5
]}

Mathis-Ullrich, Franziska ^{[1
]}

机构：

[1] Friedrich Alexander Univ Erlangen Nurnberg, Dept Artificial Intelligence Biomed Engn, D-91052 Erlangen, Germany

[2] Klinikum Karlsruhe, Dept Neurosurg, Karlsruhe, Germany

[3] Klin Hirslanden, Ctr Endoscop & Minimally Invas Neurosurg, Zurich, Switzerland

[4] Karlsruhe Inst Technol, Inst Anthropomat & Robot, Karlsruhe, Germany

[5] Dept Neurosurg, Aachen, Germany

来源：

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING | 2024年 / 71卷 / 10期

关键词：

Surgery; Training; Neurosurgery; Biomedical imaging; Transformers; Visualization; Task analysis; Anatomical feature localization; endoscopic third ventriculostomy; feature fusion; multimodal deep learning; neuroendoscopy; surgical training; transformer;

D O I：

10.1109/TBME.2024.3405814

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

- Objective: This study addresses challenges in surgical education, particularly in neuroendoscopy, where the demand for optimized workflow conflicts with the need for trainees' active participation in surgeries. To overcome these challenges, we propose a framework that accurately identifies anatomical structures within images guided by language descriptions, facilitating authentic and interactive learning experiences in neuroendoscopy. Methods: Utilizing the encoder-decoder architecture of a conventional transformer, our framework processes multimodal inputs (images and language descriptions) to identify and localize features in neuroendoscopic images. We curate a dataset from recorded endoscopic third ventriculostomy (ETV) procedures for training and evaluation. Utilizing evaluation metrics, including "R@n," "IoU=theta," "mIoU," and top-1 accuracy, we systematically benchmark our framework against state-of-the-art methodologies. Results: The framework demonstrates excellent generalization, surpassing the compared methods with 93.67% % accuracy and 76.08% % mIoU on unseen data. It also exhibits better computational speed compared with other methods. Qualitative results affirms the framework's effectiveness in precise localization of referred anatomical features within neuroendoscopic images. Conclusion: The framework's adeptness at localizing anatomical features using language descriptions positions it as a valuable tool for integration into future interactive clinical learning systems, enhancing surgical training in neuroendoscopy. Significance: The exemplary performance reinforces the framework's potential in enhancing surgical education, leading to improved skills and outcomes for trainees in neuroendoscopy.

引用

页码：2991 / 2999

页数：9

共 50 条

[41] Real-time Interactive Image Segmentation Using Improved Superpixels
Ding, Jian-Jiun
Lin, Chia-Jung
Lu, I-Fan
Cheng, Ya-Hsin
2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 740 - 744
[42] Feature-Based Monocular Real-Time Localization for UAVs in Indoor Environment
Zhang, Yu
Cai, Zhihao
Zhao, Jiang
You, Zhenxing
Wang, Yingxun
PROCEEDINGS OF 2017 CHINESE INTELLIGENT AUTOMATION CONFERENCE, 2018, 458 : 357 - 366
[43] Towards fast feature adaptation and localization for real-time face recognition systems
Zuo, F
de With, PHN
VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 1857 - 1865
[44] Real-time facial feature localization by combining space displacement neural networks
Hanif, Shehzad Muhammad
Prevost, Lionel
Belaroussi, Rachid
Milgram, Maurice
PATTERN RECOGNITION LETTERS, 2008, 29 (08) : 1094 - 1104
[45] Real-time and Accurate RFID Tag Localization based on Multiple Feature Fusion
Fu, Shupo
Zhang, Shigeng
Jiang, Danming
Liu, Xuan
2020 16TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2020), 2020, : 694 - 699
[46] Real-Time Crisis Mapping using Language Distribution
Sampson, Justin
Morstatter, Fred
Zafarani, Reza
Liu, Huan
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1648 - 1651
[47] Real-time instance segmentation of surgical instruments using attention and multi-scale feature fusion
Angeles Ceron, Juan Carlos
Ochoa Ruiz, Gilberto
Chang, Leonardo
Ali, Sharib
MEDICAL IMAGE ANALYSIS, 2022, 81
[48] Using Kinect for real-time emotion recognition via facial expressions
Qi-rong Mao
Xin-yu Pan
Yong-zhao Zhan
Xiang-jun Shen
Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 272 - 282
[49] Using Kinect for real-time emotion recognition via facial expressions
Mao, Qi-rong
Pan, Xin-yu
Zhan, Yong-zhao
Shen, Xiang-jun
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (04) : 272 - 282
[50] A real-time interactive nonverbal communication system through semantic feature extraction as an interlingua
Hou, J
Aoki, Y
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2004, 34 (01): : 148 - 155

← 1 2 3 4 5 →