Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition

被引：6

作者：

Sharma, Saurav ^{[1
]}

Nwoye, Chinedu Innocent ^{[1
]}

Mutter, Didier ^{[2
,3
]}

Padoy, Nicolas ^{[1
,2
]}

机构：

[1] Univ Strasbourg, ICube, CNRS, Strasbourg, France

[2] IHU Strasbourg, Strasbourg, France

[3] Univ Hosp Strasbourg, Strasbourg, France

来源：

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY | 2023年 / 18卷 / 06期

关键词：

Surgical triplet recognition; Laparoscopic surgery; Temporal modeling; Action triplet; Attention model;

D O I：

10.1007/s11548-023-02914-1

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Purpose One of the recent advances in surgical AI is the recognition of surgical activities as triplets of (instrument, verb, target). Albeit providing detailed information for computer-assisted intervention, current triplet recognition approaches rely only on single-frame features. Exploiting the temporal cues from earlier frames would improve the recognition of surgical action triplets from videos.Methods In this paper, we propose Rendezvous in Time (RiT)-a deep learning model that extends the state-of-the-art model, Rendezvous, with temporal modeling. Focusing more on the verbs, our RiT explores the connectedness of current and past frames to learn temporal attention-based features for enhanced triplet recognition.Results We validate our proposal on the challenging surgical triplet dataset, CholecT45, demonstrating an improved recognition of the verb and triplet along with other interactions involving the verb such as (instrument, verb). Qualitative results show that the RiT produces smoother predictions for most triplet instances than the state-of-the-arts.Conclusion We present a novel attention-based approach that leverages the temporal fusion of video frames to model the evolution of surgical actions and exploit their benefits for surgical triplet recognition.

引用

页码：1053 / 1059

页数：7

共 50 条

[1] Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition
Saurav Sharma
Chinedu Innocent Nwoye
Didier Mutter
Nicolas Padoy
[J]. International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 1053 - 1059
[2] A Neural Autoregressive Approach to Attention-based Recognition
Zheng, Yin
Zemel, Richard S.
Zhang, Yu-Jin
Larochelle, Hugo
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) : 67 - 79
[3] A Neural Autoregressive Approach to Attention-based Recognition
Yin Zheng
Richard S. Zemel
Yu-Jin Zhang
Hugo Larochelle
[J]. International Journal of Computer Vision, 2015, 113 : 67 - 79
[4] ATSN: Attention-Based Temporal Segment Network for Action Recognition
Sun, Yun-lei
Zhang, Da-lin
[J]. TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2019, 26 (06): : 1664 - 1669
[5] Recurrent Temporal Sparse Autoencoder for Attention-based Action Recognition
Xin, Miao
Zhang, Hong
Sun, Mingui
Yuan, Ding
[J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 456 - 463
[6] Hierarchical Attention-Based Multimodal Fusion Network for Video Emotion Recognition
Liu, Xiaodong
Li, Songyang
Wang, Miao
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[7] Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition
Zang, Jinliang
Wang, Le
Liu, Ziyi
Zhang, Qilin
Niu, Zhenxing
Hua, Gang
Zheng, Nanning
[J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 97 - 108
[8] A Federated Attention-Based Multimodal Biometric Recognition Approach in IoT
Lin, Leyu
Zhao, Yue
Meng, Jintao
Zhao, Qi
[J]. SENSORS, 2023, 23 (13)
[9] Real-time emotion recognition using end-to-end attention-based fusion network
Shit, Sahadeb
Rana, Aiswarya
Das, Dibyendu Kumar
Ray, Dip Narayan
[J]. JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
[10] An Attention-Based Approach for Mongolian News Named Entity Recognition
Tan, Mingyan
Bao, Feilong
Gao, Guanglai
Wang, Weihua
[J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 424 - 435

← 1 2 3 4 5 →