TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos

被引：2

作者：

Ramesh, Sanat ^{[1
,2
]}

Dall'Alba, Diego ^{[1
]}

Gonzalez, Cristians ^{[3
,5
]}

Yu, Tong ^{[2
]}

Mascagni, Pietro ^{[5
,6
]}

Mutter, Didier ^{[3
,4
,5
]}

Marescaux, Jacques ^{[4
]}

Fiorini, Paolo ^{[1
]}

Padoy, Nicolas ^{[2
,5
]}

机构：

[1] Univ Verona, Altair Robot Lab, I-37134 Verona, Italy

[2] Univ Strasbourg, CNRS, ICube, F-67000 Strasbourg, France

[3] Univ Hosp Strasbourg, F-67000 Strasbourg, France

[4] IRCAD, F-67000 Strasbourg, France

[5] IHU Strasbourg, Inst Image Guided Surg, F-67000 Strasbourg, France

[6] Fdn Policlin Univ Agostino Gemelli IRCCS, I-00168 Rome, Italy

来源：

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY | 2023年 / 18卷 / 09期

关键词：

Data augmentation; Temporal augmentation; Surgical activity recognition; Temporal convolutional networks; Gastric bypass procedures; Cataract procedures;

D O I：

10.1007/s11548-023-02864-8

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

PurposeAutomatic recognition of surgical activities from intraoperative surgical videos is crucial for developing intelligent support systems for computer-assisted interventions. Current state-of-the-art recognition methods are based on deep learning where data augmentation has shown the potential to improve the generalization of these methods. This has spurred work on automated and simplified augmentation strategies for image classification and object detection on datasets of still images. Extending such augmentation methods to videos is not straightforward, as the temporal dimension needs to be considered. Furthermore, surgical videos pose additional challenges as they are composed of multiple, interconnected, and long-duration activities.MethodsThis work proposes a new simplified augmentation method, called TRandAugment, specifically designed for long surgical videos, that treats each video as an assemble of temporal segments and applies consistent but random transformations to each segment. The proposed augmentation method is used to train an end-to-end spatiotemporal model consisting of a CNN (ResNet50) followed by a TCN.ResultsThe effectiveness of the proposed method is demonstrated on two surgical video datasets, namely Bypass40 and CATARACTS, and two tasks, surgical phase and step recognition. TRandAugment adds a performance boost of 1-6% over previous state-of-the-art methods, that uses manually designed augmentations.ConclusionThis work presents a simplified and automated augmentation method for long surgical videos. The proposed method has been validated on different datasets and tasks indicating the importance of devising temporal augmentation methods for long surgical videos.

引用

页码：1665 / 1672

页数：8

共 50 条

[41] Automated excavators activity recognition and productivity analysis from construction site surveillance videos
Chen, Chen
Zhu, Zhenhua
Hammad, Amin
AUTOMATION IN CONSTRUCTION, 2020, 110
[42] Social Relation Recognition from Videos via Multi-scale Spatial-Temporal Reasoning
Liu, Xinchen
Liu, Wu
Zhang, Meng
Chen, Jingwen
Gao, Lianli
Yan, Chenggang
Mei, Tao
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3561 - 3569
[43] A spatio-temporal recurrent network for salmon feeding action recognition from underwater videos in aquaculture
Maloy, Hakon
Aamodt, Agnar
Misimi, Ekrem
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 167
[44] Sensor Data Augmentation from Skeleton Pose Sequences for Improving Human Activity Recognition
Zolfaghari, Parham
Rey, Vitor Fortes
Ray, Lala
Kim, Hyun
Suh, Sungho
Lukowicz, Paul
2024 INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING, ABC 2024, 2024,
[45] Human activity recognition from uav videos using an optimized hybrid deep learning model
Sinha, Kumari Priyanka
Kumar, Prabhat
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) : 51669 - 51698
[46] Human activity recognition from uav videos using an optimized hybrid deep learning model
Kumari Priyanka Sinha
Prabhat Kumar
Multimedia Tools and Applications, 2024, 83 : 51669 - 51698
[47] Activity Recognition from Video Data using Spatial and Temporal Features
Al-Wattar, Mohamad
Khusainov, Rinat
Azzi, Djamel
Chiverton, John
12TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS - IE 2016, 2016, : 250 - 253
[48] Automatic Interaction and Activity Recognition from Videos of Human Manual Demonstrations with Application to Anomaly Detection
Merlo, Elena
Lagomarsino, Marta
Lamon, Edoardo
Ajoudani, Arash
2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, : 1188 - 1195
[49] Human activity recognition from UAV videos using a novel DMLC-CNN model
Sinha, Kumari Priyanka
Kumar, Prabhat
IMAGE AND VISION COMPUTING, 2023, 134
[50] A Combination of Generative and Discriminative Models for Fast Unsupervised Activity Recognition from Traffic Scene Videos
Krishna, Mahesh Venkata
Denzler, Joachim
2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 640 - 645

← 1 2 3 4 5 →