A Fusion Deep Learning Model of ResNet and Vision Transformer for 3D CT Images

被引：1

作者：

Liu, Chiyu ^{[1
,2
]}

Sun, Cunjie ^{[1
,3
]}

机构：

[1] Xuzhou Med Univ, Dept Med Imaging, Xuzhou 221004, Peoples R China

[2] First Peoples Hosp Xuzhou, Imaging Ctr, Xuzhou 221002, Peoples R China

[3] Xuzhou Med Univ, Affiliated Hosp, Informat Dept, Xuzhou 221006, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Deep learning; fusion model; 3D CT images; COVID-19; Resnet; 3D; video swin transformer;

D O I：

10.1109/ACCESS.2024.3423689

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The outbreak of COVID-19 has had a serious impact on the safety of human life and property. Rapid and effective diagnosis is the key to the prevention and treatment of the virus. In this study, we introduce a new fusion model called "Reswin", which was trained by 3D CT data to diagnose COVID-19. The model combines two mainstream computer vision models, Resnet 3D (a convolutional neural network) and Video Swin Transformer (a vision transformer neural network), which use a soft voting method. We compared our proposed model Reswin with ResNet 3D-50, Swin-T, MViT, R2+1 D-50, SlowFast-50, X3D, and CSN101, which are state-of-the-art deep learning models used for the classification of 3D images. The Reswin model achieved an accuracy of 0.9099, precision of 0.9266, F1 score of 0.9425, AUC of 0.9541, and AUPR of 0.9861 in binary classification, and an accuracy of 0.8655, precision of 0.8580, and F1 score of 0.8620 in triple classification. Reswin provides a new solution for 3D CT image classification tasks and new ideas for the development of deep learning in 3D medical imaging.

引用

页码：93389 / 93397

页数：9

共 50 条

[31] Hybrid Deep Feature Fusion of 2D CNN and 3D CNN for Vestibule Segmentation from CT Images
Zhang, Ruicong
Zhuo, Li
Chen, Meijuan
Yin, Hongxia
Li, Xiaoguang
Wang, Zhenchang
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022
[32] Deep learning based 3D segmentation in computer vision: A survey
He, Yong
Yu, Hongshan
Liu, Xiaoyan
Yang, Zhengeng
Sun, Wei
Anwar, Saeed
Mian, Ajmal
INFORMATION FUSION, 2025, 115
[33] Deep Learning Advances in Computer Vision with 3D Data: A Survey
Ioannidou, Anastasia
Chatzilari, Elisavet
Nikolopoulos, Spiros
Kompatsiaris, Ioannis
ACM COMPUTING SURVEYS, 2017, 50 (02)
[34] ResNet-Transformer deep learning model-aided detection of dens evaginatus
Wang, Siwei
Liu, Jialing
Li, Shihao
He, Pengcheng
Zhou, Xin
Zhao, Zhihe
Zheng, Liwei
INTERNATIONAL JOURNAL OF PAEDIATRIC DENTISTRY, 2024,
[35] Application of blurred circular 3D images on the human vision model
Bo-Wen Wu
Yi-Chin Fang
Microsystem Technologies, 2021, 27 : 1099 - 1105
[36] Boosting Resolution and Recovering Texture of 2D and 3D Micro-CT Images with Deep Learning
Da Wang, Ying
Armstrong, Ryan T.
Mostaghimi, Peyman
WATER RESOURCES RESEARCH, 2020, 56 (01)
[37] Development of a Deep Learning Model for Inversion of Rotational Coronagraphic Images Into 3D Electron Density
Jang, Soojeong
Kwon, Ryun-Young
Linker, Jon A.
Riley, Pete
Shin, Gyungin
Downs, Cooper
Kim, Yeon-Han
ASTROPHYSICAL JOURNAL LETTERS, 2021, 920 (02)
[38] Application of blurred circular 3D images on the human vision model
Wu, Bo-Wen
Fang, Yi-Chin
MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2021, 27 (04): : 1099 - 1105
[39] Transformer-based factorized encoder for classification of pneumoconiosis on 3D CT images
Huang, Yingying
Si, Yang
Hu, Bingliang
Zhang, Yan
Wu, Shuang
Wu, Dongsheng
Wang, Quan
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
[40] Transformer Based Multi-model Fusion for 3D Facial Animation
Chen, Benwang
Luo, Chunshui
Wang, Haoqian
2023 2ND CONFERENCE ON FULLY ACTUATED SYSTEM THEORY AND APPLICATIONS, CFASTA, 2023, : 659 - 663

← 1 2 3 4 5 →