Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

被引:0
|
作者
Firat, Huseyin [1 ]
Uzen, Huseyin [2 ]
Atila, Orhan [3 ]
Sengur, Abdulkadir [4 ]
机构
[1] Dicle Univ, Fac Engn, Dept Comp Engn, Diyarbakir, Turkiye
[2] Bingol Univ, Fac Engn & Architecture, Dept Comp Engn, Bingol, Turkiye
[3] Firat Univ, Technol Fac, Elect Elect Engn Dept, Elazig, Turkiye
[4] Firat Univ, Fac Technol, Dept Elect & Elect Engn, Elazig, Turkiye
关键词
Deep learning; Radar images; Swin transformers; Traffic hand gesture;
D O I
10.1007/s11760-024-03664-6
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Radar-based artificial intelligence (AI) applications have gained significant attention recently, spanning from fall detection to gesture recognition. The growing interest in this field has led to a shift towards deep convolutional networks, and transformers have emerged to address limitations in convolutional neural network methods, becoming increasingly popular in the AI community. In this paper, we present a novel hybrid approach for radar-based traffic hand gesture classification using transformers. Traffic hand gesture recognition (HGR) holds importance in AI applications, and our proposed three-phase approach addresses the efficiency and effectiveness of traffic HGR. In the initial phase, feature vectors are extracted from input radar images using the pre-trained DenseNet-121 model. These features are then consolidated by concatenating them to gather information from diverse radar sensors, followed by a patch extraction operation. The concatenated features from all inputs are processed in the Swin transformer block to facilitate further HGR. The classification stage involves sequential application of global average pooling, Dense, and Softmax layers. To assess the effectiveness of our method on ULM university radar dataset, we employ various performance metrics, including accuracy, precision, recall, and F1-score, achieving an average accuracy score of 90.54%. We compare this score with existing approaches to demonstrate the competitiveness of our proposed method.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Shape-based Human Action Recognition using Multi-input Topology of Deep Belief Networks
    Nickfarjam, A. M.
    Ebrahimpour-komleh, H.
    2017 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT 2017), 2017, : 1 - 4
  • [22] Automated Hand Gesture Recognition using a Deep Convolutional Neural Network model
    Dhall, Ishika
    Vashisth, Shubham
    Aggarwal, Garima
    PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 811 - 816
  • [23] Enhancing Dynamic Hand Gesture Recognition using Feature Concatenation via Multi-Input Hybrid Model
    Korti, Djazila Souhila
    Slimane, Zohra
    Lakhdari, Kheira
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (05) : 535 - 546
  • [24] CSTSUNet: A Cross Swin Transformer-Based Siamese U-Shape Network for Change Detection in Remote Sensing Images
    Wu, Yaping
    Li, Lu
    Wang, Nan
    Li, Wei
    Fan, Junfang
    Tao, Ran
    Wen, Xin
    Wang, Yanfeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [25] Maxillary sinus detection on cone beam computed tomography images using ResNet and Swin Transformer-based UNet
    Celebi, Adalet
    Imak, Andac
    Uzen, Huseyin
    Budak, Umit
    Turkoglu, Muammer
    Hanbay, Davut
    Sengur, Abdulkadir
    ORAL SURGERY ORAL MEDICINE ORAL PATHOLOGY ORAL RADIOLOGY, 2024, 138 (01): : 149 - 161
  • [26] TDFNet: Transformer-Based Deep-Scale Fusion Network for Multimodal Emotion Recognition
    Zhao, Zhengdao
    Wang, Yuhua
    Shen, Guang
    Xu, Yuezhu
    Zhang, Jiayuan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3771 - 3782
  • [27] Dental Caries Detection Using Score-Based Multi-Input Deep Convolutional Neural Network
    Imak, Andac
    Celebi, Adalet
    Siddique, Kamran
    Turkoglu, Muammer
    Sengur, Abdulkadir
    Salam, Iftekhar
    IEEE ACCESS, 2022, 10 : 18320 - 18329
  • [28] Intervention Prediction in MOOCs Based on Learners' Comments: A Temporal Multi-input Approach Using Deep Learning and Transformer Models
    Alrajhi, Laila
    Alamri, Ahmed
    Cristea, Alexandra, I
    INTELLIGENT TUTORING SYSTEMS, ITS 2022, 2022, 13284 : 227 - 237
  • [29] Automatic segmentation of cardiac magnetic resonance images based on multi-input fusion network
    Shi, Jianshe
    Ye, Yuguang
    Zhu, Daxin
    Su, Lianta
    Huang, Yifeng
    Huang, Jianlong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2021, 209
  • [30] Deep Network-based Hand Gesture Recognition using Optical Flow guided Trajectory Images
    Kavyasree, V
    Sarma, Debajit
    Gupta, Priyanka
    Bhuyan, M. K.
    PROCEEDINGS OF 2020 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON 2020), 2020, : 252 - 256