A vision transformer-based automated human identification using ear biometrics

被引:6
|
作者
Mehta, Ravishankar [1 ]
Shukla, Sindhuja [1 ]
Pradhan, Jitesh [1 ]
Singh, Koushlendra Kumar [1 ]
Kumar, Abhinav [2 ]
机构
[1] Natl Inst Technol Jamshedpur, Dept CSE, Machine Vis & Intelligence Lab, Jamshedpur 831014, Jharkhand, India
[2] Motilal Nehru Natl Inst Technol Allahabad, Allahabad 211001, Uttar Pradesh, India
关键词
Vision transformer; Patch; Embedding; Attention network; Data augmentation;
D O I
10.1016/j.jisa.2023.103599
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent years Vision Transformers (ViTs) have gained significant attention in the field of computer vision for their impressive performance in various tasks, including image recognition and machine translation tasks, question answering, text classification, image captioning. ViTs performs better on several benchmark image datasets such as ImageNet with fewer parameters and computation compared to CNN-based models. The self-attention part performs the feature extraction component of the convolutional neural network (CNN). The proposed model provides a framework on vision transformer-based model for 2D ear recognition. The self-attention part is jointly applied with Convolutional Neural Network (CNNs) in the proposed model. Adjustments and fine-tuning has been done based on the specific characteristics of the ear dataset and the desired performance requirements. In the field of deep learning, the application areas of the CNNs have been proven to be de-facto mainly due to its learning capability of spatially local representations based on their inductive biases, learning the global representation further enhances the recognition accuracy through self-attention mechanism of vision transformers (ViT's). This has been made possible by direct applications of transformer on to the sequence of image patches for better performance in classifying the images. The proposed work utilizes various patch size of images during the model training. From the experimental analysis, it has been observed that with patch size 16 x 16 it achieves highest accuracy of 99.36%. The proposed model has been validated with the Kaggle and IITD-II data set. The efficiency of the proposed model over the existing models has been also reported in the present work.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Comparison of the Performance of Convolutional Neural Networks and Vision Transformer-Based Systems for Automated Glaucoma Detection with Eye Fundus Images
    Alayon, Silvia
    Hernandez, Jorge
    Fumero, Francisco J.
    Sigut, Jose F.
    Diaz-Aleman, Tinguaro
    APPLIED SCIENCES-BASEL, 2023, 13 (23):
  • [42] Classifying European Court of Human Rights Cases Using Transformer-Based Techniques
    Imran, Ali Shariq
    Hodnefjeld, Henrik
    Kastrati, Zenun
    Fatima, Noureen
    Daudpota, Sher Muhammad
    Wani, Mudasir Ahmad
    IEEE ACCESS, 2023, 11 : 55664 - 55676
  • [43] Privacy-Aware Human Activity Classification using a Transformer-based Model
    Thipprachak, Khirakorn
    Tangamchit, Poj
    Lerspalungsanti, Sarawut
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 528 - 534
  • [44] Vision transformer-based electronic nose for enhanced mixed gases classification
    Du, Haiying
    Shen, Jie
    Wang, Jing
    Li, Qingyu
    Zhao, Long
    He, Wanmin
    Li, Xianrong
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (06)
  • [45] A robust vision transformer-based approach for classification of labeled rices in the wild
    Ulukaya, Sezer
    Deari, Sabri
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 231
  • [46] Vision transformer-based autonomous crack detection on asphalt and concrete surfaces
    Shamsabadi, Elyas Asadi
    Xu, Chang
    Rao, Aravinda S.
    Nguyen, Tuan
    Ngo, Tuan
    Dias-da-Costa, Daniel
    AUTOMATION IN CONSTRUCTION, 2022, 140
  • [47] Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting
    Alharthi, Musleh
    Mahmood, Ausif
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (05)
  • [48] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
    Kanca, Elif
    Ayas, Selen
    Kablan, Elif Baykal
    Ekinci, Murat
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [49] Monitoring Student Attendance Through Vision Transformer-based Iris Recognition
    Ennajar, Slimane
    Bouarifi, Walid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 698 - 707
  • [50] Livestock Biometrics Identification Using Computer Vision Approaches: A Review
    Meng, Hua
    Zhang, Lina
    Yang, Fan
    Hai, Lan
    Wei, Yuxing
    Zhu, Lin
    Zhang, Jue
    AGRICULTURE-BASEL, 2025, 15 (01):