A vision transformer-based automated human identification using ear biometrics

被引:6
|
作者
Mehta, Ravishankar [1 ]
Shukla, Sindhuja [1 ]
Pradhan, Jitesh [1 ]
Singh, Koushlendra Kumar [1 ]
Kumar, Abhinav [2 ]
机构
[1] Natl Inst Technol Jamshedpur, Dept CSE, Machine Vis & Intelligence Lab, Jamshedpur 831014, Jharkhand, India
[2] Motilal Nehru Natl Inst Technol Allahabad, Allahabad 211001, Uttar Pradesh, India
关键词
Vision transformer; Patch; Embedding; Attention network; Data augmentation;
D O I
10.1016/j.jisa.2023.103599
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent years Vision Transformers (ViTs) have gained significant attention in the field of computer vision for their impressive performance in various tasks, including image recognition and machine translation tasks, question answering, text classification, image captioning. ViTs performs better on several benchmark image datasets such as ImageNet with fewer parameters and computation compared to CNN-based models. The self-attention part performs the feature extraction component of the convolutional neural network (CNN). The proposed model provides a framework on vision transformer-based model for 2D ear recognition. The self-attention part is jointly applied with Convolutional Neural Network (CNNs) in the proposed model. Adjustments and fine-tuning has been done based on the specific characteristics of the ear dataset and the desired performance requirements. In the field of deep learning, the application areas of the CNNs have been proven to be de-facto mainly due to its learning capability of spatially local representations based on their inductive biases, learning the global representation further enhances the recognition accuracy through self-attention mechanism of vision transformers (ViT's). This has been made possible by direct applications of transformer on to the sequence of image patches for better performance in classifying the images. The proposed work utilizes various patch size of images during the model training. From the experimental analysis, it has been observed that with patch size 16 x 16 it achieves highest accuracy of 99.36%. The proposed model has been validated with the Kaggle and IITD-II data set. The efficiency of the proposed model over the existing models has been also reported in the present work.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] ImplantFormer: vision transformer-based implant position regression using dental CBCT data
    Yang, Xinquan
    Li, Xuguang
    Li, Xuechen
    Wu, Peixi
    Shen, Linlin
    Deng, Yongqiang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (12): : 6643 - 6658
  • [22] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    REMOTE SENSING, 2023, 15 (21)
  • [23] Vision Transformer-Based Emotion Detection in HCI for Enhanced Interaction
    Soni, Jayesh
    Prabakar, Nagarajan
    Upadhyay, Himanshu
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 76 - 86
  • [24] Vision transformer-based visual language understanding of the construction process
    Yang, Bin
    Zhang, Binghan
    Han, Yilong
    Liu, Boda
    Hu, Jiniming
    Jin, Yiming
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 99 : 242 - 256
  • [25] DHFormer: A Vision Transformer-Based Attention Module for Image Dehazing
    Wasi, Abdul
    Shiney, O. Jeba
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT I, 2024, 2009 : 148 - 159
  • [26] Automated redaction of names in adverse event reports using transformer-based neural networks
    Meldau, Eva-Lisa
    Bista, Shachi
    Melgarejo-Gonzalez, Carlos
    Noren, G. Niklas
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [27] Identification of Intra-Domain Ambiguity using Transformer-based Machine Learning
    Moharil, Ambarish
    Sharma, Arpit
    2022 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING (NLBSE 2022), 2022, : 51 - 58
  • [28] 3D Dental Biometrics: Transformer-based Dental Arch Extraction and Matching
    Zhang, Zhiyuan
    Zhong, Xin
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 139 - 140
  • [29] A deep learning approach for person identification using ear biometrics
    Ramar Ahila Priyadharshini
    Selvaraj Arivazhagan
    Madakannu Arun
    Applied Intelligence, 2021, 51 : 2161 - 2172
  • [30] Automated design tools for piezoelectric transformer-based power supplies
    Forrester, Jack
    Davidson, Jonathan N.
    Foster, Martin P.
    Horsley, Edward L.
    Stone, David A.
    JOURNAL OF ENGINEERING-JOE, 2019, (17): : 4163 - 4166