A vision transformer-based automated human identification using ear biometrics

被引：6

作者：

Mehta, Ravishankar ^{[1
]}

Shukla, Sindhuja ^{[1
]}

Pradhan, Jitesh ^{[1
]}

Singh, Koushlendra Kumar ^{[1
]}

Kumar, Abhinav ^{[2
]}

机构：

[1] Natl Inst Technol Jamshedpur, Dept CSE, Machine Vis & Intelligence Lab, Jamshedpur 831014, Jharkhand, India

[2] Motilal Nehru Natl Inst Technol Allahabad, Allahabad 211001, Uttar Pradesh, India

来源：

JOURNAL OF INFORMATION SECURITY AND APPLICATIONS | 2023年 / 78卷

关键词：

Vision transformer; Patch; Embedding; Attention network; Data augmentation;

D O I：

10.1016/j.jisa.2023.103599

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent years Vision Transformers (ViTs) have gained significant attention in the field of computer vision for their impressive performance in various tasks, including image recognition and machine translation tasks, question answering, text classification, image captioning. ViTs performs better on several benchmark image datasets such as ImageNet with fewer parameters and computation compared to CNN-based models. The self-attention part performs the feature extraction component of the convolutional neural network (CNN). The proposed model provides a framework on vision transformer-based model for 2D ear recognition. The self-attention part is jointly applied with Convolutional Neural Network (CNNs) in the proposed model. Adjustments and fine-tuning has been done based on the specific characteristics of the ear dataset and the desired performance requirements. In the field of deep learning, the application areas of the CNNs have been proven to be de-facto mainly due to its learning capability of spatially local representations based on their inductive biases, learning the global representation further enhances the recognition accuracy through self-attention mechanism of vision transformers (ViT's). This has been made possible by direct applications of transformer on to the sequence of image patches for better performance in classifying the images. The proposed work utilizes various patch size of images during the model training. From the experimental analysis, it has been observed that with patch size 16 x 16 it achieves highest accuracy of 99.36%. The proposed model has been validated with the Kaggle and IITD-II data set. The efficiency of the proposed model over the existing models has been also reported in the present work.

引用

页数：12

共 50 条

[41] Comparison of the Performance of Convolutional Neural Networks and Vision Transformer-Based Systems for Automated Glaucoma Detection with Eye Fundus Images
Alayon, Silvia
Hernandez, Jorge
Fumero, Francisco J.
Sigut, Jose F.
Diaz-Aleman, Tinguaro
APPLIED SCIENCES-BASEL, 2023, 13 (23):
[42] Classifying European Court of Human Rights Cases Using Transformer-Based Techniques
Imran, Ali Shariq
Hodnefjeld, Henrik
Kastrati, Zenun
Fatima, Noureen
Daudpota, Sher Muhammad
Wani, Mudasir Ahmad
IEEE ACCESS, 2023, 11 : 55664 - 55676
[43] Privacy-Aware Human Activity Classification using a Transformer-based Model
Thipprachak, Khirakorn
Tangamchit, Poj
Lerspalungsanti, Sarawut
2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 528 - 534
[44] Vision transformer-based electronic nose for enhanced mixed gases classification
Du, Haiying
Shen, Jie
Wang, Jing
Li, Qingyu
Zhao, Long
He, Wanmin
Li, Xianrong
MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (06)
[45] A robust vision transformer-based approach for classification of labeled rices in the wild
Ulukaya, Sezer
Deari, Sabri
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 231
[46] Vision transformer-based autonomous crack detection on asphalt and concrete surfaces
Shamsabadi, Elyas Asadi
Xu, Chang
Rao, Aravinda S.
Nguyen, Tuan
Ngo, Tuan
Dias-da-Costa, Daniel
AUTOMATION IN CONSTRUCTION, 2022, 140
[47] Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting
Alharthi, Musleh
Mahmood, Ausif
BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (05)
[48] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
Kanca, Elif
Ayas, Selen
Kablan, Elif Baykal
Ekinci, Murat
2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
[49] Monitoring Student Attendance Through Vision Transformer-based Iris Recognition
Ennajar, Slimane
Bouarifi, Walid
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 698 - 707
[50] Livestock Biometrics Identification Using Computer Vision Approaches: A Review
Meng, Hua
Zhang, Lina
Yang, Fan
Hai, Lan
Wei, Yuxing
Zhu, Lin
Zhang, Jue
AGRICULTURE-BASEL, 2025, 15 (01):

← 1 2 3 4 5 →