Research on facial recognition of sika deer based on vision transformer

被引:3
|
作者
Gong, He [1 ,2 ,3 ,4 ]
Luo, Tianye [1 ]
Ni, Lingyun [1 ]
Li, Ji [1 ]
Guo, Jie [1 ]
Liu, Tonghe [1 ]
Feng, Ruilong [1 ]
Mu, Ye [1 ,2 ,3 ,4 ]
Hu, Tianli [1 ,2 ,3 ,4 ]
Sun, Yu [1 ,2 ,3 ,4 ]
Guo, Ying [1 ,2 ,3 ,4 ]
Li, Shijun [5 ,6 ]
机构
[1] Jilin Agr Univ, Coll Informat Technol, Changchun 130118, Peoples R China
[2] Jilin Prov Agr Internet Things Technol Collaborat, Changchun 130118, Peoples R China
[3] Jilin Prov Intelligent Environm Engn Res Ctr, Changchun 130118, Peoples R China
[4] Jilin Prov Coll & Univ 13 Five Year Engn Res Ctr, Changchun 130118, Peoples R China
[5] Wuzhou Univ, Coll Informat Technol, Wuzhou 543003, Peoples R China
[6] Guangxi Key Lab Machine Vis & Intelligent Control, Wuzhou 543003, Peoples R China
关键词
Sika deer; Vision transformer; DenseNet; Face recognition; Patch flattening; FACE RECOGNITION;
D O I
10.1016/j.ecoinf.2023.102334
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
In the face of global concerns about endangered ecosystems, it is vital to identify individual animals. Along these lines, in this work, a Vision Transformer (ViT) based model for sika deer individual recognition using facial data was designed. To get the satisfactory results, both low-level aspects like texture and color must also be considered, in addition to the high-level semantic information. Consequently, it was difficult to get good results by only applying advanced retrieval features. The standard ViT or ViT with ResNet (Residual neural network) as the backbone network may not be the best solution, as the direct patch flattening method of feature embedded in the conventional ViT is not applicable for performing deer face recognition. Therefore, DenseNet (Densely connected convolutional networks) block as Module 1 was used for extracting low-level features. DenseNet layers enable feature reuse through dense connections, and any layer can communicate directly. Thus maximum exchange of information flow between layers in the network is enabled. In Module 2, the mask approach was also used to eliminate extraneous information from the images and reduce interference from complicated backgrounds on the identification accuracy. In addition, the pixel multiplication of the feature map output from the two modules enabled the fusion of the local features with global features, enriching hence the expressiveness of the feature map. Finally, the ViT structure was run through pre-trained. The experimental results showed that the proposed model can reach an accuracy of 97.68% for identifying sika deer individuals and exhibited excellent generalization capabilities. A valid database for the individual identification of sika deer is provided by our work, significantly contributing to the conservation and promotion of the ecosystem.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Applying a Convolutional Vision Transformer for Emotion Recognition in Children with Autism: Fusion of Facial Expressions and Speech Features
    Wang, Yonggu
    Pan, Kailin
    Shao, Yifan
    Ma, Jiarong
    Li, Xiaojuan
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [42] Multimodal Fusion-based Swin Transformer for Facial Recognition Micro-Expression Recognition
    Zhao, Xinhua
    Lv, Yongjia
    Huang, Zheng
    PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2022), 2022, : 780 - 785
  • [43] TransMarker: A Pure Vision Transformer for Facial Landmark Detection
    Wu, Wenyan
    Cai, Yici
    Zhou, Qiang
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3580 - 3587
  • [44] Vision and Attention Theory Based Sampling for Continuous Facial Emotion Recognition
    Cruz, Albert C.
    Bhanu, Bir
    Thakoor, Ninad S.
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2014, 5 (04) : 418 - 431
  • [45] Facial Complexion Recognition of Traditional Chinese Medicine Based on Computer Vision
    Lin, Yi
    Wang, Bin
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (ICCIA 2020), 2020, : 113 - 117
  • [46] GFI-YOLOv8: Sika Deer Posture Recognition Target Detection Method Based on YOLOv8
    Gong, He
    Liu, Jingyi
    Li, Zhipeng
    Zhu, Hang
    Luo, Lan
    Li, Haoxu
    Hu, Tianli
    Guo, Ying
    Mu, Ye
    ANIMALS, 2024, 14 (18):
  • [47] Research of emotion recognition based on speech and facial expression
    Wang, Yutai
    Yang, Xinghai
    Zou, Jing
    Telkomnika - Indonesian Journal of Electrical Engineering, 2013, 11 (01): : 83 - 90
  • [48] Research on Facial Expression Recognition Based on Voting Model
    Fei, Yang
    Jiao, Guo
    2019 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2019), 2019, 646
  • [49] Research on Facial Expression Recognition Based on LBP and DeepLearning
    Li Hao
    Li Guomin
    2019 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS 2019), 2019, : 94 - 97
  • [50] Research of Facial Expression Recognition Based on Deep Learning
    Zhang, Linhao
    Yang, Yuliang
    Li, Wanchong
    Dang, Shuai
    Zhu, Mengyu
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 688 - 691