Research on facial recognition of sika deer based on vision transformer

被引:3
|
作者
Gong, He [1 ,2 ,3 ,4 ]
Luo, Tianye [1 ]
Ni, Lingyun [1 ]
Li, Ji [1 ]
Guo, Jie [1 ]
Liu, Tonghe [1 ]
Feng, Ruilong [1 ]
Mu, Ye [1 ,2 ,3 ,4 ]
Hu, Tianli [1 ,2 ,3 ,4 ]
Sun, Yu [1 ,2 ,3 ,4 ]
Guo, Ying [1 ,2 ,3 ,4 ]
Li, Shijun [5 ,6 ]
机构
[1] Jilin Agr Univ, Coll Informat Technol, Changchun 130118, Peoples R China
[2] Jilin Prov Agr Internet Things Technol Collaborat, Changchun 130118, Peoples R China
[3] Jilin Prov Intelligent Environm Engn Res Ctr, Changchun 130118, Peoples R China
[4] Jilin Prov Coll & Univ 13 Five Year Engn Res Ctr, Changchun 130118, Peoples R China
[5] Wuzhou Univ, Coll Informat Technol, Wuzhou 543003, Peoples R China
[6] Guangxi Key Lab Machine Vis & Intelligent Control, Wuzhou 543003, Peoples R China
关键词
Sika deer; Vision transformer; DenseNet; Face recognition; Patch flattening; FACE RECOGNITION;
D O I
10.1016/j.ecoinf.2023.102334
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
In the face of global concerns about endangered ecosystems, it is vital to identify individual animals. Along these lines, in this work, a Vision Transformer (ViT) based model for sika deer individual recognition using facial data was designed. To get the satisfactory results, both low-level aspects like texture and color must also be considered, in addition to the high-level semantic information. Consequently, it was difficult to get good results by only applying advanced retrieval features. The standard ViT or ViT with ResNet (Residual neural network) as the backbone network may not be the best solution, as the direct patch flattening method of feature embedded in the conventional ViT is not applicable for performing deer face recognition. Therefore, DenseNet (Densely connected convolutional networks) block as Module 1 was used for extracting low-level features. DenseNet layers enable feature reuse through dense connections, and any layer can communicate directly. Thus maximum exchange of information flow between layers in the network is enabled. In Module 2, the mask approach was also used to eliminate extraneous information from the images and reduce interference from complicated backgrounds on the identification accuracy. In addition, the pixel multiplication of the feature map output from the two modules enabled the fusion of the local features with global features, enriching hence the expressiveness of the feature map. Finally, the ViT structure was run through pre-trained. The experimental results showed that the proposed model can reach an accuracy of 97.68% for identifying sika deer individuals and exhibited excellent generalization capabilities. A valid database for the individual identification of sika deer is provided by our work, significantly contributing to the conservation and promotion of the ecosystem.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Vision Based Facial Action Recognition System for People with Disabilities
    Przybylo, Jaromir
    INFORMATION TECHNOLOGIES IN BIOMEDICINE, ITIB 2012, 2012, 7339 : 577 - 588
  • [32] Research on Facial Expression Recognition based on Kinect
    Wang, Yutong
    Liu, Yuge
    PROCEEDINGS OF 2017 VI INTERNATIONAL CONFERENCE ON NETWORK, COMMUNICATION AND COMPUTING (ICNCC 2017), 2017, : 29 - 33
  • [33] A Video Face Recognition Leveraging Temporal Information Based on Vision Transformer
    Zhang, Hui
    Yang, Jiewen
    Dong, Xingbo
    Lv, Xingguo
    Jia, Wei
    Jin, Zhe
    Li, Xuejun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 29 - 43
  • [34] Engagement Recognition in Online Learning Based on an Improved Video Vision Transformer
    Guo, Zijian
    Zhou, Zhuoyi
    Pan, Jiahui
    Liang, Yan
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [35] Multi-View Gait Recognition Based on a Siamese Vision Transformer
    Yang, Yanchen
    Yun, Lijun
    Li, Ruoyu
    Cheng, Feiyan
    Wang, Kun
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [36] A Recognition System for Diagnosing Salivary Gland Neoplasms Based on Vision Transformer
    Li, Mao
    Shen, Ze-liang
    Xian, Hong-chun
    Zheng, Zhi-jian
    Yu, Zhen-wei
    Liang, Xin-hua
    Gao, Rui
    Tang, Ya-ling
    Zhang, Zhong
    AMERICAN JOURNAL OF PATHOLOGY, 2025, 195 (02): : 221 - 231
  • [37] Plant and Animal Species Recognition Based on Dynamic Vision Transformer Architecture
    Pan, Hang
    Xie, Lun
    Wang, Zhiliang
    REMOTE SENSING, 2022, 14 (20)
  • [38] ViTMa: A Novel Hybrid Vision Transformer and Mamba for Kinship Recognition in Indonesian Facial Micro-Expressions
    Fibriani, Ike
    Yuniarno, Eko Mulyanto
    Mardiyanto, Ronny
    Purnomo, Mauridhi Hery
    IEEE ACCESS, 2024, 12 : 164002 - 164017
  • [39] Research on the Recognition of Internet Buzzword Features Based on Transformer
    Xu, Dawei
    She, Yijie
    Tan, Zhonghua
    Li, Ruiguang
    Zhao, Jian
    CYBER SECURITY, CNCERT 2022, 2022, 1699 : 227 - 237
  • [40] FER-PCVT: Facial Expression Recognition with Patch-Convolutional Vision Transformer for Stroke Patients
    Fan, Yiming
    Wang, Hewei
    Zhu, Xiaoyu
    Cao, Xiangming
    Yi, Chuanjian
    Chen, Yao
    Jia, Jie
    Lu, Xiaofeng
    BRAIN SCIENCES, 2022, 12 (12)