Comparative Analysis of Vision Transformer Models for Facial Emotion Recognition Using Augmented Balanced Datasets

被引:6
|
作者
Bobojanov, Sukhrob [1 ]
Kim, Byeong Man [1 ]
Arabboev, Mukhriddin [2 ]
Begmatov, Shohruh [2 ]
机构
[1] Kumoh Natl Inst Technol, Comp Software Engn, Gumi 39177, South Korea
[2] Tashkent Univ Informat Technol, Tashkent 10084, Uzbekistan
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 22期
关键词
facial emotion recognition; vision transformer; data augmentation; balanced data; FER2013; RAF-DB;
D O I
10.3390/app132212271
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Facial emotion recognition (FER) has a huge importance in the field of human-machine interface. Given the intricacies of human facial expressions and the inherent variations in images, which are characterized by diverse facial poses and lighting conditions, the task of FER remains a challenging endeavour for computer-based models. Recent advancements have seen vision transformer (ViT) models attain state-of-the-art results across various computer vision tasks, encompassing image classification, object detection, and segmentation. Moreover, one of the most important aspects of creating strong machine learning models is correcting data imbalances. To avoid biased predictions and guarantee reliable findings, it is essential to maintain the distribution equilibrium of the training dataset. In this work, we have chosen two widely used open-source datasets, RAF-DB and FER2013. As well as resolving the imbalance problem, we present a new, balanced dataset, applying data augmentation techniques and cleaning poor-quality images from the FER2013 dataset. We then conduct a comprehensive evaluation of thirteen different ViT models with these three datasets. Our investigation concludes that ViT models present a promising approach for FER tasks. Among these ViT models, Mobile ViT and Tokens-to-Token ViT models appear to be the most effective, followed by PiT and Cross Former models.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Facial Emotion Recognition using Active Shape Models and Statistical Pattern Recognizers
    Jang, Gil-Jin
    Park, Jeong-Sik
    Jo, Ahra
    Kim, Ji-Hwan
    2014 NINTH INTERNATIONAL CONFERENCE ON BROADBAND AND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS (BWCCA), 2014, : 514 - 517
  • [32] Emotion Recognition Using Hidden Markov Models from Facial Temperature Sequence
    Liu, Zhilei
    Wang, Shangfei
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT II, 2011, 6975 : 240 - 247
  • [33] Speech Emotion Recognition in Multimodal Environments with Transformer: Arabic and English Audio Datasets
    Mohamed, Esraa A.
    Koura, Abdelrahim
    Kayed, Mohammed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (03) : 581 - 592
  • [34] Facial recognition techniques using SVM: A comparative analysis
    Cadena Moreano, Jose Augusto
    La Serna Palomino, Nora
    Llano Casa, Alex Christian
    ENFOQUE UTE, 2019, 10 (03): : 98 - 111
  • [35] Comparative analysis of physiological signals and Electroencephalogram (EEG) for multimodal emotion recognition using generative models
    Torres-Valencia, Cristian A.
    Garcia-Arias, Hernan F.
    Alvarez Lopez, Mauricio A.
    Orozco-Gutierrez, Alvaro A.
    2014 XIX SYMPOSIUM ON IMAGE, SIGNAL PROCESSING AND ARTIFICIAL VISION (STSIVA), 2014,
  • [36] Bi-Branch Vision Transformer Network for EEG Emotion Recognition
    Lu, Wei
    Tan, Tien-Ping
    Ma, Hua
    IEEE ACCESS, 2023, 11 : 36233 - 36243
  • [37] RFID-transformer recognition system (RTRS): enhancing privacy in facial recognition with transformer models
    Wang, Zeyuan
    Xu, He
    Zhang, Manman
    Cai, Zhaorui
    Chen, Yongyuan
    SENSOR REVIEW, 2025, 45 (01) : 17 - 30
  • [38] Facial Emotion Recognition using Neighborhood Features
    Aljoloud, Abdulaziz Salamah
    Ullah, Habib
    Alanazi, Adwan Alownie
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (01) : 299 - 306
  • [39] Facial Emotion Recognition Using Fuzzy Systems
    Nicolai, Austin
    Choi, Anthony
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 2216 - 2221
  • [40] Emotion Recognition using Facial and Audio features
    Krishna, Tarun
    Rai, Ayush
    Bansal, Shubham
    Khandelwal, Shubham
    Gupta, Shubham
    Goyal, Dushyant
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 557 - 562