BackgroundThe emotional state of individuals is difficult to identify and it is developing now a days because of vast interest in recognition. Many technologies have been developed to identify this emotional expression based on facial expressions, vocal expressions, physiological signals, and body expressions. Among these, facial emotion is very expressive for recognition using multimodalities. Understanding facial emotions has applications in mental well-being, decision-making, and even social change, as emotions play a crucial role in our lives. This recognition is complicated by the high dimensionality of data and non-linear interactions across modalities. Moreover, the way emotion is expressed by people varies and these feature identification remains challenging, where these limitations are overcome by Deep learning models.MethodsThis research work aims at facial emotion recognition through the utilization of a deep learning model, named the proposed Residual Fused-Graph Convolution Network (RF-GCN). Here, multimodal data included is video as well as an Electroencephalogram (EEG) signal. Also, the Non-Local Means (NLM) filter is used for pre-processing input video frames. Here, the feature selection process is carried out using chi-square, after feature extraction, which is done in both pre-processed video frames and input EEG signals. Finally, facial emotion recognition and its types are determined by RF-GCN, which is a combination of both the Deep Residual Network (DRN) and Graph Convolutional Network (GCN).ResultsFurther, RF-GCN is evaluated for performance by metrics such as accuracy, recall, and precision, with superior values of 91.6%, 96.5%, and 94.7%.ConclusionsRF-GCN captures the nuanced relationships between different emotional states and improves recognition accuracy. The model is trained and evaluated on the dataset and reflects real-world conditions. In this work, RF-GCN for facial emotion recognition is devised. First, input video is acquired for dataset. Then, the pre-processing is carried out using NLM filter to remove noise from the frame. After that, the features, such as statistical features, LGBP, MBP, CLBP, LTP, LTrP using Discrete Wavelet Transform are extracted. Simultaneously, EEG input signal is acquired, which is fed for feature extraction process. Here, necessary features are extracted. After that, appropriate feature selection is carried out using chi-square feature selection. Finally, facial emotion recognition and its types are detected by RF-GCN. This RF-GCN is developed by the fusion of DRN and GCN. image