Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information

被引:13
|
作者
Jo, A-Hyeon [1 ]
Kwak, Keun-Chang [2 ]
机构
[1] Chosun Univ, Elect Engn IT Bio Convergence Syst Major, Gwangju 61452, South Korea
[2] Chosun Univ, Elect Engn, Gwangju 61452, South Korea
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 04期
关键词
speech emotion recognition; human-computer interaction; two-stream; bidirectional long-short term memory; convolutional neural network;
D O I
10.3390/app13042167
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Identifying a person's emotions is an important element in communication. In particular, voice is a means of communication for easily and naturally expressing emotions. Speech emotion recognition technology is a crucial component of human-computer interaction (HCI), in which accurately identifying emotions is key. Therefore, this study presents a two-stream-based emotion recognition model based on bidirectional long short-term memory (Bi-LSTM) and convolutional neural networks (CNNs) using a Korean speech emotion database, and the performance is comparatively analyzed. The data used in the experiment were obtained from the Korean speech emotion recognition database built by Chosun University. Two deep learning models, Bi-LSTM and YAMNet, which is a CNN-based transfer learning model, were connected in a two-stream architecture to design an emotion recognition model. Various speech feature extraction methods and deep learning models were compared in terms of performance. Consequently, the speech emotion recognition performance of Bi-LSTM and YAMNet was 90.38% and 94.91%, respectively. However, the performance of the two-stream model was 96%, which was a minimum of 1.09% and up to 5.62% improved compared with a single model.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Optimal feature selection based speech emotion recognition using two-stream deep convolutional neural network
    Mustaqeem
    Kwon, Soonil
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (09) : 5116 - 5135
  • [2] Two-stream Emotion-embedded Autoencoder for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    [J]. 2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 969 - 974
  • [3] Speech Emotion Recognition Using Deep Learning on audio recordings
    Suganya, S.
    Charles, E. Y. A.
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [4] Emotion recognition of audio/speech data using deep learning approaches
    Gupta, Vedika
    Juyal, Stuti
    Singh, Gurvinder Pal
    Killa, Chirag
    Gupta, Nishant
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06): : 1309 - 1317
  • [5] Deep Learning and Audio Based Emotion Recognition
    Demir, Asli
    Atila, Orhan
    Sengur, Abdulkadir
    [J]. 2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [6] Deep learning based Affective Model for Speech Emotion Recognition
    Zhou, Xi
    Guo, Junqi
    Bie, Rongfang
    [J]. 2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 841 - 846
  • [7] Two-Stream Deep Learning Architecture-Based Human Action Recognition
    Shehzad, Faheem
    Khan, Muhammad Attique
    Yar, Muhammad Asfand E.
    Sharif, Muhammad
    Alhaisoni, Majed
    Tariq, Usman
    Majumdar, Arnab
    Thinnukool, Orawit
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5931 - 5949
  • [8] Multi-Features Audio Extraction for Speech Emotion Recognition Based on Deep Learning
    Gondohanindijo, Jutono
    Muljono
    Noersasongko, Edi
    Pujiono
    Setiadi, De Rosal Moses
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 198 - 206
  • [9] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    [J]. ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [10] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    [J]. ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197