Comparative Analysis of Windows for Speech Emotion Recognition Using CNN

被引：0

作者：

Teixeira, Felipe L. ^{[1
,2
]}

Soares, Salviano Pinto ^{[4
,5
]}

Abreu, J. L. Pio ^{[6
,7
]}

Oliveira, Paulo M. ^{[8
]}

Teixeira, Joao P. ^{[1
,3
]}

机构：

[1] Inst Politecn Braganca, Res Ctr Digitalizat & Intelligent Robot CEDRI, Campus Santa Apolonia, P-5300253 Braganca, Portugal

[2] Univ Tras Os Montes & Alto Douro UTAD, Sch Sci & Technol, Engn Dept, P-5000801 Vila Real, Portugal

[3] Inst Politecn Braganca, Associate Lab Sustainabil & Technol SusTEC, Campus Santa Apolonia, P-5300253 Braganca, Portugal

[4] Univ Aveiro, Inst Elect & Informat Engn Aveiro IEETA, P-3810193 Aveiro, Portugal

[5] Univ Aveiro, Intelligent Syst Associate Lab LASI, P-3810193 Aveiro, Portugal

[6] Hosp Univ Coimbra, P-3004561 Coimbra, Portugal

[7] Univ Coimbra, Fac Med, P-3000548 Coimbra, Portugal

[8] Univ Tras Os Montes & Alto Douro UTAD, INESC TEC, Vila Real, Portugal

来源：

OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, PT I, OL2A 2023 | 2024年 / 1981卷

关键词：

Speech Emotion Recognition; Hamming; Hanning; CNN; FEATURES; SPECTROGRAM; SELECTION;

D O I：

10.1007/978-3-031-53025-8_17

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The paper presents the comparison of accuracy in the Speech Emotion Recognition task using the Hamming and Hanning windows for framing the speech and determining the spectrogram to be used as input of a convolutional neural network. The detection of between 4 and 10 emotional states was tested for both windows. The results show significant differences in accuracy between the two window types and provide valuable insights for the development of more efficient emotional state detection systems. The best accuracy between 4 and 10 emotions was 64.1% (4 emotions), 57.8% (5 emotions), 59.8% (6 emotions), 48.4% (7 emotions), 47.8% (8 emotions), 51.4% (9 emotions), and 45.9% (10 emotions). These accuracy is at the state-of-the art level.

引用

页码：233 / 248

页数：16

共 50 条

[1] Speech Emotion Recognition Using CNN
Huang, Zhengwei
Dong, Ming
Mao, Qirong
Zhan, Yongzhao
[J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 801 - 804
[2] Speech Emotion Recognition Using Machine Learning: A Comparative Analysis
Nath S.
Shahi A.K.
Martin T.
Choudhury N.
Mandal R.
[J]. SN Computer Science, 5 (4)
[3] Learning Salient Features for Speech Emotion Recognition Using CNN
Liu, Jiamu
Han, Wenjing
Ruan, Huabin
Chen, Xiaomin
Jiang, Dongmei
Li, Haifeng
[J]. 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
[4] Speech Emotion Recognition using XGBoost and CNN BLSTM with Attention
He, Jingru
Ren, Liyong
[J]. 2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021), 2021, : 154 - 159
[5] A Combined CNN Architecture for Speech Emotion Recognition
Begazo, Rolinson
Aguilera, Ana
Dongo, Irvin
Cardinale, Yudith
[J]. SENSORS, 2024, 24 (17)
[6] Scalogram vs Spectrogram as Speech Representation Inputs for Speech Emotion Recognition Using CNN
Enriquez, Marc Dominic
Lucas, Crisron Rudolf
Aquino, Angelina
[J]. 2023 34TH IRISH SIGNALS AND SYSTEMS CONFERENCE, ISSC, 2023,
[7] BLSTM and CNN Stacking Architecture for Speech Emotion Recognition
Dongdong Li
Linyu Sun
Xinlei Xu
Zhe Wang
Jing Zhang
Wenli Du
[J]. Neural Processing Letters, 2021, 53 : 4097 - 4115
[8] BLSTM and CNN Stacking Architecture for Speech Emotion Recognition
Li, Dongdong
Sun, Linyu
Xu, Xinlei
Wang, Zhe
Zhang, Jing
Du, Wenli
[J]. NEURAL PROCESSING LETTERS, 2021, 53 (06) : 4097 - 4115
[9] Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model
Mishra, Swami
Bhatnagar, Nehal
Prakasam, P.
Sureshkumar, T. R.
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 37603 - 37620
[10] Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model
Swami Mishra
Nehal Bhatnagar
Prakasam P
Sureshkumar T. R
[J]. Multimedia Tools and Applications, 2024, 83 : 37603 - 37620

← 1 2 3 4 5 →