Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition

被引：35

作者：

Xu, Xinzhou ^{[1
,2
,3
]}

Deng, Jun ^{[4
]}

Coutinho, Eduardo ^{[5
,6
]}

Wu, Chen ^{[1
]}

Zhao, Li ^{[2
]}

Schuller, Bjoern W. ^{[7
,8
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Internet Things, Nanjing 210003, Jiangsu, Peoples R China

[2] Southeast Univ, Minist Educ, Key Lab Underwater Acoust Signal Proc, Nanjing 210096, Jiangsu, Peoples R China

[3] Tech Univ Munich, MMK, Machine Intelligence & Signal Proc Grp, D-80290 Munich, Germany

[4] AudEERING GmbH, D-82205 Gilching, Germany

[5] Imperial Coll London, Dept Comp, London SW7 2AZ, England

[6] Univ Liverpool, Dept Mus, Liverpool L69 3BX, Merseyside, England

[7] Imperial Coll London, Grp Language Audio & Mus, London SW7 2AZ, England

[8] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, D-86159 Augsburg, Germany

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2019年 / 21卷 / 03期

基金：

欧盟地平线“2020”;

关键词：

Speech emotion recognition; extreme learning machine; subspace learning; graph embedding; spectral regression; DISCRIMINANT-ANALYSIS; SPECTRAL REGRESSION; GENERAL FRAMEWORK; APPROXIMATION; ALGORITHMS; SELECTION;

D O I：

10.1109/TMM.2018.2865834

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech emotion recognition (SER) is a powerful tool for endowing computers with the capacity to process information about the affective states of users in human-machine interactions. Recent research has shown the effectiveness of graph embedding-based subspace learning and extreme learning machine applied to SER, but there are still various drawbacks in these two techniques that limit their application. Regarding subspace learning, the change from linearity to nonlinearity is usually achieved through kernelization, whereas extreme learning machines only take label information into consideration at the output layer. In order to overcome these drawbacks, this paper leverages extreme learning machines for dimensionality reduction and proposes a novel framework to combine spectral regression-based subspace learning and extreme learning machines. The proposed framework contains three stages-data mapping, graph decomposition, and regression. At the data mapping stage, various mapping strategies provide different views of the samples. At the graph decomposition stage, specifically designed embedding graphs provide a possibility to better represent the structure of data through generating virtual coordinates. Finally, at the regression stage, dimension-reduced mappings are achieved by connecting the virtual coordinates and data mapping. Using this framework, we propose several novel dimensionality reduction algorithms, apply them to SER tasks, and compare their performance to relevant state-of-the-art methods. Our results on several paralinguistic corpora show that our proposed techniques lead to significant improvements.

引用

页码：795 / 808

页数：14

共 50 条

[1] SER: Speech Emotion Recognition Application Based on Extreme Learning Machine
Ainurrochman
Febriansyah, Irfanur Ilham
Yuhana, Umi Laili
[J]. PROCEEDINGS OF 2021 13TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2021, : 179 - 183
[2] A FEATURE FUSION METHOD BASED ON EXTREME LEARNING MACHINE FOR SPEECH EMOTION RECOGNITION
Guo, Lili
Wang, Longbiao
Dang, Jianwu
Zhang, Linjuan
Guan, Haotian
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2666 - 2670
[3] Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine
Han, Kun
Yu, Dong
Tashev, Ivan
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 223 - 227
[4] Machine Learning Approach for Emotion Recognition in Speech
Gjoreski, Martin
Gjoreski, Hristijan
Kulakov, Andrea
[J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 377 - 383
[5] Speech emotion recognition based on feature selection and extreme learning machine decision tree
Liu, Zhen-Tao
Wu, Min
Cao, Wei-Hua
Mao, Jun-Wei
Xu, Jian-Ping
Tan, Guan-Zheng
[J]. NEUROCOMPUTING, 2018, 273 : 271 - 280
[6] Speech emotion recognition using optimized genetic algorithm-extreme learning machine
Musatafa Abbas Abbood Albadr
Sabrina Tiun
Masri Ayob
Fahad Taha AL-Dhief
Khairuddin Omar
Mhd Khaled Maen
[J]. Multimedia Tools and Applications, 2022, 81 : 23963 - 23989
[7] Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
Guo, Lili
Wang, Longbiao
Dang, Jianwu
Liu, Zhilei
Guan, Haotian
[J]. IEEE ACCESS, 2019, 7 : 75798 - 75809
[8] Speech emotion recognition using optimized genetic algorithm-extreme learning machine
Albadr, Musatafa Abbas Abbood
Tiun, Sabrina
Ayob, Masri
AL-Dhief, Fahad Taha
Omar, Khairuddin
Maen, Mhd Khaled
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (17) : 23963 - 23989
[9] Emotion Recognition On Speech Signals Using Machine Learning
Ghai, Mohan
Lal, Shamit
Duggal, Shivam
Manik, Shrey
[J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS AND COMPUTATIONAL INTELLIGENCE (ICBDAC), 2017, : 34 - 39
[10] Speech based Emotion Recognition using Machine Learning
Deshmukh, Girija
Gaonkar, Apurva
Golwalkar, Gauri
Kulkarni, Sukanya
[J]. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 812 - 817

← 1 2 3 4 5 →