SPEAKER VARIABILITY IN EMOTION RECOGNITION - AN ADAPTATION BASED APPROACH

被引：0

作者：

Ding, Ni ^{[1
]}

Sethu, Vidhyasaharan ^{[1
]}

Epps, Julien ^{[1
]}

Ambikairajah, Eliathamby ^{[1
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

来源：

2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年

关键词：

Speaker adaptation; emotion classification; speaker normalisation; bootstrapping;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

None of the features commonly utilised in automatic emotion classification systems completely disassociate emotion-specific information from speaker-specific information. Consequently, this speaker-specific variability adversely affects the performance of the emotion classification system and in existing systems is frequently mitigated by some form of speaker normalisation. Speaker adaptation offers an alternative to normalisation and this paper proposes a novel bootstrapping technique which involves selecting appropriate initial models from a large training pool, prior to speaker adaptation of emotion models in the context of GMM based emotion classification as an alternative to speaker normalisation. Evaluations on the LDC Emotional Prosody and the FAU Aibo corpora reveal that an emotion classification system based on the proposed bootstrapping method outperforms systems based on speaker normalisation as long as a small amount of labelled adaptation data is available. It also outperforms speaker adaption from common initial models estimated from all training speakers.

引用

页码：5101 / 5104

页数：4

共 50 条

[11] SVM Based Speaker Emotion Recognition in Continuous Scale
Hric, Martin
Chmulik, Michal
Guoth, Igor
Jarina, Roman
2015 25TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2015, : 339 - 342
[12] Speaker Clustering in Emotion Recognition
Ding, Ni
Epps, Julien
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1162 - 1165
[13] SPEAKER VARIABILITY IN SPEECH BASED EMOTION MODELS - ANALYSIS AND NORMALISATION
Sethu, Vidhyasaharan
Epps, Julien
Ambikairajah, Eliathamby
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7522 - 7526
[14] COMPARISON OF SPEAKER DEPENDENT AND SPEAKER INDEPENDENT EMOTION RECOGNITION
Rybka, Jan
Janicki, Artur
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2013, 23 (04) : 797 - 808
[15] SPEAKER ADAPTATION OF RNN-BLSTM FOR SPEECH RECOGNITION BASED ON SPEAKER CODE
Huang, Zhiying
Tang, Jian
Xue, Shaofei
Dai, Lirong
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5305 - 5309
[16] Graph Learning Based Speaker Independent Speech Emotion Recognition
Xu, Xinzhou
Huang, Chengwei
Wu, Chen
Wang, Qingyun
Zhao, Li
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2014, 14 (02) : 17 - 22
[17] An approach to speaker adaptation based on analytic functions
McDonough, J
Zavaliagkos, G
Gish, H
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 721 - 724
[18] Speaker Awareness for Speech Emotion Recognition
Assuncao, Gustavo
Menezes, Paulo
Perdigao, Fernando
INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2020, 16 (04) : 15 - 22
[19] Performance Comparison of Speaker and Emotion Recognition
Revathy, A.
Shanmugapriya, P.
Mohan, V.
2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,
[20] Speaker Attentive Speech Emotion Recognition
Le Moine, Clement
Obin, Nicolas
Roebel, Axel
INTERSPEECH 2021, 2021, : 2866 - 2870

← 1 2 3 4 5 →