SPEAKER VARIABILITY IN EMOTION RECOGNITION - AN ADAPTATION BASED APPROACH

被引:0
|
作者
Ding, Ni [1 ]
Sethu, Vidhyasaharan [1 ]
Epps, Julien [1 ]
Ambikairajah, Eliathamby [1 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
关键词
Speaker adaptation; emotion classification; speaker normalisation; bootstrapping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
None of the features commonly utilised in automatic emotion classification systems completely disassociate emotion-specific information from speaker-specific information. Consequently, this speaker-specific variability adversely affects the performance of the emotion classification system and in existing systems is frequently mitigated by some form of speaker normalisation. Speaker adaptation offers an alternative to normalisation and this paper proposes a novel bootstrapping technique which involves selecting appropriate initial models from a large training pool, prior to speaker adaptation of emotion models in the context of GMM based emotion classification as an alternative to speaker normalisation. Evaluations on the LDC Emotional Prosody and the FAU Aibo corpora reveal that an emotion classification system based on the proposed bootstrapping method outperforms systems based on speaker normalisation as long as a small amount of labelled adaptation data is available. It also outperforms speaker adaption from common initial models estimated from all training speakers.
引用
收藏
页码:5101 / 5104
页数:4
相关论文
共 50 条
  • [1] Total variability subspace adaptation based speaker recognition
    Li, Zhi-Yi, 1836, Science Press (40):
  • [2] Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters
    Xi, Yuxuan
    Li, Pengcheng
    Song, Yan
    Jiang, Yiheng
    Dai, Lirong
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 513 - 518
  • [3] Speaker recognition with session variability normalization based on MLLR adaptation transforms
    Stolcke, Andreas
    Kajarekar, Sachin S.
    Ferrer, Luciana
    Shrinberg, Elizabeth
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 1987 - 1998
  • [4] Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation
    Jae-Bok Kim
    Jeong-Sik Park
    Yung-Hwan Oh
    Cognitive Computation, 2012, 4 : 398 - 408
  • [5] Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation
    Kim, Jae-Bok
    Park, Jeong-Sik
    Oh, Yung-Hwan
    COGNITIVE COMPUTATION, 2012, 4 (04) : 398 - 408
  • [6] Speaker Recognition and Speech Emotion Recognition Based on GMM
    Xu, Shupeng
    Liu, Yan
    Liu, Xiping
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ELECTRIC AND ELECTRONICS, 2013, : 434 - 436
  • [7] ON-LINE SPEAKER ADAPTATION BASED EMOTION RECOGNITION USING INCREMENTAL EMOTIONAL INFORMATION
    Kim, Jae-Bok
    Park, Jeong-Sik
    Oh, Yung-Hwan
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4948 - 4951
  • [8] Multimodal Emotion Recognition Based on the Decoupling of Emotion and Speaker Information
    Gajsek, Rok
    Struc, Vitomir
    Mihelic, France
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 275 - 282
  • [9] Extended Variability Modeling and Unsupervised Adaptation for PLDA Speaker Recognition
    McCree, Alan
    Sell, Gregory
    Garcia-Romero, Daniel
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1552 - 1556
  • [10] Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition
    Kim, Jae-Bok
    Park, Jeong-Sik
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 126 - 134