Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech

被引:0
|
作者
Sarma, Biswajit Dev [1 ]
Das, Rohan Kumar [2 ]
机构
[1] Indian Inst Technol Guwahati, Gauhati, India
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
WHISPERED SPEECH; RECOGNITION; VERIFICATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotional state of a speaker is found to have significant effect in speech production, which can deviate speech from that arising from neutral state. This makes identifying speakers with different emotions a challenging task as generally the speaker models are trained using neutral speech. In this work, we propose to overcome this problem by creation of emotion invariant speaker embedding. We learn an extractor network that maps the test embeddings with different emotions obtained using i-vector based system to an emotion invariant space. The resultant test embeddings thus become emotion invariant and thereby compensate the mismatch between various emotional states. The studies are conducted using four different emotion classes from IEMOCAP database. We obtain an absolute improvement of 2.6% in accuracy for speaker identification studies using emotion invariant speaker embedding against average speaker model based framework with different emotions.
引用
收藏
页码:610 / 615
页数:6
相关论文
共 50 条
  • [1] Speaker Modeling Using Emotional Speech for More Robust Speaker Identification
    M. Milošević
    Ž. Nedeljković
    U. Glavitsch
    Ž. Đurović
    [J]. Journal of Communications Technology and Electronics, 2019, 64 : 1256 - 1265
  • [2] Speaker Modeling Using Emotional Speech for More Robust Speaker Identification
    Milosevic, M.
    Nedeljkovic, Z.
    Glavitsch, U.
    Durovic, Z.
    [J]. JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2019, 64 (11) : 1256 - 1265
  • [3] Emotion Attribute Projection for Speaker Recognition on Emotional Speech
    Bao, Huanjun
    Xu, Mingxing
    Zheng, Thomas Fang
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 601 - 604
  • [4] Text Independent Speaker and Emotion Independent Speech Recognition in Emotional Environment
    Revathi, A.
    Venkataramani, Y.
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 43 - 52
  • [5] Speaker Awareness for Speech Emotion Recognition
    Assuncao, Gustavo
    Menezes, Paulo
    Perdigao, Fernando
    [J]. INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2020, 16 (04) : 15 - 22
  • [6] Speaker Attentive Speech Emotion Recognition
    Le Moine, Clement
    Obin, Nicolas
    Roebel, Axel
    [J]. INTERSPEECH 2021, 2021, : 2866 - 2870
  • [7] IMPROVING SPEAKER IDENTIFICATION FOR SHARED DEVICES BY ADAPTING EMBEDDINGS TO SPEAKER SUBSETS
    Tan, Zhenning
    Yang, Yuguang
    Han, Eunjung
    Stolcke, Andreas
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1124 - 1131
  • [8] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
    Lu, Cheng
    Zong, Yuan
    Zheng, Wenming
    Li, Yang
    Tang, Chuangao
    Schuller, Bjoern W.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230
  • [9] Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
    Qin, Xiaoyi
    Li, Na
    Weng, Chao
    Su, Dan
    Li, Ming
    [J]. INTERSPEECH 2022, 2022, : 1436 - 1440
  • [10] On Deep Speaker Embeddings for Speaker Verification
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Chmulik, Michal
    [J]. 2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2021, : 162 - 166