On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition

被引:19
|
作者
Fayek, Haytham M. [1 ]
Lech, Margaret [1 ]
Cavedon, Lawrence [2 ]
机构
[1] RMIT Univ, Sch Engn, Melbourne, Vic 3001, Australia
[2] RMIT Univ, Sch Sci, Melbourne, Vic 3001, Australia
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
deep learning; emotion recognition; neural networks; speech recognition; transfer learning; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2016-868
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The correlation between Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER) is poorly understood. Studying such correlation may pave the way for integrating both tasks into a single system or may provide insights that can aid in advancing both systems such as improving ASR in dealing with emotional speech or embedding linguistic input into SER. In this paper, we quantify the relation between ASR and SER by studying the relevance of features learned between both tasks in deep convolutional neural networks using transfer learning. Experiments are conducted using the TIMIT and IEMOCAP databases. Results reveal an intriguing correlation between both tasks, where features learned in some layers particularly towards initial layers of the network for either task were found to be applicable to the other task with varying degree.
引用
收藏
页码:3618 / 3622
页数:5
相关论文
共 50 条
  • [41] Age Driven Automatic Speech Emotion Recognition System
    Verma, Devika
    Mukhopadhyay, Debajyoti
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 1005 - 1010
  • [42] Improving Automatic Emotion Recognition from Speech Signals
    Bozkurt, Elif
    Erzin, Engin
    Erdem, Cigdem Eroglu
    Erdem, A. Tanju
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 312 - +
  • [43] SENTIMENT-AWARE AUTOMATIC SPEECH RECOGNITION PRE-TRAINING FOR ENHANCED SPEECH EMOTION RECOGNITION
    Ghriss, Ayoub
    Yang, Bo
    Rozgic, Viktor
    Shriberg, Elizabeth
    Wang, Chao
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7347 - 7351
  • [44] Time Window Analysis for Automatic Speech Emotion Recognition
    Puterka, Boris
    Kacur, Juraj
    PROCEEDINGS OF ELMAR-2018: 60TH INTERNATIONAL SYMPOSIUM ELMAR-2018, 2018, : 143 - 146
  • [45] Prominence features: Effective emotional features for speech emotion recognition
    Jing, Shaoling
    Mao, Xia
    Chen, Lijiang
    DIGITAL SIGNAL PROCESSING, 2018, 72 : 216 - 231
  • [46] Speech production and automatic speech recognition
    Acoustics Bulletin, 2000, 25 (02):
  • [47] AUTOMATIC SPEECH RECOGNITION OF IMPAIRED SPEECH
    CARLSON, GS
    BERNSTEIN, J
    INTERNATIONAL JOURNAL OF REHABILITATION RESEARCH, 1988, 11 (04) : 396 - 398
  • [48] The Use of Correlation Features in the Problem of Speech Recognition
    Andriyanov, Nikita
    ALGORITHMS, 2023, 16 (02)
  • [49] Emotion Recognition in Speech Using MFCC and Wavelet Features
    Kishore, K. V. Krishna
    Satish, P. Krishna
    PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 842 - 847
  • [50] Speech Emotion Recognition Considering Local Dynamic Features
    Guan, Haotian
    Liu, Zhilei
    Wang, Longbiao
    Dang, Jianwu
    Yu, Ruiguo
    STUDIES ON SPEECH PRODUCTION, 2018, 10733 : 14 - 23