On the use of speech parameter contours for emotion recognition

被引:0
|
作者
Vidhyasaharan Sethu
Eliathamby Ambikairajah
Julien Epps
机构
[1] The University of New South Wales,The School of Electrical Engineering and Telecommunications
关键词
Emotion recognition; Paralinguistic information; Pitch contours; Formant contours; Glottal spectrum; Temporal information; LDC emotional prosody speech corpus;
D O I
暂无
中图分类号
学科分类号
摘要
Many features have been proposed for speech-based emotion recognition, and a majority of them are frame based or statistics estimated from frame-based features. Temporal information is typically modelled on a per utterance basis, with either functionals of frame-based features or a suitable back-end. This paper investigates an approach that combines both, with the use of temporal contours of parameters extracted from a three-component model of speech production as features in an automatic emotion recognition system using a hidden Markov model (HMM)-based back-end. Consequently, the proposed system models information on a segment-by-segment scale is larger than a frame-based scale but smaller than utterance level modelling. Specifically, linear approximations to temporal contours of formant frequencies, glottal parameters and pitch are used to model short-term temporal information over individual segments of voiced speech. This is followed by the use of HMMs to model longer-term temporal information contained in sequences of voiced segments. Listening tests were conducted to validate the use of linear approximations in this context. Automatic emotion classification experiments were carried out on the Linguistic Data Consortium emotional prosody speech and transcripts corpus and the FAU Aibo corpus to validate the proposed approach.
引用
收藏
相关论文
共 50 条
  • [21] Emotion recognition in Arabic speech
    Hadjadji, Imene
    Falek, Leila
    Demri, Lyes
    Teffahi, Hocine
    2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRICAL ENGINEERING (ICAEE), 2019,
  • [22] Bengali Speech Emotion Recognition
    Mohanta, Abhijit
    Sharma, Uzzal
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2812 - 2814
  • [23] Emotion recognition in Arabic speech
    Klaylat, Samira
    Osman, Ziad
    Hamandi, Lama
    Zantout, Rached
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2018, 96 (02) : 337 - 351
  • [24] Multiroom Speech Emotion Recognition
    Shalev, Erez
    Cohen, Israel
    European Signal Processing Conference, 2022, 2022-August : 135 - 139
  • [25] The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)
    Oh, Qi Qi
    Seow, Chee Kiat
    Yusuff, Mulliana
    Pranata, Sugiri
    Cao, Qi
    2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 523 - 531
  • [26] Emotion Recognition using Imperfect Speech Recognition
    Metze, Florian
    Batliner, Anton
    Eyben, Florian
    Polzehl, Tim
    Schuller, Bjoern
    Steidl, Stefan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 478 - +
  • [27] Use of Multiple Classifier System for Gender Driven Speech Emotion Recognition
    Ladde, Pravina P.
    Deshmukh, Vaishali S.
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 713 - 717
  • [28] Parameter sharing in speech recognition
    Guo, Rui
    Zhu, Xiaoyan
    2002, Press of Tsinghua University (42):
  • [29] Research on Emergency Parking Instruction Recognition Based on Speech Recognition and Speech Emotion Recognition
    Tian Kexin
    Huang Yongming
    Zhang Guobao
    Zhang Lin
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2933 - 2937
  • [30] Speech Emotion Recognition Using CNN
    Huang, Zhengwei
    Dong, Ming
    Mao, Qirong
    Zhan, Yongzhao
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 801 - 804