Speech activity detection and automatic prosodic processing unit segmentation for emotion recognition

被引:4
|
作者
Sztaho, David [1 ]
Vicsi, Klara [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Mediainformat, Magyar Tudosok Korutja 2, H-1117 Budapest, Hungary
来源
关键词
Speech acoustics; speech segmentation; Hidden Markov-Models; speech processing;
D O I
10.3233/IDT-140199
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speech communication emotions play a great role in expressing information. These emotions are partly given as reactions to our environment, to our partners during a conversation. Understanding these reactions and recognizing them automatically is highly important. Through them, we can get a clearer picture of the response of our partner in a conversation. In Cognitive Info Communication this kind of information helps us to develop robots, devices that are more aware of the need of the user, making the device easy and enjoyable to use. In our laboratory we conducted automatic emotion classification and speech segmentation experiments. In order to develop an automatic emotion recognition system on the basis of speech, an automatic speech segmenter is also needed to separate the speech segments needed for the emotion analysis. In our former research we found that the intonational phrase can be a proper unit of emotion analysis. In this paper speech detection and segmentation methods are developed. For speech detection, Hidden Markov Models are used with various noise and speech acoustic models. The results show that the procedure is able to detect speech in the sound signal with more than 91% accuracy and segment it into intonational phrases.
引用
收藏
页码:315 / 324
页数:10
相关论文
共 50 条
  • [1] Acoustic-Prosodic Recognition of Emotion in Speech
    Montenegro, Chuchi S.
    Maravillas, Elmer A.
    [J]. 2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +
  • [2] Fully Automatic Segmentation for Prosodic Speech Corpora
    Hoffmann, Sarah
    Pfister, Beat
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1389 - 1392
  • [3] Impact of action unit detection in automatic emotion recognition
    Senechal, Thibaud
    Bailly, Kevin
    Prevost, Lionel
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2014, 17 (01) : 51 - 67
  • [4] Impact of action unit detection in automatic emotion recognition
    Thibaud Senechal
    Kevin Bailly
    Lionel Prevost
    [J]. Pattern Analysis and Applications, 2014, 17 : 51 - 67
  • [5] AUTOMATIC DETECTION OF PROSODIC BOUNDARIES IN SPEECH
    CAMPBELL, N
    [J]. SPEECH COMMUNICATION, 1993, 13 (3-4) : 343 - 354
  • [6] Prosodic and accentual information for automatic speech recognition
    Milone, DH
    Rubio, AJ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (04): : 321 - 333
  • [7] Prosodic knowledge sources for automatic speech recognition
    Vergyri, D
    Stolcke, A
    Gadde, VRR
    Ferrer, L
    Shriberg, E
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 208 - 211
  • [8] A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
    Esmaileyan, Z.
    Marvi, H.
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING, 2014, 27 (01): : 79 - 89
  • [9] Spontaneous Speech Segmentation: Functional and Prosodic Aspects with Applications for Automatic Segmentation
    Barbosa, Plinio A.
    Raso, Tommaso
    [J]. REVISTA DE ESTUDOS DA LINGUAGEM, 2018, 26 (04) : 1361 - 1396
  • [10] Automatic statistical analysis of the signal and prosodic signs of emotion in speech
    Cowie, R
    DouglasCowie, E
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1989 - 1992