Emotional Intensity Estimation of a Japanese Speech Corpus Using Acoustic Features

被引:0
|
作者
Kawase, Megumi [1 ]
Nakayama, Minoru [1 ]
机构
[1] Tokyo Inst Technol, Meguro Ku, Tokyo, Japan
关键词
speech; emotion; intensity; acoustic features; deep learning;
D O I
10.1109/IV53921.2021.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there have been many studies of emotion estimation using non-linguistic speech data. In addition, there have been a few studies on emotional intensity. However, failure to read this emotional intensity correctly can lead to errors in the responses humans and machines should make when communicating with each other. In this paper, we developed three models for emotional intensity estimation using deep learning, and examined the accuracy of emotional intensity estimation of a Japanese speech corpus, which resulted in 52.4% accuracy of emotional intensity estimation. We also investigated the correlations between acoustic features and analyzed the properties of acoustic features in order to improve estimation accuracy, and found that the differentiation of gammatone cepstral coefficients varied significantly between intensities.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 50 条
  • [41] Multidimensional Features of Emotional Speech
    Suzuki, Tomoko
    Ikemoto, Machiko
    Sano, Tomoko
    Kinoshita, Toshihiko
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 240 - 240
  • [42] DEVELOPMENT OF NEW SPEECH CORPUS FOR ELDERLY JAPANESE SPEECH RECOGNITION
    Iribe, Yurie
    Kitaoka, Norihide
    Segawa, Shuhei
    [J]. 2015 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2015 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2015, : 27 - 31
  • [43] An Enhancement of Japanese Acoustic Model using Korean Speech Database
    Lee, Minkyu
    Kim, Sanghun
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2013, 32 (05): : 438 - 445
  • [44] Estonian Emotional Speech Corpus: Culture and Age in Selecting Corpus Testers
    Altrov, Rene
    Pajupuu, Hille
    [J]. HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 25 - 32
  • [45] TENDENCIES REGARDING THE EFFECT OF EMOTIONAL INTENSITY IN INTER CORPUS PHONEME-LEVEL SPEECH EMOTION MODELLING
    Vlasenko, Bogdan
    Schuller, Bjoern
    Wendemuth, Andreas
    [J]. 2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [46] Construction of a Corpus for Elderly Japanese Speech Recognition
    Fukuda, Meiko
    Nishimura, Ryota
    Kitaoka, Norihide
    Nishizaki, Hiromitsu
    Iribe, Yurie
    [J]. 2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 687 - 688
  • [47] Acoustic features of whispered speech
    Jovicic, ST
    Dordevic, MM
    [J]. ACUSTICA, 1996, 82 : S228 - S228
  • [48] Emotional Speech Recognition Using Acoustic Models of Decomposed Component Words
    Kaveeta, Vivatchai
    Patanukhom, Karn
    [J]. 2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 115 - 119
  • [49] Recognizing emotion from Turkish speech using acoustic features
    Oflazoglu, Caglar
    Yildirim, Serdar
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
  • [50] Classification of Speech with and without Face Mask using Acoustic Features
    Das, Rohan Kumar
    Li, Haizhou
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 747 - 752