Emotional Intensity Estimation of a Japanese Speech Corpus Using Acoustic Features

被引:0
|
作者
Kawase, Megumi [1 ]
Nakayama, Minoru [1 ]
机构
[1] Tokyo Inst Technol, Meguro Ku, Tokyo, Japan
关键词
speech; emotion; intensity; acoustic features; deep learning;
D O I
10.1109/IV53921.2021.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there have been many studies of emotion estimation using non-linguistic speech data. In addition, there have been a few studies on emotional intensity. However, failure to read this emotional intensity correctly can lead to errors in the responses humans and machines should make when communicating with each other. In this paper, we developed three models for emotional intensity estimation using deep learning, and examined the accuracy of emotional intensity estimation of a Japanese speech corpus, which resulted in 52.4% accuracy of emotional intensity estimation. We also investigated the correlations between acoustic features and analyzed the properties of acoustic features in order to improve estimation accuracy, and found that the differentiation of gammatone cepstral coefficients varied significantly between intensities.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 50 条
  • [1] Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus
    Kosaka, Tetsuo
    Aizawa, Yoshitaka
    Kato, Masaharu
    Nose, Takashi
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1747 - 1751
  • [2] Emotional Speech Discrimination using Sub-segmental Acoustic Features
    Ramdinmawii, Esther
    Mittal, Vinay Kumar
    [J]. 2017 2ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND NETWORKS (TEL-NET), 2017, : 152 - 158
  • [3] ACOUSTIC AND ARTICULATORY ANALYSIS ON JAPANESE VOWELS IN EMOTIONAL SPEECH
    Cao, Mengxue
    Li, Aijun
    Fang, Qiang
    Wei, Jianguo
    Song, Chan
    Dang, Jianwu
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 40 - 44
  • [4] FEATURES SELECTION FOR PRIMITIVES ESTIMATION ON EMOTIONAL SPEECH
    Perez Espinosa, Humberto
    Reyes Garcia, Carlos A.
    Villasenor Pineda, Luis
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5138 - 5141
  • [5] Prosodic and Acoustic Features of Emotional Speech in Taiwan Mandarin
    Lin, Hsin-Yi
    Fon, Janice
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 450 - 453
  • [6] JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
    Xin, Detai
    Jiang, Junfeng
    Takamichi, Shinnosuke
    Saito, Yuki
    Aizawa, Akiko
    Saruwatari, Hiroshi
    [J]. arXiv, 2023,
  • [7] JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions
    Xin, Detai
    Jiang, Junfeng
    Takamichi, Shinnosuke
    Saito, Yuki
    Aizawa, Akiko
    Saruwatari, Hiroshi
    [J]. IEEE ACCESS, 2024, 12 : 19752 - 19764
  • [8] ON THE ACOUSTIC FEATURES OF THE JAPANESE /R/ IN CONTINUOUS SPEECH.
    Ohmura, Hiroshi
    [J]. Denshi Gijutsu Sogo Kenkyusho Iho/Bulletin of the Electrotechnical Laboratory, 1986, 50 (2-3): : 189 - 192
  • [9] A new speech corpus of super-elderly Japanese for acoustic modeling
    Fukuda, Meiko
    Nishimura, Ryota
    Nishizaki, Hiromitsu
    Horii, Koharu
    Iribe, Yurie
    Yamamoto, Kazumasa
    Kitaoka, Norihide
    [J]. COMPUTER SPEECH AND LANGUAGE, 2023, 77
  • [10] Investigating Acoustic Cues of Emotional Valence in Mandarin Speech Prosody - A Corpus Approach
    Li, Junlin
    Huang, Chu-Ren
    [J]. CHINESE LEXICAL SEMANTICS, CLSW 2023, PT II, 2024, 14515 : 316 - 330