Emotional Intensity Estimation of a Japanese Speech Corpus Using Acoustic Features

被引:0
|
作者
Kawase, Megumi [1 ]
Nakayama, Minoru [1 ]
机构
[1] Tokyo Inst Technol, Meguro Ku, Tokyo, Japan
关键词
speech; emotion; intensity; acoustic features; deep learning;
D O I
10.1109/IV53921.2021.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there have been many studies of emotion estimation using non-linguistic speech data. In addition, there have been a few studies on emotional intensity. However, failure to read this emotional intensity correctly can lead to errors in the responses humans and machines should make when communicating with each other. In this paper, we developed three models for emotional intensity estimation using deep learning, and examined the accuracy of emotional intensity estimation of a Japanese speech corpus, which resulted in 52.4% accuracy of emotional intensity estimation. We also investigated the correlations between acoustic features and analyzed the properties of acoustic features in order to improve estimation accuracy, and found that the differentiation of gammatone cepstral coefficients varied significantly between intensities.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 50 条
  • [31] Acoustic and Perceptual Features of the Emotional Speech of Adolescents Aged 14–16 Years
    Grigorev A.S.
    Gorodnyi V.A.
    Frolova O.V.
    Kondratenko A.M.
    Dolgaya V.D.
    Lyakso E.E.
    [J]. Neuroscience and Behavioral Physiology, 2020, 50 (9) : 1224 - 1231
  • [32] CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings
    Perotin, Laureline
    Serizel, Romain
    Vincent, Emmanuel
    Guerin, Alexandre
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (01) : 22 - 33
  • [33] Acoustic Characteristics of Emotional Speech Using Spectrogram Image Classification
    Stolar, Melissa
    Lech, Margaret
    Bolia, Robert S.
    Skinner, Michael
    [J]. 2018 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2018,
  • [34] CPJD Corpus: Crowdsourced Parallel Speech Corpus of Japanese Dialects
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 434 - 437
  • [35] CPJD corpus: Crowdsourced parallel speech corpus of Japanese dialects
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    [J]. LREC 2018 - 11th International Conference on Language Resources and Evaluation, 2019, : 434 - 437
  • [36] EmoChildRu: Emotional Child Russian Speech Corpus
    Lyakso, Elena
    Frolova, Olga
    Dmitrieva, Evgeniya
    Grigorev, Aleksey
    Kaya, Heysem
    Salah, Albert Ali
    Karpov, Alexey
    [J]. SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 144 - 152
  • [37] Emotional Speech Corpus for Persuasive Dialogue System
    Asai, Sara
    Yoshino, Koichiro
    Shinagawa, Seitaro
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 491 - 497
  • [38] EMOVO Corpus: an Italian Emotional Speech Database
    Costantini, Giovanni
    Iadarola, Iacopo
    Paoloni, Andrea
    Todisco, Massimiliano
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3501 - 3504
  • [39] A Cross-Corpus Recognition of Emotional Speech
    Xiao, Zhongzhe
    Wu, Di
    Zhang, Xiaojun
    Tao, Zhi
    [J]. PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 42 - 46
  • [40] Construction and Analysis of Indonesian Emotional Speech Corpus
    Lubis, Nurul
    Lestari, Dessi
    Purwarianti, Ayu
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 2014 17TH ORIENTAL CHAPTER OF THE INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDIZATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (COCOSDA), 2014,