Prosodic feature normalization for emotion recognition by using synthesized speech

被引:1
|
作者
Suzuki, Motoyuki [1 ]
Nakagawa, Shohei [1 ]
Kita, Kenji [1 ]
机构
[1] Univ Tokushima, Inst Sci & Technol, Tokushima 7708506, Japan
关键词
Emotion recognition of speech; prosodic feature normalization; synthesized speech;
D O I
10.3233/978-1-61499-105-2-306
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech signals is one of the most important technologies for natural conversation between humans and robots. Most emotion recognizers extract prosodic features from an input speech in order to use emotion recognition. However, prosodic features changes drastically depending on the uttered text. In order to normalize the differences of prosodic features related to an uttered text, we used a synthesized speech signal. Most speech synthesizers output speech signals with a "neutral" emotion. After extracting prosodic features from an input speech, it is normalized by using prosodic features extracted from the synthesized speech. We propose two types of normalization, frame-level normalization and vector-level normalization. The experimental results showed that the frame-level normalization is effective for two important emotional dimensions. The average normalized difference was decreased by 0.41% (pleasantness) and 1.14% (arousal).
引用
收藏
页码:306 / 313
页数:8
相关论文
共 50 条
  • [1] Study of prosodic feature extraction for multidialectal Odia speech emotion recognition
    Swain, Monorama
    Routray, Aurobinda
    Kabisatpathy, P.
    Kundu, Jogendra N.
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1644 - 1649
  • [2] Prosodic Feature Based Speech Emotion Recognition At Segmental and Supra Segmental Levels
    Jacob, Agnes
    Mythili, P.
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [3] Emotion Recognition from Speech using Prosodic and Linguistic Features
    Pervaiz, Mahwish
    Khan, Tamim Ahmed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 84 - 90
  • [4] Emotion recognition method based on normalization of prosodic features
    Suzuki, Motoyuki
    Nakagawa, Shohei
    Kita, Kenji
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [5] Acoustic-Prosodic Recognition of Emotion in Speech
    Montenegro, Chuchi S.
    Maravillas, Elmer A.
    2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +
  • [6] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
  • [7] Speech Emotion Recognition Using Speech Feature and Word Embedding
    Atmaja, Bagus Tris
    Shirai, Kiyoaki
    Akagi, Masato
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
  • [8] Emotion recognition from speech using global and local prosodic features
    Rao K.S.
    Koolagudi S.G.
    Vempada R.R.
    International Journal of Speech Technology, 2013, 16 (2) : 143 - 160
  • [9] Emotion recognition from speech using source, system, and prosodic features
    Koolagudi, Shashidhar G.
    Rao, K. Sreenivasa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 265 - 289
  • [10] Improving Speech Emotion Recognition System Using Spectral and Prosodic Features
    Chakhtouna, Adil
    Sekkate, Sara
    Adib, Abdellah
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 399 - 409