Prosodic feature normalization for emotion recognition by using synthesized speech

被引：1

作者：

Suzuki, Motoyuki ^{[1
]}

Nakagawa, Shohei ^{[1
]}

Kita, Kenji ^{[1
]}

机构：

[1] Univ Tokushima, Inst Sci & Technol, Tokushima 7708506, Japan

来源：

ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS | 2012年 / 243卷

关键词：

Emotion recognition of speech; prosodic feature normalization; synthesized speech;

D O I：

10.3233/978-1-61499-105-2-306

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion recognition from speech signals is one of the most important technologies for natural conversation between humans and robots. Most emotion recognizers extract prosodic features from an input speech in order to use emotion recognition. However, prosodic features changes drastically depending on the uttered text. In order to normalize the differences of prosodic features related to an uttered text, we used a synthesized speech signal. Most speech synthesizers output speech signals with a "neutral" emotion. After extracting prosodic features from an input speech, it is normalized by using prosodic features extracted from the synthesized speech. We propose two types of normalization, frame-level normalization and vector-level normalization. The experimental results showed that the frame-level normalization is effective for two important emotional dimensions. The average normalized difference was decreased by 0.41% (pleasantness) and 1.14% (arousal).

引用

页码：306 / 313

页数：8

共 50 条

[1] Study of prosodic feature extraction for multidialectal Odia speech emotion recognition
Swain, Monorama
Routray, Aurobinda
Kabisatpathy, P.
Kundu, Jogendra N.
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1644 - 1649
[2] Prosodic Feature Based Speech Emotion Recognition At Segmental and Supra Segmental Levels
Jacob, Agnes
Mythili, P.
2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
[3] Emotion Recognition from Speech using Prosodic and Linguistic Features
Pervaiz, Mahwish
Khan, Tamim Ahmed
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 84 - 90
[4] Emotion recognition method based on normalization of prosodic features
Suzuki, Motoyuki
Nakagawa, Shohei
Kita, Kenji
2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[5] Acoustic-Prosodic Recognition of Emotion in Speech
Montenegro, Chuchi S.
Maravillas, Elmer A.
2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +
[6] Speech emotion recognition using emotion perception spectral feature
Jiang, Lin
Tan, Ping
Yang, Junfeng
Liu, Xingbao
Wang, Chao
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
[7] Speech Emotion Recognition Using Speech Feature and Word Embedding
Atmaja, Bagus Tris
Shirai, Kiyoaki
Akagi, Masato
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
[8] Emotion recognition from speech using global and local prosodic features
Rao K.S.
Koolagudi S.G.
Vempada R.R.
International Journal of Speech Technology, 2013, 16 (2) : 143 - 160
[9] Emotion recognition from speech using source, system, and prosodic features
Koolagudi, Shashidhar G.
Rao, K. Sreenivasa
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 265 - 289
[10] Improving Speech Emotion Recognition System Using Spectral and Prosodic Features
Chakhtouna, Adil
Sekkate, Sara
Adib, Abdellah
INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 399 - 409

← 1 2 3 4 5 →