Speech synthesis of emotions using vowel features of a speaker

被引：3

作者：

Boku, Kanu ^{[1
]}

Asada, Taro ^{[1
]}

Yoshitomi, Yasunari ^{[1
]}

Tabuse, Masayoshi ^{[1
]}

机构：

[1] Kyoto Prefectural Univ, Grad Sch Life & Environm Sci, Sakyo Ku, 1-5 Nakaragi Cho, Shimogamo, Kyoto 6068522, Japan

来源：

ARTIFICIAL LIFE AND ROBOTICS | 2014年 / 19卷 / 01期

关键词：

Emotional speech; Feature parameter; Synthetic speech; Emotional synthetic speech; Vowel;

D O I：

10.1007/s10015-013-0126-9

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Recently, methods for adding emotion to synthetic speech have received considerable attention in the field of speech synthesis research. We previously proposed a case-based method for generating emotional synthetic speech by exploiting the characteristics of the maximum amplitude and the utterance time of vowels, and the fundamental frequency of emotional speech. In the present study, we propose a method in which our reported method is further improved by controlling the fundamental frequency of emotional synthetic speech. As an initial investigation, we adopted the utterance of a Japanese name that is semantically neutral. By using the proposed method, emotional synthetic speech made from the emotional speech of one male subject was discriminable with a mean accuracy of 83.9 % when 18 subjects listened to the emotional synthetic utterances of "angry,'' "happy,'' "neutral,'' "sad,'' or "surprised'' when the utterance was the Japanese name "Taro,'' or "Hiroko.'' Further adjustment of fundamental frequency in the proposed method made a much clearer impression on the subjects for emotional synthetic speech.

引用

页码：27 / 32

页数：6

共 50 条

[1] Speech synthesis of emotions using vowel features of a speaker
Boku, K.
Asada, T.
Yoshitomi, Y.
Tabuse, M.
PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 18TH '13), 2013, : 176 - 179
[2] Speech Synthesis of Emotions Using Vowel Features
Boku, Kanu
Asada, Taro
Yoshitomi, Yasunari
Tabuse, Masayoshi
INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2013, 1 (01) : 54 - 67
[3] Speech Synthesis of Emotions in a Sentence Using Vowel Features
Makino, Rintaro
Yoshitomi, Yasunari
Asada, Taro
Tabuse, Masayoshi
PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 403 - 406
[4] Speech Synthesis of Emotions in a Sentence using Vowel Features
Makino, Rintaro
Yoshitomi, Yasunari
Asada, Taro
Tabuse, Masayoshi
JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2020, 7 (02): : 107 - 110
[5] VOWEL AND SPEAKER IDENTIFICATION IN NATURAL AND SYNTHETIC SPEECH
LEHISTE, I
MELTZER, D
LANGUAGE AND SPEECH, 1973, 16 (OCT-D) : 356 - 364
[6] VOWEL AND SPEAKER IDENTIFICATION IN NATURAL AND SYNTHETIC SPEECH
MELTZER, D
LEHISTE, I
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1972, 51 (01): : 131 - &
[7] Speaker identification using speech and lip features
Ou, GB
Li, X
Yao, XC
Jia, HB
Murphey, YL
PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 2565 - 2570
[8] SPEAKER-INDEPENDENT VOWEL RECOGNITION IN PERSIAN SPEECH
Nazari, Mohammad
Sayadiyan, Abolghasem
Valiollahzadeh, Seyyed Majid
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 672 - 676
[9] Classification of Emotions from Speech using Implicit Features
Srivastava, Mohit
Agarwal, Anupam
2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 266 - 271
[10] SPEAKER NORMALIZATION OF STATIC AND DYNAMIC VOWEL SPECTRAL FEATURES
ZAHORIAN, SA
JAGHARGHI, AJ
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 90 (01): : 67 - 75

← 1 2 3 4 5 →