Prosodic Processing for the Automatic Synthesis of Emotional Russian Speech

被引：0

作者：

Kaliyev, Arman ^{[1
]}

Matveev, Yuri N. ^{[1
]}

Lyakso, Elena E. ^{[1
]}

Rybin, Sergey V. ^{[2
]}

机构：

[1] St Petersburg State Univ, St Petersburg, Russia

[2] ITMO Univ, St Petersburg, Russia

来源：

2018 IEEE INTERNATIONAL CONFERENCE QUALITY MANAGEMENT, TRANSPORT AND INFORMATION SECURITY, INFORMATION TECHNOLOGIES (IT&QM&IS) | 2018年

基金：

俄罗斯科学基金会;

关键词：

emotional speech; pause prediction; prosody; speech synthesis; statistical models; CLASSIFICATION; RECOGNITION; EXPRESSION;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Currently, the automatic speech synthesis technology is undergoing significant changes due to new solutions in the field of machine learning. These solutions qualitatively improve the sound of synthesized speech, bringing it closer to natural human speech. Against the backdrop of this, as well as under the influence of business, the development of artificial emotional speech for human-machine interaction systems has received a new strong turn of development. Due to this prosodic processing for the synthesis of Russian emotional speech has become an important research direction for our research group. The article presents an algorithm for predicting pause locations for three categories of emotional speech. In particular, the authors used three corpora of emotional speech, collected according to emotional categories (neutral, excited and depressed), for training classifiers. The obtained results can be used to create a high-quality automatic synthesizer of emotional speech.

引用

页码：653 / 655

页数：3

共 50 条

[1] Evaluating Prosodic Processing for Incremental Speech Synthesis
Baumann, Timo
Schlangen, David
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 438 - 441
[2] PROSODIC ANALYSIS AND MODELLING FOR MALAY EMOTIONAL SPEECH SYNTHESIS
Mustafa, Mumtaz B.
Ainon, Raja N.
Zainuddin, Roziati
Don, Zuraidah M.
Knowles, Gerry
Mokhtar, Salimah
[J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2010, 23 (02) : 102 - 110
[3] PROSODIC MODELS, AUTOMATIC SPEECH UNDERSTANDING, AND SPEECH SYNTHESIS: TOWARDS THE COMMON GROUND?
Batliner, Anton
Moebius, Bernd
[J]. INTEGRATION OF PHONETIC KNOWLEDGE IN SPEECH TECHNOLOGY, 2005, 25 : 21 - 44
[4] Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis
Brognaux, Sandrine
Francois, Thomas
Saerens, Marco
[J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3872 - 3879
[5] Emotional prosodic processing in schizophrenia
Shea, TL
Sergejew, AA
Egan, GF
Burnham, D
Copolov, DL
[J]. AUSTRALIAN JOURNAL OF PSYCHOLOGY, 2005, 57 : 36 - 36
[6] Speech activity detection and automatic prosodic processing unit segmentation for emotion recognition
Sztaho, David
Vicsi, Klara
[J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2014, 8 (04): : 315 - 324
[7] AUTOMATIC DETECTION OF PROSODIC BOUNDARIES IN SPEECH
CAMPBELL, N
[J]. SPEECH COMMUNICATION, 1993, 13 (3-4) : 343 - 354
[8] Automatic generation of prosodic structure for high quality Mandarin speech synthesis
Chou, FC
Tseng, CY
Lee, LS
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1624 - 1627
[9] Emotional prosodic processing in auditory hallucinations
Shea, T. L.
Sergejew, A. A.
Burnham, D.
Jones, C.
Rossell, S. L.
Copolov, D. L.
Egan, G. F.
[J]. SCHIZOPHRENIA RESEARCH, 2007, 90 (1-3) : 214 - 220
[10] Emotional prosodic processing in auditory hallucinations
Shea, TL
Sergejew, AA
Egan, GF
Burnham, D
Copolov, DL
[J]. SCHIZOPHRENIA BULLETIN, 2005, 31 (02) : 376 - 376

← 1 2 3 4 5 →