Prosodic Processing for the Automatic Synthesis of Emotional Russian Speech

被引:0
|
作者
Kaliyev, Arman [1 ]
Matveev, Yuri N. [1 ]
Lyakso, Elena E. [1 ]
Rybin, Sergey V. [2 ]
机构
[1] St Petersburg State Univ, St Petersburg, Russia
[2] ITMO Univ, St Petersburg, Russia
基金
俄罗斯科学基金会;
关键词
emotional speech; pause prediction; prosody; speech synthesis; statistical models; CLASSIFICATION; RECOGNITION; EXPRESSION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Currently, the automatic speech synthesis technology is undergoing significant changes due to new solutions in the field of machine learning. These solutions qualitatively improve the sound of synthesized speech, bringing it closer to natural human speech. Against the backdrop of this, as well as under the influence of business, the development of artificial emotional speech for human-machine interaction systems has received a new strong turn of development. Due to this prosodic processing for the synthesis of Russian emotional speech has become an important research direction for our research group. The article presents an algorithm for predicting pause locations for three categories of emotional speech. In particular, the authors used three corpora of emotional speech, collected according to emotional categories (neutral, excited and depressed), for training classifiers. The obtained results can be used to create a high-quality automatic synthesizer of emotional speech.
引用
收藏
页码:653 / 655
页数:3
相关论文
共 50 条
  • [1] Evaluating Prosodic Processing for Incremental Speech Synthesis
    Baumann, Timo
    Schlangen, David
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 438 - 441
  • [2] PROSODIC ANALYSIS AND MODELLING FOR MALAY EMOTIONAL SPEECH SYNTHESIS
    Mustafa, Mumtaz B.
    Ainon, Raja N.
    Zainuddin, Roziati
    Don, Zuraidah M.
    Knowles, Gerry
    Mokhtar, Salimah
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2010, 23 (02) : 102 - 110
  • [3] PROSODIC MODELS, AUTOMATIC SPEECH UNDERSTANDING, AND SPEECH SYNTHESIS: TOWARDS THE COMMON GROUND?
    Batliner, Anton
    Moebius, Bernd
    [J]. INTEGRATION OF PHONETIC KNOWLEDGE IN SPEECH TECHNOLOGY, 2005, 25 : 21 - 44
  • [4] Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis
    Brognaux, Sandrine
    Francois, Thomas
    Saerens, Marco
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3872 - 3879
  • [5] Emotional prosodic processing in schizophrenia
    Shea, TL
    Sergejew, AA
    Egan, GF
    Burnham, D
    Copolov, DL
    [J]. AUSTRALIAN JOURNAL OF PSYCHOLOGY, 2005, 57 : 36 - 36
  • [6] Speech activity detection and automatic prosodic processing unit segmentation for emotion recognition
    Sztaho, David
    Vicsi, Klara
    [J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2014, 8 (04): : 315 - 324
  • [7] AUTOMATIC DETECTION OF PROSODIC BOUNDARIES IN SPEECH
    CAMPBELL, N
    [J]. SPEECH COMMUNICATION, 1993, 13 (3-4) : 343 - 354
  • [8] Automatic generation of prosodic structure for high quality Mandarin speech synthesis
    Chou, FC
    Tseng, CY
    Lee, LS
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1624 - 1627
  • [9] Emotional prosodic processing in auditory hallucinations
    Shea, T. L.
    Sergejew, A. A.
    Burnham, D.
    Jones, C.
    Rossell, S. L.
    Copolov, D. L.
    Egan, G. F.
    [J]. SCHIZOPHRENIA RESEARCH, 2007, 90 (1-3) : 214 - 220
  • [10] Emotional prosodic processing in auditory hallucinations
    Shea, TL
    Sergejew, AA
    Egan, GF
    Burnham, D
    Copolov, DL
    [J]. SCHIZOPHRENIA BULLETIN, 2005, 31 (02) : 376 - 376