Emotional speech synthesis based on improved codebook mapping voice conversion

被引:0
|
作者
Wang, YP [1 ]
Ling, ZH [1 ]
Wang, RH [1 ]
机构
[1] Univ Sci & Technol China, iFlytek Speech Lab, Hefei 230026, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a spectral transformation method for emotional speech synthesis based on voice conversion framework. Three emotions are studied, including anger, happiness and sadness. For the sake of high naturalness, superior speech quality and emotion expressiveness, our original STASC system is modified by introducing a new feature selection strategy and hierarchical codebook mapping procedure. Our result shows that the LSF coefficients at low frequency carry more emotion-relative information, and therefore only these coefficients are converted. Listening tests prove that the proposed method can achieve a satisfactory balance between emotional expression and speech quality of converted speech signals.
引用
收藏
页码:374 / 381
页数:8
相关论文
共 50 条
  • [1] Voice conversion with tone mapping codebook of mandarin speech
    Zuo, Guo-Yu
    Liu, Wen-Ju
    Ruan, Xiao-Gang
    Shuju Caiji Yu Chuli/Journal of Data Acquisition and Processing, 2005, 20 (02): : 144 - 149
  • [2] Mandarin voice conversion using tone codebook mapping
    Zuo, Guoyu
    Chen, Yao
    Ruan, Xiaogang
    Liu, Wenju
    ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 965 - 973
  • [3] Learning Mandarin tone mapping codebook for voice conversion
    Zuo, GY
    Chen, Y
    Ruan, XG
    Liu, WJ
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 4824 - 4828
  • [4] A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis
    Tuerk, Oytun
    Schroeder, Marc
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2282 - 2285
  • [5] Dimensional Affective Speech Synthesis Based on Voice Conversion
    Zhang, Xin
    Wan, Yaobin
    Wang, Wei
    Intelligent Computing, 2024, 3
  • [6] Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space
    Xue, Yawen
    Hamada, Yasuhiro
    Akagi, Masato
    SPEECH COMMUNICATION, 2018, 102 : 54 - 67
  • [7] Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation
    Csapo, Tamas Gabor
    Nemeth, Geza
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 209 - 220
  • [8] MULTI VOICE TEXT TO SPEECH SYNTHESIS BASED ON THE INSTANTANEOUS PARAMETRIC VOICE CONVERSION
    Azarov, Elias
    Petrovsky, Alexander
    Zubrycki, Piotr
    SPA 2010: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2010, : 78 - 82
  • [9] Voice Conversion for Whispered Speech Synthesis
    Cotescu, Marius
    Drugman, Thomas
    Huybrechts, Goeric
    Lorenzo-Trueba, Jaime
    Moinet, Alexis
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 186 - 190
  • [10] SPEECH-CODEBOOK BASED SOFT VOICE ACTIVITY DETECTION
    Heese, Florian
    Niermann, Markus
    Vary, Peter
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4335 - 4339