An emotional speech synthesis markup language processor for multi-speaker and emotional text-to-speech applications

被引:0
|
作者
Ryu, Se-Hui [1 ]
Cho, Hee [1 ]
Lee, Ju-Hyun [1 ]
Hong, Ki-Hyung [1 ]
机构
[1] Sungshin Womens Univ, Dept Serv Design Engn, 34 Da Gil 2, Bomun Ro 02844, South Korea
来源
关键词
Text-to-speech; Markup language; Emotion; Multiple voice colors;
D O I
10.7776/ASK.2021.40.5.523
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we designed and developed an Emotional Speech Synthesis Markup Language (SSML) processor. Multi-speaker emotional speech synthesis technology that can express multiple voice colors and emotional expressions have been developed, and we designed Emotional SSML by extending SSML for multiple voice colors and emotional expressions. The Emotional SSML processor has a graphic user interface and consists of following four components. First, a multi-speaker emotional text editor that can easily mark specific voice colors and emotions on desired positions. Second, an Emotional SSML document generator that creates an Emotional SSML document automatically from the result of the multi-speaker emotional text editor. Third, an Emotional SSML parser that parses the Emotional SSML document. Last, a sequencer to control a multi-speaker and emotional Text-to-Speech (TTS) engine based on the result of the Emotional SSML parser. Based on SSML which is a programming language and platform independent open standard, the Emotional SSML processor can easily integrate with various speech synthesis engines and facilitates the development of multi-speaker emotional text-to-speech applications.
引用
收藏
页码:523 / 529
页数:7
相关论文
共 50 条
  • [1] Multi-speaker Emotional Text-to-speech Synthesizer
    Cho, Sungjae
    Lee, Soo-Young
    [J]. INTERSPEECH 2021, 2021, : 2337 - 2338
  • [2] ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
    Xue, Jinlong
    Deng, Yayue
    Han, Yichen
    Li, Ya
    Sun, Jianqing
    Liang, Jiaen
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 230 - 234
  • [3] Emotional Speech Synthesis for Multi-Speaker Emotional Dataset Using WaveNet Vocoder
    Choi, Heejin
    Park, Sangjun
    Park, Jinuk
    Hahn, Minsoo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
  • [4] Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes
    Mitsui, Kentaro
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    [J]. INTERSPEECH 2020, 2020, : 2032 - 2036
  • [5] Deep Voice 2: Multi-Speaker Neural Text-to-Speech
    Arik, Sercan O.
    Diamos, Gregory
    Gibiansky, Andrew
    Miller, John
    Peng, Kainan
    Ping, Wei
    Raiman, Jonathan
    Zhou, Yanqi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [6] Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech
    Singh, Abhayjeet
    Nagireddi, Amala
    Jayakumar, Anjali
    Deekshitha, G.
    Bandekar, Jesuraja
    Roopa, R.
    Badiger, Sandhya
    Udupa, Sathvik
    Kumar, Saurabh
    Ghosh, Prasanta Kumar
    Murthy, Hema A.
    Zen, Heiga
    Kumar, Pranaw
    Kant, Kamal
    Bole, Amol
    Singh, Bira Chandra
    Tokuda, Keiichi
    Hasegawa-Johnson, Mark
    Olbrich, Philipp
    [J]. IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 790 - 798
  • [7] Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment
    Liu, Zhaoyu
    Mak, Brian
    [J]. INTERSPEECH 2020, 2020, : 2932 - 2936
  • [8] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding
    Chen, Mengnan
    Chen, Minchuan
    Liang, Shuang
    Ma, Jun
    Chen, Lei
    Wang, Shaojun
    Xiao, Jing
    [J]. INTERSPEECH 2019, 2019, : 2105 - 2109
  • [9] Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
    Tu, Tao
    Chen, Yuan-Jui
    Liu, Alexander H.
    Lee, Hung-yi
    [J]. INTERSPEECH 2020, 2020, : 3191 - 3195
  • [10] Modeling and synthesizing emotional speech for Catalan text-to-speech synthesis
    Iriondo, I
    Alías, F
    Melenchón, J
    Llorca, MA
    [J]. AFFECTIVE DIALOGUE SYSTEMS, PROCEEDINGS, 2004, 3068 : 197 - 208