Emotional speech synthesis based on improved codebook mapping voice conversion

被引：0

作者：

Wang, YP ^{[1
]}

Ling, ZH ^{[1
]}

Wang, RH ^{[1
]}

机构：

[1] Univ Sci & Technol China, iFlytek Speech Lab, Hefei 230026, Peoples R China

来源：

AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS | 2005年 / 3784卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a spectral transformation method for emotional speech synthesis based on voice conversion framework. Three emotions are studied, including anger, happiness and sadness. For the sake of high naturalness, superior speech quality and emotion expressiveness, our original STASC system is modified by introducing a new feature selection strategy and hierarchical codebook mapping procedure. Our result shows that the LSF coefficients at low frequency carry more emotion-relative information, and therefore only these coefficients are converted. Listening tests prove that the proposed method can achieve a satisfactory balance between emotional expression and speech quality of converted speech signals.

引用

页码：374 / 381

页数：8

共 50 条

[21] A COMPARISON OF DISCRETE AND SOFT SPEECH UNITS FOR IMPROVED VOICE CONVERSION
van Niekerk, Benjamin
Carbonneau, Marc-Andre
Zaidi, Julian
Baas, Matthew
Seute, Hugo
Kamper, Herman
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6562 - 6566
[22] Spectral voice conversion for text-to-speech synthesis
Kain, A
Macon, MW
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 285 - 288
[23] Synthesis of Child Speech With HMM Adaptation and Voice Conversion
Watts, Oliver
Yamagishi, Junichi
King, Simon
Berkling, Kay
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 1005 - 1016
[24] Codebook Clustering for Unit Selection based EMG-to-Speech Conversion
Diener, Lorenz
Janke, Matthias
Schultz, Tanja
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2420 - 2424
[25] An improved CycleGAN-based emotional voice conversion model by augmenting temporal dependency with a transformer
Fu, Changzeng
Liu, Chaoran
Ishi, Carlos Toshinori
Ishiguro, Hiroshi
SPEECH COMMUNICATION, 2022, 144 : 110 - 121
[26] Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion
Amini, Jamal
Shahrebabaki, Abdoreza Sabzi
Shokouhi, Navid
Sheikhzadeh, Hamid
Raahemifa, Kaamran
Eslami, Mehdi
2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 428 - 433
[27] A Dual Alignment Scheme for Improved Speech-to-Singing Voice Conversion
Vijayan, Karthika
Dong, Minghui
Li, Haizhou
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1598 - 1606
[28] Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1443 - 1446
[29] Voice quality conversion in TD-PSOLA speech synthesis
Sun, XJ
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 953 - 956
[30] HMM adaptation and voice conversion for the synthesis of child speech: a comparison
Watts, Oliver
Yamagishi, Junichi
King, Simon
Berkling, Kay
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2595 - +

← 1 2 3 4 5 →