Text-independent voice conversion based on state mapped codebook

被引：0

作者：

Zhang, Meng ^{[1
]}

Tao, Jianhua ^{[1
]}

Tian, Jilei ^{[2
]}

Wang, Xia ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China

[2] Nokia Res Ctr, Interact Core Technol Ctr, Tampere, Finland

[3] Nokia Res Ctr, Tampere, Finland

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

基金：

中国国家自然科学基金;

关键词：

text-independent; voice conversion; hidden Markov model; state mapped codebook;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Voice conversion has become more and more important in speech technology, but most of current works have to use parallel utterances of both source and target speaker as the training corpus, which limits the application of the technology. In the paper, we propose a new method of text-independent voice conversion which uses non-parallel corpus for the training. The Hidden Markov Model (HMM) is used to represent the phonetic structure of training speech and to generate the training pairs of source and target speakers by mapping the HMM states between source and target speeches. Then, HMM state mapped codebooks are generated to create the mapping function for the text-independent voice conversion. The subjective experiments based on ABX tests and MOS tests show that the method proposed in the paper gets the similar conversion performance and better speech quality compared to the conventional voice conversion systems.

引用

页码：4605 / +

页数：2

共 50 条

[31] Text-independent speaker verification based on relation of MFCC components
Ou, GW
Ke, DF
[J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 57 - 60
[32] A robust wavelet-based text-independent speaker identification
Phung Trung Nghia
Pham Viet Binh
Nguyen Huu Thai
Nguyen Thanh Ha
Kumsawat, Prayoth
[J]. ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL II, PROCEEDINGS, 2007, : 219 - 223
[33] Principal Component Based Classification for Text-Independent Speaker Identification
Hanilci, Cemal
Ertas, Figen
[J]. 2009 FIFTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING, COMPUTING WITH WORDS AND PERCEPTIONS IN SYSTEM ANALYSIS, DECISION AND CONTROL, 2010, : 39 - 42
[34] A Text-Independent Speaker Verification System Based on Cross Entropy
Lu, Xiaochun
Yin, Junxun
[J]. COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
[35] Text-independent Speaker Identification in Birds
Fox, E. J. S.
Roberts, J. D.
Bennamoun, M.
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2122 - 2125
[36] Emotional speech synthesis based on improved codebook mapping voice conversion
Wang, YP
Ling, ZH
Wang, RH
[J]. AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2005, 3784 : 374 - 381
[37] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
El-Moneim, Samia Abd
Sedik, Ahmed
Nassar, M. A.
El-Fishawy, Adel S.
Sharshar, A. M.
Hassan, Shaimaa E. A.
Mahmoud, Adel Zaghloul
Dessouky, Moawd I.
El-Banby, Ghada M.
El-Samie, Fathi E. Abd
El-Rabaie, El-Sayed M.
Neyazi, Badawi
Seddeq, H. S.
Ismail, Nabil A.
Khalaf, Ashraf A. M.
Elabyad, G. S. M.
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 993 - 1006
[38] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
Samia Abd El-Moneim
Ahmed Sedik
M. A. Nassar
Adel S. El-Fishawy
A. M. Sharshar
Shaimaa E. A. Hassan
Adel Zaghloul Mahmoud
Moawd I. Dessouky
Ghada M. El-Banby
Fathi E. Abd El-Samie
El-Sayed M. El-Rabaie
Badawi Neyazi
H. S. Seddeq
Nabil A. Ismail
Ashraf A. M. Khalaf
G. S. M. Elabyad
[J]. International Journal of Speech Technology, 2021, 24 : 993 - 1006
[39] Group-based speaker embeddings for text-independent speaker verification
Jung, Youngmoon
Eom, Youngsik
Lee, Yeonghyeon
Kim, Hoirin
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
[40] Parallel implementation of a VQ-based text-independent speaker identification
Soganci, R
Gürgen, F
Topcuoglu, H
[J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 291 - 300

← 1 2 3 4 5 →