Text-independent voice conversion based on state mapped codebook

被引:0
|
作者
Zhang, Meng [1 ]
Tao, Jianhua [1 ]
Tian, Jilei [2 ]
Wang, Xia [3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] Nokia Res Ctr, Interact Core Technol Ctr, Tampere, Finland
[3] Nokia Res Ctr, Tampere, Finland
基金
中国国家自然科学基金;
关键词
text-independent; voice conversion; hidden Markov model; state mapped codebook;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice conversion has become more and more important in speech technology, but most of current works have to use parallel utterances of both source and target speaker as the training corpus, which limits the application of the technology. In the paper, we propose a new method of text-independent voice conversion which uses non-parallel corpus for the training. The Hidden Markov Model (HMM) is used to represent the phonetic structure of training speech and to generate the training pairs of source and target speakers by mapping the HMM states between source and target speeches. Then, HMM state mapped codebooks are generated to create the mapping function for the text-independent voice conversion. The subjective experiments based on ABX tests and MOS tests show that the method proposed in the paper gets the similar conversion performance and better speech quality compared to the conventional voice conversion systems.
引用
收藏
页码:4605 / +
页数:2
相关论文
共 50 条
  • [31] Text-independent speaker verification based on relation of MFCC components
    Ou, GW
    Ke, DF
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 57 - 60
  • [32] A robust wavelet-based text-independent speaker identification
    Phung Trung Nghia
    Pham Viet Binh
    Nguyen Huu Thai
    Nguyen Thanh Ha
    Kumsawat, Prayoth
    [J]. ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL II, PROCEEDINGS, 2007, : 219 - 223
  • [33] Principal Component Based Classification for Text-Independent Speaker Identification
    Hanilci, Cemal
    Ertas, Figen
    [J]. 2009 FIFTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING, COMPUTING WITH WORDS AND PERCEPTIONS IN SYSTEM ANALYSIS, DECISION AND CONTROL, 2010, : 39 - 42
  • [34] A Text-Independent Speaker Verification System Based on Cross Entropy
    Lu, Xiaochun
    Yin, Junxun
    [J]. COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
  • [35] Text-independent Speaker Identification in Birds
    Fox, E. J. S.
    Roberts, J. D.
    Bennamoun, M.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2122 - 2125
  • [36] Emotional speech synthesis based on improved codebook mapping voice conversion
    Wang, YP
    Ling, ZH
    Wang, RH
    [J]. AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2005, 3784 : 374 - 381
  • [37] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
    El-Moneim, Samia Abd
    Sedik, Ahmed
    Nassar, M. A.
    El-Fishawy, Adel S.
    Sharshar, A. M.
    Hassan, Shaimaa E. A.
    Mahmoud, Adel Zaghloul
    Dessouky, Moawd I.
    El-Banby, Ghada M.
    El-Samie, Fathi E. Abd
    El-Rabaie, El-Sayed M.
    Neyazi, Badawi
    Seddeq, H. S.
    Ismail, Nabil A.
    Khalaf, Ashraf A. M.
    Elabyad, G. S. M.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 993 - 1006
  • [38] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
    Samia Abd El-Moneim
    Ahmed Sedik
    M. A. Nassar
    Adel S. El-Fishawy
    A. M. Sharshar
    Shaimaa E. A. Hassan
    Adel Zaghloul Mahmoud
    Moawd I. Dessouky
    Ghada M. El-Banby
    Fathi E. Abd El-Samie
    El-Sayed M. El-Rabaie
    Badawi Neyazi
    H. S. Seddeq
    Nabil A. Ismail
    Ashraf A. M. Khalaf
    G. S. M. Elabyad
    [J]. International Journal of Speech Technology, 2021, 24 : 993 - 1006
  • [39] Group-based speaker embeddings for text-independent speaker verification
    Jung, Youngmoon
    Eom, Youngsik
    Lee, Yeonghyeon
    Kim, Hoirin
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
  • [40] Parallel implementation of a VQ-based text-independent speaker identification
    Soganci, R
    Gürgen, F
    Topcuoglu, H
    [J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 291 - 300