High Quality Voice Conversion based on ISODATA Clustering Algorithm

被引:0
|
作者
Li, Yanping [1 ]
Zuo, Yutao [1 ]
Yang, Zhen [1 ]
Shao, Xi [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Telecommun & Informat Engn, Nanjing, Jiangsu, Peoples R China
关键词
voice conversion; ISODATA; similarity; quality; bilinear frequency; Gaussian mixture model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Two main challenges introduced in current voice conversion are the dependence on parallel training data and the trade-off between speaker similarity and speech quality. To tackle the latter problem, this paper proposes a novel conversion method based on Iterative Self-organizing DATA Analysis Techniques Algorithm (ISODATA) clustering algorithm. Specially, we use ISODATA during the training of Gaussian mixture model, the optimized mixture number can guarantee the validity and accuracy of the GMM model, which can acquire speaker's identity effectively related to speaker similarity between original target speech and converted speech, Next, we combine improved GMM and bilinear frequency warping for the conversion stage, which can get a good balance between speaker similarity and speech quality. Theory analysis and experimental results demonstrate that the proposed algorithm can achieve higher quality and similarity compared with other two methods.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] High-quality Voice Conversion Using Spectrogram-Based WaveNet Vocoder
    Chen, Kuan
    Chen, Bo
    Lai, Jiahao
    Yu, Kai
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1993 - 1997
  • [22] Brain extraction using isodata clustering algorithm aided by histogram analysis
    Computer Engineering Department, University of Kashan, Kashan, Iran
    [J]. Conf. Proc. Int. Conf. Knowl.-Based Eng. Innov., KBEI, (847-852):
  • [23] Vocal Tract Spectrum Transformation Based on Clustering in Voice Conversion System
    Xie Weichao
    Zhang Linghua
    [J]. PROCEEDING OF THE IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2012, : 236 - 240
  • [24] APPLYING IMPROVED SPECTRAL MODELING FOR HIGH QUALITY VOICE CONVERSION
    Villavicencio, Fernando
    Roebel, Axel
    Rodet, Xavier
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4285 - +
  • [25] Voice Conversion Using Dynamic Features for High Quality Transformation
    Wang, Wei
    Yang, Zhen
    [J]. SECOND INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING, 2010, 7546
  • [26] High Quality Algorithm for Chinese Short Messages Text Clustering Based on Semantic
    Yang, Fengxia
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 1263 - 1266
  • [27] HIGH-QUALITY NONPARALLEL VOICE CONVERSION BASED ON CYCLE-CONSISTENT ADVERSARIAL NETWORK
    Fang, Fuming
    Yamagishi, Junichi
    Echizen, Isao
    Lorenzo-Trueba, Jaime
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5279 - 5283
  • [28] A Revisit to Feature Handling for High-quality Voice Conversion Based on Gaussian Mixture Model
    Suda, Hitoshi
    Kotani, Gaku
    Takamichi, Shinnosuke
    Saito, Daisuke
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 816 - 822
  • [29] Voice conversion based on self organization clustering and modified particle swarm optimization
    Xie, Weichao
    Zhang, Linghua
    [J]. Shengxue Xuebao/Acta Acustica, 2014, 39 (01): : 130 - 136
  • [30] An algorithm for voice conversion with limited corpus
    GU Dong
    JIAN Zhihua
    [J]. Chinese Journal of Acoustics, 2018, 37 (03) : 371 - 384