Part-Syllable Transformation-Based Voice Conversion with Very Limited Training Data

被引:0
|
作者
Mohammad Javad Jannati
Abolghasem Sayadiyan
机构
[1] Iran University of Science and Technology,School of Computer Engineering
[2] Amirkabir University of Technology,Department of Electrical Engineering
关键词
Voice conversion; Very limited training data; Part-syllable;
D O I
暂无
中图分类号
学科分类号
摘要
Voice conversion suffers from two drawbacks: requiring a large number of sentences from target speaker and concatenation error (in concatenative methods). In this research, part-syllable transformation-based voice conversion (PST-VC) method, which performs voice conversion with very limited data from a target speaker and simultaneously reduces concatenation error, is introduced. In this method, every syllable is segmented into three parts: left transition, vowel core, and right transition. Using this new language unit called part-syllable (PS), PST-VC, reduces concatenation error by transferring segmentation and concatenation from the transition points to the relatively stable points of a syllable. Since the greatest amount of information from any speaker is contained in the vowels, PST-VC method uses this information to transform the vowels into all of the language PSs. In this approach, a series of transformations are trained that can generate all of the PSs of a target speaker’s voice by receiving one vowel core as the input. Having all of the PSs, any voice of target speaker can be imitated. Therefore, PST-VC reduces the number of training sentences needed to a single-syllable word and also reduces the concatenation error.
引用
收藏
页码:1935 / 1957
页数:22
相关论文
共 50 条
  • [1] Part-Syllable Transformation-Based Voice Conversion with Very Limited Training Data
    Jannati, Mohammad Javad
    Sayadiyan, Abolghasem
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (05) : 1935 - 1957
  • [2] DOMAIN ADAPTATION FOR SAR TARGET RECOGNITION WITH LIMITED TRAINING DATA VIA RIGID TRANSFORMATION-BASED FEATURE CONVERSION
    Tai, Tsenjung
    Toda, Masato
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 247 - 250
  • [3] Voice conversion based on feature combination with limited training data
    Ghorbandoost, Mostafa
    Sayadiyan, Abolghasem
    Ahangar, Mohsen
    Sheikhzadeh, Hamid
    Shahrebabaki, Abdoreza Sabzi
    Amini, Jamal
    SPEECH COMMUNICATION, 2015, 67 : 113 - 128
  • [4] Transformation-Based Data Synthesis for Limited Sample Scenario
    Lee, Chang-Hwa
    Lee, Sang Wan
    IEEE ACCESS, 2024, 12 : 184841 - 184852
  • [5] Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data
    Xu, Ning
    Tang, Yibing
    Bao, Jingyi
    Jiang, Aiming
    Liu, Xiaofeng
    Yang, Zhen
    SPEECH COMMUNICATION, 2014, 58 : 124 - 138
  • [6] DeepConversion: Voice conversion with limited parallel training data
    Zhang, Mingyang
    Sisman, Berrak
    Zhao, Li
    Li, Haizhou
    SPEECH COMMUNICATION, 2020, 122 : 31 - 43
  • [7] WaveNet Vocoder with Limited Training Data for Voice Conversion
    Liu, Li-Juan
    Ling, Zhen-Hua
    Yuan-Jiang
    Ming-Zhou
    Dai, Li-Rong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1983 - 1987
  • [8] Factorized WaveNet for voice conversion with limited data
    Du, Hongqiang
    Tian, Xiaohai
    Xie, Lei
    Li, Haizhou
    SPEECH COMMUNICATION, 2021, 130 : 45 - 54
  • [9] SUB-SYLLABLE SEGMENT-BASED VOICE CONVERSION USING SPECTRAL BLOCK CLUSTERING TRANSFORMATION FUNCTIONS
    Yeh, Jui-Feng
    Hsu, Chung-Hua
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2010, 33 (07) : 1059 - 1067
  • [10] Parallel voice conversion with limited training data using stochastic variational deep kernel learning
    Jafaryani, Mohamadreza
    Sheikhzadeh, Hamid
    Pourahmadi, Vahid
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 115