Part-Syllable Transformation-Based Voice Conversion with Very Limited Training Data

被引:0
|
作者
Mohammad Javad Jannati
Abolghasem Sayadiyan
机构
[1] Iran University of Science and Technology,School of Computer Engineering
[2] Amirkabir University of Technology,Department of Electrical Engineering
关键词
Voice conversion; Very limited training data; Part-syllable;
D O I
暂无
中图分类号
学科分类号
摘要
Voice conversion suffers from two drawbacks: requiring a large number of sentences from target speaker and concatenation error (in concatenative methods). In this research, part-syllable transformation-based voice conversion (PST-VC) method, which performs voice conversion with very limited data from a target speaker and simultaneously reduces concatenation error, is introduced. In this method, every syllable is segmented into three parts: left transition, vowel core, and right transition. Using this new language unit called part-syllable (PS), PST-VC, reduces concatenation error by transferring segmentation and concatenation from the transition points to the relatively stable points of a syllable. Since the greatest amount of information from any speaker is contained in the vowels, PST-VC method uses this information to transform the vowels into all of the language PSs. In this approach, a series of transformations are trained that can generate all of the PSs of a target speaker’s voice by receiving one vowel core as the input. Having all of the PSs, any voice of target speaker can be imitated. Therefore, PST-VC reduces the number of training sentences needed to a single-syllable word and also reduces the concatenation error.
引用
收藏
页码:1935 / 1957
页数:22
相关论文
共 50 条
  • [21] Isolated mandarin syllable recognition with limited training data specially considering the effect of tones
    Lee, YM
    Lee, LS
    Tseng, CY
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (01): : 75 - 80
  • [22] Improving the performance of MGM-based voice conversion by preparing training data method
    Zuo, GY
    Liu, WJ
    Ruan, XG
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 181 - 184
  • [23] Pitch Transformation in Neural Network based Voice Conversion
    Xie, Feng-Long
    Qian, Yao
    Soong, Frank K.
    Li, Haifeng
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 197 - +
  • [24] Schema evolution in data warehousing environments - A schema transformation-based approach
    Fan, H
    Poulovassilis, A
    CONCEPTUAL MODELING - ER 2004, PROCEEDINGS, 2004, 3288 : 639 - 653
  • [25] A transformation-based approach to Gaussian mixture density estimation for bounded data
    Scrucca, Luca
    BIOMETRICAL JOURNAL, 2019, 61 (04) : 873 - 888
  • [26] Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
    Zhou, Kun
    Sisman, Berrak
    Li, Haizhou
    INTERSPEECH 2021, 2021, : 811 - 815
  • [27] Use of a genetic algorithm in Brill's transformation-based part-of-speech tagger
    Wilson, Garnett
    Heywood, Malcolm
    GECCO 2005: Genetic and Evolutionary Computation Conference, Vols 1 and 2, 2005, : 2067 - 2073
  • [28] Nonparallel training for voice conversion based on a parameter adaptation approach
    Mouchtaris, A
    Van der Spiegel, J
    Mueller, P
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 952 - 963
  • [29] Wavelet transformation-based management of integrated summary data for distributed query processing
    Joe, MJ
    Whang, KY
    Kim, SW
    DATA & KNOWLEDGE ENGINEERING, 2001, 39 (03) : 293 - 312
  • [30] Satisfiability of constrained Horn clauses on algebraic data types: A transformation-based approach
    De Angelis, Emanuele
    Fioravanti, Fabio
    Pettorossi, Alberto
    Proietti, Maurizio
    JOURNAL OF LOGIC AND COMPUTATION, 2022, 32 (02) : 402 - 442