Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification

被引:0
|
作者
Sun, Xinjian [1 ]
Zhang, Xiongwei [1 ]
Yang, Jibin [1 ]
Cao, Tieyong [1 ]
机构
[1] PLA Univ Sci & Technol, Coll Commun Engn, Nanjing, Jiangsu, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bilinear Model (BM) can express both characteristics within a speaker (style) and phonemes across speakers (content) independently in a speech database. It has a successful application in voice conversion (VC) by extrapolation. However, extrapolation suffers an undesired repetition of BM building and a large-scale estimation of parameters. To tackle these problems, we propose to enhance the normal BM-based VC scheme by integrating a joint Gaussian Mixture Model (GMM)-based classification, assuming that the GMM components correspond to the quasi-phoneme content classes. The enhanced scheme not only optimizes the VC algorithm in computation, but also improves the quality of speech compared to the normal BM-based one, as well as traditional GMM-based mapping system in evaluation experiments.
引用
收藏
页码:1225 / 1228
页数:4
相关论文
共 50 条
  • [1] GMM-Based Speaker Gender and Age Classification After Voice Conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
  • [2] Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. SPEECH COMMUNICATION, 2012, 54 (01) : 134 - 146
  • [3] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [4] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
    Tanaka, Kei
    Hara, Sunao
    Abe, Masanobu
    Minagi, Shogo
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [5] Modulation Spectrum-Based Post-Filter for GMM-Based Voice Conversion
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [6] GMM-based classification of genomic sequences
    Akhtar, Mahmood
    Ambikairajah, Eliathamby
    Epps, Julien
    [J]. PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 103 - +
  • [7] Alleviating the Over-Smoothing Problem in GMM-Based Voice Conversion with Discriminative Training
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3061 - 3065
  • [8] MODULATION SPECTRUM-CONSTRAINED TRAJECTORY TRAINING ALGORITHM FOR GMM-BASED VOICE CONVERSION
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4859 - 4863
  • [9] Using a DBN to integrate Sparse Classification and GMM-based ASR
    Sun, Yang
    Gemmeke, Jort F.
    Cranen, Bert
    ten Bosch, Louis
    Boves, Lou
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2098 - 2101
  • [10] A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models
    Takamichi, Shinnosuke
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2490 - 2498