Voice Conversion System using SVM for Vocal Tract Modification and Codebook based Model for Pitch Contour Modification

被引:0
|
作者
Laskar, R. H. [1 ]
Talukdar, F. A. [1 ]
Bhattacharjee, Rajib [1 ]
Das, Saugat [1 ]
机构
[1] Natl Inst Technol, Dept Elect & Telecommun Engn, Silchar, India
关键词
Support Vector Machine; Vector Quantization; Radial Basis Function Network; Regression Analysis; Intonation pattern; Pitch Contour; Codebook;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The basic idea of this paper is to design an alternative voice conversion technique using support vector machine (SVM) as a regression tool that, converts the voice of a source speaker to specific standard target speaker. A nonlinear mapping function between the parameters for the acoustic features of the two speakers has been captured in our work. The vocal tract characteristics have been represented by the line spectral frequencies (LSFs). The kernel induced feature space using radial basis function network type SVM with Gaussian basis function have been used in our work. The codebook based technique has been used to modify the intonation characteristic (pitch contour). Mapping of the pitch contour has been achieved at the word level by associating the codebooks derived from the pitch contours of the source and the target speakers. The speech signals for the desired target speaker have been synthesized using the transformed LSFs along with the modified pitch contour and evaluated using both the subjective and the listening tests. The results signify that the proposed model improves the voice conversion performance in terms of capturing the speaker's identity. However, the performance can further be improved by suitably modifying various user defined parameters used in regression analysis and using more training LSF vectors in the training stage.
引用
收藏
页码:2205 / 2210
页数:6
相关论文
共 50 条
  • [11] A study on the pitch shift model in human whistling using the vocal tract model
    [J]. Mori, M. (mikoi@u-fukui.ac.jp), 1600, Institute of Electrical Engineers of Japan (134):
  • [12] Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour
    Zhang, J
    Sun, J
    Dai, BQ
    [J]. AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2005, 3784 : 326 - 333
  • [13] Emotions Based Voice Supportive Model Using SVM
    Kumar, A. Senthil
    Pranavi, N.
    Dharshini, S. Gown Priya
    [J]. 2021 7TH INTERNATIONAL CONFERENCE ON ENGINEERING AND EMERGING TECHNOLOGIES (ICEET 2021), 2021, : 27 - 30
  • [14] Novel approach of MFCC based alignment and WD-residual modification for voice conversion using RBF
    Nirmal, Jagannath
    Zaveri, Mukesh
    Patnaik, Suprava
    Kachare, Pramod
    [J]. NEUROCOMPUTING, 2017, 237 : 39 - 49
  • [15] Statistical Singing Voice Conversion based on Direct Waveform Modification with Global Variance
    Kobayashi, Kazuhiro
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2754 - 2758
  • [16] Voice Quality Modification Using a Harmonics Plus Noise ModelTransferring Vocal Effort with Parallel Corpora
    Àngel Calzada Defez
    Joan Claudi Dr Socoró Carrié
    [J]. Cognitive Computation, 2013, 5 : 473 - 482
  • [17] A new pitch generation model based on internal dependence of pitch contour for Manadrin TTS system
    Yu, Jian
    Zhang, Wanzhi
    Tao, Jianhua
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 741 - 744
  • [18] VOICE CONVERSION FOR ARBITRARY SPEAKERS USING ARTICULATORY-MOVEMENT TO VOCAL-TRACT PARAMETER MAPPING
    Ariwardhani, Narpendyah W.
    Iribe, Yurie
    Katsurada, Kouichi
    Nitta, Tsuneo
    [J]. 2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2013,
  • [19] SYNTHESIS OF THE SINGING VOICE USING A PHYSICALLY PARAMETERIZED MODEL OF THE HUMAN VOCAL-TRACT
    COOK, PR
    [J]. PROCEEDINGS : 1989 INTERNATIONAL COMPUTER MUSIC CONFERENCE, NOVEMBER 2-5, 1989, : 69 - 72
  • [20] Robustness of Statistical Voice Conversion based on Direct Waveform Modification against Background Sounds
    Kurita, Yusuke
    Kobayashi, Kazuhiro
    Takeda, Kazuya
    Toda, Tomoki
    [J]. INTERSPEECH 2019, 2019, : 684 - 688