Voice Conversion System using SVM for Vocal Tract Modification and Codebook based Model for Pitch Contour Modification

被引:0
|
作者
Laskar, R. H. [1 ]
Talukdar, F. A. [1 ]
Bhattacharjee, Rajib [1 ]
Das, Saugat [1 ]
机构
[1] Natl Inst Technol, Dept Elect & Telecommun Engn, Silchar, India
关键词
Support Vector Machine; Vector Quantization; Radial Basis Function Network; Regression Analysis; Intonation pattern; Pitch Contour; Codebook;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The basic idea of this paper is to design an alternative voice conversion technique using support vector machine (SVM) as a regression tool that, converts the voice of a source speaker to specific standard target speaker. A nonlinear mapping function between the parameters for the acoustic features of the two speakers has been captured in our work. The vocal tract characteristics have been represented by the line spectral frequencies (LSFs). The kernel induced feature space using radial basis function network type SVM with Gaussian basis function have been used in our work. The codebook based technique has been used to modify the intonation characteristic (pitch contour). Mapping of the pitch contour has been achieved at the word level by associating the codebooks derived from the pitch contours of the source and the target speakers. The speech signals for the desired target speaker have been synthesized using the transformed LSFs along with the modified pitch contour and evaluated using both the subjective and the listening tests. The results signify that the proposed model improves the voice conversion performance in terms of capturing the speaker's identity. However, the performance can further be improved by suitably modifying various user defined parameters used in regression analysis and using more training LSF vectors in the training stage.
引用
收藏
页码:2205 / 2210
页数:6
相关论文
共 50 条
  • [31] Vocal tract spectrum conversion using a two-factor Gaussian process dynamic model
    [J]. Sun, X.-J. (sunxinjane@163.com), 1600, Science Press (40):
  • [32] Pitch modification of speech signal using source filter model by Linear Prediction for prosodic transformations
    Faycal, Ykhlef
    Guertei, Mhania
    Bensebti, Mesaoud
    [J]. PROCEEDINGS OF FUTURE GENERATION COMMUNICATION AND NETWORKING, MAIN CONFERENCE PAPERS, VOL 1, 2007, : 413 - 418
  • [33] MODEL-MAPPING BASED VOICE CONVERSION SYSTEM A Novel Approach to Improve Voice Similarity and Naturalness using Model-based Speech Synthesis Techniques
    Li, Baojie
    Wu, Dalei
    Jiang, Hui
    [J]. BIOSIGNALS 2010: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2010, : 442 - 446
  • [34] Time scale modification and vocal tract length normalization for improving the performance of Tamil speech recognition system implemented using language independent segmentation algorithm
    Saraswathi, S.
    Geetha, T. V.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2006, 9 (3-4) : 151 - 163
  • [35] Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient's Speech Using Spectral Differential Modification in Voice Conversion
    Murakami, Hiroki
    Hara, Sunao
    Abe, Masanobu
    Sato, Masaaki
    Minagi, Shogo
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2464 - 2468
  • [36] Voice conversion using Viterbi algorithm based on Gaussian mixture model
    Jian Zhi-Hua
    Yang Zhen
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
  • [37] Modification of The Paris System for urinary tract washing specimens using diagnostic cytological features
    Suh, J.
    Go, H.
    Sung, C.
    Baek, S.
    Hwang, H.
    Jeong, S.
    Cho, Y.
    [J]. CYTOPATHOLOGY, 2017, 28 (06) : 516 - 523
  • [38] Voice Morphing based on Interpolation of Vocal Tract Area Functions Using AR-HMM Analysis of Speech
    Nambu, Yoshiki
    Mikawa, Masahiko
    Tanaka, Kazuyo
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2607 - 2610
  • [39] A system for voice conversion based on probabilistic classification and a harmonic plus noise model
    Stylianou, Y
    Cappe, O
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 281 - 284
  • [40] Image histogram modification based on a new model of the visual system nonlinearity
    Cobra, DT
    [J]. JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (04) : 807 - 815