Voice Conversion System using SVM for Vocal Tract Modification and Codebook based Model for Pitch Contour Modification

被引：0

作者：

Laskar, R. H. ^{[1
]}

Talukdar, F. A. ^{[1
]}

Bhattacharjee, Rajib ^{[1
]}

Das, Saugat ^{[1
]}

机构：

[1] Natl Inst Technol, Dept Elect & Telecommun Engn, Silchar, India

来源：

2008 IEEE REGION 10 CONFERENCE: TENCON 2008, VOLS 1-4 | 2008年

关键词：

Support Vector Machine; Vector Quantization; Radial Basis Function Network; Regression Analysis; Intonation pattern; Pitch Contour; Codebook;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The basic idea of this paper is to design an alternative voice conversion technique using support vector machine (SVM) as a regression tool that, converts the voice of a source speaker to specific standard target speaker. A nonlinear mapping function between the parameters for the acoustic features of the two speakers has been captured in our work. The vocal tract characteristics have been represented by the line spectral frequencies (LSFs). The kernel induced feature space using radial basis function network type SVM with Gaussian basis function have been used in our work. The codebook based technique has been used to modify the intonation characteristic (pitch contour). Mapping of the pitch contour has been achieved at the word level by associating the codebooks derived from the pitch contours of the source and the target speakers. The speech signals for the desired target speaker have been synthesized using the transformed LSFs along with the modified pitch contour and evaluated using both the subjective and the listening tests. The results signify that the proposed model improves the voice conversion performance in terms of capturing the speaker's identity. However, the performance can further be improved by suitably modifying various user defined parameters used in regression analysis and using more training LSF vectors in the training stage.

引用

页码：2205 / 2210

页数：6

共 50 条

[31] Vocal tract spectrum conversion using a two-factor Gaussian process dynamic model
[J]. Sun, X.-J. (sunxinjane@163.com), 1600, Science Press (40):
[32] Pitch modification of speech signal using source filter model by Linear Prediction for prosodic transformations
Faycal, Ykhlef
Guertei, Mhania
Bensebti, Mesaoud
[J]. PROCEEDINGS OF FUTURE GENERATION COMMUNICATION AND NETWORKING, MAIN CONFERENCE PAPERS, VOL 1, 2007, : 413 - 418
[33] MODEL-MAPPING BASED VOICE CONVERSION SYSTEM A Novel Approach to Improve Voice Similarity and Naturalness using Model-based Speech Synthesis Techniques
Li, Baojie
Wu, Dalei
Jiang, Hui
[J]. BIOSIGNALS 2010: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2010, : 442 - 446
[34] Time scale modification and vocal tract length normalization for improving the performance of Tamil speech recognition system implemented using language independent segmentation algorithm
Saraswathi, S.
Geetha, T. V.
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2006, 9 (3-4) : 151 - 163
[35] Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient's Speech Using Spectral Differential Modification in Voice Conversion
Murakami, Hiroki
Hara, Sunao
Abe, Masanobu
Sato, Masaaki
Minagi, Shogo
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2464 - 2468
[36] Voice conversion using Viterbi algorithm based on Gaussian mixture model
Jian Zhi-Hua
Yang Zhen
[J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
[37] Modification of The Paris System for urinary tract washing specimens using diagnostic cytological features
Suh, J.
Go, H.
Sung, C.
Baek, S.
Hwang, H.
Jeong, S.
Cho, Y.
[J]. CYTOPATHOLOGY, 2017, 28 (06) : 516 - 523
[38] Voice Morphing based on Interpolation of Vocal Tract Area Functions Using AR-HMM Analysis of Speech
Nambu, Yoshiki
Mikawa, Masahiko
Tanaka, Kazuyo
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2607 - 2610
[39] A system for voice conversion based on probabilistic classification and a harmonic plus noise model
Stylianou, Y
Cappe, O
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 281 - 284
[40] Image histogram modification based on a new model of the visual system nonlinearity
Cobra, DT
[J]. JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (04) : 807 - 815

← 1 2 3 4 5 →