Interpretable parametric voice conversion functions based on Gaussian mixture models and constrained transformations

被引:10
|
作者
Erro, Daniel [1 ,2 ]
Alonso, Agustin [1 ]
Serrano, Luis [1 ]
Navas, Eva [1 ]
Hernaez, Inma [1 ]
机构
[1] Univ Basque Country, Aholab, Bilbao, Spain
[2] Ikerbasque, Basque Fdn Sci, Bilbao, Spain
来源
COMPUTER SPEECH AND LANGUAGE | 2015年 / 30卷 / 01期
关键词
Voice conversion; Gaussian mixture models; Frequency warping; Amplitude scaling; Spectral tilt;
D O I
10.1016/j.csl.2014.03.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice conversion functions based on Gaussian mixture models and parametric speech signal representations are opaque in the sense that it is not straightforward to interpret the physical meaning of the conversion parameters. Following the line of recent works based on the frequency warping plus amplitude scaling paradigm, in this article we show that voice conversion functions can be designed according to physically meaningful constraints in such manner that they become highly informative. The resulting voice conversion method can be used to visualize the differences between source and target voices or styles in terms of formant location in frequency, spectral tilt and amplitude in a number of spectral bands. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [1] Voice Conversion Using Gaussian Mixture Models
    D'souza, Kevin
    Talele, K. T. V.
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,
  • [2] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
    Doi, Hironori
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
  • [3] Voice conversion based on Gaussian processes by using kernels modeling the spectral density with Gaussian mixture models
    Bao, Jingyi
    Xu, Ning
    [J]. MODERN PHYSICS LETTERS B, 2018, 32 (34-36):
  • [4] STORYTELLING VOICE CONVERSION: EVALUATION EXPERIMENT USING GAUSSIAN MIXTURE MODELS
    Pribil, Jiri
    Pribilova, Anna
    Durackova, Daniela
    [J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2015, 66 (04): : 194 - 202
  • [5] VOICE CONVERSION BASED ON MATRIX VARIATE GAUSSIAN MIXTURE MODEL
    Saito, Daisuke
    Doi, Hidenobu
    Minematsu, Nobuaki
    Hirose, Keikichi
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 567 - 571
  • [6] Voice conversion using Viterbi algorithm based on Gaussian mixture model
    Jian Zhi-Hua
    Yang Zhen
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
  • [7] Voice Conversion Using Structrued Gaussian Mixture Model
    Zeng, Daojian
    Yu, Yibiao
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 541 - 544
  • [8] Efficient Gaussian Mixture Model Evaluation in Voice Conversion
    Tian, Jilei
    Nurminen, Jani
    Popa, Victor
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2282 - 2285
  • [9] Voice Conversion Based on Gaussian Mixture Modules with Minimum Distance Spectral Mapping
    Jin, Gui
    Johnson, Michael T.
    Liu, Jia
    Lin, Xiaokang
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2015, : 356 - 359
  • [10] Voice conversion using canonical correlation analysis based on Gaussian mixture model
    Jian, ZhiHua
    Yang, Zhen
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 210 - +