Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice Transformation

被引:0
|
作者
Erro, Daniel [1 ]
Navas, Eva [1 ]
Hernaez, Inma [1 ]
机构
[1] Univ Basque Country UPV EHU, AHOLAB, Bilbao, Spain
关键词
vocal tract length normalization; voice conversion; frequency warping plus amplitude scaling; speech synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method that determines the optimal configuration of a bilinear vocal tract length normalization function to transform the frequency axis of one voice according to a specific target voice. Given a number of parallel utterances of the involved speakers, the single parameter of this function can be calculated through an iterative procedure by minimizing an objective error measure defined in the cepstral domain. This method is also applicable when multiple warping classes are considered, and it can be complemented with amplitude correction filters. The resulting physically motivated cepstral transformation results in highly satisfactory conversion accuracy and improved quality with respect to standard satistical systems.
引用
收藏
页码:86 / 89
页数:4
相关论文
共 50 条
  • [31] Comments on vocal fact length normalization equals linear transformation in cepstral space
    Afify, Mohamed
    Siohan, Olivier
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1731 - 1732
  • [32] ESTIMATION OF VOCAL-TRACT LENGTH FROM ACOUSTIC DATA
    WAKITA, H
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 : S21 - S21
  • [33] Fast and robust joint estimation of vocal tract and voice source parameters
    Ding, W
    Campbell, N
    Higuchi, N
    Kasuya, H
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1291 - 1294
  • [34] A Precise Estimation of Vocal Tract Parameters for High Quality Voice Morphing
    Xu, Ning
    Yang, Zhen
    [J]. ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 684 - 687
  • [35] Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users
    Gaudrain, Etienne
    Baskent, Deniz
    [J]. EAR AND HEARING, 2018, 39 (02): : 226 - 237
  • [36] Vocal tract length normalization for speaker independent acoustic-to-articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 455 - 459
  • [37] Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
    Lung, Jensen Wong Jing
    Salam, Md Sah Hj
    Rehman, Amjad
    Rahim, Mohd Shafry Mohd
    Saba, Tanzila
    [J]. IETE TECHNICAL REVIEW, 2014, 31 (02) : 128 - 136
  • [38] Real-time vocal tract length normalization in a phonological awareness teaching system
    Paczolay, D
    Kocsor, A
    Tóth, L
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 309 - 314
  • [39] THE ITERATIVE TRANSFORMATION METHOD AND LENGTH ESTIMATION FOR TUBULAR FLOW REACTORS
    FAZIO, R
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 1991, 42 (02) : 105 - 110
  • [40] Unsupervised estimation of the human vocal tract length over sentence level utterances
    Necioglu, BF
    Clements, MA
    Barnwell, TP
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1319 - 1322