STATISTICAL F0 PREDICTION FOR ELECTROLARYNGEAL SPEECH ENHANCEMENT CONSIDERING GENERATIVE PROCESS OF F0 CONTOURS WITHIN PRODUCT OF EXPERTS FRAMEWORK

被引:0
|
作者
Tanaka, Kou [1 ]
Kameoka, Hirokazu [2 ]
Toda, Tomoki [3 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Nara, Japan
[2] NTT Corp, NTT Commun Sci Labs, Tokyo, Tokyo, Japan
[3] Nagoya Univ, Informat Technol Ctr, Nagoya, Aichi 4648601, Japan
关键词
Electrolaryngeal speech enhancement; F-0; prediction; Generative model; Product of Experts; VOICE CONVERSION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We have previously proposed a statistical fundamental frequency (F-0) prediction method that makes it possible to predict the underlying F-0 contour of electrolaryngeal (EL) speech from its spectral feature sequence. Although this method was shown to contribute to improving the naturalness of EL speech as a whole, the predicted F-0 contour was still unnatural compared with that in normal speech. One possible solution to improve the naturalness of the predicted F-0 contours would be to take account of the physical mechanism of vocal phonation. Recently a statistical model of voice F-0 contours was formulated by constructing a stochastic counterpart of the Fujisaki model, a well-founded mathematical model representing the control mechanism of vocal fold vibration. This paper proposes a Product-of -Experts model to incorporate this generative model of voice F-0 contours into the statistical F-0 prediction model. Based on the constructed model, we derive algorithms for parameter training and F-0 prediction. Experimental results revealed that the proposed method successfully outperformed our previously proposed method in terms of the naturalness of the predicted F-0 contours.
引用
收藏
页码:5665 / 5669
页数:5
相关论文
共 50 条
  • [1] Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement
    Tanaka, Kou
    Kameoka, Hirokazu
    Toda, Tomoki
    Nakamura, Satoshi
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1069 - 1073
  • [2] Generative modeling of speech F0 contours
    Kameoka, Hirokazu
    Yoshizato, Kota
    Ishihara, Tatsuma
    Ohishi, Yasunori
    Kashino, Kunio
    Sagayama, Shigeki
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1825 - 1829
  • [3] Generating F0 contours by statistical manipulation of natural F0 shapes
    Saito, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 1100 - 1106
  • [4] A Novel Model of F0 Contours Prediction for Continuous Speech
    胡文英
    祖漪清
    王志中
    JournalofShanghaiJiaotongUniversity, 2005, (03) : 231 - 235
  • [5] The use of a generative model of F0 contours for multilingual speech synthesis
    Fujisaki, H
    Ohno, S
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 714 - 717
  • [6] F0 CONTOUR ESTIMATION USING PHONETIC FEATURE IN ELECTROLARYNGEAL SPEECH ENHANCEMENT
    Cai, Zexin
    Xu, Zhicheng
    Li, Ming
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6490 - 6494
  • [7] Generation of F0 contours for Vietnamese speech synthesis
    Do Dat Tran
    Castelli, Eric
    2010 THIRD INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2010, : 158 - 162
  • [8] Resonances f0(1370), f0(1500) and f0(1710) within the extended Linear Sigma Model
    Janowski, Stanislaus
    Giacosa, Francesco
    FAIRNESS 2013: FAIR NEXT GENERATION OF SCIENTISTS 2013, 2014, 503
  • [9] Multiband statistical learning for F0 estimation in speech
    Sha, F
    Burgoyne, JA
    Saul, LK
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 661 - 664
  • [10] F0 Transformation within the Voice Conversion Framework
    Hanzlicek, Zdenek
    Matousek, Jindrich
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 681 - 684