Composite Wavelet Model for Stability-Oriented Speech Synthesis from Cepstral Features

被引:0
|
作者
Koguchi, Junya [1 ]
Sagayama, Shigeki [1 ]
机构
[1] Meiji Univ, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper discusses a stability-oriented vocoder based on Gabor wavelet approximation of the source signal for statistical speech synthesis. In conventional vocoders with recursive filters, the filter gain characteristics often cause degradations in the sound quality due to unstable behavior of recursive filters affected by sharp resonances driven by a particular overtone in the excitation signal. To cope with this problem, we have proposed Composite Wavelet Model (CWM) to avoid filter-caused problems and have made several improvements as a vocoder. Based on non-recursive filters, it enables synthesizing stable speech which is robust to changes in F-0 parameter. In this paper, we further discuss the optimal number of mixture components to improve the synthetic speech quality to determine them through subjective experimental evaluations and report them on the result of incorporating in an HMM-based speech synthesis system. Objective experimental evaluations confirmed the improved stability in the amplitude of the synthetic speech.
引用
收藏
页码:1697 / 1701
页数:5
相关论文
共 50 条
  • [1] Stability-Oriented Design of Model Predictive Control for DC/DC Boost Converter
    Li, Yuan
    Sahoo, Subham
    Dragicevic, Tomislav
    Zhang, Yichao
    Blaabjerg, Frede
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, 71 (01) : 922 - 932
  • [2] Lewis Base-Catalyzed Trifluoromethylsulfinylation of Allylic Alcohols: Stability-Oriented Divergent Synthesis
    Xing, Shuya
    Ma, Cheng
    Liu, Wen
    Ni, Shao-Fei
    Zhu, Dianhu
    Xu, Li-Wen
    Shao, Xinxin
    ORGANIC LETTERS, 2023, : 1066 - 1071
  • [3] Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
    Raikar, Aditya
    Gandhi, Ami
    Patil, Hemant A.
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 405 - 413
  • [4] Automatic extraction of stop-oriented features from Chinese speech wave using wavelet transform
    Du, LM
    Hou, ZQ
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 303 - 306
  • [5] Robust Speech Recognition Using Pereptual Wavelet Denoising and Mel-frequency Product Spectrum Cepstral Coefficient Features
    Korba, Mohamed Cherif Amara
    Messadeg, Djemil
    Djemili, Rafik
    Bourouba, Hocine
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2008, 32 (03): : 283 - 288
  • [6] Twin identification from speech: linear and non-linear cepstral features and models
    A. Revathi
    R. Nagakrishnan
    N. Sasikaladevi
    International Journal of Speech Technology, 2020, 23 : 183 - 189
  • [7] Twin identification from speech: linear and non-linear cepstral features and models
    Revathi, A.
    Nagakrishnan, R.
    Sasikaladevi, N.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) : 183 - 189
  • [8] Use of Machine Learning for Deception Detection From Spectral and Cepstral Features of Speech Signals
    Fernandes, Sinead V.
    Ullah, Muhammad S.
    IEEE ACCESS, 2021, 9 : 78925 - 78935
  • [9] An ecological stability-oriented model for the conjunctive allocation of surface water and groundwater in oases in arid inland river basins
    Qiao, Zixu
    Ma, Long
    Liu, Tingxi
    Huang, Xing
    WATER SUPPLY, 2021, 21 (01) : 368 - 385
  • [10] An ecological stability-oriented model for the conjunctive allocation of surface water and groundwater in oases in arid inland river basins
    Qiao, Zixu
    Ma, Long
    Liu, Tingxi
    Huang, Xing
    Water Science and Technology: Water Supply, 2021, 21 (01): : 368 - 385