Composite Wavelet Model for Stability-Oriented Speech Synthesis from Cepstral Features

被引:0
|
作者
Koguchi, Junya [1 ]
Sagayama, Shigeki [1 ]
机构
[1] Meiji Univ, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper discusses a stability-oriented vocoder based on Gabor wavelet approximation of the source signal for statistical speech synthesis. In conventional vocoders with recursive filters, the filter gain characteristics often cause degradations in the sound quality due to unstable behavior of recursive filters affected by sharp resonances driven by a particular overtone in the excitation signal. To cope with this problem, we have proposed Composite Wavelet Model (CWM) to avoid filter-caused problems and have made several improvements as a vocoder. Based on non-recursive filters, it enables synthesizing stable speech which is robust to changes in F-0 parameter. In this paper, we further discuss the optimal number of mixture components to improve the synthetic speech quality to determine them through subjective experimental evaluations and report them on the result of incorporating in an HMM-based speech synthesis system. Objective experimental evaluations confirmed the improved stability in the amplitude of the synthetic speech.
引用
收藏
页码:1697 / 1701
页数:5
相关论文
共 50 条
  • [41] Screening for Generalized Anxiety Disorder From Acoustic and Linguistic Features of Impromptu Speech: Prediction Model Evaluation Study
    Teferra, Bazen Gashaw
    Borwein, Sophie
    DeSouza, Danielle D.
    Rose, Jonathan
    JMIR FORMATIVE RESEARCH, 2022, 6 (10)
  • [42] Model-Based Synthesis of Visual Speech Movements from 3D Video
    JamesD Edge
    Adrian Hilton
    Philip Jackson
    EURASIP Journal on Audio, Speech, and Music Processing, 2009
  • [43] Model-Based Synthesis of Visual Speech Movements from 3D Video
    Edge, James D.
    Hilton, Adrian
    Jackson, Philip
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
  • [44] A wavelet subband based LSTM model for 12-lead ECG synthesis from reduced lead set
    Kapfo, Ato
    Datta, Sumit
    Dandapat, Samarendra
    Bora, Prabin Kumar
    BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (06) : 1385 - 1395
  • [45] Synthesis of virtual monoenergetic images from kilovoltage peak images using wavelet loss enhanced CycleGAN for improving radiomics features reproducibility
    Xu, Zilong
    Li, Miaomiao
    Li, Baosheng
    Shu, Huazhong
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (03) : 2370 - 2390
  • [46] Sparse smoothing of articulatory features from Gaussian mixture model based acoustic-to-articulatory inversion: Benefit to speech recognition
    Sudhakar, Prasad
    Ghosh, Prasanta Kumar
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 169 - 173
  • [47] A STUDY ON THE INFLUENCE OF SPEECH COMMUNICATION SYNTHESIS MODEL ON COLLEGE STUDENTS' POSITIVE PSYCHOLOGY FROM THE PERSPECTIVE OF COGNITIVE PRAGMATICS
    Wang, Jiabao
    PSYCHIATRIA DANUBINA, 2022, 34 : S498 - S500
  • [48] Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora:: application to emotional speech synthesis
    Hirose, K
    Sato, K
    Asano, Y
    Minematsu, N
    SPEECH COMMUNICATION, 2005, 46 (3-4) : 385 - 404
  • [49] HMM-BASED APPROACHES TO MODEL MULTICHANNEL INFORMATION IN SIGN LANGUAGE INSPIRED FROM ARTICULATORY FEATURES-BASED SPEECH PROCESSING
    Tornay, Sandrine
    Razavi, Marzieh
    Camgoz, Necati Cihan
    Bowden, Richard
    Magimai-Doss, Mathew
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2817 - 2821
  • [50] Synthesis of poly(vinylidene chloride)-based composite latexes by emulsion polymerization from epoxy functional seeds for improved thermal stability
    Garnier, Jerome
    Dufils, Pierre-Emmanuel
    Vinas, Jerome
    Vanderveken, Yves
    van Herk, Alex
    Lacroix-Desmazes, Patrick
    POLYMER DEGRADATION AND STABILITY, 2012, 97 (02) : 170 - 177