Composite Wavelet Model for Stability-Oriented Speech Synthesis from Cepstral Features

被引：0

作者：

Koguchi, Junya ^{[1
]}

Sagayama, Shigeki ^{[1
]}

机构：

[1] Meiji Univ, Tokyo, Japan

来源：

2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2018年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper discusses a stability-oriented vocoder based on Gabor wavelet approximation of the source signal for statistical speech synthesis. In conventional vocoders with recursive filters, the filter gain characteristics often cause degradations in the sound quality due to unstable behavior of recursive filters affected by sharp resonances driven by a particular overtone in the excitation signal. To cope with this problem, we have proposed Composite Wavelet Model (CWM) to avoid filter-caused problems and have made several improvements as a vocoder. Based on non-recursive filters, it enables synthesizing stable speech which is robust to changes in F-0 parameter. In this paper, we further discuss the optimal number of mixture components to improve the synthetic speech quality to determine them through subjective experimental evaluations and report them on the result of incorporating in an HMM-based speech synthesis system. Objective experimental evaluations confirmed the improved stability in the amplitude of the synthetic speech.

引用

页码：1697 / 1701

页数：5

共 50 条

[1] Stability-Oriented Design of Model Predictive Control for DC/DC Boost Converter
Li, Yuan
Sahoo, Subham
Dragicevic, Tomislav
Zhang, Yichao
Blaabjerg, Frede
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, 71 (01) : 922 - 932
[2] Lewis Base-Catalyzed Trifluoromethylsulfinylation of Allylic Alcohols: Stability-Oriented Divergent Synthesis
Xing, Shuya
Ma, Cheng
Liu, Wen
Ni, Shao-Fei
Zhu, Dianhu
Xu, Li-Wen
Shao, Xinxin
ORGANIC LETTERS, 2023, : 1066 - 1071
[3] Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
Raikar, Aditya
Gandhi, Ami
Patil, Hemant A.
TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 405 - 413
[4] Automatic extraction of stop-oriented features from Chinese speech wave using wavelet transform
Du, LM
Hou, ZQ
ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 303 - 306
[5] Robust Speech Recognition Using Pereptual Wavelet Denoising and Mel-frequency Product Spectrum Cepstral Coefficient Features
Korba, Mohamed Cherif Amara
Messadeg, Djemil
Djemili, Rafik
Bourouba, Hocine
INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2008, 32 (03): : 283 - 288
[6] Twin identification from speech: linear and non-linear cepstral features and models
A. Revathi
R. Nagakrishnan
N. Sasikaladevi
International Journal of Speech Technology, 2020, 23 : 183 - 189
[7] Twin identification from speech: linear and non-linear cepstral features and models
Revathi, A.
Nagakrishnan, R.
Sasikaladevi, N.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) : 183 - 189
[8] Use of Machine Learning for Deception Detection From Spectral and Cepstral Features of Speech Signals
Fernandes, Sinead V.
Ullah, Muhammad S.
IEEE ACCESS, 2021, 9 : 78925 - 78935
[9] An ecological stability-oriented model for the conjunctive allocation of surface water and groundwater in oases in arid inland river basins
Qiao, Zixu
Ma, Long
Liu, Tingxi
Huang, Xing
WATER SUPPLY, 2021, 21 (01) : 368 - 385
[10] An ecological stability-oriented model for the conjunctive allocation of surface water and groundwater in oases in arid inland river basins
Qiao, Zixu
Ma, Long
Liu, Tingxi
Huang, Xing
Water Science and Technology: Water Supply, 2021, 21 (01): : 368 - 385

← 1 2 3 4 5 →