HMM speech synthesis based on MDCT representation

被引:5
|
作者
Biagetti G. [1 ]
Crippa P. [1 ]
Falaschetti L. [1 ]
Turchetti C. [1 ]
机构
[1] DII – Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, Ancona
关键词
HMM; MDCT; Mel-cepstral analysis; Overlap-and-add; Speech synthesis;
D O I
10.1007/s10772-018-09571-9
中图分类号
学科分类号
摘要
Hidden Markov model (HMM) based text-to-speech (TTS) has become one of the most promising approaches, as it has proven to be a particularly flexible and robust framework to generate synthetic speech. However, several factors such as mel-cepstral vocoder and over-smoothing are responsible for causing quality degradation of synthetic speech. This paper presents an HMM speech synthesis technique based on the modified discrete cosine transform (MDCT) representation to cope with these two issues. To this end, we use an analysis/synthesis technique based on MDCT that guarantees a perfect reconstruction of the signal frame from feature vectors and allows for a 50% overlap between frames without increasing the data vector, in contrast to the conventional mel-cepstral spectral parameters that do not ensure direct speech waveform reconstruction. Experimental results show that a sound of good quality, conveniently evaluated using both objective and subjective tests, is obtained. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:1045 / 1055
页数:10
相关论文
共 50 条
  • [1] Arabic Speech Synthesis based on HMM
    Khalil, Krichi Mohamed
    Adnan, Cherif
    [J]. 2018 15TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS AND DEVICES (SSD), 2018, : 1091 - 1095
  • [2] Croatian HMM based speech synthesis
    Martincic-Ipsic, S.
    Ipsic, I.
    [J]. ITI 2006: PROCEEDINGS OF THE 28TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2006, : 251 - +
  • [3] Modeling of Fundamental Frequency Contours for HMM-based Speech Synthesis Representation of fundamental frequency contours for statistical speech synthesis
    Hirose, Keikichi
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 171 - 176
  • [4] A Multi Model HMM Based Speech Synthesis
    Chanjaradwichai, Supadaech
    Suchato, Atiwong
    Punyabukkana, Proadpran
    [J]. ENGINEERING JOURNAL-THAILAND, 2018, 22 (01): : 187 - 203
  • [5] HMM-Based Vietnamese Speech Synthesis
    Trinh Quoc Son
    [J]. 2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2015, : 349 - 353
  • [6] Robustness of HMM-based Speech Synthesis
    Yamagishi, Junichi
    Ling, Zhenhua
    King, Simon
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 581 - 584
  • [7] Czech HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    [J]. TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 291 - 298
  • [8] Arabic HMM-based Speech Synthesis
    Khalil, Krichi Mohamed
    Adnan, Cherif
    [J]. 2013 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND SOFTWARE APPLICATIONS (ICEESA), 2013, : 450 - 454
  • [9] A Solution on Tibetan Speech Synthesis Based on HMM
    Zhou, Yan
    Zhao, Dongcai
    Wang, Fuzhao
    [J]. PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 1776 - 1780
  • [10] Arabic Speech Synthesis System Based on HMM
    Amrouche, Aissa
    Abed, Ahcene
    Falek, Leila
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2019), 2019, : 73 - 78