Robust Estimation of Multiple-Regression HMM Parameters for Dimension-Based Expressive Dialogue Speech Synthesis

被引:0
|
作者
Nagata, Tomohiro [1 ]
Mori, Hiroki [1 ]
Nose, Takashi [2 ]
机构
[1] Utsunomiya Univ, Grad Sch Engn, Utsunomiya, Tochigi, Japan
[2] Tokyo Inst Technol, Grad Sch Sci & Engn, Tokyo, Japan
关键词
HMM-based speech synthesis; spontaneous speech; paralinguistic information; UU Database; MRHSMM; MAP estimation; ADAPTATION; MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes spontaneous dialogue speech synthesis based on multiple-regression hidden semi-Markov model (MRHSMM), which enables users to specify paralinguistic information of synthesized speech with a dimensional representation. Paralinguistic aspects of synthesized speech are controlled by multiple regression models whose explanatory variables are abstract dimensions such as pleasant-unpleasant and aroused sleepy. For robust estimation of the regression matrices of the MRHSMM with unbalanced spontaneous dialogue speech samples, the re-estimation formulae were derived in the framework of the maximum a posteriori (MAP) estimation. The result of a perceptual experiment confirmed that the naturalness of synthesized speech was improved by applying the MAP estimation for regression matrices. In addition a high correlation (R similar or equal to 0.7) wasobserved between given and perceived paralinguistic information, which implies that the proposed method could successfully reflect intended paralinguistic messages on the synthesized speech.
引用
收藏
页码:1548 / 1552
页数:5
相关论文
共 50 条
  • [1] Dimensional paralinguistic information control based on multiple-regression HSMM for spontaneous dialogue speech synthesis with robust parameter estimation
    Nagata, Tomohiro
    Mori, Hiroki
    Nose, Takashi
    [J]. SPEECH COMMUNICATION, 2017, 88 : 137 - 148
  • [2] EMOTIONAL SPEECH RECOGNITION BASED ON STYLE ESTIMATION AND ADAPTATION WITH MULTIPLE-REGRESSION HMM
    Ijima, Yusuke
    Tachibana, Makoto
    Nose, Takashi
    Kobayashi, Takao
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4157 - 4160
  • [3] A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
    Ijima, Yusuke
    Nose, Takashi
    Tachibana, Makoto
    Kobayashi, Takao
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (01): : 107 - 115
  • [4] An intuitive style control technique in HMM-based expressive speech synthesis using subjective style intensity and multiple-regression global variance model
    Nose, Takashi
    Kobayashi, Takao
    [J]. SPEECH COMMUNICATION, 2013, 55 (02) : 347 - 357
  • [5] An On-line Adaptation Technique for Emotional Speech Recognition Using Style Estimation with Multiple-Regression HMM
    Ijima, Yusuke
    Tachibana, Makoto
    Nose, Takashi
    Kobayashi, Takao
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1297 - 1300
  • [6] Speaking Style Adaptation for Spontaneous Speech Recognition Using Multiple-Regression HMM
    Ijima, Yusuke
    Matsubara, Takeshi
    Nose, Takashi
    Kobayashi, Takao
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 548 - 551
  • [7] Robust Voicing Detection and Estimation for HMM-Based Speech Synthesis
    Narendra, N. P.
    Rao, K. Sreenivasa
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2015, 34 (08) : 2597 - 2619
  • [8] A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM
    Nose, Takashi
    Kobayashi, Takao
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 116 - 119
  • [9] DIALOGUE CONTEXT SENSITIVE HMM-BASED SPEECH SYNTHESIS
    Tsiakoulis, Pirros
    Breslin, Catherine
    Gasic, Milica
    Henderson, Matthew
    Kim, Dongho
    Szummer, Martin
    Thomson, Blaise
    Young, Steve
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] FACTORED MLLR ADAPTATION FOR HMM-BASED EXPRESSIVE SPEECH SYNTHESIS
    Sung, June Sig
    Hong, Doo Hwa
    Lee, Chul Min
    Kim, Nam Soo
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 974 - 977