Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech

被引:134
|
作者
Nakamura, Keigo [1 ]
Toda, Tomoki [1 ]
Saruwatari, Hiroshi [1 ]
Shikano, Kiyohiro [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Nara 6300192, Japan
关键词
Electrolaryngeal speech; Voice conversion; Speaking-aid system; Speech enhancement; Airpressure sensor; Silence excitation; Non-audible murmur; Laryngectomee; MAXIMUM-LIKELIHOOD; LARYNGECTOMY;
D O I
10.1016/j.specom.2011.07.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An electrolarynx (EL) is a medical device that generates sound source signals to provide laryngectomees with a voice. In this article we focus on two problems of speech produced with an EL (EL speech). One problem is that EL speech is extremely unnatural and the other is that sound source signals with high energy are generated by an EL, and therefore, the signals often annoy surrounding people. To address these two problems, in this article we propose three speaking-aid systems that enhance three different types of EL speech signals: EL speech, EL speech using an air-pressure sensor (EL-air speech), and silent EL speech. The air-pressure sensor enables a laryngectomee to manipulate the F-0 contours of EL speech using exhaled air that flows from the tracheostoma. Silent EL speech is produced with a new sound source unit that generates signals with extremely low energy. Our speaking-aid systems address the poor quality of EL speech using voice conversion (VC), which transforms acoustic features so that it appears as if the speech is uttered by another person. Our systems estimate spectral parameters, F-0 and aperiodic components independently. The result of experimental evaluations demonstrates that the use of an air-pressure sensor dramatically improves F-0 estimation accuracy. Moreover, it is revealed that the converted speech signals are preferred to source EL speech. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:134 / 146
页数:13
相关论文
共 50 条
  • [21] Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion
    Yang, Yaogen
    Zhang, Haozhe
    Cai, Zexin
    Shi, Yao
    Li, Ming
    Zhang, Dong
    Ding, Xiaojun
    Deng, Jianhua
    Wang, Jie
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
  • [22] A Study of Mutual Information for GMM-Based Spectral Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 78 - 81
  • [23] GMM-BASED ACOUSTIC MODELING FOR EMBEDDED SPEECH RECOGNITION
    Levy, Christophe
    Linares, Georges
    Bonastre, Jean-Francois
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1726 - 1729
  • [24] GMM-based a priori SNR estimation in speech enhancement
    Lei, Jianjun
    Wang, Jian
    Liu, Gang
    Guo, Jun
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 4293 - +
  • [25] The Use of Air-Pressure Sensor in Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1628 - 1631
  • [26] Experiment with GMM-Based Artefact Localization in Czech Synthetic Speech
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 23 - 31
  • [27] Speech Recognition in a Home Environment Using Parallel Decoding with GMM-Based Noise Modeling
    Machida, Kohei
    Nose, Takashi
    Ito, Akinori
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [28] A GMM-based telephone channel classification for Mandarin speech recognition
    Xu, W
    Peng, X
    Wang, BX
    [J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 642 - 645
  • [29] Non-intrusive GMM-based speech quality measurement
    Falk, TH
    Xu, QF
    Chan, WY
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 125 - 128
  • [30] Target Speech GMM-based Spectral Compensation for Noise Robust Speech Recognition
    Shinozaki, Takahiro
    Furui, Sadaoki
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1223 - 1226