Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech

被引:134
|
作者
Nakamura, Keigo [1 ]
Toda, Tomoki [1 ]
Saruwatari, Hiroshi [1 ]
Shikano, Kiyohiro [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Nara 6300192, Japan
关键词
Electrolaryngeal speech; Voice conversion; Speaking-aid system; Speech enhancement; Airpressure sensor; Silence excitation; Non-audible murmur; Laryngectomee; MAXIMUM-LIKELIHOOD; LARYNGECTOMY;
D O I
10.1016/j.specom.2011.07.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An electrolarynx (EL) is a medical device that generates sound source signals to provide laryngectomees with a voice. In this article we focus on two problems of speech produced with an EL (EL speech). One problem is that EL speech is extremely unnatural and the other is that sound source signals with high energy are generated by an EL, and therefore, the signals often annoy surrounding people. To address these two problems, in this article we propose three speaking-aid systems that enhance three different types of EL speech signals: EL speech, EL speech using an air-pressure sensor (EL-air speech), and silent EL speech. The air-pressure sensor enables a laryngectomee to manipulate the F-0 contours of EL speech using exhaled air that flows from the tracheostoma. Silent EL speech is produced with a new sound source unit that generates signals with extremely low energy. Our speaking-aid systems address the poor quality of EL speech using voice conversion (VC), which transforms acoustic features so that it appears as if the speech is uttered by another person. Our systems estimate spectral parameters, F-0 and aperiodic components independently. The result of experimental evaluations demonstrates that the use of an air-pressure sensor dramatically improves F-0 estimation accuracy. Moreover, it is revealed that the converted speech signals are preferred to source EL speech. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:134 / 146
页数:13
相关论文
共 50 条
  • [41] Voice Conversion Based on Hybrid SVR and GMM
    Song, Peng
    Jin, Yun
    Zhao, Li
    Zou, Cairong
    [J]. ARCHIVES OF ACOUSTICS, 2012, 37 (02) : 143 - 149
  • [42] A Multi-level GMM-Based Cross-Lingual Voice Conversion Using Language-Specific Mixture Weights for Polyglot Synthesis
    B. Ramani
    M. P. Actlin Jeeva
    P. Vijayalakshmi
    T. Nagarajan
    [J]. Circuits, Systems, and Signal Processing, 2016, 35 : 1283 - 1311
  • [43] A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion
    Moriguchi, Takuto
    Toda, Tomoki
    Sano, Motoaki
    Sato, Hiroshi
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3071 - 3075
  • [44] GMM-based Codebook Construction and Feedback Encoding in FDD Systems
    Turan, Nurettin
    Koller, Michael
    Fesl, Benedikt
    Bazzi, Samer
    Xu, Wen
    Utschick, Wolfgang
    [J]. 2022 56TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2022, : 37 - 42
  • [45] Artefact Determination by GMM-Based Continuous Detection of Emotional Changes in Synthetic Speech
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. 2019 42ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2019, : 45 - 48
  • [46] Improving the Performance of GMM Based Voice Conversion Method
    Song, Peng
    Zhao, Li
    [J]. PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 436 - 440
  • [47] Monitoring Radiated Coexistence Testing Using GMM-Based Classifier
    Al Kalaa, Mohamad Omar
    Refai, Hazem H.
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2017, 66 (11) : 10336 - 10345
  • [48] Using a DBN to integrate Sparse Classification and GMM-based ASR
    Sun, Yang
    Gemmeke, Jort F.
    Cranen, Bert
    ten Bosch, Louis
    Boves, Lou
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2098 - 2101
  • [49] A GMM based residual prediction method for voice conversion
    Xia, J
    Yin, JX
    [J]. ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 389 - 392
  • [50] Voice Conversion Based on STRAIGHT and UBM-GMM
    Gao Yingying
    Zhu Weibin
    [J]. PROCEEDINGS OF 2009 CONFERENCE ON COMMUNICATION FACULTY, 2009, : 342 - 345