Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion

被引：7

作者：

Tanaka, Kei ^{[1
]}

Hara, Sunao ^{[1
]}

Abe, Masanobu ^{[1
]}

Sato, Masaaki ^{[2
]}

Minagi, Shogo ^{[2
]}

机构：

[1] Okayama Univ, Grad Sch Nat Sci & Technol, Okayama, Japan

[2] Okayama Univ, Grad Sch Med Dent & Pharmaceut Sci, Okayama, Japan

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

voice conversion; speech intelligibility; glossectomy; PROSTHESES;

D O I：

10.21437/Interspeech.2017-841

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, using GMM-based voice conversion algorithm, we propose to generate speaker-dependent mapping functions to improve the intelligibility of speech uttered by patients with a wide glossectomy. The speaker-dependent approach enables to generate the mapping functions that reconstruct missing spectrum features of speech uttered by a patient without having influences of a speaker's factor. The proposed idea is simple, i.e., to collect speech uttered by a patient before and after the glossectomy, but in practice it is hard to ask patients to utter speech just for developing algorithms. To confirm the performance of the proposed approach, in this paper, in order to simulate glossectomy patients, we fabricated an intraoral appliance which covers lower dental arch and tongue surface to restrain tongue movements. In terms of the Mel-frequency cepstrum (MFC) distance, by applying the voice conversion, the distances were reduced by 25% and 42% for speaker dependent case and speaker-independent case, respectively. In terms of phoneme intelligibility, dictation tests revealed that speech reconstructed by speaker-dependent approach almost always showed better performance than the original speech uttered by simulated patients, while speaker-independent approach did not.

引用

页码：3384 / 3388

页数：5

共 43 条

[1] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
Tanaka, Kei
Hara, Sunao
Abe, Masanobu
Minagi, Shogo
[J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
[2] GMM-Based Speaker Gender and Age Classification After Voice Conversion
Pribil, Jiri
Pribilova, Anna
Matousek, Jindrich
[J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
[3] Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
[J]. SPEECH COMMUNICATION, 2012, 54 (01) : 134 - 146
[4] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
Hwang, Hsin-Te
Tsao, Yu
Wang, Hsin-Min
Wang, Yih-Ru
Chen, Sin-Horng
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[5] A pruning approach for GMM-based speaker verification in mobile embedded systems
Leung, CC
Moon, YS
Meng, H
[J]. BIOMETRIC AUTHENTICATION, PROCEEDINGS, 2004, 3072 : 607 - 613
[6] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
Sun, Xinjian
Zhang, Xiongwei
Yang, Jibin
Cao, Tieyong
[J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
[7] Enhancing the Performance of a GMM-based Speaker Identification System in a Multi-Microphone Setup
Stergiou, Andreas
Pnevmatikakis, Aristodemos
Polymenakos, Lazaros C.
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1463 - 1466
[8] Modulation Spectrum-Based Post-Filter for GMM-Based Voice Conversion
Takamichi, Shinnosuke
Toda, Tomoki
Black, Alan W.
Nakamura, Satoshi
[J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[9] Voice Conversion for TTS Systems with Tuning on the Target Speaker Based on GMM
Zahariev, Vadim
Azarov, Elias
Petrovsky, Alexander
[J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 788 - 798
[10] Objective Comparison of Four GMM-Based Methods for PMA-to-Speech Conversion
Erro, Daniel
Hernaez, Inma
Serrano, Luis
Saratxaga, Ibon
Navas, Eva
[J]. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 24 - 32

← 1 2 3 4 5 →