The Use of Air-Pressure Sensor in Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion

被引：0

作者：

Nakamura, Keigo ^{[1
]}

Toda, Tomoki ^{[1
]}

Saruwatari, Hiroshi ^{[1
]}

Shikano, Kiyohiro ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

Electrolarynx; Air-pressure sensor; Laryngectomee; Voice conversion; Speaking-aid;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In our previous work, we proposed a speaking-aid system converting electrolaryngeal speech (EL speech) to normal speech using a statistical voice conversion technique. The main weakness of our system is the difficulty of estimating natural contours of the fundamental frequency (F-0) from EL speech including only built-in F-0 contours. This paper proposes another speaking-aid system with an air-pressure sensor to enable laryngectomees to control F-0 contours of the EL speech using their breathing air. The experimental result demonstrates that 1) the correlation coefficient of F-0 contours between the converted and the target speech is improved from 0.58 to 0.78 by the use of the air-pressure sensor and 2) the synthetic speech converted by the proposed system sounds more natural and is more preferred to that by our conventional aid system.

引用

页码：1628 / 1631

页数：4

共 50 条

[1] Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1443 - 1446
[2] Electrolaryngeal Speech Enhancement with Statistical Voice Conversion based on CLDNN
Kobayashi, Kazuhiro
Toda, Tomoki
[J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2115 - 2119
[3] A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Spectral Subtraction and Statistical Voice Conversion
Tanaka, Kou
Toda, Tomoki
Neubig, Graham
Sakti, Sakriani
Nakamura, Satoshi
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3066 - 3070
[4] A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion
Moriguchi, Takuto
Toda, Tomoki
Sano, Motoaki
Sato, Hiroshi
Neubig, Graham
Sakti, Sakriani
Nakamura, Satoshi
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3071 - 3075
[5] Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion
Yang, Yaogen
Zhang, Haozhe
Cai, Zexin
Shi, Yao
Li, Ming
Zhang, Dong
Ding, Xiaojun
Deng, Jianhua
Wang, Jie
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
[6] TWO-STAGE TRAINING METHOD FOR JAPANESE ELECTROLARYNGEAL SPEECH ENHANCEMENT BASED ON SEQUENCE-TO-SEQUENCE VOICE CONVERSION
Ma, Ding
Violeta, Lester Phillip
Kobayashi, Kazuhiro
Toda, Tomoki
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 949 - 954
[7] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
Doi, Hironori
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
[8] A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation
Tanaka, Kou
Toda, Tomoki
Neubig, Graham
Sakti, Sakriani
Nakamura, Satoshi
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06): : 1429 - 1437
[9] MANDARIN ELECTROLARYNGEAL SPEECH VOICE CONVERSION WITH SEQUENCE-TO-SEQUENCE MODELING
Yen, Ming-Chi
Huang, Wen-Chin
Kobayashi, Kazuhiro
Peng, Yu-Huai
Tsai, Shu-Wei
Tsao, Yu
Toda, Tomoki
Jang, Jyh-Shing Roger
Wang, Hsin-Min
[J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 650 - 657
[10] MESOPHARYNGEAL AIR-PRESSURE IN WHISPERED SPEECH
HIGASHIKAWA, M
SAKAKURA, A
TAKAHASHI, H
[J]. FOLIA PHONIATRICA ET LOGOPAEDICA, 1995, 47 (02) : 77 - 78

← 1 2 3 4 5 →