Real-Time Vibration Control of An Electrolarynx based on Statistical F0 Contour Prediction

被引:0
|
作者
Tanaka, Kou [1 ]
Toda, Tomoki [2 ]
Neubig, Graham [1 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, 8916-5 Takayama Cho, Ikoma, Nara, Japan
[2] Nagoya Univ, Informat Technol Ctr, Chikusa Ku, Furo Cho, Nagoya, Aichi, Japan
关键词
VOICE CONVERSION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An electrolarynx is a speaking aid device to artificially generate excitation sounds to help laryngectomees produce electrolaryngeal (EL) speech. Although EL speech is quite intelligible, its naturalness significantly suffers from the unnatural fundamental frequency (F-0) patterns of the mechanical excitation sounds. To make it possible to produce more naturally sounding EL speech, we have proposed a method to automatically control F-0 patterns of the excitation sounds generated from the electrolarynx based on the statistical F-0 prediction, which predicts F-0 patterns from the produced EL speech in real-time. In our previous work, we have developed a prototype system by implementing the proposed real-time prediction method in an actual, physical electrolarynx, and through the use of the prototype system, we have found that improvements of the naturalness of EL speech yielded by the prototype system tend to be lower than that yielded by the batch-type prediction. In this paper, we examine negative impacts caused by latency of the real-time prediction on the F-0 prediction accuracy, and to alleviate them, we also propose two methods, 1) modeling of segmented continuous F-0 (CF0) patterns and 2) prediction of forthcoming F-0 values. The experimental results demonstrate that 1) the conventional real-time prediction method needs a large delay to predict CF0 patterns and 2) the proposed methods have positive impacts on the real-time prediction.
引用
收藏
页码:1333 / 1337
页数:5
相关论文
共 50 条
  • [1] A Vibration Control Method of an Electrolarynx Based on Statistical F0 Pattern Prediction
    Tanaka, Kou
    Toda, Tomoki
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (09): : 2165 - 2173
  • [2] An Inter-Speaker Evaluation through Simulation of Electrolarynx Control based on Statistical F0 Prediction
    Tanaka, Kou
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [3] Direct F0 Control of an Electrolarynx based on Statistical Excitation Feature Prediction and its Evaluation through Simulation
    Tanaka, Kou
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 31 - 35
  • [4] An F0 contour control model using an F0 contour codebook
    Kagoshima, Takehiko
    Morita, Masahiro
    Seto, Shigenobu
    Akamine, Masami
    Shiga, Yoshinori
    Systems and Computers in Japan, 2007, 38 (01): : 62 - 72
  • [5] Development and Perceptual Evaluation of Amplitude-Based F0 Control in Electrolarynx Speech
    Saikachi, Yoko
    Stevens, Kenneth N.
    Hillman, Robert E.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2009, 52 (05): : 1360 - 1369
  • [6] Design and Preliminary Evaluation of Electrolarynx With F0 Control Based on Capacitive Touch Technology
    Wang Li
    Qian Zhaopeng
    Feng Yijun
    Niu Haijun
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2018, 26 (03) : 629 - 636
  • [7] F0 prediction model of speech synthesis based on template and statistical method
    Tao, JH
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 497 - 504
  • [8] An application of the Bayesian time series model and statistical system analysis for F0 control
    Kato, H
    Kawahara, H
    SPEECH COMMUNICATION, 1998, 24 (04) : 325 - 339
  • [9] A stochastic F0 contour model based on clustering and a probabilistic measure
    Yamashita, Y
    Ishida, T
    Shimadera, K
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03) : 543 - 549
  • [10] Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement
    Tanaka, Kou
    Kameoka, Hirokazu
    Toda, Tomoki
    Nakamura, Satoshi
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1069 - 1073