A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion

被引:0
|
作者
Moriguchi, Takuto [1 ]
Toda, Tomoki [1 ]
Sano, Motoaki [2 ]
Sato, Hiroshi [2 ]
Neubig, Graham [1 ]
Sakti, Sakriani [1 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan
[2] Foster Elect Co Ltd, Akishima, Tokyo, Japan
关键词
statistical voice conversion; real-time processing; reduction of computational cost; DSP; non-audible murmur; electrolaryngeal speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a digital signal processor (DSP) implementation of real-time statistical voice conversion (VC) for silent speech enhancement and electrolaryngeal speech enhancement. As a silent speech interface, we focus on non audible murmur (NAM), which can be used in situations where audible speech is not acceptable. Electrolaryngeal speech is one of the typical types of alaryngeal speech produced by an alternative speaking method for laryngectornees. However, the sound quality of NAM and electrolaryngeal speech suffers from lack of naturalness. VC has proven to be one of the promising approaches to address this problem, and it has been successfully implemented on devices with sufficient computational resources. An implementation on devices that are highly portable but have limited computational resources would greatly contribute to its practical use. In this paper we further implement real-time VC on a DSP. To implement the two speech enhancement systems based on real-time VC, one from NAM to a whispered voice and the other from electrolaryngeal speech to a natural voice, we propose several methods for reducing computational cost while preserving conversion accuracy. We conduct experimental evaluations and show that real-time VC is capable of running on a DSP with little degradation.
引用
收藏
页码:3071 / 3075
页数:5
相关论文
共 50 条
  • [1] Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1443 - 1446
  • [2] Electrolaryngeal Speech Enhancement with Statistical Voice Conversion based on CLDNN
    Kobayashi, Kazuhiro
    Toda, Tomoki
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2115 - 2119
  • [3] A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Spectral Subtraction and Statistical Voice Conversion
    Tanaka, Kou
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3066 - 3070
  • [4] Augmented Speech Production based on Real-Time Statistical Voice Conversion
    Toda, Tomoki
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 592 - 596
  • [5] The Use of Air-Pressure Sensor in Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1628 - 1631
  • [6] Implementation of a Real-time ECG Signal Processor
    Chien, Shih-Yu Chang
    Hsieh, Cheng-Han
    Lin, Mark Po-Hung
    Fang, Qiang
    Lee, Shuenn-Yuh
    2014 IEEE INTERNATIONAL SYMPOSIUM ON BIOELECTRONICS AND BIOINFORMATICS (ISBB), 2014,
  • [7] Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion
    Yang, Yaogen
    Zhang, Haozhe
    Cai, Zexin
    Shi, Yao
    Li, Ming
    Zhang, Dong
    Ding, Xiaojun
    Deng, Jianhua
    Wang, Jie
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
  • [8] REAL-TIME SPEECH CODER IMPLEMENTATION ON AN ARRAY PROCESSOR
    WOLF, JJ
    FIELD, KD
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1982, 30 (04) : 615 - 620
  • [9] A SINGLE CHIP DIGITAL SIGNAL PROCESSOR AND ITS APPLICATION TO REAL-TIME SPEECH ANALYSIS
    HAGIWARA, Y
    KITA, Y
    MIYAMOTO, T
    TOBA, Y
    HARA, H
    AKAZAWA, T
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1983, 18 (01) : 91 - 99
  • [10] On a real time implementation of LPC speech coder on a bit-slice microprocessor based digital signal processor
    Mahalingam, V.S.
    Kesheorey, M.R.
    Sitaram, N.
    IETE Journal of Research, 1988, 34 (02) : 143 - 146