A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion

被引:0
|
作者
Moriguchi, Takuto [1 ]
Toda, Tomoki [1 ]
Sano, Motoaki [2 ]
Sato, Hiroshi [2 ]
Neubig, Graham [1 ]
Sakti, Sakriani [1 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan
[2] Foster Elect Co Ltd, Akishima, Tokyo, Japan
关键词
statistical voice conversion; real-time processing; reduction of computational cost; DSP; non-audible murmur; electrolaryngeal speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a digital signal processor (DSP) implementation of real-time statistical voice conversion (VC) for silent speech enhancement and electrolaryngeal speech enhancement. As a silent speech interface, we focus on non audible murmur (NAM), which can be used in situations where audible speech is not acceptable. Electrolaryngeal speech is one of the typical types of alaryngeal speech produced by an alternative speaking method for laryngectornees. However, the sound quality of NAM and electrolaryngeal speech suffers from lack of naturalness. VC has proven to be one of the promising approaches to address this problem, and it has been successfully implemented on devices with sufficient computational resources. An implementation on devices that are highly portable but have limited computational resources would greatly contribute to its practical use. In this paper we further implement real-time VC on a DSP. To implement the two speech enhancement systems based on real-time VC, one from NAM to a whispered voice and the other from electrolaryngeal speech to a natural voice, we propose several methods for reducing computational cost while preserving conversion accuracy. We conduct experimental evaluations and show that real-time VC is capable of running on a DSP with little degradation.
引用
收藏
页码:3071 / 3075
页数:5
相关论文
共 50 条
  • [31] REAL-TIME LINEAR QUADRATIC CONTROL USING DIGITAL SIGNAL PROCESSOR
    Slavov, T.
    Mollov, L.
    Petkov, P.
    TWMS JOURNAL OF PURE AND APPLIED MATHEMATICS, 2012, 3 (02): : 145 - 157
  • [32] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
    Doi, Hironori
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
  • [33] Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation
    Ibrahim, Ali
    Gastaldo, Paolo
    Chible, Hussein
    Valle, Maurizio
    SENSORS, 2017, 17 (03)
  • [34] A LABORATORY-BASED COURSE IN REAL-TIME DIGITAL SIGNAL PROCESSING IMPLEMENTATION
    Budge, Scott E.
    2009 IEEE 13TH DIGITAL SIGNAL PROCESSING WORKSHOP & 5TH IEEE PROCESSING EDUCATION WORKSHOP, VOLS 1 AND 2, PROCEEDINGS, 2009, : 762 - 767
  • [35] ON THE DIGITAL SIGNAL PROCESSOR BASED PROGRAMMABLE REAL-TIME REED SOLOMON CODING DECODING SYSTEM
    OU, YJZ
    LIN, ZY
    INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS, 1990, 8 (02): : 95 - 98
  • [36] SYSTEM BASED ON A DIGITAL SIGNAL PROCESSOR FOR REAL-TIME CROSS-CORRELATION OF NEURAL SIGNALS
    FIORE, L
    PIRCHIO, M
    RICCI, D
    ACATI, G
    FIORINO, C
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 1994, 32 (05) : 593 - 596
  • [37] Real-time digital signal processor-based system for MHD mode identification in ISTTOK
    Carvalho, BB
    Fernandes, H
    Sousa, J
    Varandas, CAF
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2004, 75 (10): : 4265 - 4267
  • [38] Design and FPGA Implementation of a Real-time Processor for the HDR Conversion of Images and Videos
    Licciardo, Gian Domenico
    Cappetta, Carmine
    Di Benedetto, Luigi
    2016 8TH COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC), 2016, : 192 - 197
  • [39] Real-time Control of a DNN-based Articulatory Synthesizer for Silent Speech Conversion: a pilot study
    Bocquelet, Florent
    Hueber, Thomas
    Girin, Laurent
    Savariaux, Christophe
    Yvert, Blaise
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2405 - 2409
  • [40] Efficient implementation of real-time PFFT processor based on FPGA
    Ling, Xiao-Feng
    Gong, Xin-Bao
    Jin, Rong-Hong
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2012, 46 (11): : 1811 - 1815