Denoising Recurrent Neural Network for Deep Bidirectional LSTM based Voice Conversion

被引:7
|
作者
Wu, Jie [1 ]
Huang, Dongyan [2 ]
Xie, Lei [1 ]
Li, Haizhou [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China
[2] ASTAR, Inst Infocomm Res, Singapore, Singapore
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
residual error; Gaussian noise; denoising; recurrent neural network; voice conversion;
D O I
10.21437/Interepeech.2017-694
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper studies the post processing in deep bidirectional Long Short-Term Memory (DBLSTM) based voice conversion, where the statistical parameters are optimized to generate speech that exhibits similar properties to target speech. However, there always exists residual error between converted speech and target one. We reformulate the residual error problem as speech restoration, which aims to recover the target speech samples from the converted ones. Specifically, we propose a denoising recurrent neural network (DeRNN) by introducing regularization during training to shape the distribution of the converted data in latent space. We compare the proposed approach with global variance (GV), modulation spectrum (MS) and recurrent neural network (RNN) based postfilters, which serve a similar purpose. The subjective test results show that the proposed approach significantly outperforms these conventional approaches in terms of quality and similarity.
引用
收藏
页码:3379 / 3383
页数:5
相关论文
共 50 条
  • [21] Deep residual-dense network based on bidirectional recurrent neural network for atrial fibrillation detection
    Laghari, Asif Ali
    Sun, Yanqiu
    Alhussein, Musaed
    Aurangzeb, Khursheed
    Anwar, Muhammad Shahid
    Rashid, Mamoon
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [22] Deep residual-dense network based on bidirectional recurrent neural network for atrial fibrillation detection
    Asif Ali Laghari
    Yanqiu Sun
    Musaed Alhussein
    Khursheed Aurangzeb
    Muhammad Shahid Anwar
    Mamoon Rashid
    [J]. Scientific Reports, 13
  • [23] Stacked LSTM based deep recurrent neural network with kalman smoothing for blood glucose prediction
    Md Fazle Rabby
    Yazhou Tu
    Md Imran Hossen
    Insup Lee
    Anthony S. Maida
    Xiali Hei
    [J]. BMC Medical Informatics and Decision Making, 21
  • [24] Stacked LSTM based deep recurrent neural network with kalman smoothing for blood glucose prediction
    Rabby, Md Fazle
    Tu, Yazhou
    Hossen, Md Imran
    Lee, Insup
    Maida, Anthony S.
    Hei, Xiali
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [25] Electricity Theft Detection Using Deep Bidirectional Recurrent Neural Network
    Chen, Zhongtao
    Meng, De
    Zhang, Yufan
    Xin, Tinglin
    Xiao, Ding
    [J]. 2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 401 - 406
  • [26] Car Tourist Trajectory Prediction Based on Bidirectional LSTM Neural Network
    Mikhailov, Sergei
    Kashevnik, Alexey
    [J]. ELECTRONICS, 2021, 10 (12)
  • [27] A Deep Bidirectional GRU Network Model for Biometric Electrocardiogram Classification Based on Recurrent Neural Networks
    Lynn, Htet Myet
    Pan, Sung Bum
    Kim, Pankoo
    [J]. IEEE ACCESS, 2019, 7 : 145395 - 145405
  • [28] Pitch Transformation in Neural Network based Voice Conversion
    Xie, Feng-Long
    Qian, Yao
    Soong, Frank K.
    Li, Haifeng
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 197 - +
  • [29] Voltages prediction algorithm based on LSTM recurrent neural network
    Chen, Ying
    [J]. OPTIK, 2020, 220
  • [30] Remaining Useful Life Estimation in Prognostics Using Deep Bidirectional LSTM Neural Network
    Wang, Jiujian
    Wen, Guilin
    Yang, Shaopu
    Liu, Yongqiang
    [J]. 2018 PROGNOSTICS AND SYSTEM HEALTH MANAGEMENT CONFERENCE (PHM-CHONGQING 2018), 2018, : 1037 - 1042