PHASE RECONSTRUCTION FROM AMPLITUDE SPECTROGRAMS BASED ON VON-MISES-DISTRIBUTION DEEP NEURAL NETWORK

被引:0
|
作者
Takamichi, Shinnosuke [1 ]
Saito, Yuki [1 ]
Takamune, Norihiro [1 ]
Kitamura, Daichi [2 ]
Saruwatari, Hiroshi [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
[2] Kagawa Coll, Natl Inst Technol, Dept Elect & Comp Engn, Takamatsu, Kagawa, Japan
关键词
speech analysis; phase reconstruction; deep neural network; von Mises distribution; group delay;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a deep neural network (DNN)-based phase reconstruction from amplitude spectrograms. In audio signal and speech processing, the amplitude spectrogram is often used for processing, and the corresponding phase spectrogram is reconstructed from the amplitude spectrogram on the basis of the Griffin-Lim method. However, the Griffin-Lim method causes unnatural artifacts in synthetic speech. Addressing this problem, we introduce the von-Mises-distribution DNN for phase reconstruction. The DNN is a generative model having the von Mises distribution that can model distributions of a periodic variable such as a phase, and the model parameters of the DNN are estimated on the basis of the maximum likelihood criterion. Furthermore, we propose a group-delay loss for DNN training to make the predicted group delay close to a natural group delay. The experimental results demonstrate that 1) the trained DNN can predict group delay accurately more than phases themselves, and 2) our phase reconstruction methods achieve better speech quality than the conventional Griffin-Lim method.
引用
收藏
页码:286 / 290
页数:5
相关论文
共 50 条
  • [31] Deep Neural Network Based Complex Spectrogram Reconstruction for Speech Bandwidth Expansion
    Yu, Hongjiang
    Zhu, Wei-Ping
    2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 110 - 113
  • [32] Understanding and Boosting of Deep Convolutional Neural Network Based on Sample Distribution
    Zheng, Qinghe
    Yang, Mingqiang
    Zhang, Qingrui
    Zhang, Xinxin
    Yang, Jiajie
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 823 - 827
  • [33] Predictions for Three-Month Postoperative Vocal Recovery after Thyroid Surgery from Spectrograms with Deep Neural Network
    Lee, Jeong Hoon
    Lee, Chang Yoon
    Eom, Jin Seop
    Pak, Mingun
    Jeong, Hee Seok
    Son, Hee Young
    SENSORS, 2022, 22 (17)
  • [34] Distribution network distributed state estimation method based on an integrated deep neural network
    Zhang W.
    Fan Y.
    Hou J.
    Song Y.
    Dianli Xitong Baohu yu Kongzhi/Power System Protection and Control, 2024, 52 (03): : 128 - 140
  • [35] Speech Magnitude Spectrum Reconstruction from MFCCs Using Deep Neural Network
    Jiang Wenbin
    Liu Peilin
    Wen Fei
    CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (02) : 393 - 398
  • [36] Speech Magnitude Spectrum Reconstruction from MFCCs Using Deep Neural Network
    JIANG Wenbin
    LIU Peilin
    WEN Fei
    Chinese Journal of Electronics, 2018, 27 (02) : 393 - 398
  • [37] Packet Loss Concealment Based on Phase Correction and Deep Neural Network
    Ji, Qiang
    Bao, Changchun
    Cui, Zihao
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [38] Gear and bearing diagnostics using neural network-based amplitude and phase demodulation
    Larson, EC
    Wipf, DP
    Parker, BE
    CRITICAL LINK: DIAGNOSIS TO PROGNOSIS, 1997, : 511 - 521
  • [39] Absolute Phase Unwrapping with Deep Neural Network for Structured Light 3D Reconstruction
    Chang, Wan
    Xiang, Sen
    Deng, Huiping
    Wu, Jin
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 420 - 425
  • [40] Deep Griffin-Lim Iteration: Trainable Iterative Phase Reconstruction Using Neural Network
    Masuyama, Yoshiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2021, 15 (01) : 37 - 50