A Study on Sampling of STFT Modifications in Time and Frequency Domains for DNN-Based Speech Dereverberation

被引:0
|
作者
Wu, Bo [1 ]
Li, Kehuang [2 ]
Yang, Minglei [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Xidian Univ, Natl Lab Radar Signal Proc, Xian, Peoples R China
[2] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We investigate the effects of time and frequency sampling on short-time Fourier transform modifications to be used for speech dereverberation based on deep neural networks (DNNs). We first show that by adopting a linear activation function at the output layer and globally normalizing the target features into zero mean and unit variance, better performances can be obtained than existing DNN approaches. Then we show that the quality of dereverberated speech could be degraded with denser sampling in time for longer reverberation times, even at the price of increased computational complexities, requiring an adaptive time sampling strategy. On the other hand, the difference between the unwrapped phases of reverberant and anechoic speech becomes negligible with a dense sampling in frequency, implying a reduced speech distortion. Therefore, there is a great potential to enhance DNN based acoustic signal processing if the conventional sampling strategy can be carefully adjusted.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] DNN-Based Linear Prediction Residual Enhancement for Speech Dereverberation
    Feng, Xinyang
    Li, Nuo
    He, Zunwen
    Zhang, Yan
    Zhang, Wancheng
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 541 - 545
  • [2] INVERTIBLE DNN-BASED NONLINEAR TIME-FREQUENCY TRANSFORM FOR SPEECH ENHANCEMENT
    Lakeuchi, Daiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6644 - 6648
  • [3] Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis
    Song, Eunwoo
    Soong, Frank K.
    Kang, Hong-Goo
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2253 - 2257
  • [4] A study of speaker adaptation for DNN-based speech synthesis
    Wu, Zhizheng
    Swietojanski, Pawel
    Veaux, Christophe
    Renals, Steve
    King, Simon
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 879 - 883
  • [5] DNN-Based Arabic Speech Synthesis
    Amrouche, Aissa
    Bentrcia, Youssouf
    Boubakeur, Khadidja Nesrine
    Abed, Ahcene
    [J]. 2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 378 - 382
  • [6] Robust Beam forming for Speech Recognition Using DNN-Based Time-Frequency Masks Estimation
    Jiang, Wenbin
    Wen, Fei
    Liu, Peilin
    [J]. IEEE ACCESS, 2018, 6 : 52385 - 52392
  • [7] DNN-BASED ENHANCEMENT OF NOISY AND REVERBERANT SPEECH
    Zhao, Yan
    Wang, DeLiang
    Merks, Ivo
    Zhang, Tao
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6525 - 6529
  • [8] DNN-BASED SPEECH RECOGNITION FOR GLOBALPHONE LANGUAGES
    Tachbelie, Martha Yifiru
    Abulimiti, Ayimunishagu
    Abate, Solomon Teferra
    Schultz, Tanja
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8269 - 8273
  • [9] IMPACT OF SINGLE-MICROPHONE DEREVERBERATION ON DNN-BASED MEETING TRANSCRIPTION SYSTEMS
    Yoshioka, Takuya
    Chen, Xie
    Gales, Mark J. F.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] A DNN-based emotional speech synthesis by speaker adaptation
    Yang, Hongwu
    Zhang, Weizhao
    Zhi, Pengpeng
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 633 - 637