Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

被引:105
|
作者
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
Miyoshi, Masato [2 ]
Okuno, Hiroshi G. [3 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan
[2] Kanazawa Univ, Grad Sch Nat Sci & Technol, Kanazawa, Ishikawa 9201192, Japan
[3] Kyoto Univ, Grad Sch Informat, Dept Intelligence Sci & Technol, Kyoto 6068501, Japan
关键词
Blind source separation (BSS); blind dereverberation (BD); conditional separation and dereverberation (CSD); CONVOLUTIVE MIXTURES; IDENTIFICATION; DECONVOLUTION; SUPPRESSION; ALGORITHMS; SIGNALS;
D O I
10.1109/TASL.2010.2045183
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a method for performing blind source separation (BSS) and blind dereverberation (BD) at the same time for speech mixtures. In most previous studies, BSS and BD have been investigated separately. The separation performance of conventional BSS methods deteriorates as the reverberation time increases while many existing BD methods rely on the assumption that there is only one sound source in a room. Therefore, it has been difficult to perform both BSS and BD when the reverberation time is long. The proposed method uses a network, in which dereverberation and separation networks are connected in tandem, to estimate source signals. The parameters for the dereverberation network (prediction matrices) and those for the separation network (separation matrices) are jointly optimized. This enables a BD process to take a BSS process into account. The prediction and separation matrices are alternately optimized with each depending on the other; hence, we call the proposed method the conditional separation and dereverberation (CSD) method. Comprehensive evaluation results are reported, where all the speech materials contained in the complete test set of the TIMIT corpus are used. The CSD method improves the signal-to-interference ratio by an average of about 4 dB over the conventional frequency-domain BSS approach for reverberation times of 0.3 and 0.5 s. The direct-to-reverberation ratio is also improved by about 10 dB.
引用
下载
收藏
页码:69 / 84
页数:16
相关论文
共 50 条
  • [31] Blind speech separation of nonlinear convolutive mixtures for robust speech recognition
    Koutras, A.
    Dermatas, E.
    Kokkinakis, G.
    Control and Intelligent Systems, 2002, 30 (02) : 83 - 90
  • [32] CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures
    Jasmine J. C. Sheeja
    B. Sankaragomathi
    Signal, Image and Video Processing, 2022, 16 : 1323 - 1331
  • [33] Delay and predict equalization for blind speech dereverberation
    Triki, Mahdi
    Slock, Dirk T. M.
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 4955 - 4958
  • [34] SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
    Quan, Changsheng
    Li, Xiaofei
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1310 - 1323
  • [35] Joint Noise Suppression and Dereverberation of separating speech signals by using prediction and separation matrix
    Palagan, C. Anna
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES 2018), 2018, : 202 - 207
  • [36] AUTOREGRESSIVE FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR JOINT BLIND SOURCE SEPARATION AND DEREVERBERATION
    Sekiguchi, Kouhei
    Bando, Yoshiaki
    Nugraha, Aditya Arie
    Fontaine, Mathieu
    Yoshii, Kazuyoshi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 511 - 515
  • [37] SEMI-BLIND SPEECH ENHANCEMENT BASED ON RECURRENT NEURAL NETWORK FOR SOURCE SEPARATION AND DEREVERBERATION
    Wake, Masaya
    Bando, Yoshiaki
    Mimura, Masato
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [38] Implementation and Assessment of Joint Source Separation and Dereverberation
    Moffat, David
    Reiss, Joshua D.
    60TH AES INTERNATIONAL CONFERENCE ON DREAMS (DEREVERBERATION AND REVERBERATION OF AUDIO, MUSIC, AND SPEECH), 2016,
  • [39] Convolutive blind separation of speech mixtures using the natural gradient
    Douglas, SC
    Sun, XA
    SPEECH COMMUNICATION, 2003, 39 (1-2) : 65 - 78
  • [40] Subband-based blind separation for convolutive mixtures of speech
    Araki, S
    Makino, S
    Aichner, R
    Nishikawa, T
    Saruwatari, H
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2005, E88A (12) : 3593 - 3603