Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

被引:105
|
作者
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
Miyoshi, Masato [2 ]
Okuno, Hiroshi G. [3 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan
[2] Kanazawa Univ, Grad Sch Nat Sci & Technol, Kanazawa, Ishikawa 9201192, Japan
[3] Kyoto Univ, Grad Sch Informat, Dept Intelligence Sci & Technol, Kyoto 6068501, Japan
关键词
Blind source separation (BSS); blind dereverberation (BD); conditional separation and dereverberation (CSD); CONVOLUTIVE MIXTURES; IDENTIFICATION; DECONVOLUTION; SUPPRESSION; ALGORITHMS; SIGNALS;
D O I
10.1109/TASL.2010.2045183
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a method for performing blind source separation (BSS) and blind dereverberation (BD) at the same time for speech mixtures. In most previous studies, BSS and BD have been investigated separately. The separation performance of conventional BSS methods deteriorates as the reverberation time increases while many existing BD methods rely on the assumption that there is only one sound source in a room. Therefore, it has been difficult to perform both BSS and BD when the reverberation time is long. The proposed method uses a network, in which dereverberation and separation networks are connected in tandem, to estimate source signals. The parameters for the dereverberation network (prediction matrices) and those for the separation network (separation matrices) are jointly optimized. This enables a BD process to take a BSS process into account. The prediction and separation matrices are alternately optimized with each depending on the other; hence, we call the proposed method the conditional separation and dereverberation (CSD) method. Comprehensive evaluation results are reported, where all the speech materials contained in the complete test set of the TIMIT corpus are used. The CSD method improves the signal-to-interference ratio by an average of about 4 dB over the conventional frequency-domain BSS approach for reverberation times of 0.3 and 0.5 s. The direct-to-reverberation ratio is also improved by about 10 dB.
引用
收藏
页码:69 / 84
页数:16
相关论文
共 50 条
  • [1] JOINT BLIND DEREVERBERATION AND SEPARATION OF SPEECH MIXTURES
    Jan, Tariqullah
    Wang, Wenwu
    [J]. 2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2343 - 2347
  • [2] Computationally Efficient and Versatile Framework for Joint Optimization of Blind Speech Separation and Dereverberation
    Nakatani, Tomohiro
    Ikeshita, Rintaro
    Kinoshita, Keisuke
    Sawada, Hiroshi
    Araki, Shoko
    [J]. INTERSPEECH 2020, 2020, : 91 - 95
  • [3] Online blind source separation and dereverberation of speech based on a joint diagonalizability constraint
    Yu, Ho-Gun
    Kim, Do-Hui
    Song, Min-Hwan
    Park, Hyung-Min
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 503 - 514
  • [4] A low-complexity joint optimization of blind source separation and dereverberation
    Wang T.
    Yang F.
    Yang J.
    [J]. Shengxue Xuebao/Acta Acustica, 2024, 49 (01): : 163 - 170
  • [5] LOW LATENCY ONLINE BLIND SOURCE SEPARATION BASED ON JOINT OPTIMIZATION WITH BLIND DEREVERBERATION
    Ueda, Tetsuya
    Nakatani, Tomohiro
    Ikeshita, Rintaro
    Kinoshita, Keisuke
    Araki, Shoko
    Makino, Shoji
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 506 - 510
  • [6] Dereverberation and Signal Separation of Speech Signal Mixtures
    Nordholm, Sven
    Hai Huyen Dam
    [J]. 2022 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2022, : 96 - 100
  • [7] Joint Multichannel Blind Speech Separation and Dereverberation: A Real-Time Algorithmic Implementation
    Rotili, Rudy
    De Simone, Claudio
    Perelli, Alessandro
    Cifani, Simone
    Squartini, Stefano
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 85 - 93
  • [8] Real-Time Joint Blind Speech Separation and Dereverberation in Presence of Overlapping Speakers
    Rotili, Rudy
    Principi, Emanuele
    Squartini, Stefano
    Piazza, Francesco
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 437 - 446
  • [9] Blind Speech Separation and Dereverberation using neural beamforming
    Pfeifenberger, Lukas
    Pernkopf, Franz
    [J]. SPEECH COMMUNICATION, 2022, 140 : 29 - 41
  • [10] A Novel Approach for Blind Separation and Dereverberation of Speech Mixtures using Multiple step Linear Predictive Coding
    Ehsan, Wajeeha
    Jan, Tariqullah
    [J]. 2015 INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET), 2015,