Computationally Efficient and Versatile Framework for Joint Optimization of Blind Speech Separation and Dereverberation

被引:3
|
作者
Nakatani, Tomohiro [1 ]
Ikeshita, Rintaro [1 ]
Kinoshita, Keisuke [1 ]
Sawada, Hiroshi [1 ]
Araki, Shoko [1 ]
机构
[1] NTT Corp, Tokyo, Japan
来源
关键词
Blind source separation; dereverberation; automatic speech recognition; INDEPENDENT COMPONENT ANALYSIS; MIXTURES;
D O I
10.21437/Interspeech.2020-2138
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper proposes new blind signal processing techniques for optimizing a multi-input multi-output (MIMO) convolutional beamformer (CBF) in a computationally efficient way to simultaneously perform dereverberation and source separation. For effective CBF optimization, a conventional technique factorizes it into a multiple-target weighted prediction error (WPE) based dereverberation filter and a separation matrix. However, this technique requires the calculation of a huge spatio-temporal covariance matrix that reflects the statistics of all the sources, which makes the computational cost very high. For computationally efficient optimization, this paper introduces two techniques: one that decomposes the huge covariance matrix into ones for individual sources, and another that decomposes the CBF into sub-filters for estimating individual sources. Both techniques effectively and substantively reduce the size of the covariance matrices that must calculated, and allow us to greatly reduce the computational cost without loss of optimality.
引用
下载
收藏
页码:91 / 95
页数:5
相关论文
共 50 条
  • [1] Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Miyoshi, Masato
    Okuno, Hiroshi G.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01): : 69 - 84
  • [2] JOINT BLIND DEREVERBERATION AND SEPARATION OF SPEECH MIXTURES
    Jan, Tariqullah
    Wang, Wenwu
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2343 - 2347
  • [3] Online blind source separation and dereverberation of speech based on a joint diagonalizability constraint
    Yu, Ho-Gun
    Kim, Do-Hui
    Song, Min-Hwan
    Park, Hyung-Min
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 503 - 514
  • [4] A low-complexity joint optimization of blind source separation and dereverberation
    Wang T.
    Yang F.
    Yang J.
    Shengxue Xuebao/Acta Acustica, 2024, 49 (01): : 163 - 170
  • [5] LOW LATENCY ONLINE BLIND SOURCE SEPARATION BASED ON JOINT OPTIMIZATION WITH BLIND DEREVERBERATION
    Ueda, Tetsuya
    Nakatani, Tomohiro
    Ikeshita, Rintaro
    Kinoshita, Keisuke
    Araki, Shoko
    Makino, Shoji
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 506 - 510
  • [6] Joint Multichannel Blind Speech Separation and Dereverberation: A Real-Time Algorithmic Implementation
    Rotili, Rudy
    De Simone, Claudio
    Perelli, Alessandro
    Cifani, Simone
    Squartini, Stefano
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 85 - 93
  • [7] Real-Time Joint Blind Speech Separation and Dereverberation in Presence of Overlapping Speakers
    Rotili, Rudy
    Principi, Emanuele
    Squartini, Stefano
    Piazza, Francesco
    ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 437 - 446
  • [8] Blind Speech Separation and Dereverberation using neural beamforming
    Pfeifenberger, Lukas
    Pernkopf, Franz
    SPEECH COMMUNICATION, 2022, 140 : 29 - 41
  • [9] A Semi-blind Source Separation Approach for Speech Dereverberation
    Wang, Ziteng
    Na, Yueyue
    Liu, Zhang
    Li, Yun
    Tian, Biao
    Fu, Qiang
    INTERSPEECH 2020, 2020, : 3925 - 3929
  • [10] Blind and Spatially-Regularized Online Joint Optimization of Source Separation, Dereverberation, and Noise Reduction
    Ueda, Tetsuya
    Nakatani, Tomohiro
    Ikeshita, Rintaro
    Kinoshita, Keisuke
    Araki, Shoko
    Makino, Shoji
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1157 - 1172