Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays

被引:18
|
作者
Souden, Mehrez [1 ]
Kinoshita, Keisuke [2 ]
Delcroix, Marc [2 ]
Nakatani, Tomohiro [2 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30308 USA
[2] NTT Commun Sci Labs, Kyoto 6190237, Japan
关键词
Blind source separation; decision fusion; distributed; microphone array processing; location-based speech clustering; speech enhancement; MULTICHANNEL WIENER FILTER; BLIND SOURCE SEPARATION; EM; ALGORITHMS; MIXTURES;
D O I
10.1109/TASLP.2013.2292308
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In distributed microphone arrays (DMAs) the source location information can be defined at the intra and inter-node levels. Indeed, while the first type of information results from the diversity of acoustic channels recorded by microphones embedded in the same node, the second is attributed to the differences between the acoustic channels observed by spatially distributed nodes. Both cues are very useful in DMA processing, and the aim of this paper is to utilize both of them to cluster and separate multiple competing speech signals. To capture the intra-node information, we employ the normalized recording vector, while at the inter-node level, we consider different features including the energy level differences with and without the phase differences between nodes. We model the intra-node information using the Watson mixture model (WMM), and propose using the Gamma mixture model (GaMM), Dirichlet mixture model (DMM), and WMM to model different inter-node location features. Furthermore, we propose several integrations of the intra-node and inter-node feature contributions to cluster speech recordings using the expectation maximization algorithm. Finally, simulation results are provided to demonstrate the performance of all ensuing methods.
引用
收藏
页码:354 / 367
页数:14
相关论文
共 50 条
  • [1] AN INTEGRATION OF SOURCE LOCATION CUES FOR SPEECH CLUSTERING IN DISTRIBUTED MICROPHONE ARRAYS
    Souden, Mehrez
    Kinoshita, Keisuke
    Nakatani, Tomohiro
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 111 - 115
  • [2] DISTRIBUTED SPEECH SEPARATION IN SPATIALLY UNCONSTRAINED MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4490 - 4494
  • [3] Microphone Clustering and BP Network based Acoustic Source Localization in Distributed Microphone Arrays
    Zhang, Qiaoling
    Chen, Zhe
    Yin, Fuliang
    [J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2013, 13 (04) : 33 - 40
  • [4] MULTICHANNEL FEATURE ENHANCEMENT IN DISTRIBUTED MICROPHONE ARRAYS FOR ROBUST DISTANT SPEECH RECOGNITION IN SMART ROOMS
    Mirsamadi, Seyedmahdad
    Hansen, John H. L.
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 507 - 512
  • [5] Continuous Speech Separation with Ad Hoc Microphone Arrays
    Wang, Dongmei
    Yoshioka, Takuya
    Chen, Zhuo
    Wang, Xiaofei
    Zhou, Tianyan
    Meng, Zhong
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1100 - 1104
  • [6] Variational probabilistic speech separation using microphone arrays
    Rennie, Steven J.
    Aarabi, Parham
    Frey, Brendan J.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 135 - 149
  • [7] Clustering-based location in wireless networks
    Mengual, Luis
    Marban, Oscar
    Eibe, Santiago
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (09) : 6165 - 6175
  • [8] Deep clustering-based single-channel speech separation and recent advances
    Aihara, Ryo
    Wichern, Gordon
    Le Roux, Jonathan
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2020, 41 (02) : 465 - 471
  • [9] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [10] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676