DNN-Based Mask Estimation for Distributed Speech Enhancement in Spatially Unconstrained Microphone Arrays

被引:7
|
作者
Furnon, Nicolas [1 ]
Serizel, Romain [1 ]
Essid, Slim [2 ]
Illina, Irina [1 ]
机构
[1] Univ Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France
[2] Inst Polytech Paris, Telecom Paris, LTCI, F-91764 Palaiseau, France
关键词
Microphone arrays; Speech enhancement; Estimation; Speech processing; Noise measurement; Noise reduction; Distortion; Distributed algorithm; microphone arrays; speech enhancement; MULTICHANNEL WIENER FILTER; LOW-RANK APPROXIMATION; NOISE-REDUCTION; SIGNAL ESTIMATION; SENSOR NETWORKS; SINGLE; SEGREGATION; BEAMFORMER; ALGORITHMS;
D O I
10.1109/TASLP.2021.3092838
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments. However, in the context of ad-hoc microphone arrays, many challenges remain and raise the need for distributed processing. In this paper, we propose to extend a previously introduced distributed DNN-based time-frequency mask estimation scheme that can efficiently use spatial information in form of so-called compressed signals which are pre-filtered target estimations. We study the performance of this algorithm named Tango under realistic acoustic conditions and investigate practical aspects of its optimal application. We show that the nodes in the microphone array cooperate by taking profit of their spatial coverage in the room. We also propose to use the compressed signals not only to convey the target estimation but also the noise estimation in order to exploit the acoustic diversity recorded throughout the microphone array.
引用
收藏
页码:2310 / 2323
页数:14
相关论文
共 50 条
  • [1] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
  • [2] DISTRIBUTED SPEECH SEPARATION IN SPATIALLY UNCONSTRAINED MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4490 - 4494
  • [3] DNN-BASED SPEECH MASK ESTIMATION FOR EIGENVECTOR BEAMFORMING
    Pfeifenberger, Lukas
    Zoehrer, Matthias
    Pernkopf, Franz
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 66 - 70
  • [4] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Tammen, Marvin
    Fischer, Doerte
    Meyer, Bernd T.
    Doclo, Simon
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
  • [5] Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes
    Furnon, Nicolas
    Serizel, Romain
    Essid, Slim
    Illina, Irina
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1095 - 1099
  • [6] Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement
    Cui, Zihao
    Bao, Changchun
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 618 - 622
  • [7] DNN-BASED ENHANCEMENT OF NOISY AND REVERBERANT SPEECH
    Zhao, Yan
    Wang, DeLiang
    Merks, Ivo
    Zhang, Tao
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6525 - 6529
  • [8] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
    Abdullah, Salinna
    Zamani, Majid
    Demosthenous, Andreas
    [J]. IEEE ACCESS, 2021, 9 : 24350 - 24362
  • [9] Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation
    Martin-Donas, Juan Manuel
    Jensen, Jesper
    Tan, Zheng-Hua
    Gomez, Angel M.
    Peinado, Antonio M.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 3080 - 3094
  • [10] DNN-based Feature Transformation for Speech Recognition Using Throat Microphone
    Lin, Shengke
    Tsunakawa, Takashi
    Nishida, Masafumi
    Nishimura, Masafumi
    [J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 596 - 599