DNN-Based Mask Estimation for Distributed Speech Enhancement in Spatially Unconstrained Microphone Arrays

被引：7

作者：

Furnon, Nicolas ^{[1
]}

Serizel, Romain ^{[1
]}

Essid, Slim ^{[2
]}

Illina, Irina ^{[1
]}

机构：

[1] Univ Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France

[2] Inst Polytech Paris, Telecom Paris, LTCI, F-91764 Palaiseau, France

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2021年 / 29卷

关键词：

Microphone arrays; Speech enhancement; Estimation; Speech processing; Noise measurement; Noise reduction; Distortion; Distributed algorithm; microphone arrays; speech enhancement; MULTICHANNEL WIENER FILTER; LOW-RANK APPROXIMATION; NOISE-REDUCTION; SIGNAL ESTIMATION; SENSOR NETWORKS; SINGLE; SEGREGATION; BEAMFORMER; ALGORITHMS;

D O I：

10.1109/TASLP.2021.3092838

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments. However, in the context of ad-hoc microphone arrays, many challenges remain and raise the need for distributed processing. In this paper, we propose to extend a previously introduced distributed DNN-based time-frequency mask estimation scheme that can efficiently use spatial information in form of so-called compressed signals which are pre-filtered target estimations. We study the performance of this algorithm named Tango under realistic acoustic conditions and investigate practical aspects of its optimal application. We show that the nodes in the microphone array cooperate by taking profit of their spatial coverage in the room. We also propose to use the compressed signals not only to convey the target estimation but also the noise estimation in order to exploit the acoustic diversity recorded throughout the microphone array.

引用

页码：2310 / 2323

页数：14

共 50 条

[1] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
Furnon, Nicolas
Serizel, Romain
Illina, Irina
Essid, Slim
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
[2] DISTRIBUTED SPEECH SEPARATION IN SPATIALLY UNCONSTRAINED MICROPHONE ARRAYS
Furnon, Nicolas
Serizel, Romain
Illina, Irina
Essid, Slim
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4490 - 4494
[3] DNN-BASED SPEECH MASK ESTIMATION FOR EIGENVECTOR BEAMFORMING
Pfeifenberger, Lukas
Zoehrer, Matthias
Pernkopf, Franz
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 66 - 70
[4] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
Tammen, Marvin
Fischer, Doerte
Meyer, Bernd T.
Doclo, Simon
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
[5] Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes
Furnon, Nicolas
Serizel, Romain
Essid, Slim
Illina, Irina
[J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1095 - 1099
[6] Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement
Cui, Zihao
Bao, Changchun
[J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 618 - 622
[7] DNN-BASED ENHANCEMENT OF NOISY AND REVERBERANT SPEECH
Zhao, Yan
Wang, DeLiang
Merks, Ivo
Zhang, Tao
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6525 - 6529
[8] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
Abdullah, Salinna
Zamani, Majid
Demosthenous, Andreas
[J]. IEEE ACCESS, 2021, 9 : 24350 - 24362
[9] Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation
Martin-Donas, Juan Manuel
Jensen, Jesper
Tan, Zheng-Hua
Gomez, Angel M.
Peinado, Antonio M.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 3080 - 3094
[10] DNN-based Feature Transformation for Speech Recognition Using Throat Microphone
Lin, Shengke
Tsunakawa, Takashi
Nishida, Masafumi
Nishimura, Masafumi
[J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 596 - 599

← 1 2 3 4 5 →