ONLINE DEEP ATTRACTOR NETWORK FOR REAL-TIME SINGLE-CHANNEL SPEECH SEPARATION

被引：0

作者：

Han, Cong ^{[1
]}

Luo, Yi ^{[1
]}

Mesgarani, Nima ^{[1
]}

机构：

[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

基金：

美国国家科学基金会;

关键词：

Source separation; speaker-independent; attractor network; real-time;

D O I：

10.1109/icassp.2019.8682884

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speaker-independent speech separation is a challenging audio processing problem. In recent years, several deep learning algorithms have been proposed to address this problem. The majority of these methods use noncausal implementation which limits their application in real-time scenarios such as in wearable hearing devices and low-latency telecommunication. In this paper, we propose the Online Deep Attractor Network (ODANet), an extension to the Deep Attractor Network (DANet) which is causal and enables real-time speech separation. In contrast with DANet that estimates the global attractor point for each speaker using the entire utterance, ODANet estimates the attractors for each time step and tracks them using a dynamic weighting function with only causal information. This not only solves the speaker tracking problem, but also allows ODANet to generate more stable embeddings across time. Experimental results show that ODANet can achieve a similar separation accuracy as the noncausal DANet in both two speaker and three speaker speech separation problems, which makes it a suitable candidate for applications that require robust real-time speech processing.

引用

页码：361 / 365

页数：5

共 50 条

[31] Improving Deep Attractor Network by BGRU and GMM for Speech Separation
Rawad Melhem
Assef Jafar
Riad Hamadeh
[J]. Journal of Harbin Institute of Technology(New series), 2021, 28 (03) : 90 - 96
[32] SLEEPINCEPTIONNET: A DEEP LEARNING ALGORITHM FOR REAL-TIME SLEEP STAGES SCORING USING SINGLE-CHANNEL EEG
Haghayegh, Shahab
Hu, Kun
Stone, Katie
Redline, Susan
Schernhammer, Eva
[J]. SLEEP, 2022, 45 : A38 - A39
[33] Deep Domain Adaptation Enhances Amplification Curve Analysis for Single-Channel Multiplexing in Real-Time PCR
Mao, Ye
Xu, Ke
Miglietta, Luca
Kreitmann, Louis
Moser, Nicolas
Georgiou, Pantelis
Holmes, Alison
Rodriguez-Manzano, Jesus
[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (06) : 3093 - 3103
[34] PHASE RECONSTRUCTION WITH LEARNED TIME-FREQUENCY REPRESENTATIONS FOR SINGLE-CHANNEL SPEECH SEPARATION
Wichern, Gordon
Le Roux, Jonathan
[J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 396 - 400
[35] A Single-channel Speech Enhancement Approach Based on Perceptual Masking Deep Neural Network
[J]. Zhang, Xiong-Wei (xwzhang9898@163.com), 2017, Science Press (43):
[36] CompNet: Complementary network for single-channel speech enhancement
Fan, Cunhang
Zhang, Hongmei
Li, Andong
Xiang, Wang
Zheng, Chengshi
Lv, Zhao
Wu, Xiaopei
[J]. NEURAL NETWORKS, 2023, 168 : 508 - 517
[37] TOWARDS REAL-TIME SINGLE-CHANNEL SINGING-VOICE SEPARATION WITH PRUNED MULTI-SCALED DENSENETS
Huber, Markus
Schindler, Gunther
Roth, Wolfgang
Froning, Holger
Schorkhuber, Christian
Pernkopf, Franz
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 806 - 810
[38] Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise
Doire, Clement S. J.
Brookes, Mike
Naylor, Patrick A.
Hicks, Christopher M.
Betts, Dave
Dmour, Mohammad A.
Jensen, Soren Holdt
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 572 - 587
[39] IMPROVED SINGLE-CHANNEL SPEECH SEPARATION USING SINUSOIDAL MODELING
Mowlaee, Pejman
Christensen, Mads Graesboll
Jensen, Soren Holdt
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 21 - 24
[40] SPEAKER AND NOISE INDEPENDENT ONLINE SINGLE-CHANNEL SPEECH ENHANCEMENT
Germain, Francois G.
Mysore, Gautham J.
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 71 - 75

← 1 2 3 4 5 →