ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION

被引：0

作者：

Liu, Qingju ^{[1
]}

Xu, Yong ^{[1
]}

Jackson, Philip J. B. ^{[1
]}

Wang, Wenwu ^{[1
]}

Coleman, Philip ^{[2
]}

机构：

[1] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England

[2] Univ Surrey, Inst Sound Recording, Guildford, Surrey, England

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

基金：

英国工程与自然科学研究理事会;

关键词：

Deep neural network; binaural blind speech separation; spectral and spatial; iterative DNN;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose an iterative deep neural network (DNN)-based binaural source separation scheme, for recovering two concurrent speech signals in a room environment. Besides the commonly-used spectral features, the DNN also takes non-linearly wrapped binaural spatial features as input, which are refined iteratively using parameters estimated from the DNN output via a feedback loop. Different DNN structures have been tested, including a classic multilayer perception regression architecture as well as a new hybrid network with both convolutional and densely-connected layers. Objective evaluations in terms of PESQ and STOI showed consistent improvement over baseline methods using traditional binaural features, especially when the hybrid DNN architecture was employed. In addition, our proposed scheme is robust to mismatches between the training and testing data.

引用

下载

页码：541 / 545

页数：5

共 50 条

[21] SPEAKER-INDEPENDENT BRAIN ENHANCED SPEECH DENOISING
Hosseini, Maryam
Celotti, Luca
Plourde, Eric
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1310 - 1314
[22] Hardware design of a speaker-independent speech recognizer
Yang, Wu-Ji
Wang, Hsiao-Chuan
Chung-kuo Kung Ch'eng Hsueh K'an/Journal of the Chinese Institute of Engineers, 1988, 11 (04): : 361 - 371
[23] SPEAKER-INDEPENDENT VOWEL RECOGNITION IN PERSIAN SPEECH
Nazari, Mohammad
Sayadiyan, Abolghasem
Valiollahzadeh, Seyyed Majid
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 672 - 676
[24] PREDICTOR CODEBOOK FOR SPEAKER-INDEPENDENT SPEECH RECOGNITION
KAWABATA, T
SYSTEMS AND COMPUTERS IN JAPAN, 1994, 25 (01) : 37 - 46
[25] Japanese Speaker-Independent Homonyms Speech Recognition
Murakami, Jin'ichi
Hotta, Haseo
COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 306 - 313
[26] Speaker-Independent Malay Vowel Recognition of Children using Neural Networks
Ting, H. N.
Lam, Y. M.
WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING, VOL 25, PT 4: IMAGE PROCESSING, BIOSIGNAL PROCESSING, MODELLING AND SIMULATION, BIOMECHANICS, 2010, 25 : 288 - 291
[27] Speaker-independent phonation recognition for Malay plosives using neural networks
Ting, HN
Yunus, J
Salleh, SH
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 619 - 623
[28] On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition
Huang, Xuedong
Lee, Kai-Fu
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02): : 150 - 157
[29] Iterative training techniques for phonetic template based speech recognition with a speaker-independent phonetic recognizer
Kim, WG
Jang, M
Lee, CH
AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 577 - 584
[30] Speaker adaptation techniques for speech recognition with a speaker-independent phonetic recognizer
Kim, WG
Jang, M
COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 95 - 100

← 1 2 3 4 5 →