ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION

被引:0
|
作者
Liu, Qingju [1 ]
Xu, Yong [1 ]
Jackson, Philip J. B. [1 ]
Wang, Wenwu [1 ]
Coleman, Philip [2 ]
机构
[1] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England
[2] Univ Surrey, Inst Sound Recording, Guildford, Surrey, England
基金
英国工程与自然科学研究理事会;
关键词
Deep neural network; binaural blind speech separation; spectral and spatial; iterative DNN;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose an iterative deep neural network (DNN)-based binaural source separation scheme, for recovering two concurrent speech signals in a room environment. Besides the commonly-used spectral features, the DNN also takes non-linearly wrapped binaural spatial features as input, which are refined iteratively using parameters estimated from the DNN output via a feedback loop. Different DNN structures have been tested, including a classic multilayer perception regression architecture as well as a new hybrid network with both convolutional and densely-connected layers. Objective evaluations in terms of PESQ and STOI showed consistent improvement over baseline methods using traditional binaural features, especially when the hybrid DNN architecture was employed. In addition, our proposed scheme is robust to mismatches between the training and testing data.
引用
下载
收藏
页码:541 / 545
页数:5
相关论文
共 50 条
  • [1] Continuous speech of speaker-independent based on two weight neural networks
    Cao Wen-ming
    Ye Hong
    Xu Chun-yan
    Wang Shou-jue
    PROCEEDINGS OF 2005 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1 AND 2, 2005, : 1415 - +
  • [2] Speaker-independent speech recognition by means of functional-link neural networks
    Ugena, A
    de Arriaga, F
    El Alami, M
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 1018 - 1021
  • [3] SPEAKER-INDEPENDENT CONSONANT CLASSIFICATION IN CONTINUOUS SPEECH WITH DISTINCTIVE FEATURES AND NEURAL NETWORKS
    DEMORI, R
    FLAMMIA, G
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 94 (06): : 3091 - 3103
  • [4] Compact modular neural networks in a hybrid speaker-independent speech recognition system
    Glaeser, A
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1895 - 1899
  • [5] Binaural reverberant Speech separation based on deep neural networks
    Zhang, Xueliang
    Wang, DeLiang
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2018 - 2022
  • [6] LOW-LATENCY SPEAKER-INDEPENDENT CONTINUOUS SPEECH SEPARATION
    Yoshioka, Takuya
    Chen, Zhuo
    Liu, Changliang
    Xiao, Xiong
    Erdogan, Hakan
    Dimitriadis, Dimitrios
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6980 - 6984
  • [7] A CASA APPROACH TO DEEP LEARNING BASED SPEAKER-INDEPENDENT CO-CHANNEL SPEECH SEPARATION
    Liu, Yuzhou
    Wang, DeLiang
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5399 - 5403
  • [8] PERMUTATION INVARIANT TRAINING OF DEEP MODELS FOR SPEAKER-INDEPENDENT MULTI-TALKER SPEECH SEPARATION
    Yul, Dang
    Kalbcek, Marten
    Tan, Zheng-Hua
    Jensen, Jesper
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 241 - 245
  • [9] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Fahad, Md Shah
    Ranjan, Ashish
    Deepak, Akshay
    Pradhan, Gayadhar
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
  • [10] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Md Shah Fahad
    Ashish Ranjan
    Akshay Deepak
    Gayadhar Pradhan
    Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135