ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION

被引:0
|
作者
Liu, Qingju [1 ]
Xu, Yong [1 ]
Jackson, Philip J. B. [1 ]
Wang, Wenwu [1 ]
Coleman, Philip [2 ]
机构
[1] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England
[2] Univ Surrey, Inst Sound Recording, Guildford, Surrey, England
基金
英国工程与自然科学研究理事会;
关键词
Deep neural network; binaural blind speech separation; spectral and spatial; iterative DNN;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose an iterative deep neural network (DNN)-based binaural source separation scheme, for recovering two concurrent speech signals in a room environment. Besides the commonly-used spectral features, the DNN also takes non-linearly wrapped binaural spatial features as input, which are refined iteratively using parameters estimated from the DNN output via a feedback loop. Different DNN structures have been tested, including a classic multilayer perception regression architecture as well as a new hybrid network with both convolutional and densely-connected layers. Objective evaluations in terms of PESQ and STOI showed consistent improvement over baseline methods using traditional binaural features, especially when the hybrid DNN architecture was employed. In addition, our proposed scheme is robust to mismatches between the training and testing data.
引用
收藏
页码:541 / 545
页数:5
相关论文
共 50 条
  • [31] CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING
    Li, Chenxing
    Zhu, Lei
    Xu, Shuang
    Gao, Peng
    Xu, Bo
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 711 - 715
  • [32] SPEAKER-CONSISTENT PARSING FOR SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
    YAMAGUCHI, K
    SINGER, H
    MATSUNAGA, S
    SAGAYAMA, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 719 - 724
  • [33] SPEAKER-INDEPENDENT DETECTION OF CHILD-DIRECTED SPEECH
    Schuster, Sebastian
    Pancoast, Stephanie
    Ganjoo, Milind
    Frank, Michael C.
    Jurafsky, Dan
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 366 - 371
  • [34] Deep neural networks based binary classification for single channel speaker independent multi-talker speech separation
    Saleem, Nasir
    Khattak, Muhammad Irfan
    APPLIED ACOUSTICS, 2020, 167
  • [35] Biomimetic pattern recognition for speaker-independent speech recognition
    Qin, H
    Wang, SJ
    Sun, H
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1290 - 1294
  • [36] On Speaker-Independent Personality Perception and Prediction from Speech
    Polzehl, Tim
    Schoenenberg, Katrin
    Moeller, Sebastian
    Metze, Florian
    Mohammadi, Gelareh
    Vinciarelli, Alessandro
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 258 - 261
  • [37] Speaker-Independent Speech Recognition using Visual Features
    Pooventhiran, G.
    Sandeep, A.
    Manthiravalli, K.
    Harish, D.
    Renuka, Karthika D.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 616 - 620
  • [38] Generalized Cyclic Transformations in Speaker-Independent Speech Recognition
    Mueller, Florian
    Belilovsky, Eugene
    Mertins, Alfred
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 211 - 215
  • [39] SPEAKER-INDEPENDENT PERCEPTION OF HUMAN SPEECH BY ZEBRA FINCHES
    Ohms, Verena R.
    van Heijningen, Caroline A. A.
    Gill, Arike
    Beckers, Gabriel J. L.
    ten Cate, Carel
    EVOLUTION OF LANGUAGE, PROCEEDINGS, 2010, : 467 - +
  • [40] Uighur speaker-independent speech recognition based on CDCPM
    Wang, K.L.
    2001, Science Press (38):