A SOURCE/FILTER MODEL WITH ADAPTIVE CONSTRAINTS FOR NMF-BASED SPEECH SEPARATION

被引:0
|
作者
Bouvier, Damien [1 ]
Obin, Nicolas [1 ]
Liuni, Marco [1 ]
Roebel, Axel [1 ]
机构
[1] UPMC, IRCAM, CNRS, UMR STMS IRCAM, Paris, France
关键词
speech separation; non-negative matrix factorization; source/filter model; constraints; NONNEGATIVE MATRIX FACTORIZATION; PARTS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces a constrained source/filter model for semi-supervised speech separation based on non-negative matrix factorization (NMF). The objective is to inform NMF with prior knowledge about speech, providing a physically meaningful speech separation. To do so, a source/filter model (indicated as Instantaneous Mixture Model or IMM) is integrated in the NMF. Furthermore, constraints are added to the IMM-NMF, in order to control the NMF behaviour during separation, and to enforce its physical meaning. In particular, a speech specific constraint-based on the source/filter coherence of speech - and a method for the automatic adaptation of constraints' weights during separation are presented. Also, the proposed source/filter model is semi-supervised: during training, one filter basis is estimated for each phoneme of a speaker; during separation, the estimated filter bases are then used in the constrained source/filter model. An experimental evaluation for speech separation was conducted on the TIMIT speakers database mixed with various environmental background noises from the QUT-NOISE database. This evaluation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.
引用
收藏
页码:131 / 135
页数:5
相关论文
共 50 条
  • [41] Cluster-based Language Model for Spoken Document Retrieval Using NMF-Based Document Clustering
    Hu, Xinhui
    Isotani, Ryosuke
    Kawai, Hisashi
    Nakamura, Satoshi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 705 - 708
  • [42] Source-filter separation for nonstationary voiced speech based on sinusoidal representation
    Ito, Masashi
    Ohara, Keiji
    Ito, Akinori
    Yano, Masafumi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2010, 31 (02) : 181 - 184
  • [43] Classifying NMF Components Based on Vector Similarity for Speech and Music Separation
    Zheng, Nengheng
    Cai, Yi
    Li, Xia
    Lee, Tan
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [44] DEEP GENERATIVE MODEL LEARNING FOR BLIND SPECTRUM CARTOGRAPHY WITH NMF-BASED RADIO MAP DISAGGREGATION
    Shrestha, Sagar
    Fu, Xiao
    Hong, Mingyi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4920 - 4924
  • [45] Upgrading Sparse NMF algorithm for blind source separation through Adaptive Parameterized Hybrid Kernel based approach
    ParimalaGandhi, A.
    Vijayan, S.
    MEASUREMENT, 2019, 143 : 11 - 21
  • [46] ADAPTIVE SPARSE SOURCE SEPARATION WITH APPLICATION TO SPEECH SIGNALS
    Azizi, Elham
    Mohimani, G. Hosein
    Babaie-Zadeh, Massoud
    ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 640 - 643
  • [47] Single channel source separation using graph sparse NMF and adaptive dictionary learning
    Pham, Tuan
    Lee, Yuan-Shan
    Lin, Yan-Bo
    Li, Yung-Hui
    Tai, Tzu-Chiang
    Wang, Jia-Ching
    INTELLIGENT DATA ANALYSIS, 2017, 21 : S5 - S19
  • [48] Speech Source Separation Using Variational Autoencoder and Bandpass Filter
    Do, Hao Duc
    Tran, Son Thai
    Chau, Duc Thanh
    IEEE ACCESS, 2020, 8 : 156219 - 156231
  • [49] Source-filter Separation of Speech Signal in the Phase Domain
    Loweimi, Erfan
    Barker, Jon
    Hain, Thomas
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 598 - 602
  • [50] Deep Learning Based Speech Separation via NMF-Style Reconstructions
    Nie, Shuai
    Liang, Shan
    Liu, Wenju
    Zhang, Xueliang
    Tao, Jianhua
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 2043 - 2055