Supervised Separation of Speech from Background Piano Music using a Nonnegative Matrix Factorization Approach

被引:0
|
作者
Martinez-Colon, A. [1 ]
Canadas-Quesada, F. J. [1 ]
Vera-Candeas, P. [1 ]
Ruiz-Reyes, N. [1 ]
Moreno-Fuentes, F. [1 ]
机构
[1] Univ Jaen, Telecommun Engn Dept, Jaen, Spain
来源
STAIRS 2014 | 2014年 / 264卷
关键词
Sound separation; Non-negative matrix factorization; training; supervised; sparse; interference;
D O I
10.3233/978-1-61499-421-3-181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a supervised algorithm for separating speech from background non-stationary noise (piano music) in single-channel recordings. The proposed algorithm, based on a nonnegative matrix factorization (NMF) approach, is able to extract speech sounds from isolated or chords piano sounds learning the set of spectral patterns generated by independent syllables and piano notes. Moroever, a sparsity constraint is used to improve the quality of the separated signals. Our proposal was tested using several audio mixtures composed of real-world piano recordings and Spanish speech showing promising results.
引用
收藏
页码:181 / 190
页数:10
相关论文
共 50 条
  • [1] Music Signal Separation by Supervised Nonnegative Matrix Factorization with Basis Deformation
    Kitamura, Daichi
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    Kondo, Kazunobu
    Takahashi, Yu
    [J]. 2013 18TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2013,
  • [2] Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
    Mohammadiha, Nasser
    Smaragdis, Paris
    Leijon, Arne
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2140 - 2151
  • [3] Layered Nonnegative Matrix Factorization for Speech Separation
    Hsu, Chung-Chien
    Chien, Jen-Tzung
    Chi, Tai-Shih
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 628 - 632
  • [4] Supervised and Semi-supervised Speech Enhancement Using Weighted Nonnegative Matrix Factorization
    Zou, Xia
    Hu, Yonggang
    Zhang, Xiongwei
    [J]. 2017 9TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2017,
  • [5] Single channel speech music separation using nonnegative matrix factorization with sliding windows and spectral masks
    Grais, Emad M.
    Erdogan, Hakan
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1784 - 1787
  • [6] Discriminative Layered Nonnegative Matrix Factorization for Speech Separation
    Hsu, Chung-Chien
    Chi, Tai-Shih
    Chien, Jen-Tzung
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 560 - 564
  • [7] Transductive Convolutive Nonnegative Matrix Factorization for Speech Separation
    Mai, Yaodan
    Lan, Long
    Guan, Naiyang
    Zhang, Xiang
    Luo, Zhigang
    [J]. PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 1400 - 1404
  • [8] Deep Transductive Nonnegative Matrix Factorization for Speech Separation
    Liu, Yalin
    Guan, Naiyang
    Liu, Jie
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 249 - 254
  • [9] TRANSDUCTIVE NONNEGATIVE MATRIX FACTORIZATION FOR SEMI-SUPERVISED HIGH-PERFORMANCE SPEECH SEPARATION
    Guan, Naiyang
    Lan, Long
    Tao, Dacheng
    Luo, Zhigang
    Yang, Xuejun
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] Robust Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Prevention of Basis Sharing
    Kitamura, Daichi
    Saruwatari, Hiroshi
    Yagi, Kosuke
    Shikano, Kiyohiro
    Takahashi, Yu
    Kondo, Kazunobu
    [J]. 2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 392 - 397